Scaffold proteins derived from plant cystatins

Abstract

The present invention relates to scaffold proteins derived from plant cystatins and to nucleic acids encoding them. The scaffolds are highly stable and have the ability to display peptides. The scaffolds are particularly well suited for constructing libraries, e.g., in phage display or related systems. The invention also relates to various uses of the scaffolds, including in therapy, diagnosis, environmental and security monitoring, synthetic biology and research, and to cells and cell cultures expressing the scaffold proteins.

Claims

1. A nucleic acid comprising a coding sequence encoding a synthetic scaffold protein including a scaffold portion having an amino acid sequence which is at least 85% identical to TABLE-US-00024 (SEQ ID NO: 1) NSLEIEELARFAVDEHNKKENALLEFVRVVKAKEQVVAGTMYYLTL EAKDGGKKKLYEAKVWVKPWENFKELQEFKPVGDA, wherein the scaffold portion further comprises at least one heterologous peptide of from 3 to 20 amino acids in length, wherein the heterologous peptide is inserted: (a) adjacent to any of amino acid residues VVAG (SEQ ID NO:82); (b) in replacement of amino acid residues VVAG (SEQ ID NO:82); (c) adjacent to any of amino acid residues PWE; (d) in replacement of amino acid residues PWE; or (e) any combination of (a)-(d) thereof; wherein said amino acid residues VVAG (SEQ ID NO:82) or PWE are present as a contiguous sequence in the amino acid sequence of SEQ ID NO:1, wherein the heterologous peptide is heterologous to the amino acid sequence of SEQ ID NO:1, and wherein the at least one heterologous peptide inserted into the scaffold portion is disregarded when calculating sequence identity of the scaffold portion.

2. The nucleic acid of claim 1, wherein the scaffold portion comprises an additional amino acid sequence at the N-terminus.

3. The nucleic acid of claim 2, wherein the additional amino acid sequence at the N-terminus comprises the sequence MATGVRAVPGNE (SEQ ID NO:80).

4. The nucleic acid of claim 1, wherein the scaffold portion of the synthetic scaffold protein has a melting temperature (Tm) of at least 70 C.

5. The nucleic acid of claim 1, wherein the synthetic scaffold protein is a fusion protein further comprising one or more additional polypeptide sequences attached at the N- or C-terminal, or both, to the scaffold portion.

6. The nucleic acid of claim 5, wherein the one or more additional polypeptide sequences include polypeptides selected from signal sequences, leader sequences, targeting sequences, purification tag, linker sequences, phage coat proteins or other protein for surface display.

7. The nucleic acid of claim 5, wherein the one or more additional polypeptide sequences includes one or more additional scaffold portions, each of which independently can bind the same or different target entities.

8. The nucleic acid of claim 7, wherein the synthetic scaffold protein is a homo-multimer of two or more of the same scaffold portions.

9. The nucleic acid of claim 7, wherein the synthetic scaffold protein is a hetero-multimer of two or more different scaffold portions.

10. The nucleic acid of claim 1, further comprising one or more expression control sequences operably linked to the coding sequence encoding the synthetic scaffold protein.

11. The nucleic acid of claim 10, wherein the expression control sequences include one or more of a promoter sequence and/or enhancer sequence.

12. The nucleic acid of claim 10, wherein the expression control sequences control expression of the synthetic scaffold protein in eukaryotic cells.

13. The nucleic acid of claim 12, wherein the expression control sequences control expression of the synthetic scaffold protein in human and/or yeast cells.

14. The nucleic acid of claim 10, wherein the expression control sequences control expression of the synthetic scaffold protein in prokaryotic cells.

15. The nucleic acid of claim 10, wherein the expression control sequences control expression of the synthetic scaffold protein in plant cells.

16. A vector comprising the nucleic acid of claim 1, one or more expression control sequences operably linked to the coding sequence encoding the synthetic scaffold, and an origin of replication.

17. A cell comprising the vector of claim 16.

18. A cell comprising the nucleic acid of claim 1 and expressing the synthetic scaffold protein.

19. A library comprising a population of different nucleic acids of claim 1, wherein the library of nucleic acids encode a variety of synthetic scaffolds having different sequences.

20. The library of claim 19 having a complexity of 10.sup.8 or higher.

21. A nucleic acid comprising a coding sequence encoding a synthetic scaffold protein comprising a scaffold portion having an amino acid sequence represented in: TABLE-US-00025 (SEQ ID NO: 4) NSLEIEELARFAVDEHNKKENALLEFVRVVKAKEQ(X.sub.n)TMYYLTLEAKD GGKKKLYEAKVWVK(X.sub.n)NFKELQEFKPVGDA or an amino acid sequence which is at least 85% identical to the amino acid sequence of SEQ ID NO:4, wherein X is independently for each occurrence any amino acid, and n is from 3 to 20.

22. The nucleic acid of claim 21, wherein the scaffold portion has an amino acid sequence represented in: TABLE-US-00026 (SEQ ID NO: 5) NSLEIEELARFAVDEHNKKENALLEFVRVVKAKEQ(X.sub.5-13)TMYYLTL EAKDGGKKKLYEAKVWVK(X.sub.5-13)NFKELQEFKPVGDA wherein X is independently for each occurrence any amino acid.

23. The nucleic acid of claim 21, wherein the scaffold portion has an amino acid sequence represented in: TABLE-US-00027 (SEQ ID NO: 6) NSLEIEELARFAVDEHNKKENALLEFVRVVKAKEQ(X.sub.9)TMYYLTLEA KDGGKKKLYEAKVWVK(X.sub.9) NFKELQEFKPVGDA wherein X is independently for each occurrence any amino acid.

24. A transgene comprising a coding sequence encoding a synthetic scaffold protein comprising a scaffold portion which binds to a plant nematode, the scaffold portion having an amino acid sequence represented in: TABLE-US-00028 (SEQ ID NO: 4) NSLEIEELARFAVDEHNKKENALLEFVRVVKAKEQ(X.sub.n)TMYYLTLEAKD GGKKKLYEAKVWVK(X.sub.n)NFKELQEFKPVGDA or an amino acid sequence which is at least 85% identical to the amino acid sequence of SEQ ID NO:4, wherein X is independently for each occurrence any amino acid, and n is from 3 to 20, and wherein the coding sequence for the synthetic scaffold protein is operably linked to expression control sequences for expressing synthetic scaffold protein in plant cells.

Description

SPECIFIC EMBODIMENTS OF THE INVENTION

(1) Embodiments of the invention will now be described, by way of example only, with reference to the accompanying drawings in which:

(2) FIGS. 1A-1B show a consensus phytocystatin PHYTC57 derived from 57 phytocystatin amino acid sequences (FIG. 1A) PHYTC57 synthetic gene shown as a double strand sequence together (SEQ ID NOS 91 and 92) with the overlapping oligonucleotides (P1 to P6, SEQ ID NOS 85-90, respectively) used to generate the gene by recursive PCR. The coding sequence is also shown as single letter amino acid code (SEQ ID NO 94). The positions of the two restriction sites SfiI and NotI are shown. (FIG. 1B) Schematic representation of the cloning region of pDHisII which is based on pHEN1. The positions of the relevant regions encoding the pelB signal sequence, hexhistidine tag and N-terminal section of the M13 phage PIII protein are shown. The positions of the standard M13 primer binding sites for M13R and P10 are indicated as are the unique SfiI and NotI restriction sites.

(3) FIG. 2 shows graph of the results of papain inhibition assays for a modified oryzacystatin lacking residue Asp86 (OSA-ID86) and PHYTC57 at varying concentrations of phytocystatins, showing the enhanced efficacy of PHYTC57.

(4) FIG. 3 shows surface plasmon resonance measurement of the interaction between cystatins immobilized on an NTA sensorchip, and papain. The experiments were performed at several concentrations of papain and data were analysed using the Biaevaluation3 software package (BIACORE) with global fitting of the data to the Langmuir 1:1 binding model. OSA-ID86, a modified rice cystatin I; CPA, papaya cystain; CUN, orange cystatin; CEWC, chicken egg white cystatin; PHYTC57, consensus cystatin.

(5) FIGS. 4A-4D show (FIG. 4A) Thermal stability of OSA-ID86 (grey square) and PHYTC57 (black triangle) shown as residual enzyme activity assayed at 25 C. following incubation at 100 C. for the times shown. PHYTC57 displays greater thermal stability than does OSA-ID86), (FIG. 4B) Effect of simulated gastric fluid treatment on the inhibitory properties of OSA-ID86 (grey square) and PHYTC57 (black triangle) showing that PHYTC57 retains activity over a longer period of time than does OSA-ID86. (FIG. 4C) SDS-PAGE analysis of the stability of PHYTC57 and OSA-ID86 to incubation with simulated gastric fluid. The time of incubation is shown in seconds and the positions of marker proteins (M) are indicated on the left in kDa with the position of pepsin from the assay and the cystatin indicated. (FIG. 4D) Western blot analysis of PHYTC57 following simulated gastric fluid treatment for the times indicated (seconds) using an anti-His tag antibody. The positions of marker proteins (M) are indicated in kDa.

(6) FIGS. 5A-5B PHYTC57 and OSA-ID86 comparison showing amino acid differences between the proteins PHYTC57 (SEQ ID NO 94) and OSA-ID86 (SEQ ID NO 93. (FIG. 5A) Alignment of protein sequences. (FIG. 5B) Representation of the position of the amino acid changes from (FIG. 5A) on the 3D structure of OSAI with residues changes labeled. The binding region is also shown with the N-terminal, QVVAG (SEQ ID NO 83) and PW loops labeled.

(7) FIGS. 6A-6B show a crystal structure of an Adhiron isolated from the library and sequence of an Adhiron scaffold derived from PHYTC57. (FIG. 6A) The Adhiron scaffold is shown and the inserted loops are indicated. (FIG. 6B) Codon optimised nucleic acid sequence (SEQ ID NO 96) and amino acid sequence (SEQ ID NO 97) for the Adhiron 81 amino acid scaffold. The alpha helix, beta sheets and the insertion regions for loop1 and loop2 are highlighted.

(8) FIGS. 7A-7C show biochemical characterisation of Adhiron scaffold and sequencing of the library. (FIG. 7A) Differential scanning calorimetry was performed to determine the melting temperature of the Adhiron scaffold (Tm 101 C.). (FIG. 7B) Circular dichroism was used to examine the structure of the Adhiron scaffold and of three selected Adhiron proteins containing loop insertions and all show very high structure. (FIG. 7C) The Adhiron phage library was used to infected E. coli ER2738 cells. 96 random clones were isolated and sequenced. The graph represents the percentage of each amino acid within the loop regions. An ideal library would contain 5.26% of each amino acid; cysteine was not included in the library.

(9) FIGS. 8A-8C show a comparison of the stability of (FIG. 8A) the Adhiron scaffold compared with (FIG. 8B) a representative small soluble well characterised protein, lysozyme, by differential scanning calorimetry. (FIG. 8C) shows that for an Adhiron selected to bind to a myc antibody, the addition of the loops into the scaffold reduces the Tm to 85 C. but this still represents a higher melting temperature than most scaffold proteins. This Adhiron protein can undergo repeated cycles of denaturation and renaturation as shown by the series of scans.

(10) FIGS. 9A-9B show phage ELISA results for yeast SUMO (ySUMO). (FIG. 9A) Phage ELISA using 24 clones isolated from the third pan round. Phage produced by each clone were incubated in wells containing ySUMO or control. The image was recorded three minutes after addition of 3,3,5,5-Tetramethylbenzidine (TMB) substrate. (FIG. 9B) Graph showing the absorbance at 560 nm of the phage ELISA for ySUMO (hatched) and control (white) wells.

(11) FIGS. 10A-10D show purification and characterisation of an Adhiron specific for yeast SUMO (Ad-ySUMO). (FIG. 10A) Ad-ySUMO was expressed in BL21 cells and cell lysates were heated to 50 C., 60 C., 70 C., 80 C., 90 C. and 100 C. for 20 minutes and the precipitate was removed by centrifugation at 15,000g. Aliquots of 5 l of cleared lysates for each temperature were separated on a 15% SDS-PAGE gel, and stained with coomassie to visualise the proteins. (FIG. 10B) Lysates were incubated with Ni-NTA beads for 1 hr. Post incubated lysate (5 l) and purified Ad-ySUMO (10 l) were run on a 15% SDS-PAGE gel and stained with coomassie for visualisation. (FIG. 10C) Biotinylated Ad-ySUMOs were used to detect ySUMO (hatched bars) and did not detect human SUMO (white bars) by ELISA. (FIG. 10D) Western blots using biotinylated Ad-ySUMO clones 10, 15, 20, and 22 against 0.5 g of yeast SUMO (upper panel) and mixed with 20 g of HEK293 cell lysate (lower panel).

(12) FIGS. 11A-11C show phage ELISA results for Adhirons identified in screens against a series of targets, i.e. growth factor protein FGF1 (FIG. 11A), a cell surface receptor CD31 (FIG. 11B), and a peptide (FIG. 11C). Graphs represent the absorbance readings of each well after the addition of TMB. The wells containing the target are shown as hatched bars whilst the control wells are shown as white bars.

(13) FIG. 12 shows immunofluorescence images of HPV16 E5 GFP (target) and HPV16 E5 GFP without epitope for Adhiron (control). Adhiron E5 was conjugated to quantum dots and used to detect E5 protein in the mammalian cells. Cells were stained with DAPI (DNA stain), GFP, and E5 (with the Adhiron-Quantum dots).

(14) FIGS. 13A-13C show an Adhiron targeting hSUMO2. (FIG. 13A) An ABP raised against hSUMO2 was tested to determine specificity for hSUMO1 and hSUMO2 by ELISA. The hSUMO2 ABP specifically bound to hSUMO2, which was reflected in the binding affinity. (FIG. 13B) Blot showing the ability of the hSUMO2 ABP to inhibit RNF4's SUMO-targeted ubiquitin ligase activity. (FIG. 13C) A control vector or hSUMO2 binder was expressed in cells and analysed using an anti-FLAG antibody and cultured with arsenic to induce nuclear bodies, PML (green). RNF4 promotes degradation of PML. Blocking the interaction between hSUMO2 and RNF4 alters PML degradation. This is the first description of a hSUMO2 binding protein that specifically binds to and inhibits hSUMO2 without interacting with hSUMO1. The hSUMO2 Adhiron also specifically blocks the domain that it's interacting with and does not affect other functions of the hSUMO2, as demonstrated in (Fig. C).

(15) FIG. 14 shows a graph that represents a blood clot formation and lysis turbidity assay. The solid line represents the normal formation and lysis. The dashed lines represent the effect of five different Adhirons on this process. The five different Adhirons contain different epitopes and are having different effects on clot formation and lysis. Some are prolonging clotting time, some are prolonging lysis time and some are prolonging clotting and lysis time. One of the Adhirons is completing inhibiting clot formation. This demonstrates that the different Adhirons are binding to and inhibiting different regions of fibrinogen and therefore represent a really novel way of studying protein function.

(16) FIG. 15 shows a confocal image of FITC labelled fibrinogen after clot formation with a fibrinogen binding Adhiron and a control Adhiron that does not bind fibrinogen. This demonstrates the ability of the Adhiron to modify the normal clot response.

(17) FIG. 16 shows fluoroscopy images that demonstrate the expression of functional Adhirons in mammalian cells. It can be seen that the human SUMO2 binding Adhiron alters the degradation of the nuclear phosphoprotein PML leading to an increase in these PML nuclear bodies.

(18) FIGS. 17A-17B shows the co-crystal structure of FcRIIIa and bound Adhiron. (FIG. 17A) The Adhiron binds to an allosteric site that affects receptor binding to IgG. This site has also been identified as a target for small molecule drug design providing evidence that Adhirons can provide important information about druggable sites. (FIG. 17B) Adhirons have also been found to target the direct binding site of IgG on FcRIIIa. The Adhiron contained two loops and a 4 amino acid unstructured N-terminal sequence. The N-terminus peptides are also contributing to the interaction between Adhiron and FcRIIIa. The information gained by understanding the binding interaction between the Adhiron and FcRIIIa will help identify small molecules via in silico screening. This provides an intriguing novel approach for future drug discovery methodologies. In addition the crystal structures demonstrate the ability of the scaffold to present loops that can either extend the beta sheets of the core scaffold or more flexible loops that create alternative conformational interaction surfaces. This is facilitated by the inherent core stability of the scaffold, which potentially makes this scaffold unique.

(19) FIGS. 18A-18B. NMR spectra. (FIG. 18A) Overlay of 1H-15N HSQC fingerprint spectra for the ABP scaffold (light grey) and Yeast Sumo Adhiron 15 (dark grey). (FIG. 18B) Overlay of 1H-15N HSQC fingerprint spectrum for Yeast SUMO Adhiron 15 (dark grey) and 1H-15N TROSY HSQC spectrum for the Yeast SUMO protein and Yeast SUMO-Adhiron 15 complex (light grey).

(20) FIG. 19 shows a graph of the change of impedance versus concentration of SUMO binding Adhiron. This provides an example of the potential use of Adhirons in impedance based biosensor devices and shows that Adhirons bound to a surface have a larger dynamic range compared to antibodies.

(21) FIGS. 20 to 24 show sequence alignments of the LOOP 1 and LOOP2 regions of Adhirons selected against a range of targets. This analysis allows identification of the range of binders against a given target and facilitates the development of potential consensus binding regions in some cases.

(22) FIG. 20 shows a sequence alignment of LOOP1 and LOOP2 regions of several Adhirons (SEQ ID NOS 98-115) that bind to Lectin-like oxidized LDL receptor-1 (LOX1).

(23) FIG. 21 shows a sequence alignment of LOOP1 and LOOP2 regions of several Adhirons (SEQ ID NOS 116-151) that bind to Human Growth Hormone (HGH).

(24) FIG. 22 shows a sequence alignment of LOOP1 and LOOP2 regions of several Adhirons (SEQ ID NOS 152-176) that bind to yeast small ubiquitin-like modifier (SUMO).

(25) FIG. 23 shows a sequence alignment of LOOP1 and LOOP2 regions of several Adhirons (SEQ ID NOS 177-191) that bind to penicillin binding protein 2a (PBP2a).

(26) FIG. 24 shows a sequence alignment of LOOP1 and LOOP2 regions of several Adhirons SEQ ID NOS 192-194) that bind to a peptide target.

(27) FIG. 25 shows phage ELISA results for Adhirons identified in screens against an organic compound, posaconazole. Graphs represent the absorbance readings of each well after the addition of TMB. The wells containing the target are shown as hatched bars whilst the control wells are shown as white bars.

(28) FIG. 26. A graph of the change of impedance based detection of various concentrations of fibrinogen from micromolar to attomolar by a fibrinogen binding Adhiron. This provides an example of the potential use of Adhirons in impedance based biosensor devices and shows that Adhirons bound to a surface can detect low concentrations of target protein within a 15 minute incubation and display a large linear dynamic range. The two data points at 310.sup.12 micromolar were measured immediately after adding the analyte (lower data point) and after a 15 minute incubation (upper data point).

(29) FIG. 27 shows a sequence alignment of LOOP1 and LOOP2 regions of several Adhirons (SEQ ID NOS 195-218) that bind to Growth factor receptor-bound protein 2 (Grb2) Src homology 2 domain.

(30) FIG. 28 shows a sequence alignment of LOOP1 and LOOP2 regions of several Adhirons (SEQ ID NOS 219-241) that bind to Signal Transducer and Activator of Transcription 3 (STAT3) Src homology 2 domain.

(31) It has been stated that To prompt wider interest in a particular protein scaffold, it is necessary to demonstrate that specificities for different kinds of relevant ligands can be generated, that the derived binding proteins are practically useful, and that they offer at least some benefits over conventional antibody fragments. These criteria are not met by many of the protein scaffolds proposed so far and for most of them merely initial engineering efforts have been described. (Skerra 2007). The present invention is therefore of clear value in providing a useful, versatile and attractive protein scaffold, as demonstrated below.

(32) The high level of success in terms of identifying important bioactive binding proteins attributable to the novel Adhiron scaffold and derived library is due in large part to the very high stability of the core scaffold which provides a highly rigid framework upon which variable loop regions can be displayed. These loop regions have sufficient flexibility to adopt a range of conformations and can thus interact with a wide range of conformational features on target molecules allowing selection of binding reagents against a wide range of molecular targets, including notable small organic molecules. These structural aspects of this unique scaffold lead to functional outcomes that have not been achieved with other scaffold proteins.

(33) There may also be benefits in the use of a plant-based protein for many applications, particularly those involving humans since the protein is not derived from a human protein and therefore will not be involved in natural interactions with human proteins. The only potential interactions would be against cysteine proteases but the active binding regions that could interact with such proteins are typically replaced or removed in the Adhiron scaffolds. However, humans come into contact with and tolerate plant-derived proteins constantly through for example, food, cosmetic products, and medicinal compositions.

(34) Creation of an Exemplary Scaffold Protein (Termed PHYTC57)

(35) Introduction

(36) A consensus approach to protein design starts with a multiple sequence alignment of members of a protein family to derive a single consensus sequence in which each position is normally occupied by the residue that occurs most frequently. Residues that define the structure, folding pathway, and stability of the folded protein will tend to be conserved, while those required for a common biological function such as catalytic residues in an enzyme, or residues that interact with a conserved target protein, are also likely to be conserved. A natural protein will not usually contain all the conserved consensus residues because proteins only evolve to be sufficiently stable to perform their biological role in vivo, and in many cases this may include a degree of instability to facilitate turnover and regulation of biological processes (Steipe, Schiller et al. 1994).

(37) We were interested to explore whether a consensus approach could be applied to enhance the inhibitory properties and stability of phytocystatins. Phytocystatins are small (100aa) protein inhibitors of cysteine proteases (Kondo et al. 1991). A detailed phylogenomic analysis of the cystatin superfamily reveals the relationships between the different classes of cystatins (Kordis and Turk 2009). Phytocystatins comprise three regions that are involved in binding to cysteine proteases, the N-terminal region, a QVVAG (SEQ ID NO 83) loop and a PW loop (Margis et al. 1998). Previously, based on a structural model and sequence alignments, we generated a variant of rice cystatin, oryzacystatin I (OSA-I or OC-I), that lacks residue Asp 86, close to the conserved PW loop and named OSA-ID86. This variant displayed a 13-fold improvement in K.sub.i against both papain and the Caenorhabditis elegans gut-specific cysteine protease GCP-1 (Urwin et al. 1995; McPherson et al. 1997; Urwin et al. 1997; Urwin et al. 1998; Urwin et al. 2000; Urwin et al. 2001; Urwin et al. 2003; Lilley et al. 2004). We then demonstrated good levels of transgenic resistance, against plant nematodes, conferred by OSA-ID86 and other cystatins when expressed in the specialised feeding cells that develop within a nematode parasitized plant (Urwin et al. 1995; McPherson et al. 1997; Urwin et al. 1997; Urwin et al. 1998; Urwin et al. 2000; Urwin et al. 2001; Urwin et al. 2003; Lilley et al. 2004).

(38) Here we describe the design, construction and characterisation of a consensus phytocystatin based on multiple sequence alignment of the amino acid sequences of 57 phytocystatins. We show that this consensus phytocystatin displays efficient inhibition of the cysteine protease papain and enhanced stability in thermostability and digestibility assays.

(39) Materials and Methods

(40) Design of the Consensus Phytocystatin Hereinbelow Referred to as PHYTC57

(41) Following a tBLAST search of the GenBank data base a multiple alignment of phytocystatin sequences was performed using CLUSTALW (clustalw.genome.ad.jp; BLOSUM62; Gap opening penalty=12; Gap extension penalty=2) with further manual alignment using the program CINEMA (bioinf.manchester.ac.uk/dbbrowser/CINEMA2.1) (Parry-Smith et al. 1998; Lord et al. 2002). The most commonly used amino acid at each position was determined and the variable N- and C-terminal ends were truncated to give a consensus protein of 95 amino acids in length. A synthetic coding region was designed to encode the consensus phytocystatin (phytc57) with codon usage optimised for expression in E. coli by using the appropriate codon frequency table (kazusa.or.jp/codon). The coding region was constructed from six oligonucleotides (P1-P6; see FIG. 1) each of approximately 70 nt in length that included regions of at least 10 nt overlap between adjacent oligonucleotides. The flanking oligonucleotides P1 and P6 also contained an SfiI site and a NotI site, respectively, for cloning into the vector pDHisII where the coding sequence was fused to the vector-derived pelB signal sequence coding region. Retention of a signal peptidase cleavage site was checked by submitting the N-terminal sequence of the PELB/PHYTC57 protein to the automatic signal sequence prediction programme Signal P (Bendtsen et al. 2004; cbs.dtu.dk/services/SignalP). Pairs of oligonucleotides (P1+P2), (P3+P4) and (P5+P6) were annealed and converted to double stranded fragments by self-priming. The reactions (10 l) contained 25 pmole of each primer, 0.2 mM of each dNTP, 1 Pwo Buffer (10 mM Tris-HCl pH 8.85, 25 mM KCl, 5 mM (NH.sub.4).sub.2SO.sub.4, 2 mM MgSO.sub.4) and 5 units Pwo DNA polymerase (Boehringher) and were subjected to one cycle of 95 C. for 2 min, 25 C. for 4 min and then 72 C. for 5 min. The subsequent joining of the products (P1/P2) to (P3/P4) and then (P1/P2/P3/P4) to (P5/P6) was performed in an identical manner, to generate full-length product. A 5 L aliquot of the final reaction was used as template in a 50 L PCR together with 50 pmole each of the flanking primers P1 and P6, 0.2 mM each dNTP, 1 Pwo buffer and 5 units Pwo. The reaction was subjected to 95 C. for 1 min, then 30 cycles of 95 C., 30 sec, 55 C., 30 sec and 72 C., 30 sec. The product was digested with SfiI and NotI, recovered from an agarose gel and cloned into SfiI and NotI restricted pDHisII. The newly constructed gene region was sequenced to confirm correct primer assembly and the expected DNA sequence. The sequence of the consensus coding region is shown in FIG. 1A.

(42) Synthesis of Other Phytocystatin Genes

(43) Synthetic DNA sequences optimised for E. coli expression were similarly designed, constructed and cloned for phytocystatins from rice (Oryza sativaI; osa-ID86 a modified form of oc-I GenBank accession number U54702 lacking the codon for Asp 86 (Urwin et al. 1995), satsuma orange (Citrus unshiu; cun, GenBank accession number C95263) and papaya (Carcia papaya; cpa, GenBank accession number X71124 (Song et al. 1995). The chicken egg white cystatin gene (cewc) sequence was amplified from a pQE30-derived recombinant plasmid using the gene specific primers:

(44) TABLE-US-00006 cewcF (SEQIDNO10) 5ATTAGCGGCCCAGCCGGCCATGGCCAGCGAGGACCGCTCCCGGC3 cewcR (SEQIDNO11) 5CGCTGTACTTGCGGCCGCCCTGGCACTTGCTTTCCAGC3
that introduced an SfiI site and NotI site (underscored), respectively.

(45) PCR reactions were carried out with 50 pmole of each primer, a final concentration of 0.2 mM of each dNTP (Promega) and approximately 10 ng template in a volume of 50 l containing 1 SuperTaq buffer (10 mM Tris-HCl pH 9.0, 1.5 mM MgCl.sub.2, 50 mM KCl, 0.1% (v/v) Triton X-100). To ensure high fidelity amplification during the PCR a 10:1 mix of SuperTaq (HT Biotechnology):Pfu (Boehringer Mannheim) DNA polymerases were used with 1 unit of polymerase mix per reaction. The reactions were subjected to 1 cycle of 95 C. for 1 min then 20 cycles of 95 C. for 30 sec, 55 C. for 30 sec and 72 C. for 30 sec. A final step of 72 C. for 30 sec was used to ensure that all products were full length.

(46) pDHisII Construction

(47) Phytocystatin coding regions were initially cloned into a modified version of the phagemid vector pHEN1 (Hoogenboom et al. 1991) as gene III fusions. The pHEN vector was modified by addition of a hexa-histidine region encoded by complementary oligonucleotides that were phosphorylated and annealed to create a linker with appropriate single strand ends for ligation with the NotI cleaved vector. The oligonucleotide sequences were;

(48) TABLE-US-00007 HisF: (SEQIDNO12) 5-GGCCGCAGAGGATCGCATCACCATCACCATCACGG-3 HisR: (SEQIDNO13) 5-GGCCCCGTGATGGTGATGGTGATGCGATCCTCTGC-3
and the NotI complementary ends are underscored.

(49) The pHEN1 vector was digested with NotI and dephosphorylated using 5 units of shrimp alkaline phosphatase (NEB) in a 50 l reaction containing 100 mM NaCl, 50 mM Tris-HCl (pH 7.9), 10 mM MgCl.sub.2 and 1 mM DTT for 30 minutes at 37 C. before heating to 65 C. for 15 min. The distal NotI site was destroyed by PCR mutagenesis using the primers

(50) TABLE-US-00008 XhoF: (SEQIDNO14) 5ATCACGCTCGAGCAGAACAAAAACTCATCTCAG3 XhoR: (SEQIDNO15) 5TGTTCTGCTCGAGCGTGATGGTGATGGTGATGGCG3 BamHIR: (SEQIDNO16) 5TGGCCTTGATATTCACAAACG3 M13R: (SEQIDNO17) 5AGCGGATAACAATTTCACACAGGA3

(51) The XhoF primer was used together with BamHIR primer and the XhoR together with the M13R primer to generate two fragments that were annealed and amplified in a second PCR that included primers M13R and BamHIR. The introduced XhoI restriction site is underscored. The resulting product was cloned as an SfiI/BamHI fragment into pHEN1 vector similarly digested. The presence of the XhoI site was screened for by restriction analysis of isolated phagemid DNA and the insert in a positive clone, pDHisII (FIG. 1B), was confirmed by DNA sequence analysis.

(52) Expression of Cystatins in pDHisII

(53) Initially, expression of cystatins were performed using constructs in the vector pDHisII which adds C-terminal RGS(H).sub.6 and myc tags. Expression studies in the E. coli host strain HB2151 allow suppression of the amber codon within the cystatin-gene III fusion resulting in accumulation of cystatin in the periplasm. Cultures (1 L) were grown in 2TY media with 0.1% (v/v) glucose and 100 g/mL ampicillin then induced by IPTG to 1 mM and grown at 30 C., 16 hours. Cell pellets were resuspended in 20 mL 50 mM Tris pH 8, 20% (w/v) sucrose at 4 C. and periplasmic preparations performed. To the periplasmic fraction 1 mL Ni-NTA resin (GE Healthcare) was added and incubated with mixing for 16 hours at 4 C. The resin was washed 8 times with 1 mL aliquots of wash buffer (50 mM NaHPO.sub.4, 500 mM NaCl, pH 6), then 3 times with wash buffer containing 40 mM imadiazole. Cystatin was eluted with 200 L wash buffer containing 250 mM imadiazole at 4 C. for 1 hour, before dialysis against PBS. Protein was aliquoted and frozen in liquid nitrogen. Samples were analysed by SDS-PAGE and electrospray mass spectrometry.

(54) Expression of Cystatins in pET101

(55) For more efficient expression of soluble protein, the phytocystatin genes were sub-cloned together with the C-terminal 6-His tag from pDHis II into pET101 by directional TOPO cloning (Invitrogen). The primers

(56) TABLE-US-00009 cystaSDFWD (SEQIDNO18) 5CACCATGAAATCACTATTGCTTACG3 cystaSDREV (SEQIDNO19) 5CTACTAGTGATGGTGATGGTGATGCG3
were used to PCR amplify the phytocystatins genes from the original cloning vector pDHisII using the conditions described above. Positive clones were confirmed by DNA sequence analysis and were introduced into BL21(DE3) Star cells (Invitrogen). Cultures were grown at 30 C. and expression of phytocystatins was induced by addition of IPTG (to 1 mM) for 16 h. The cells were harvested by centrifugation (4,000g, 10 min), sonicated on ice (31 min), the cell debris pelleted by centrifugation (10,000g, 10 min). The supernatant was loaded onto a metal chelating column (Pharmacia) charged with 0.1 mM NiCl.sub.2. After extensive washing the phytocystatins were eluted with 100 mM imidazole and dialysed extensively into HBS buffer (10 mM HEPES pH 7.4, 150 mM NaCl, 0.005% P20) and stored at 70 C.
Papain Assay

(57) The inhibition of papain (Sigma) was assayed using pGlu-Phe-Leu-p-NA (Sigma), a synthetic substrate for cysteine proteinases (Filippova et al. 1984). 10 ng of papain, in 50 l of incubation buffer (0.15 M MES/OH pH 5.8, 4 mM NaEDTA, 4 mM DTT), was added to 50 l of sample, either with no cystatin or containing a known concentration of cystatin and pre-incubated for 30 minutes. 50 l of 1 mM pGlu-Phe-Leu-p-NA (in incubation buffer and 30% DMSO) was added and papain activity monitored by a linear increase in OD at 415 nm, over 15 min at 25 C.

(58) Surface Plasmon Resonance (SPR) Experiments

(59) Papain was obtained from Sigma as a lyophised powder. For the SPR experiments a stock solution of 1 mg/ml was prepared by dissolving in HBS buffer containing 1 mM DTT. This stock was diluted to the required concentrations with the same buffer. All SPR experiments were performed on a BIACORE 3000 instrument using an NTA sensorchip. The running buffer was HBS-EP (10 mM HEPES, 0.15 M NaCl, 0.005% P20, pH 7.4) and all experiments were carried out at 25 C. Phytocystatins were immobilised (approximately 400 response units (RU) on flowcells 2 to 4 with flowcell 1 left blank. To monitor binding of papain to the phytocystatins, 240 l of HBS buffer was injected over all flow cells at a flow rate of 80 l/min (association phase) followed by normal buffer flow for 3 min (dissociation phase) to provide a subtraction blank. This injection was repeated with papain. Proteins were stripped from the sensorchip using EDTA and the surface re-charged with nickel before repeating the experiment with a different concentration of papain. At least 5 cycles were performed for each concentration to allow kinetic analysis. The data from these binding experiments were analysed using the Biaevaluation3 software package (BIACORE) and by global fitting the data to the Langmuir 1:1 binding model.

(60) Thermal Stability of Phytocystatins

(61) Aliquots of PHYTC57 and OSA-ID86 were prepared at 0.5 g protein/mL in 50 mM phosphate buffer pH 7.4. The samples were immediately placed in a boiling water bath. Samples were removed at various times and plunged into liquid nitrogen, before storage at 70 C. Residual inhibitory activity, to determine the remaining level of functional cystatin, was determined by the papain inhibition assay.

(62) Digestibility Assay

(63) Simulated gastric fluid was made up as described (Astwood et al. 1996) immediately before use and contained 0.32% (w/v) pepsin (Sigma) in 0.03 M NaCl adjusted to pH 1.2 with concentrated HCl. Four replicates of PHYTC57 and of OSA-ID86 at 0.2 mg/ml final concentration, were incubated in SGF at 37 C. At intervals 200 l aliquots were removed and immediately mixed with 75 l 0.2M Na.sub.2CO.sub.3 to terminate digestion. Each sample was divided into three sub-samples and, as previously described (Atkinson et al. 2004), were separated on 15% SDS-PAGE gels. Two gels were stained for total protein with Coomassie blue, one to show the effect of simulated gastric fluid on PHYTC57 and the second to compare digestibility of PHYTC57 with OSA-ID86. The third gel was subjected to western blot analysis using a mouse antibody to detect the 6His-tag (Qiagen) of PHYTC57 with visualisation using alkaline phosphatase activity. A final sample from each time point was analysed for residual inhibitory activity against papain by the enzyme assay as outlined above.

(64) Results

(65) Synthesis of Phytocystatin Genes

(66) For the design of the consensus phytocystatin coding region a tBLASTN search of the Genbank database was undertaken using OSA-I (Oryza sativa; U54702), ZMA2 (Zea mays; D38130) and HAN1 (Helianthus annuus; Q10993) protein sequences as search probes. The list of sequences used to derive the consensus sequence is shown in Table 1. Sequences were identified from databases by homology searching. The table shows a systematic name for each cystatin together with the organism name and common name of the plant and the Genbank accession number. Coding sequences were translated and aligned using the program CLUSTALW and the alignment was displayed using the program CINEMA (Parry-Smith et al. 1998; Lord et al. 2002) to allow improvement by manual alignment. A consensus sequence was then derived by identifying the most common amino acid at each position (FIG. 1A). The length of the consensus protein was set at 95 amino acids with the N-terminus positioned four residues before the conserved N-terminal glycine residue, and thus before the first -strand (1). The C-terminus was set 15 residues after the conserved PW motif and thus after the last -strand (5). These criteria were based on the X-ray structures of CEWC (Bode et al. 1988) and human stefin B (Stubbs et al. 1990) and the NMR structure of OSA-I (Nagata et al. 2000).

(67) TABLE-US-00010 TABLE 1 Phytocystatin sequences used to derive the consensus (PHYTC57) sequence. Phyto- Accession cystatin Organism name Common name number AAR Ambrosia artemisiifolia Short ragweed L16624 ACE Allium cepa Onion AA508918 ATH1 Arabidopsis thaliana Arabadopsis Z17618 ATH2 Arabidopsis thaliana Arabadopsis Z97341 ATH3 Arabidopsis thaliana Arabadopsis Z17675 ATH4 Arabidopsis thaliana Arabadopsis ATAJ110 ATH6 Arabidopsis thaliana Arabadopsis AC002409 ATH8 Arabidopsis thaliana Arabadopsis Z37263 AVU Artemisia vulgaris Mugwort AF143677 BCA1 Brassica campestris Chinese cabbage L41355 BCA2 Brassica campestris Chinese cabbage L48182 BCA3 Brassica campestris Chinese cabbage U51119 CPA Carcia papaya Papaya X71124 CSA Cucumis sativus Cucumber AB014760 CSAT Castanea sativa Chestnut AJ224331 CUN Citrus unshiu Satsuma orange C95263 DCA Daucus carota Carrot D85623 DCAR Dianthus caryophyllus Carnation AF064734 GHI2 Gossypium hirsutum Cotton AI728662 GHI3 Gossypium hirsutum Cotton AI726250 GMA1 Glycine max Soybean D64115 GMA2 Glycine max Soybean U51583 GMA3 Glycine max Soybean U51855 GMA4 Glycine max Soybean U51854 GMA5 Glycine max Soybean AI495568 GMA7 Glycine max Soybean AI938438 HAN1 Helianthus annuus Sunflower Q10993 IBA Ipomoea batatas Sweet potato AF117334 MCR1 Mesembryanthemum Ice plant AA856241 crystallinum MCR2 Mesembryanthemum Ice plant AA887617 crystallinum MDO Malus domestica Apple tree AT000283 OSA1 Oryza sativa Rice U54702 OSA2 Oryza sativa Rice X57658 OSA5 Oryza sativa Rice C25431 PAM Persea americana Avacado JH0269 PBA Populus balsamifera Poplar AI167046 PCO Pyrus comunis Pear U82220 PTA Pinus taeda Pine AI812403 PTR Populus tremula Poplar AI162398 RCO1 Ricinus communis Castor bean Z49697 RCO2 Ricinus communis Castor bean T23262 SBI Sorghum bicolor Sorghum X87168 SLA Silene latifolia White campion Z93053 SLY1 Lycopersicon esculentum Tomato AF083253 SLY2 Lycopersicon esculentum Tomato X73986 SLY5 Lycopersicon esculentum Tomato AI1781497 STU1 Solanum tuberosum Potato L16450 STU2 Solanum tuberosum Potato L16450 STU3 Solanum tuberosum Potato L16450 STU4 Solanum tuberosum Potato L16450 STU5 Solanum tuberosum Potato L16450 STU6 Solanum tuberosum Potato L16450 STU7 Solanum tuberosum Potato L16450 STU8 Solanum tuberosum Potato L16450 STU10 Solanum tuberosum Potato X74985 VUN Vigna unguiculata Cowpea Z21954 ZMA1 Zea mays Maize D10622 ZMA2 Zea mays Maize D38130 ZMA4 Zea mays Maize AI001246 ZMA5 Zea mays Maize AI740162

(68) The synthetic gene encoding PHYTC57 was generated from six oligonucleotides (P1-P6) each approximately 70 nt, designed to include regions of at least 10 nt overlaps between adjacent oligonucleotides as described in Materials and Methods. For comparative study of naturally occurring phytocystatins we adopted a similar synthetic gene approach using oligonucleotides for the osa-ID86, can (Citrus unshiu; Satsuma) and cpa (Carcia papaya; Papaya) phytocystatin coding regions designed from the unique sequence of the appropriate protein with codon changes to reflect E. coli codon usage. The chicken egg white cystatin (cewc) coding region was PCR amplified from a pQE-derived plasmid. The genes were initially cloned into the phage display vector pDHisII by exploiting the SfiI and NotI sites (FIG. 1B).

(69) Cystatin Expression

(70) Cystatins were initially expressed from pDHisII constructs and purified by Ni-NTA affinity chromatography. Analysis of the purified cystatins by electrospray ionisation mass spectrometry indicated that, unexpectedly, in each case there was a C-terminal truncation of the expressed protein. Table 2 shows the expected masses of the full-length cystatins and those with a 16 amino acid C-terminal truncation calculated using the Protein Calculator program (scripps.edu/cdputnam/protcalc) together with the determined molecular masses. The sequence of the C-terminal end of the proteins is shown with the major truncation site indicated by an arrow. With the exception of CUN, the determined molecular mass is in excellent agreement with forms of the fusion proteins that had lost the C-terminal 16 amino acids, but which retain the His tag (Table 2). In the case of CUN there is truncation of fewer than 16 residues, but this was not characterised further. To ensure expression of defined protein sequences the cystatin coding regions plus His-tag were therefore sub-cloned into pET101 by PCR and TOPO-facilitated cloning and expressed in BL21 (DE3) Star cells (Stratagene). Protein expression was induced by addition of IPTG to 1 mM for 16 hours and the cystatins purified under native conditions by Ni-NTA affinity chromatography by virtue of their C-terminal hexahistidine tag. The cystatins were separated by SDS-PAGE to examine their purity which was estimated to be >98%, their masses were in excellent agreement with the expected values by ESI-MS (data not shown) and so these proteins were used for further analyses.

(71) TABLE-US-00011 TABLE2 Electrospraymassspectrometryresultsfor cystatinsexpressedfromHB2151. Predictedmass(Da) Experimental Phytocystatin Fulllength Truncated mass(Da) OSA1D86 14329 12603 12603.3 0.8 CPA 14321 12596 12593.5 6.8 PHYTC57 13855 12129 12128.3 0.8 CEWC 16399 14608 14607.4 1.5 CUN 14222 12496 12728.0 1.1 .............RGSHHHHHHARAEQKLISEEDLNGAA (SEQ ID NO 20)
Cystatin Binding Activity

(72) Enzyme assays with papain using the artificial substrate pGlu-Phe-Leu-p-nitroanilide confirmed that PHYTC57 was an active cysteine protease inhibitor. Ki values were not determined, but from the IC50 values for PHTYC57 and OSA-ID86 (4.610.sup.8 M and 1.810.sup.7 M respectively) it is clear that PHYTC57 is a more potent inhibitor than OSA-ID86. To directly measure the interaction between the cystatins and protease we measured the binding kinetics using BIAcore surface plasmon resonance analysis. The cystatins were immobilised onto nickel coated sensor chips by the C-terminal His-tags. Papain was then allowed to bind to the immobilised cystatin and measurements were made at several papain concentrations. Sensorgrams for the binding of OSA-ID86, CPA, CUN, CEWC and PHYTC57 to papain are shown in FIG. 3. The data at each concentration were fitted to the Langmuir 1:1 binding model and the kinetic constants were determined. The data showed a good fit to the model consistent with the known 1:1 stoichiometry of cystatin inhibition of cysteine proteases. A summary of the kinetic constants for these cystatins is shown in Table 3. PHYTC57 displays higher association and lower dissociation kinetics compared with the naturally occurring cystatins tested, with an equilibrium constant K.sub.D of 6.310.sup.12 M, indicating a tight binding complex with papain. This value is two orders of magnitude lower than the K.sub.D value measured for chicken egg white cystatin (3.910.sup.10 M), three orders of magnitude lower than the improved phytocystatin OSA-ID86 (4.710.sup.9 M) and four orders of magnitude lower than the phytocystatins CUN (1.410.sup.8 M) and CPA (210.sup.8 M).

(73) TABLE-US-00012 TABLE 3 Kinetic parameters determined by surface plasmon resonance. Association and dissociation rate constants (K.sub.a and K.sub.d) and equilibrium constants (K.sub.A and K.sub.D) are shown. K.sub.a K.sub.d K.sub.A K.sub.D (1/M .Math. s) (1/s) Chi.sup.2 (1/M) (M) OSAID86 2.6 10.sup.5 1.2 10.sup.3 0.98 2.1 10.sup.8 4.7 10.sup.9 CUN 2.6 10.sup.5 3.5 10.sup.3 2.3 7.3 10.sup.7 1.4 10.sup.8 CPA 2.6 10.sup.5 5.2 10.sup.3 3.2 5.0 10.sup.7 2.0 20.sup.8 CEWC 1.3 10.sup.6 4.9 10.sup.4 5.3 2.6 10.sup.9 .sup.3.9 10.sup.10 PHYTC57 3.5 10.sup.5 2.2. 10.sup.6 13.9 .sup.1.6 10.sup.11 .sup.6.3 10.sup.12
Phytocystatin Stability

(74) We were interested to explore whether PHYTC57 displayed greater stability when compared with a well-characterised parental phytocystain, OSA-ID86. Samples of these phytocystatins were incubated for various times in a boiling water bath, chilled and then tested for residual inhibitory activity in the papain assay (FIG. 4A). The consensus protein PHYTC57 displays greater thermostability (t=17 min) than OSA-ID86 (t=6 min) while inhibitory activity can still be detected at 80 min with PHYTC57 compared with only 58 min for OSA-ID86.

(75) We also tested the stability of OSA-ID86 and PHYTC57 using a simulated gastric fluid (SGF) digestibility assay. The proteins were incubated for various times in freshly prepared SGF before neutralising the reaction. Samples were then analysed in two ways, by enzyme assays to determine the residual inhibitory activity (FIG. 4B) and by SDS-PAGE (FIG. 4C) to determine whether the protein was digested and. Enhanced stability of PHYTC57 was observed in the digestion studies with the PHYTC57 and OSA-ID86 displaying t values of 260 sec and 30 sec respectively in enzyme assays. These assay data indicate that some 99% of OSA-ID86 is destroyed in the simulated gastric fluid between 30 and 60 seconds after incubation. For PHYTC57 this level of inactivation does not occur until between 2 and 5 minutes demonstrating that PHYTC57 is more resistant to the digestion conditions. The Coomassie stained SDS-PAGE results (FIG. 4C), which identify full-length protein, support these data with only trace OSA-ID86 present at 30 sec whereas for PHYTC57 some protein is still present at 120 sec. To confirm these results for PHYTC57 we analysed the SGF digestion products by western blot analysis using an anti-6His tag antibody. As shown in FIG. 4D this reveals that the majority of the protein has indeed been destroyed by 300 sec SGF treatment, although, trace amounts of intact PHYTC57 remain at 300 and even 420 sec treatment. If PHYTC57 was to be used for transgenic plant expression, the fact that the majority of full-length PHYTC57 protein and all inhibitory activity has been lost following a 10 min incubation in SGF would mean that PHYTC57 should be readily destroyed during the digestive process following any inadvertent host digestion.

(76) Discussion

(77) The consensus approach to protein design provides a method to generate a sequence that does not exist in nature. Such a sequence should optimise the conserved functional sequence parameters of a family of homologous proteins. In particular critical residues that are involved in the structure and folding of the family are likely to be highly conserved (Lehmann and Wyss 2001; Main et al. 2003a; Main et al. 2003b; Main et al. 2005) (Forrer et al. 2004) (Steipe 2004). Depending upon the biological roles of individual members of the family, functional residues may or may not be conserved. In the case of the phytocystatins which show functional conservation in the form of cysteine protease binding and inhibition, there is a reinforcement of the highly conserved sequence motifs that are involved in interaction with the cysteine protease active site. The consensus approach should therefore give rise to an optimised protein in which the biological function, cysteine protease binding, as well as general stability are enhanced.

(78) Plants contain a class of cystatins that have distinctive characteristics from animal cystatins and stefins, and plants do not contain stefins. In particular for the animal cystatins and stefins the N-terminal Gly sequence, is important for inhibition of cysteine proteases whilst this is not the case for plant cystatins [Abe, K. et al. (1988) J. Biol. Chem 263, 7655-7659]. In addition the plant cystatins contain a typical sequence [LVI][AGT][RKE][FY][AS][VI]X[EDQV][HYFQ]N (SEQ ID NO 81) that is not found in other cystatins or in stefins.

(79) SPR analysis of immobilised cystatins with papain revealed that PHYTC57 is more effective at forming and maintaining a protein:protein interaction complex with papain than are the 3 parental phytocystatins tested here. In particular the dissociation rate constant is reduced indicating that once bound, the PHYTC57:papain complex is more stable than those formed by the other cystatins tested. It has previously been reported that the animal-derived cystatins, which contain disulphide bonds are more efficient cystatins than characterised phytocystatins whose efficacy ranges from around 10 nM for OSA-I (Urwin et al. 1995) to pM for soybean cystatins (Koiwa et al. 2001) against papain. In our studies we observe that PHYTC57 is more effective at binding papain than the animal cystatin chicken egg white cystatin. In addition PHYTC57 displays enhanced thermal stability as well as greater stability in a simulated digestibility assay.

(80) The fact that PHYTC57 displays enhanced properties does not mean that it could not be enhanced further. The reinforcement of the conserved motifs that comprise the binding region may represent the biologically optimal binding sequence, but perhaps not the most effective. For example, Koiwa demonstrated that alteration of the third loop region in soybean cystatins enhanced efficacy of the resulting variants (Koiwa et al. 2001). There has been a report that novel sequences within the QVVAG (SEQ ID NO 83) region confer enhanced inhibitory activity (Melo et al. 2003). In terms of potential use in transgenic plant systems it is unlikely that further enhancements in stability would be beneficial due to the need to ensure that transgenic products do not accumulate in the environment.

(81) We have focussed here on the relative binding efficiency of PHYTC57 compared with OSA-ID86. This rice cystatin variant was the first enhanced phytocystatin generated and it has been subjected to a wide range of studies leading to transgenic plant expression and plant nematode resistance trials (Urwin et al. 1995; McPherson et al. 1997; Urwin et al. 1997; Urwin et al. 1998; Urwin et al. 2000; Urwin et al. 2001; Urwin et al. 2003; Lilley et al. 2004). PHYTC57 displays 34 amino acid differences from OSA-ID86. A comparison of the positions of these amino acids differences between OSA-ID86 and PHYTC57 is shown in FIG. 5 in which the substitutions are colour coded and represented upon the structural model for OSA-I (pdb code 1EQK) (Nagata et al. 2000). The differences are dispersed throughout the structural scaffold and include both conservative changes and non-conservative changes. It is interesting that 6 of the changes involve the introduction of negatively charged amino acids. Detailed comparative biochemical, sequence analysis and mutagenesis studies of individual parental cystatin molecules with the consensus protein could be undertaken to define the most critical consensus substitutions leading to enhanced properties, thus enhancing our understanding of protein structure-function relationships.

(82) The surprisingly significant enhanced stability of PHTC57 led us to consider the potential for this small consensus protein as a new scaffold. Protein scaffolds for the selection of new binding functions are proving to be useful in a wide range of applications as replacements for antibodies (Skerra 2007) including in medical applications (Wurch, Pierre et al. 2012). An example of the development of a cystatin as a scaffold for peptide aptamer selection based on human stefin A has been reported (Woodman, Yeh et al. 2005). There has been a recent report of the development of an Fn3-like consensus protein from 15 fibronectin or tenascin Fn3-like domain sequences which is proposed as a potential scaffold (Jacobs, Diem et al. 2012). The naturally occurring Fn3 domain 10 is a well-studied scaffold developed by Koide and colleagues (Koide, Bailey et al. 1998; Karatan, Merguerian et al. 2004; Koide, Gilbreth et al. 2007). Due to its enhanced thermostability, small size and lack of cysteines we anticipate that the consensus PHYTC57 will prove useful as a binding protein scaffold for displaying variable peptide loop libraries for screening against a range of target molecules to identify novel artificial binding proteins.

(83) PHYTC57 therefore offers potential benefits for transgenic plant defence schemes as an improved cysteine protease inhibitor targeted at pathogens such as plant nematodes, and for development as a scaffold protein for selecting new binding functionalities.

(84) Creation of Scaffold Protein Library, Screening and Testing Function of Adhirons

(85) Introduction

(86) The present inventors designed a novel artificial binding (scaffold) protein based on the consensus sequence of 57 plant-derived phytocystatins described above (termed PHYTC57 above, but referred to below as Adhiron). This artificial protein meets all the requirements (small, monomeric, high solubility and high stability and the lack of disulphide bonds and glycosylation sites) to be a good scaffold for peptide presentation. We chose the VVAG (SEQ ID NO 82) and the PWE regions of the Adhiron scaffold for peptide presentation with nine randomized positions in each loop.

(87) Based on high yields of Adhiron scaffold expressed in E. coli we hypothesised that protein alterations within the two loops have a tolerable effect on the protein expression level and stability of the scaffold. Therefore our scaffold seems to be amenable for use in generating combinatorial libraries for screening with the phage display technology (Smith 1985). The success of phage displays system relies on the quality of the initial DNA library, which is mainly derived by its diversity. Improved library diversity can be achieved by using trinucleotide (trimer)-synthesized oligos (Kayushin, Korosteleva et al. 1996) which provide theoretically equal levels of introduction of the different amino acids as well as avoidance of stop codons and cysteine (Virnekas, Ge et al. 1994; Krumpe, Schumacher et al. 2007). Furthermore, trimer insertions or deletions will not lead to a shift in reading frame mutation thereby still producing potentially functional proteins. Therefore we have chosen a trimer mixture encoding the 19 naturally occurring amino acids excluding cysteine for the loop-randomised oligos.

(88) Our work demonstrates that Adhirons have a high potential to play a key role in generating research reagents, diagnostics as well as therapeutics (drug discovery).

(89) The Adhiron scaffold shows remarkably high thermal stability (Tm ca. 101 C.) above that reported for any other non-repeat scaffold protein, and can be expressed at high levels in prokaryotic expression systems to produce recombinant protein reproducibly. We have constructed a phage-display library based on the insertion of randomised amino acid sequences to replace residues at two loop regions within the Adhiron. The library has a complexity of approximately 310.sup.10, with greater than 86% full length clones after phage production, indicating the very high quality of the library. As a demonstration of the efficacy of the library, the yeast Small Ubiquitin-like Modifier protein (SUMO) was screened to identify artificial binding protein (Adhiron) reagents capable of binding to this target protein. More than 20 individual Adhirons were identified that bind to yeast SUMO (ySUMO) as assessed by a phage enzyme-linked immunosorbent assay (ELISA). DNA sequencing indicated that the majority show partial sequence homology within one of the two loop regions to the known SUMO interactive motif (Val/Ile-X-Val/Ile-Val/Ile; where X is any amino acid). Four Adhiron coding regions were sub-cloned into the vector pET11 and recombinant protein was expressed, purified and tested in ELISA and Western blot analyses. The four Adhirons had low nanomolar affinities for ySUMO and showed high specificity to ySUMO with low level binding to human SUMO1 protein. Furthermore, we screened the Adhiron library against a number of other targets, namely fibroblast growth factor (FGF1), platelet endothelial cell adhesion molecule (PECAM-1), also known as cluster of differentiation 31 (CD31), and a 10 amino acid peptide with a cysteine on the N-terminus for thiol linkage to biotin (Cys-Thr-His-Asp-Leu-Tyr-Met-Ile-Met-Arg-Glu, SEQ ID NO 84) and also identified Adhirons against these targets as confirmed by phage ELISA. We have, therefore, developed a versatile, highly stable and well expressed scaffold protein, termed Adhiron, that is capable of displaying randomised peptide loops and we have demonstrated the ability to select highly specific, high affinity binding reagents from an Adhiron library against a range of targets for use in multiple applications.

(90) Materials and Methods

(91) Construction of Adhiron Library

(92) A consensus sequence derived from alignment of 57 phytocystatin sequences was identified as described above and a codon-modified gene designed for expression in E. coli was synthesised (GenScript). The Adhiron scaffold coding region and Adhiron library coding regions were cloned between NheI and NotI restriction sites to create a fusion coding region with the 3 half of the gene III of bacteriophage M13 in a phagemid vector pBSTG1, a derivative of pDHisII which is derived from pHEN1 (Hoogenboom, Griffiths et al. 1991) and which also contains a DsbA signal peptide (pBSTG1-DsbA-Adhiron). The library was constructed by splice overlap extension (SOE) of two PCR products (Horton, Cai et al. 1990) and all primers were synthesised by Ella Biotech.

(93) The first PCR product extended from the DsbA coding sequence to the first inserted loop and was generated by the primers:

(94) TABLE-US-00013 Forward primer (SEQ ID NO 21) 5-TCTGGCGTTTTCTGCGTC-3, Reverse primer (SEQ ID NO 22) 5-CTGTTCTTTCGCTTTAACAAC-3.

(95) The second PCR product introduced two nine amino acid loop regions into the scaffold protein at loop 1 and loop 2 by using the following primers. The PstI site used for cloning is underscored:

(96) TABLE-US-00014 Forward loop (SEQ ID NO 23) 5GTTGTTAAAGCGAAAGAACAGNNNNNNNNNNNNNNNNNNNNNNNNNN NACCATGTACCACTTGACCCTG-3, Reverse loop (SEQ ID NO 24) 5CTGCGGAACTCCTGCAGTTCTTTGAAGTTNNNNNNNNNNNNNNNNNN NNNNNNNNNCTTAACCCAAACTTTCGCTTCG-3.

(97) The degenerate positions (NNN) were introduced as trimers representing a single codon for each of the 19 amino acids excluding cysteine and there were no termination codons. The primers were also designed to introduce NheI (forward) and PstI (reverse) restriction sites to facilitate cloning into the pBSTG1 phagemid vector that contains an in-frame amber stop codon to allow translational read through to create an Adhiron-truncated pIII fusion protein. PCR was performed using Phusion High Fidelity Polymerase (NEB) at 98 C. for 5 minutes followed by 20 cycles of 98 C., 10 sec; 56 C., 15 sec; 72 C., 15 sec followed by 72 C. for 5 minutes. PCR products were purified by gel extraction (Qiagen), and used for SOEing with 10 cycles using the protocol above. The PCR product was digested with NheI and PstI and was gel extracted then cloned into the pBSTG1-Adhiron phagemid that had also been digested with NheI and PstI to leave the DsbA signal sequence and C-terminal coding region of the Adhiron, to generate the DNA based Adhiron library. Electroporation was used to introduce the ligated library products into E. coli ER2738 electrocompetent cells (Lucigen). In total 20 mL of ER2738 cells were electroporated with 50 ng of library DNA per 50 l of ER2738 cells. Cells were allowed to recover for 1 hr in 2TY medium and were then grown at 37 C., 225 rpm to an OD.sub.600 of 0.6 in 2 litres of 2TY medium. 1 l M13KO7 helper phage (NEB) (10.sup.14/ml) were added and allowed to infect the cells with shaking at 90 rpm for 1 hr, and then the culture was allowed to produce phage particles overnight at 25 C. in the presence of kanamycin (50 g/ml). The phage were precipitated with 6% polyethylene glycol 8000 and 0.3 M NaCl, and suspended in 50% glycerol for storage. Library size was determined to be 310.sup.10 with a minimal vector only background.

(98) Target Preparation and Phage Display

(99) The following protocols are described for yeast SUMO but an identical protocol was used for the screening or other targets. Yeast SUMO (ySUMO) protein was expressed in BL21 (DE3) cells using IPTG induction and purified by Ni-NTA resin (Qiagen) affinity chromatography according to the manufacturer's instructions. Purity was confirmed by SDS-PAGE. Yeast SUMO was biotinylated using EZ-link NHS-SS-biotin (Pierce), according to the manufacturer's instructions. Biotinylation was confirmed using streptavidin conjugated to horse radish peroxidase (HRP) to detect the biotin on ySUMO absorbed onto Immuno 96 Microwell Nunc MaxiSorp (Nunc) plates. Phage display library biopanning was performed as follows:

(100) 5 l of the phagemid library, containing 10.sup.12 phagemid particles, was mixed with 95 l phosphate buffer saline, 0.1% Tween-20 (PBST) and pre-panned three times in high binding capacity streptavidin coated wells (Pierce) for a total of 1 hour. 100 l of 100 nM biotinylated ySUMO was added to the panning well for 1 hour with shaking on a Heidolph VIBRAMAX 100 at speed setting 3 prior to adding the pre-panned phage for 2.5 hours also on the vibrating platform. Panning wells were washed 10 times in 300 l PBST using a plate washer (Tecan Hydroflex), and eluted with 100 l of 50 mM glycine-HCl (pH 2.2) for 10 minutes, neutralised with 1 M Tris-HCL (pH 9.1), and further eluted with 100 l of triethlyamine 100 mM for 6 minutes and neutralised with 50 l of 1M Tris-HCl (pH 7). Eluted phage were incubated with exponentially growing ER2738 cells (OD.sub.600=0.6) for 1 hr at 37 C. and 90 rpm. Cells were plated onto Lysogeny Broth agar plates supplemented with 100 g/ml carbenicillin and grown at room temperature overnight. The next day, the colonies were scraped into 5 ml of 2TY medium and inoculated into 25 ml of 2TY medium supplemented with carbenicillin (100 g/ml) to reach an OD.sub.600 of 0.2, incubated at 37 C., 225 rpm for 1 hr and infected with ca. 110.sup.9 M13K07 helper phage. After 1 hr incubation at 90 rpm, kanamycin was added to 25 g/ml, cells were incubated overnight at 25 C. at 170 rpm, and phage were precipitated with 6% polyethylene glycol 8000, 0.3M NaCl and resuspended in 1 ml of 10 mM Tris, pH 8.0, 1 mM EDTA (TE buffer). 2 l of this phage suspension was used for the second round of selection. This time phage display was performed using streptavidin magnetic beads (Invitrogen). Phage were pre-panned with 10 l of washed beads for 1 hr on a Stuart SB2 fixed speed rotator (20 rpm), and 10 l of beads were labelled with 100 l of 100 nM biotinylated ySUMO for four hours on the same rotator. Yeast SUMO labelled beads were washed three times in PBST prior to adding the pre-panned phage for 1 hr. Beads were washed 5 times in PBST using a magnet to separate beads from solution after each wash, then eluted and amplified as above. The final pan was performed using neutravidin high binding capacity plates (Pierce), as previously described for the first panning round, but this time the phage were eluted using 100 l 100 mM dithiothreitol (DTT) on a vibrating platform for 20 min prior to infection of ER2738 cells. Phage were recovered from wells containing target protein and controls wells to determine the level of amplification in target wells.

(101) Phage ELISA

(102) Individual ER2738 colonies from the final pan were picked and grown in 100 l of 2TY with 100 g/ml of carbenicillin in a 96 deep well plate at 37 C. (900 rpm) for 6 hr. 25 l of the culture was added to 200 l of 2TY containing carbenicillin and grown at 37 C. (900 rpm) for 1 hr. M13K07 helper phage (10 l of 10.sup.11/ml) were added, followed by kanamycin to 25 g/ml and the bacteria were grown overnight at 25 C. (450 rpm). Streptavidin coated plates (Pierce) were blocked with 2 casein blocking buffer (Sigma) overnight at 37 C. The following day the plates were labelled with 0.4 nM of biotinylated yeast SUMO for 1 hr, the bacteria were collected by centrifugation at 3000 rpm for 5 min and 45 l of growth medium containing the phage was added to wells containing biotinylated yeast SUMO or a well containing the biotinylated linker and incubated for 1 hr. Wells were washed using a Tecan Hydroflex plate washer 3 times in 300 l PBST, and a 1:1000 dilution of HRP-conjugated anti-phage antibody (Seramun) in 100 l PBST was added for 1 hr. Wells were washed 10 times in 300 l PBST and binding was visualised with 100 l 3,3,5,5-Tetramethylbenzidine (TMB) liquid substrate (Seramun) and measured at 560 nm.

(103) Adhiron Protein Production

(104) The DNA coding sequences of Adhirons that bound to yeast SUMO were amplified by PCR, the product was restriction digested with NheI and PstI and cloned into pET11a containing the Adhiron scaffold and digested with the same restriction sites. Colonies were picked and grown overnight in 5 ml LB medium at 37 C., 225 rpm and plasmid DNA was purified as minipreps (Qiagen) and sequenced to confirm the presence of the correct insert. Plasmids were transformed into BL21 (DE3) cells by heat shock and colonies were grown overnight at 37 C. The following day the culture was added to 400 ml of LB medium, grown to an OD.sub.600 of 0.6 at 250 rpm at 37 C. and isopropyl -D-1-thiogalactopyranoside (IPTG) was added to 1 mM final concentration. Cells were grown for a further 6 hr, harvested by centrifugation at 3000 g and re-suspended in 25 ml of 1 Bugbuster (Novagen). Benzonase was added according to the manufacturer's instructions and the suspension was mixed at room temperature for 20 minutes, heated to 50 C. for 20 minutes and centrifuged for 20 minutes at 9400g. The cleared supernatant was mixed with Ni-NTA resin 500 l of slurry for 1 hr, washed 3 times in 30 ml wash buffer (50 mM PBS, 500 mM NaCl, 20 mM imidazole, pH 7.4) and eluted in 1 ml of elution buffer (50 mM PBS, 500 mM NaCl, 300 mM imidazole, pH 7.4). 100 g of the SUMO binding Adhirons (Ad-ySUMO) were biotinylated using NHS SS-biotin (Pierce) according the manufacturer's instructions for use in ELISAs and Western blotting.

(105) ELISA Analysis

(106) 5 ng (unless otherwise indicated) of target protein in PBS was absorbed on to Immuno 96 Microwell Nunc MaxiSorp plate wells overnight at 4 C. The next day 200 l of 3 blocking buffer was added to the wells and incubated at 37 C. for 4 hours with no shaking. Biotinylated yeast SUMO binding Adhirons at 100 g/ml were diluted 1:1000 in PBST containing 2 blocking buffer and 50 l aliquots were incubated in target wells for 1 hr with shaking. Wells were washed 3 in 300 l PBST, and streptavidin conjugated to horse radish peroxidase (HRP) (Invitrogen) diluted 1:1000 in 50 l PBST was added to the wells for 1 hr. Wells were washed 6 in 300 l PBST and binding was visualised with 50 l TMB liquid substrate and the absorbance measured at 560 nm.

(107) Western Blot Analysis

(108) Target protein or target protein mixed with HEK293 cell lysate (20 g) was mixed with loading buffer (Laemmli, 60 mM Tris-Cl pH 6.8, 2% SDS, 10% glycerol, 5% -mercaptoethanol, 0.01% bromophenol blue), boiled for 3 min, centrifuged for 1 min at 15,000g and then resolved in a 15% SDS-polyacrylamide gel. Proteins were transferred to PVDF membranes for 45 minutes at 4 Watts (Amersham Biosciences) and incubated for 1 hr in blocking buffer (5% BSA in PBS 0.1% Tween) followed by incubation for 1 hr with Ad-ySUMO (100 g/ml diluted 1:1000 PBST). Bound Ad-ySUMOs were detected using streptavidin conjugated HRP and chemiluminescence (ECL Plus kit, Amersham).

(109) Protein-Protein Interaction Affinity Measurement

(110) The BLitz (ForteBio) dip and read streptavidin biosensors were used to estimate affinity of binding of the biotinylated Ad-ySUMO binders, according to the manufacturer's instructions. In brief, at least 4 readings at different ySUMO concentrations (0.25 mM-1 mM), were used to measure the affinity of each Ad-ySUMO. A global fit was used to calculate the affinity of each Ad-ySUMO. These readings were comparable to affinities measures made using a Biacore surface plasmon resonance instrument.

(111) Results

(112) Adhiron Design and Phage Display

(113) The Adhiron gene was originally designed to create a more potent protease inhibitor, however, due to its potential as a scaffold protein for presenting constrained peptide regions for molecular recognition (FIG. 6A) we decided to investigate its use as such a scaffold. The gene sequence was codon optimised to enhance expression in an E. coli expression system (FIG. 6B). Restriction sites were introduced to facilitate cloning of the gene into the pBSTG1 phagemid vector to allow in-frame translational read through of an amber stop codon to allow an Adhiron-truncated pIII fusion protein to be produced when expressed in non-suppressor cells such as ER2738 cells but to allow production of the Adhiron only in suppressor cells such as JM83. The Adhiron pIII fusion protein was expressed from the phagemid vector pBSTG1, while the other components to allow replication and packaging of the phagemid DNA into M13 phage particles were introduced using M13KO7 helper phage. Expression of the Adhiron-pIII fusion protein was confirmed by Western blot analysis using an anti-pIII antibody. The thermal stability of the Adhiron scaffold was tested by differential scanning calorimetry, which showed a melting temperature of 101 C. (FIG. 7A). The structural integrity of the consensus sequence was examined using circular dichroism, which demonstrated a high ratio of beta sheet to alpha helix and random coil (FIG. 7B).

(114) We then compared the thermal stability by differential scanning calorimetry of the Adhiron scaffold (FIG. 8A) with a representative small soluble well characterised protein, lysozyme (FIG. 8B) which shows that lysozyme is significantly less stable (Tm ca. 65 C.) than the Adhiron. We then tested an Adhiron selected to bind to a myc antibody, the addition of the loops into the scaffold reduces the Tm to 85 C. but this still represents a higher melting temperature than most scaffold proteins. This Adhiron protein can undergo repeated cycles of denaturation and renaturation as shown by the series of scans (FIG. 8C).

(115) Library Design

(116) The introduction of peptide encoding sequences suitable for molecular recognition was guided by the predicted loop positions within the structure of the Adhiron (FIG. 6). Loop1 was positioned between the first and second beta strands and loop2 was positioned between the third and fourth beta strands. Sequences comprising nine random amino acids (excluding cysteine) were introduced at both loop positions replacing four and three amino acids in loop1 and loop2, respectively. To determine if extension of these loop regions disrupts the structure of the Adhiron three individual Adhirons with loop insertions were isolated, expressed and examined by circular dichroism (FIG. 7B). All three clones maintained a high proportion of beta structure, with one clone displaying an increase in beta structure content likely indicating extension of the beta strands into the new loop regions. This demonstrates that loop insertion does not affect the structure of the scaffold. We generated a phage display library of complexity approximately 310.sup.10. To check the amino acid composition, 96 clones were isolated from ER2738 cells infected with the library. We examined the sequence of phage clones to determine any bias in amino acid composition or other undesirable consequences introduced during phage production (FIG. 7C). No bias in amino acid distribution was observed. 86.5% clones were full-length variants while 3.1% of clones were the Adhiron scaffold with no inserts, and 10.4% of the clones showed frame shifts and so were likely of no value in the library. This very high proportion of full-length coding sequences at the level of the phage genome demonstrates the high quality of the library generated.

(117) Library Screening

(118) Library screening was performed initially using yeast SUMO as the target. Yeast SUMO was biotinylated to allow immobilisation of the protein via avidin binding proteins and to ensure that the target was not adsorbed directly onto plastic or particle surfaces which can sometimes lead to denaturation of the target protein. This ensures that the target protein maintains its three dimensional structure allowing for the selection of binding proteins that recognise either linear, or conformational epitopes. Over 1000-fold amplification in colony recovery was observed compared to control samples by panning round three. Twenty four clones were isolated and their ability to bind to the SUMO target was confirmed by phage ELISA (FIG. 9). All clones tested showed strong binding to yeast SUMO with little or no binding to the control wells demonstrating the specificity of the Adhirons. The clones were sequenced, which identified 22 distinct Adhirons and the sequences of the loop regions in these clones is shown in Table 4.

(119) TABLE-US-00015 TABLE 4 Showing the two loop sequences for the 24 Ad-ySUMO binders identified from the screen. Ad-ySUMO Loop1 (SEQ ID No) Loop2 (SEQ ID No) 1 WDLTGNVDT (25) WDDWGERFW (49) 2 IDLTNSFAS (26) DINQYWHSM (50) 3 INLMMVSPM (27) GIQQNPSHA (51) 4 IDLTHSLNY (28) GLTNEIQKM (52) 5 IDLTHSLNY (29) GLTNEIQKM (53) 6 IDLTEWQDR (30) PEPIHSHHS (54) 7 WVDMDYYWR (31) MDEIWAEYA (55) 8 IDLTQTEIV (32) EPGIIPIVH (56) 9 IDLTDVWID (33) GLMTQTNSM (57) 10 IIIHENDAD (34) GIMDGLNKY (58) 11 WILNNTQFI (35) VLEGPDRWTV (59) 12 WYERSENWD (36) RDYGFTLVP (60) 13 WDLTTPINI (37) YEDYQTPMY (61) 14 WFDDEYDWI (38) DYAATDLYW (62) 15 IDLTQPHDS (39) YEEDEYWRM (63) 16 IDLTQSFDM (40) PIDSNFTGT (64) 17 WYLLDVMDD (41) HDRRYKQAE (65) 18 WIDRGQYWD (42) IHNGYTIMD (66) 19 WSEADNDWH (43) LDLETWQHF (67) 20 IDLTGQWLF (44) PLWQYDAQY (68) 21 IDLTQSFDM (45) PSHHNYQTM (69) 22 IDLTQSFDM (46) PIDSNFTGT (70) 23 IDLTQPHDS (47) PHDELNWNM (71) 24 WEDFQTHWE (48) DVGQLLSGI (72)

(120) Clones 4 and 5 are identical and clones 16 and 22 are also identical. Interestingly clones 15 and 23, as well as 21 and 22 contain the same amino acid sequence in loop1 but different sequences in loop2. This sequence variation further supports the complex nature of the library. Analysis of the sequences identified a commonly occurring sequence of IDLT in positions 1 to 4 of loop1 in 12 of the clones indicating that this may be an important motif in binding to at least one epitope on the ySUMO. Also either a P or G in position occurs at position 1 of the second loop in 9 distinct clones and a P or G occurs in a position within residues 2 to 5 in another 6 clones potentially indicating that some structural feature may be important in binding. Interestingly the IDLT motif is similar to the human SUMO1 binding site of the MEF2 E3 ligase PIASx (VDVIDLTSEQ ID NO 73) (Song, Durrin et al. 2004; Song, Zhang et al. 2005). Four clones were selected for further characterisation; clones 15 and 22 as this loop1 sequence occurred more than once, clone 20 as it also contained the IDLT motif and clone 10 as it contained a distinct motif in loop1.

(121) Characterisation of the Adhiron-ySUMO (Ad-YSUMO) Proteins

(122) Due to the high thermal stability of the Adhiron scaffold (FIG. 7A) we predicted that to aid purification it should be possible to heat denature and precipitate the majority if E. coli proteins without affecting the integrity of the expressed Adhirons. To test this we heated lysates for 20 minutes at 50 C., 60 C., 70 C., 80 C., 90 C. and 100 C. centrifuged to pellet the denatured protein and analysed the supernatants by SDS-PAGE (FIG. 10A). Heating the lysate dramatically decreased the quantity of bacterial protein in the supernatant but did not significantly reduce Adhiron levels. A temperature of 50 C. was suitable to precipitate the majority of bacterial proteins and so was adopted in future studies. FIG. 9B demonstrates that the purified Ad-ySUMOs show high purity using a batch metal affinity purification method and that in some samples such as clone 10, the majority, of the protein was not isolated, potentially due to the limiting amount of the resin used during this purification. The estimated level of protein expressed was approximately 100 mg/L. The affinities of the Adhirons were estimated by using BioLayer Interferometry with BLitz (ForteBio) dip and read biosensors. The affinities were 11.5, 2.4, 14.2, and 9.0 nM for Ad-ySUMO10, 15, 20 and 22, respectively. These values are in line with affinities normally seen for good antibodies.

(123) To further evaluate the use of the Adhirons as research reagents the Ad-ySUMOs were biotinylated and used in ELISA (FIG. 10C) and Western blot analysis (FIG. 10D). The Ad-ySUMOs bind to yeast SUMO but not to human SUMO1 (data not shown for SUMO1 Western blots) (n=3). To determine the specificity of the reagents, yeast SUMO was mixed with HEK293 cell lysates. Interestingly, Ad-ySUMO10 and 15 show specific binding to yeast SUMO with no binding to other proteins but Ad-ySUMO20 and 22 bind to many proteins in the lysates by Western blotting (n=3).

(124) Further Example Screens

(125) To further evaluate the ability of the Adhiron library to identify specific reagents capable of binding to a range of targets we screened against a growth factor (FGF1), a receptor (CD31), and a peptide sequence. All screens were performed over three panning rounds. Phage ELISA was used to examine the ability of Adhirons to bind to the corresponding target (FIG. 11).

(126) Interestingly, the majority of the clones tested for FGF1 and CD31 showed specific binding, whereas only three clones from the peptide screen showed specific binding. Further panning rounds against the peptide increased the ratio of hits to background so that 80% of the clones picked showed binding to target. This result is not unexpected due to the small size and limited likelihood of appropriate epitope presentation of the peptide compared to the larger and therefore potentially multiple epitope sites of the proteins.

(127) To confirm that expressed Adhirons bind to their targets we have used the Blitz to analyse three distinct recombinant Adhirons for both CD31 and the peptide target. The Adhirons were expressed and purified as soluble proteins. The K.sub.D values for CD31 Adhirons ranged from 8.510.sup.8 to 6.810.sup.9 M while those for the peptide ranged from 3.310.sup.8 to 3.510.sup.8.

(128) Additionally, as shown in FIG. 25 we have identified Adhirons that bind to an organic molecule, posaconazole,

(129) Adhiron Crystal Structure

(130) FIG. 17 shows the crystal structure of an Adhiron complexed with a FcgRIIIa receptor domain. This reveals the 3D structure of the Adhiron showing the key structural elements of 4 beta strands and an alpha helix. The structure is unexpected in terms of a more compact nature and an apparently less twisted structure than is seen for X-ray structures of other cystatins, including for example stefin A. It is interesting that the beta structure extends into the loop regions to varying degrees.

(131) The importance of displaying two randomised loops for selection of binding molecules for at least some targets is highlighted in this structure by the intimate interaction of both loops with the receptor protein. These loops correspond to LOOP1 and LOOP2 described in more detail below.

(132) Discussion

(133) We have developed a new scaffold protein based on a consensus design of plant cystatin proteins, termed Adhiron, and which displays an extremely high thermal stability with a Tm of 101 C. This scaffold has been used to produce libraries by the introduction of two 9 amino acid variable regions. These variable sequences were encoded by oligonucleotides in which the variable positions are a subset of trimers comprising a single codon for each amino acid with the exception of cysteine. This resulted in a very similar distribution of each amino acid within the library. The library was of very high quality with 86.5% of clones representing full length variant clones.

(134) The library was configured in a filamentous phage display format as a truncated gp3 fusion and has been screened against various target proteins. Analysis of Adhirons identified by screening against yeast SUMO revealed a number of proteins with distinct sequences in their variable regions. In some cases there are similarities which implies binding to the same site on the SUMO, whereas other clones do not show sequence conservation. All clones bind to ySUMO and not to a range of control proteins, indicating specificity. We have also identified Adhirons that bind specifically to a growth factor, human FGF1, a human receptor protein domain from CD31, a peptide and an organic compound. The ability to select binders against organic compounds as well as wide range of proteins is an important finding as most scaffolds have structural features that favour particular classes of target molecules.

(135) The Adhirons can be conveniently purified by the inclusion of a temperature step at 50 C. which denatures many of the endogenous E. coli host proteins, thus enhancing the efficiency of affinity purification of the Adhiron. The X-ray structure of a complex of an Adhiron that binds to the human FcgRIII receptor provides useful information not only on the complex but also on the Adhiron protein and reveals a more compact and largely beta structure that supports the CD data. The compact nature of the scaffold, which is more pronounced than seen in other structures of stefins or cystatins seems likely to contribute to its high thermal stability. The interaction interface revealed by the X-ray structure indicates that both loop regions are involved in the interaction with the receptor domain.

(136) The demonstration of successful and high quality libraries based on this highly stable, small and easily purified scaffold coupled with our robust and effective strategy for screening against a range of proteins makes our Adhiron library a valuable and novel resource for the development of reagents for a wide range of scientific, medical and commercial applications.

(137) Variants of Adhirons

(138) Variants of the Adhiron scaffold have been produced. The sequence shown in SEQ ID NO 1 and FIG. 6 is for the shortest version Adhiron examined, and comprises 81 residues in length before addition of further functional sequences such as a linker and His-tag or other sequences. However, two longer scaffold proteins have also been produced (SEQ ID NOS 2 and 3), and each may be preferable in certain circumstances. It is possible that further deletion from the scaffold may be possible, but it is believed that SEQ ID NO 1, or variants thereof, are near to the optimum minimal length without stability of the scaffold protein being unduly compromised.

(139) Full length Adhiron 92 (92 aa) has the following sequence (which consists of the scaffold sequence SEQ ID NO 3 plus a linker and His-tag, which are underlined):

(140) TABLE-US-00016 (SEQ ID NO 74) ATGVRAVPGN ENSLEIEELA RFAVDEHNKK ENALLEFVRV VKAKEQVVAG TMYYLTLEAK DGGKKKLYEA KVWVKPWENF KELQEFKPVG DA AAAHHHHHH

(141) Short Adhiron 84 (84 aa) has the following sequence (which consists of the scaffold sequence SEQ ID No 2 plus a linker and His-tag, which are underlined):

(142) TABLE-US-00017 (SEQ ID NO 75) GNENSLEIEE LARFAVDEHN KKENALLEFV RVVKAKEQVV AGTMYYLTLE AKDGGKKKLY EAKVWVKPWE NFKELQEFKP VGDA AAAHHHHHH

(143) Shortest Adhiron 81 (81 aa), which is shown in FIG. 6, has the following sequence (which consists of the scaffold sequence SEQ ID No 1 plus a linker and His-tag, which are underlined):

(144) TABLE-US-00018 (SEQ ID NO 76) NSLEIEELAR FAVDEHNKKE NALLEFVRVV KAKEQVVAGT MYYLTLEAKD GGKKKLYEAK VWVKPWENFK ELQEFKPVGD A AAAHHHHHH

(145) The underlined sequence comprises an additional 3 Ala linker and 6 His detection/purification tag. This tag is not part of the scaffold per se, but is a useful addition to the protein for obvious reasons.

(146) Specific Examples of Adhirons for Libraries

(147) Exemplary Adhiron sequences which are useful for preparing scaffold protein libraries, i.e. libraries in which a variety of peptides have been inserted into the scaffold, are as follows:

(148) TABLE-US-00019 Adhiron 92 (sequence shown includes and additional Met, linker and tag) (SEQ ID NO 77) M custom character VRAVPGNENSLEIEELARFAVDEHNKKENALLEFVRVVKAKEQ TMYYLTLEAKDGGKKKLYEAKVWVK NFKELQEFKPVGDA AAAHHHHHH

(149) One or more of the following modifications can be/have been made: An additional methionine residue (in bold) has been added at the N-terminus to facilitate translation. An N-terminal portion is located at amino acid residues 2-4 (in bold and italics), and suitably these 3 amino acids are replaced by an insert of typically from 3 up to about 20 amino acids. LOOP1 is located at amino acid residues 47-50 (numbered to exclude the N-terminal met) (in bold and italics), and suitably these 4 amino acids can be replaced by an insert of typically from 4 up to about 20 amino acids. A loop length of from 5 to 13 amino acids is preferred, and it is believed that a loop length of 9 amino acids is optimal. LOOP2 is located at amino acid residues 76-78 (numbered to exclude the N-terminal met) (in bold and italics), and suitably these 3 amino acids can be replaced by an insert of an insert of typically from 3 up to about 20 amino acids. A loop length of from 5 to 13 amino acids is preferred, and it is believed that a loop length of 9 amino acids is optimal. A C-terminal linker and His-tag is present. The length and composition of the linker can be varied, and the tag could of course be adapted to any suitable purification system.

(150) TABLE-US-00020 Adhiron 84 (sequence shown includes and additional Met, linker and tag) (SEQ ID NO 77) MGNENSLEIEELARFAVDEHNKKENALLEFVRVVKAKEQ custom character TMYYLTLEAKD GGKKKLYEAKVWVK NFKELQEFKPVGDA AAAHHHHHH

(151) One or more of the following modifications can/have be made: An additional methionine residue (in bold) has been added at the N-terminus to facilitate translation. An N-terminal peptide sequence can be added to the N-terminus of the Adhiron (i.e. between the methionine residue and the first glycine as shown above), and this addition can be typically from 3 up to about 20 amino acids. LOOP1 is located at amino acid residues 39-42 (numbered to exclude the N-terminal met) (shown in bold and italics), and suitably these 4 amino acids can be replaced by an insert of typically from 4 up to about 20 amino acids. A loop length of from 5 to 13 amino acids is preferred, and it is believed that a loop length of 9 amino acids is optimal. LOOP2 is located at amino acid residues 68-70 (numbered to exclude the N-terminal met) (shown in bold and italics), and suitably these 3 amino acids can be replaced by an insert of typically from 3 up to about 20 amino acids. A loop length of from 5 to 13 amino acids is preferred, and it is believed that a loop length of 9 amino acids is optimal. A C-terminal linker and His-tag is present. The length and composition of the linker can be varied, and the tag could of course be adapted to any suitable purification system.

(152) TABLE-US-00021 Adhiron 81 (excludes the Met) (SEQ ID NO 78) MNSLEIEELARFAVDEHNKKENALLEFVRVVKAKEQ custom character TMYYLTLEAKDGG KKKLYEAKVWVK NFKELQEFkPVGDA AAAHHHHHH

(153) One or more of the following modifications can/have be made: An additional methionine residue (in bold) has been added at the N-terminus to facilitate translation. An N-terminal loop can be added to the N-terminus of the Adhiron, and this addition can be typically from 3 up to about 20 amino acids. LOOP1 is located at amino acid residues 36-39 (numbered to exclude the N-terminal met) (shown in bold and italics), and suitably these 4 amino acids can be replaced by an insert of typically from 4 up to about 20 amino acids. A loop length of from 5 to 13 amino acids is preferred, and it is believed that a loop length of 9 amino acids is optimal. LOOP2 is located at amino acid residues 65-67 (numbered to exclude the N-terminal met) (shown in bold and italics), and suitably these 3 amino acids can be replaced by an insert of typically from 3 up to about 20 amino acids. A loop length of from 5 to 13 amino acids is preferred, and it is believed that a loop length of 9 amino acids is optimal. A C-terminal linker and His-tag is present. The length and composition of the linker can be varied, and the tag could of course be adapted to any suitable purification system.

(154) Thus, taking Adhiron 92 as an example, a particularly suitable scaffold protein for use in a display system may take the form:

(155) TABLE-US-00022 N-TERM PEPTIDELOOP 1 MXXXXXXVRAVPGNENSLEIEELARFAVDEHNKKENALLEFVRVVKAKEQXXXXXXXXXTM LOOP2linker and tag YYLTLEAKDGGKKKLYEAKVWVKXXXXXXXXXNFKELQEFKPVGDA AAAHHHHHH
where X is any amino acid. (SEQ ID NO 79)

(156) In general LOOP1 and LOOP2 are believed to be of primary importance in target binding, as supported by the crystal structures (FIG. 17). In certain embodiments of the invention only one of LOOP1 and LOOP2 can be replaced with a peptide sequence, but in general it is preferred that both are replaced. Whilst the N-TERM is not envisaged as being quite as important as LOOPs 1 and 2, in many circumstances inserting a suitable peptide at the N-TERM may result in improved binding affinity and specificity.

(157) FIGS. 20 to 24 show sequence alignments of the LOOP1 and LOOP2 regions of several Adhirons which bind to LOX1, HGH, yeast SUMO, PBP2 and a peptide respectively.

(158) Additionally, FIGS. 27 and 28 show LOOP1 and LOOP2 regions of several Adhirons which bind to Grb2 and STAT3.

(159) It should be noted that peptides can optionally be inserted into the loop regions without removal of the existing amino acids, but this is typically less preferred.

(160) Additional Examples of the Wide Utilities of Adhirons

(161) Immunofluorescence:

EXAMPLE

Reagents to Detect Viral Proteins

(162) Despite repeated attempts over many years it has not proven possible to raise an antibody that identifies the Human Papilloma Virus E5 protein. (Quantitative measurement of human papillomavirus type 16 E5 oncoprotein levels in epithelial cell lines by mass spectrometry. Sahab et al., J Virol. 2012 September; 86(17):9465-73; Wetherill, L F, Ross, R and Macdonald, A (2012). HPV E5: An enigmatic oncoprotein Small DNA Tumour Viruses Ed. K. Gaston. Caister Academic Press. Pp. 55-70)

(163) The utility of the Adhiron system is thus demonstrated by the example that Adhirons were raised against HPV16 E5 viral protein using as a target a peptide with identity to a region of the E5 protein. The E5 Adhiron was biotinylated and used in immunofluorescence to detect E5 protein in overexpressed cells. The Adhiron does not cross react with other HPV serotypes. In addition the E5 Adhirons have been conjugated to Quantum Dots and used to detect E5 protein in human samples. Conjugation to Quantum Dots increased the sensitivity of the reagents. See FIG. 12. HPV16 E5 GFP (target) and HPV16 E5 GFP without epitope for Adhiron (control) were expressed in mammalian cells. Adhiron E5 was conjugated to quantum dots and used to detect E5 protein in the mammalian cells. Cells were stained with DAPI (DNA stain), GFP, and E5 (with the Adhiron-Quantum dots). The Adhiron only binds to the target showing specificity to the E5 protein.

(164) Inhibiting and Modifying ProteinProtein Interactions:

Example 1

Reagents that Inhibit Binding to SUMO

(165) There have been no antibodies raised that are able to specifically and differentially bind to human SUMO 2 (hSUMO2). Adhirons were raised against hSUMO2 and multiple Adhirons that specifically bind to SUMO2 rather than human SUMO1 were identified. FIG. 13 demonstrates that the hSUMO2 Adhirons have a functional effect on a protein-protein interaction by binding to hSUMO2 and preventing RNF4, a polySUMO specific E3 ubiquitin ligase from binding with hSUMO2. The hSUMO2 Adhiron has this effect without affecting ubiquitination of other proteins. In the presence of ATP hSUMO2 normally binds to RNF4 causing ubiquitination of the target proteins (black smear at the top of the gel in lane 2). In the presence of increasing concentrations of the Adhirons (lanes 3 to 9) the level of ubiquitination decreases.

Example 2

Reagents that Alter Fibrin Clot and Lysis

(166) Fibrinogen was screened to identify Adhirons that could alter clot formation and lysis. Numerous Adhirons have been identified that alter this process in plasma samples. The graph shown in FIG. 14 represents the clot formation and lysis turbidity assay. The black line represents the normal time course of clot formation and lysis. The grey lines represent the effects of five different Adhirons on this process. Control non-fibrinogen binding Adhirons have no effect on this assay. The effects of the Adhirons include, reduced clot formation, increased lysis time, and increased clotting time. This demonstrates an ability of the Adhirons to modulate protein function by inhibiting protein-protein interactions. FIG. 15 shows a confocal image of FITC fluorescently labelled fibrinogen after clot formation in the presence of a fibrinogen binding Adhiron and a control Adhiron.

(167) Expression of Adhirons in Mammalian Cells:

(168) Adhirons were raised against human SUMO2 as described in Example 1 and expressed in mammalian HEK293 cells using the pcDNA3.1 mammalian expression vector. The Adhirons were fused with a FLAG tag and a nuclear localisation signal. Control cells (no Adhiron expressed) and cells expressing a human SUMO2 specific Adhiron were treated with arsenic (As) for 2 hrs then washed and allowed to recover for 12 and 24 hr. Arsenic causes an increase in promyelocytic leukaemia (PML) protein bodies but SUMO regulates the degradation of these bodies. Cells were stained using an anti-FLAG antibody to identify the Adhiron, and for PML (organiser of nuclear bodies). The human SUMO2 Adhiron alters the degradation of PML leading to an increase in these bodies. This demonstrates ability to express functional Adhirons in mammalian cells. The results are shown in FIG. 16.

(169) Co-Crystallisation and Other Structural Biology Methods to Identify Druggable Sites on Proteins:

(170) FIG. 17 shows the co-crystal structure of FcgRIIIa (grey) and bound Adhiron (white). Adhirons were identified that bind to FcgRIIIa then used in a range of assays to show that the inhibit IgG binding, including cell- and SPR-based assays. The Adhirons were then co-crystallised and the structure solved (diagram above). This identified druggable sites, including an allosteric site, on FcgRIIIa. The Adhiron is also suitable for NMR studies and, as an example, FIG. 18 shows 1H-15N HSQC spectra of an anti-yeast SUMO Adhiron and the Adhiron-yeast SUMO complex. The ability to rapidly collect structural data on Adhirons will have many applications including identifying potential drug binding sites.

(171) Incorporation Into Electronic Devices for Developing Point of Care Devices:

(172) Site directed mutagenesis of the coding sequence of the human SUMO2 Adhiron (described in Example 1) allowed the introduction of a cysteine at the C-terminal end of the oligohistidine tag. This allowed directional immobilisation of the Adhiron to an electronic device surface such that the molecular recognition loops are accessible to analyte. Upon binding of the target protein to Adhiron on the device a change in impedance can be measured. The change is concentration dependent (as shown in FIG. 19) demonstrating the effective presentation of the Adhiron and the ability of Adhirons to be productively incorporated into electronic devices as a platform for biosensor applications.

(173) This protocol was repeated for another Adhiron, in this case a human fibrinogen Adhiron. The results of this work are shown in FIG. 26. Once again a concentration dependent change was observed, and this was demonstrated over a range from attomolar to micromolar concentrations.

(174) Adhirons Have Been Identified that Bind to a Range of Targets:

(175) The Adhiron library has been used to screen against a wide range of target molecules by using display methodology described in this document. Table 5 lists examples of targets against which Adhirons have been raised; these include proteins and small molecules. The Adhirons raised against these targets display high affinity and specificity. This demonstrates that that Adhirons provides a versatile scaffold molecule and that the libraries built using this scaffold are effective in identifying artificial binding proteins capable of binding to a broad range of target molecules. Examples of some of the sequences that have been isolated from screens against some of the targets are shown in FIGS. 20-24. We have recently also screened against magnetic particles produced by magnetotropic bacteria and have identified Adhirons that bind to epitopes on these multicomponent bioinorganic complexes.

(176) TABLE-US-00023 TABLE 5 Targets against which Adhirons have been successfully raised. FcRIIIa - protein Interleukin 8 - protein P7 - peptide Human serum albumin - protein E5 - peptide C-reactive protein - protein 3D - protein Beta 2-microglobulin - protein HSV-1 gB - protein Serum amyloid P - protein HSV-1gD- protein Vascular endothelial growth factor receptor 2 - protein HSV-2 gD - protein Oxidized low-density lipoprotein receptor 1 - protein M2 - protein Allograft inflammatory factor 1 - protein HE4 - protein CD30 - protein S100 calcium binding protein B - CD31 - protein protein Yeast SUMO - protein Beta secretase1 - protein Human SUMO1 - protein Proprotein convertase subtilisin/ kexin type 9 - protein Human SUMO2 - protein Myosin e1 - protein GST - protein Enhance green fluorescent protein - protein Growth Hormone - protein Fyn - protein Fibroblast growth factor 1 - protein Lck - protein Fibroblast growth factor receptor 1 - ZAP70 - protein protein Fibroblast growth factor receptor 3 - TUBA8 - peptide protein Phospho Fibroblast growth factor MRP1- peptide receptor 3 Epidermal growth factor receptor 1 - penicillin-binding protein 2a - protein protein HER2 - protein Dog IgE - protein Epiregulin - protein Horse IgG - protein Amphiregulin - protein Dog IgG - protein Fibrinogen - protein Dog CRP - protein Complement C3 - protein 3 small molecules Myoglobin - protein Posaconazole - small molecule CK19 - protein GP-73 - protein HE4 - protein Thioredoxin - protein CD27 - protein Signal transducer and activator of transcription 1 - protein Signal transducer and activator of Signal transducer and activator transcription 3 - protein of transcription 4 - protein Signal transducer and activator of Phosphoinositide 3-kinase p85 transcription 5 - protein alpha - protein Phosphoinositide 3-kinase p85 beta - Phosphoinositide 3-kinase p55 - protein protein myeloid leukemia cell differentiation B-cell lymphoma-extra large - protein - protein protein Nucleoside transporter protein C - Factor XIII - protein membrane protein Breakpoint cluster region protein - Casein kinase II A1 - protein protein Casein kinase II A2 - protein Protein kinase C zeta - protein Protein kinase cGMP dependent type vaccinia related kinase 1 - II - protein protein Hydrophobin protein 1 - protein FusB - protein Interleukin 17A - protein Interleukin 6 - protein Osteocalcin - protein Osteopontin - protein Parathyroid hormone - protein Bacterial spores - protein matrix metalloproteinase-3 - protein Aminotransferase - protein Cytokeratin 8 -protein S100 - A3 - protein S100 A6 - protein Transacetylase - protein serine protease inhibitors A1 - serine protease inhibitors A3 - protein protein GAPDH - protein p53 - protein Resistin - protein Lipocalin 2 - protein Procalcitonin - protein

REFERENCES

(177) Astwood, J. D., J. N. Leach, et al. (1996). Stability of food allergens to digestion in vitro. Nature Biotechnology 14(10): 1269-1273. Atkinson, H. J., K. A. Johnston, et al. (2004). Prima facie evidence that a phytocystatin for transgenic plant resistance to nematodes is not a toxic risk in the human diet. Journal of Nutrition 134(2): 431-434. Bendtsen, J. D., H. Nielsen, et al. (2004). Improved prediction of signal peptides: SignalP 3.0. Journal of Molecular Biology 340(4): 783-795. Binz, H. K., M. T. Stumpp, et al. (2003). Designing Repeat Proteins: Well-expressed, Soluble and Stable Proteins from Combinatorial Libraries of Consensus Ankyrin Repeat Proteins. Journal of Molecular Biology 332(2): 489-503. Bode, W., R. Engh, et al. (1988). The 2.0 a X-Ray Crystal-Structure of Chicken Egg-White Cystatin and Its Possible Mode of Interaction With Cysteine Proteinases. Embo Journal 7(8): 2593-2599. Carter, P. J. (2011). Introduction to current and future protein therapeutics: A protein engineering perspective. Experimental Cell Research 317(9): 1261-1269. Dai, M. H., H. E. Fisher, et al. (2007). The creation of a novel fluorescent protein by guided consensus engineering. Protein Engineering Design & Selection 20(2): 69-79. Deboer, H. A., L. J. Comstock, et al. (1983). THE TAC PROMOTERA FUNCTIONAL HYBRID DERIVED FROM THE TRP AND LAC PROMOTERS. Proceedings of the National Academy of Sciences of the United States of America-Biological Sciences 80(1): 21-25. Filippova, I. Y., E. N. Lysogorskaya, et al. (1984). L-Pyroglutamyl-L-Phenylalanyl-L-Leucine-Para-Nitroanilidea Chromogenic Substrate for Thiol Proteinase Assay. Analytical Biochemistry 143(2): 293-297. FitzGerald, K. (2000). In vitro display technologiesnew tools for drug discovery. Drug Discovery Today 5(6): 253-258. Forrer, P., H. K. Binz, et al. (2004). Consensus design of repeat proteins. ChemBioChem 5: 183-189. Forrer, P., H. K. Binz, et al. (2004). Consensus Design of Repeat Proteins. ChemBioChem 5(2): 183-189. Gebauer, M. and A. Skerra (2009). Engineered protein scaffolds as next-generation antibody therapeutics. Current Opinion in Chemical Biology 13(3): 245-255. Grebien, F., O. Hantschel, et al. (2011). Targeting the SH2-Kinase Interface in Bcr-Abl Inhibits Leukemogenesis. Cell 147(2): 306-319. Ho, M. and I. Pastan (2009). Mammalian Cell Display for Antibody Engineering. Methods in Molecular Biology. A. S. Dimitrov. 525: 337-352. Hoffmann, T., L. K. Stadler, et al. (2010). Structure-function studies of an engineered scaffold protein derived from stefin A. I: Development of the SQM variant. Protein Eng Des Sel 23(5): 403-413. Hoogenboom, H. R., A. D. Griffiths, et al. (1991). Multisubunit Proteins on the Surface of Filamentous PhageMethodologies for Displaying Antibody (Fab) Heavy and Light-Chains. Nucleic Acids Research 19(15): 4133-4137. Horton, R., Z. Cai, et al. (1990). Gene splicing by overlap extension: tailor-made genes using the polymerase chain reaction. Biotechniques 8: 528-535. Hutchison, C. A., S. Phillips, et al. (1978). Mutagenesis at a specific position in a DNA sequence. Journal of Biological Chemistry 253(18): 6551-6560. Jacobs, S. A., M. D. Diem, et al. (2012). Design of novel FN3 domains with high stability by a consensus sequence approach. Protein Engineering Design & Selection 25(3): 107-117. Jacobs, S. A., M. D. Diem, et al. (2012). Design of novel FN3 domains with high stability by a consensus sequence approach. Protein Eng Des Sel 25(3): 107-117. Jaeckel, C., J. D. Bloom, et al. (2010). Consensus Protein Design without Phylogenetic Bias. Journal of Molecular Biology 399(4): 541-546. Karatan, E., M. Merguerian, et al. (2004). Molecular recognition properties of FN3 monobodies that bind the Src SH3 domain. Chemistry & Biology 11(6): 835-844. Kayushin, A., M. Korosteleva, et al. (1996). A convenient approach to the synthesis of trinucleotide phosphoramidites-synthons for the generation of oligonucleotide/peptide libraries. Nucleic Acids Res 24: 3748-3755. Knappik, A., L. Ge, et al. (2000). Fully synthetic human combinatorial antibody libraries (HuCAL) based on modular consensus frameworks and CDRs randomized with trinucleotides. Journal of Molecular Biology 296(1): 57-86. Kohler, G. and C. Milstein (1975). Continuous cultures of fused cells secreting antibody of predefined specificity. Nature 256(5517): 495-497. Koide, A., C. W. Bailey, et al. (1998). The fibronectin type III domain as a scaffold for novel binding proteins. Journal of Molecular Biology 284(4): 1141-1151. Koide, A., C. W. Bailey, et al. (1998). The fibronectin type III domain as a scaffold for novel binding proteins. Journal of Molecular Biology 284(4): 1141-1151. Koide, A., R. N. Gilbreth, et al. (2007). High-affinity single-domain binding proteins with a binary-code interface. Proceedings of the National Academy of Sciences of the United States of America 104(16): 6632-6637. Koiwa, H., M. P. D'Urzo, et al. (2001). Phage display selection of hairpin loop soyacystatin variants that mediate high affinity inhibition of a cysteine proteinase. Plant Journal 27(5): 383-391. Komor, R. S., P. A. Romero, et al. (2012). Highly thermostable fungal cellobiohydrolase I (Cel7A) engineered using predictive methods. Protein engineering, design & selection: PEDS 25(12): 827-833. Komor, R. S., P. A. Romero, et al. (2012). Highly thermostable fungal cellobiohydrolase I (Cel7A) engineered using predictive methods. Protein Engineering Design and Selection 25(12): 827-833. Kondo, H., K. Abe, et al. (1991). Gene Organization of Oryzacystatin-Ii, a New Cystatin Superfamily Member of Plant-Origin, Is Closely Related to That of Oryzacystatin-I But Different From Those of Animal Cystatins. Febs Letters 278(1): 87-90. Kordis, D. and V. Turk (2009). Phylogenomic analysis of the cystatin superfamily in eukaryotes and prokaryotes. Bmc Evolutionary Biology 9. Krumpe, L., K. Schumacher, et al. (2007). Trinucleotide cassettes increase diversity of T7 phage-displayed peptide library. BMC Biotechnology 7(1): 65. Lee, S.-C., K. Park, et al. (2012). Design of a binding scaffold based on variable lymphocyte receptors of jawless vertebrates by module engineering. Proceedings of the National Academy of Sciences 109(9): 3299-3304. Lehmann, M., C. Loch, et al. (2002). The consensus concept for thermostability engineering of proteins: further proof of concept. Protein Engineering 15(5): 403-411. Lehmann, M., R. Lopez-Ulibarri, et al. (2000). Exchanging the active site between phytases for altering the functional properties of the enzyme. 9(10): 1866-1872. Lehmann, M. and M. Wyss (2001). Engineering proteins for thermostability: the use of sequence alignments versus rational design and directed evolution. Current Opinion in Biotechnology 12(4): 371-375. Lilley, C. J., P. E. Urwin, et al. (2004). Preferential expression of a plant cystatin at nematode feeding sites confers resistance to Meloidogyne incognita and Globodera pallida. Plant Biotechnology Journal 2(1): 3-12. Lofblom, J. (2011). Bacterial display in combinatorial protein engineering. Biotechnology Journal 6(9): 1115-1129. Lord, P. W., J. N. Selley, et al. (2002). CINEMA-MX: a modular multiple alignment editor. Bioinformatics 18(10): 1402-1403. Main, E. R. G., S. E. Jackson, et al. (2003). The folding and design of repeat proteins: reaching a consensus. Current Opinion in Structural Biology 13(4): 482-489. Main, E. R. G., A. R. Lowe, et al. (2005). A recurring theme in protein engineering: the design, stability and folding of repeat proteins. Current Opinion in Structural Biology 15(4): 464-471. Main, E. R. G., Y. Xiong, et al. (2003). Design of stable alpha-helical arrays from an idealized TPR motif Structure 11(5): 497-508. Makela, A. R. and C. Oker-Blom (2008). The baculovirus display technologyAn evolving instrument for molecular screening and drug delivery. Combinatorial Chemistry & High Throughput Screening 11(2): 86-98. Margis, R., E. M. Reis, et al. (1998). Structural and phylogenetic relationships among plant and animal cystatins. Archives of Biochemistry and Biophysics 359(1): 24-30. McPherson, M. J., P. E. Urwin, et al. (1997). Engineering plant nematode resistance by an anti-feedant approach. Cellular and Molecular Basis for Plant-Nematode Interactions. C. Fenoll, S. Ohl and F. Grundler. The Netherlands, Kluwer: 237-249. Melo, F. R., M. O. Mello, et al. (2003). Use of phage display to select novel cystatins specific for Acanthoscelides obtectus cysteine proteinases. Biochimica Et Biophysica Acta-Proteins and Proteomics 1651(1-2): 146-152. Mosavi, L. K., T. J. Cammett, et al. (2004). The ankyrin repeat as molecular architecture for protein recognition. Protein Science 13(6): 1435-1448. Mosavi, L. K., D. L. Minor, et al. (2002). Consensus-derived structural determinants of the ankyrin repeat motif Proceedings of the National Academy of Sciences 99(25): 16029-16034. Mullis, K., F. Faloona, et al. (1986). Specific enzymatic amplification of DNA in vitro: the polymerase chain reaction. Cold Spring Harb Symp Quant Biol 51 Pt 1: 263-273. Nagata, K., N. Kudo, et al. (2000). Three-dimensional solution structure of oryzacystatin-I, a cysteine proteinase inhibitor of the rice, Oryza sativa L. japonica. Biochemistry 39(48): 14753-14760. Nixon A E, W. C. (2006). Engineered protein inhibitors of proteases. Curr Opin Drug Discov Devel 9(2): 261-268. Nord, K., J. Nilsson, et al. (1995). A combinatorial library of an -helical bacterial receptor domain. Protein Engineering 8(6): 601-608. Odegrip, R., D. Coomber, et al. (2004). CIS display: In vitro selection of peptides from libraries of protein-DNA complexes. Proceedings of the National Academy of Sciences of the United States of America 101(9): 2806-2810. Parizek, P., L. Kummer, et al. (2012). Designed Ankyrin Repeat Proteins (DARPins) as Novel Isoform-Specific Intracellular Inhibitors of c-Jun N-Terminal Kinases. ACS Chemical Biology 7(8): 1356-1366. Parmeggiani, F., R. Pellarin, et al. (2008). Designed armadillo repeat proteins as general peptide-binding scaffolds: Consensus design and computational optimization of the hydrophobic core. Journal of Molecular Biology 376(5): 1282-1304. Parry-Smith, D. J., A. W. R. Payne, et al. (1998). CINEMAa novel colour interactive editor for multiple alignments (Reprinted from Gene, vol 221, pg GC57-GC63, 1998). Gene 221(1): GC57-GC63. Polizzi, K. M., J. F. Chaparro-Riggers, et al. (2006). Structure-guided consensus approach to create a more thermostable penicillin G acylase. Biotechnology Journal 1(5): 531-536. Reichert, J. M. (2010). Antibodies to watch in 2010. MAbs. 2(1): 84-100. Saiki R K, S. S., Faloona F, Mullis K B, Horn G T, Erlich H A, Arnheim N. (1985). Enzymatic amplification of beta-globin genomic sequences and restriction site analysis for diagnosis of sickle cell anemia. Science 230(4732): 1350-1354. Schlehuber, S. and A. Skerra (2005). Anticalins as an alternative to antibody technology. Expert Opinion on Biological Therapy 5(11): 1453-1462. Skerra, A. (2007). Alternative non-antibody scaffolds for molecular recognition. Current Opinion in Biotechnology 18: 295-304. Smith, G. (1985). Filamentous fusion phage: novel expression vectors that display cloned antigens on the virion surface. Science 228(4705): 1315-1317. Song, I., M. Taylor, et al. (1995). Inhibition of Cysteine Proteinases By Carica-Papaya Cystatin Produced in Escherichia-Coli. Gene 162(2): 221-224. Song, J., L. K. Durrin, et al. (2004). Identification of a SUMO-binding motif that recognizes SUMO-modified proteins. Proc Natl Acad Sci USA 101(40): 14373-14378. Song, J., Z. Zhang, et al. (2005). Small ubiquitin-like modifier (SUMO) recognition of a SUMO binding motif: a reversal of the bound orientation. J Biol Chem 280(48): 40122-40129. Stadler, L. K., T. Hoffmann, et al. (2011). Structure-function studies of an engineered scaffold protein derived from Stefin A. II: Development and applications of the SQT variant. Protein Eng Des Sel 24(9): 751-763. Steipe, B. (2004). Consensus-based engineering of protein stability: From intrabodies to thermostable enzymes. Protein Engineering 388: 176-186. Steipe, B., B. Schiller, et al. (1994). Sequence Statistics Reliably Predict Stabilizing Mutations in a Protein Domain. Journal of Molecular Biology 240(3): 188-192. Steipe, B., B. Schiller, et al. (1994). Sequence Statistics Reliably Predict Stabilizing Mutations in a Protein Domain. Journal of Molecular Biology 240(3): 188-192. Stubbs, M. T., B. Laber, et al. (1990). The Refined 2.4a X-Ray Crystal-Structure of Recombinant Human Stefin-B in Complex With the Cysteine Proteinase Papaina Novel Type of Proteinase-Inhibitor Interaction. Embo Journal 9(6): 1939-1947. Studier, F. W. and B. A. Moffatt (1986). Use of bacteriophage T7 RNA polymerase to direct selective high-level expression of cloned genes. J Mol Biol 189(1): 113-130. Theurillat, J.-P., B. Dreier, et al. (2010). Designed ankyrin repeat proteins: a novel tool for testing epidermal growth factor receptor 2 expression in breast cancer. Mod Pathol 23(9): 1289-1297. Traxlmayr, M. W. and C. Obinger (2012). Directed evolution of proteins for increased stability and expression using yeast display. Archives of Biochemistry and Biophysics 526(2): 174-180. Urwin, P. E., H. J. Atkinson, et al. (1995). Engineered Oryzacystatin-I Expressed in Transgenic Hairy Roots Confers Resistance to Globodera-Pallida. Plant Journal 8(1): 121-131. Urwin, P. E., J. Green, et al. (2003). Expression of a plant cystatin confers partial resistance to Globodera, full resistance is achieved by pyramiding a cystatin with natural resistance. Molecular Breeding 12(3): 263-269. Urwin, P. E., A. Levesley, et al. (2000). Transgenic resistance to the nematode Rotylenchulus reniformis conferred by Arabidopsis thaliana plants expressing proteinase inhibitors. Molecular Breeding 6(3): 257-264. Urwin, P. E., C. J. Lilley, et al. (1997). Resistance to both cyst and root-knot nematodes conferred by transgenic Arabidopsis expressing a modified plant cystatin. Plant Journal 12(2): 455-461. Urwin, P. E., M. J. McPherson, et al. (1998). Enhanced transgenic plant resistance to nematodes by dual proteinase inhibitor constructs. Planta 204(4): 472-479. Urwin, P. E., K. M. Troth, et al. (2001). Effective transgenic resistance to Globodera pallida in potato field trials. Molecular Breeding 8(1): 95-101. Virnekas, B., L. Ge, et al. (1994). Trinucleotide phosphoramidites: ideal reagents for the synthesis of mixed oligonucleotides for random mutagenesis. Nucleic Acids Res 22: 5600-5607. Von Behring, E., Kitasato, S., (1890). ber das Zustandekommen der Diphterie-Immunitt and der Tetanus-Immunitt bei Thieren. Deutsche Medizinische Wochenzeitschrift 16: 1113-1114. Wojcik, J., O. Hantschel, et al. (2010). A potent and highly specific FN3 monobody inhibitor of the Abl SH2 domain. Nat Struct Mol Biol 17(4): 519-527. Woodman, R., J. T. H. Yeh, et al. (2005). Design and validation of a neutral protein scaffold for the presentation of peptide aptamers. Journal of Molecular Biology 352(5): 1118-1133. Wurch, T., A. Pierre, et al. (2012). Novel protein scaffolds as emerging therapeutic proteins: from discovery to clinical proof-of-concept. Trends in Biotechnology 30(11): 575-582.

Scaffold proteins derived from plant cystatins

Assignee

Inventors

Cpc classification

Classification Explorer

C12N15/1037

CHEMISTRY; METALLURGY

Classification Explorer

C07K14/415

CHEMISTRY; METALLURGY

Classification Explorer

G01N2500/04

PHYSICS

Classification Explorer

A61K38/00

HUMAN NECESSITIES

Classification Explorer

G01N33/68

PHYSICS

Classification Explorer

G01N2333/415

PHYSICS

Classification Explorer

A61K38/168

HUMAN NECESSITIES

Classification Explorer

A61P7/02

HUMAN NECESSITIES

Classification Explorer

A61P43/00

HUMAN NECESSITIES

International classification

Classification Explorer

A61K38/00

HUMAN NECESSITIES

Classification Explorer

G01N33/68

PHYSICS

Classification Explorer

C12N15/10

CHEMISTRY; METALLURGY

Classification Explorer

A61K38/16

HUMAN NECESSITIES

Classification Explorer

C07K14/415

CHEMISTRY; METALLURGY

Abstract

Claims

Description