Process for generation of protein and uses therof
09598688 ยท 2017-03-21
Assignee
Inventors
Cpc classification
C07K1/36
CHEMISTRY; METALLURGY
C07K1/00
CHEMISTRY; METALLURGY
C07K2299/00
CHEMISTRY; METALLURGY
C12N15/1089
CHEMISTRY; METALLURGY
C12Y401/01039
CHEMISTRY; METALLURGY
International classification
C12N15/00
CHEMISTRY; METALLURGY
C07K1/00
CHEMISTRY; METALLURGY
C07K1/36
CHEMISTRY; METALLURGY
Abstract
A method of generating a protein with an improved functional property, the method comprising: (a) identifying at least one Target amino acid Residue in a first protein, wherein said Target amino acid Residue is associated with said functional property; (b) comparing at least one homologous second protein from the same or a different phylogenetic branch as the first protein with the first protein and identifying at least one Variant amino acid Residue between the first protein and the second protein; (c) selecting at least one Candidate amino acid Residue from the Variant amino acid Residue identified in (b) on the basis of said Candidate amino acid Residue affecting said Target amino acid Residue with respect to said functional property; (d) forming at least one Candidate Mutant protein in silico or producing at least one Candidate Mutant protein in vitro in which said at least one Candidate amino acid Residue from the second protein substitutes a corresponding residue in the first protein; and (e) screening said at least one Candidate Mutant protein produced in (d) to identify a protein having said improved functional property.
Claims
1. A method of generating a Rubisco protein with an improved functional property selected from the group consisting of improved kinetic efficiency of the Rubisco protein, an altered specificity of the Rubisco protein for one or more substrates, an altered specificity for one or more products of the Rubisco protein, and an altered effective temperature range for Rubisco protein catalysis, the method comprising: (a) identifying at least one Target amino acid Residue in a first Rubisco protein, wherein said at least one Target amino acid Residue is associated with said functional property, wherein said at least one Target amino acid Residue is one or more first shell residues that directly coordinate with a Rubisco protein reaction center, or one or more second shell residues that directly coordinate with one or more first shell residues, wherein said one or more first shell residues are selected from the group consisting of Glu60, Asn123, Lys175, Lys177, Kcx201, Asp203, Glu204, His294, and Lys334, wherein said one or more second shell residues are residues that directly coordinate with said one or more first shell residues, and wherein residue numbering is from the amino acid sequence of spinach Rubisco protein, and Kcx201 denotes carbamylated Lys201; (b) comparing at least one homologous second Rubisco protein from the same or a different phylogenetic branch as the first Rubisco protein with the first Rubisco protein and identifying at least one Variant amino acid Residue between the first Rubisco protein and the second Rubisco protein, wherein the residues of the second Rubisco protein used for comparison with the first Rubisco protein comprise residues selected from the group consisting of all residues directly coordinated to a Rubisco protein active site, all residues interacting with a Rubisco protein substrate reactive center or a Rubisco protein intermediate reaction species, other residues within a proximate distance of between 3 and 28 from any atom of substrate or intermediate reaction species from the active site of a Rubisco protein, and a subset of such residues; (c) selecting at least one Candidate amino acid Residue from the Variant amino acid Residues identified in (b) based on assessing the spatial proximity of the Variant Residues to the Target Residues and estimating and ranking their ability to influence the electrostatics and orientation of the Target Residues and to modulate the functional property of a Rubisco protein; (d) forming or producing at least one Candidate Mutant Rubisco protein in which said at least one Candidate amino acid Residue from the second Rubisco protein substitutes a corresponding residue in the first Rubisco protein; and (e) screening said at least one Candidate Mutant Rubisco protein formed or produced in (d) to identify a Rubisco protein having said improved functional property.
2. The method according to claim 1, wherein step (a) and step (b) are performed simultaneously.
3. The method according to claim 1, wherein step (d) comprises forming or producing at least one Candidate Mutant Rubisco protein by using the at least one Candidate amino acid Residue from the second Rubisco protein to substitute a corresponding residue in a homologous Rubisco protein or in homologous Rubisco proteins other than or in addition to the first Rubisco protein.
4. The method according to claim 1, wherein step (d) comprises forming or producing at least one Candidate Mutant Rubisco protein in which at least two Candidate amino acid Residues from the second Rubisco protein substitute corresponding residues in the first Rubisco protein and/or in a homologous Rubisco protein other than the first Rubisco protein.
5. The method according to claim 1, wherein the improved kinetic efficiency of the Rubisco protein is selected from the group consisting of improved carboxylation efficiency, improved k.sup.c.sub.cat, and improved specificity (S.sub.c/o).
6. The method according to claim 1, wherein the Target amino acid Residues of the Rubisco protein are in the N-terminal domain of the Rubisco Large subunit.
7. The method according to claim 1, wherein step (c) comprises selecting at least one Divergent Candidate amino acid Residue and/or at least one Alternative Candidate amino acid Residue, instead of at least one Candidate amino acid Residue, from the Variant amino acid Residues identified in (b), wherein said Divergent Candidate amino acid Residue is an amino acid residue that is selected from a plurality of Variant Residues and that is able to influence one or more Target Residues sterically and/or electrostatically, thereby influencing the function of the Rubisco protein mediated by the one or more Target Residues; and wherein said Alternate Candidate amino acid Residue is an alternative amino acid residue at the position of a Candidate Residue that is expressed in the second Rubisco protein, but is not an amino acid that is expressed in the consensus sequence of the second Rubisco protein, and that is able to influence one or more Target Residues sterically and/or electrostatically thereby influencing the function of the Rubisco protein mediated by the one or more Target Residues.
8. The method according to claim 1, wherein the first Rubisco protein is taken from species of green plants, flowering plants or cyanobacteria and the second Rubisco protein is taken from species of red algae.
9. The method according to claim 1, wherein step (b) is performed before step (a).
10. The method according to claim 1, wherein the at least one Target amino acid Residue is contained in a set of Rubisco proteins containing the first Rubisco protein.
11. The method according to claim 1, further comprising performing directed evolution on said Rubisco protein having said improved functional property and screening products thereof.
12. The method according to claim 1, wherein the altered specificity of the Rubisco protein for one or more substrates comprises an improved K.sub.c or improved specificity (S.sub.c/o).
Description
BRIEF DESCRIPTION OF THE FIGURES
(1) The present invention will now be described by way of example with reference to the accompanying figures.
(2)
(3)
(4)
(5)
(6)
(7)
(8)
(9)
(10)
(11)
(12)
(13)
(14)
(15)
(16)
(17)
(18)
(19)
(20)
(21)
(22)
(23)
(24)
(25)
(26)
(27)
(28)
(29)
DETAILED DESCRIPTION OF THE INVENTION
(30) The present invention provides a method of generating a protein with an improved functional property. The invention comprises a procedure to narrow the sequence space for conventional mutational test by defining mutants with one or, preferably, multiple mutations using mechanistic (computational) and bioinformatic and database (phylogenetic-specific/environment-specific sequence changes, kinetic data, 3-D structure and modelling) information, in the first instance. A summary of how the problem of reduction of sequence space may be solved by certain embodiments of the invention is illustrated for Rubisco in
(31) A first step of the method comprises the process of identifying Target amino acid Residues in a first protein. As described in more detail below, the process of identifying Target amino acid Residues may comprise active-site fragment QM calculations (step (i) using, for example, DFT methods), and hybrid QM/QM or QM/MM calculations (step (ii) using, for example, ONIOM methods), in conjunction with use of empirical data (for example, kinetic data and X-ray crystal structures). Molecular dynamics (MD) simulations (step (iii)), and calculations using combinations of QM, QM/QM or QM/MM, and MD methods (steps (iv) and (v)) may be used, respectively, for evaluating the stability of the predicted Candidate Mutant proteins, or for more detailed understanding of the roles of the Target Residues in the enzymic reaction. These Target amino acid Residues will usually be cornerstone residues for the proper functioning of the protein with respect to the functional property under investigation. It is a preferred feature of the invention that the improved functional property is achieved by modifying the chemical properties of these residues (by mutation of other residues) in order to improve, for example, their kinetic activity.
(32) Convention for Numbering of Rubisco Amino Acid Residues
(33) Throughout this specification, when identifying a residue of a Rubisco LSU by number, the residue numbering was based on the numbering of the residue from the amino acid sequence of spinach Rubisco LSU (SEQ ID NO: 17). This numbering convention was used for all residues identified in computational chemistry, in protein structures and in mutations. This does not create ambiguity as these numbers can be mapped to sequence numbers on alignments and from structural comparisons, and accordingly a given spinach residue number can be mapped with total confidence to a structurally equivalent cyanobacterial or red algal residue number.
(34) Sequence Listings
(35) Table 1 provides a summary of the naturally occurring and 50% consensus Rubisco amino acid sequences discussed herein and which are provided in the computer-readable sequence listing, with the SEQ ID NOS as shown in the SEQ ID column. Where multiple sequences were used to produce a 50% consensus sequence, the total number of sequences involved in the consensus sequence creation is listed in brackets in the Description column. The database accession numbers provide unique identifiers for each of the sequences which were used, including those sequences which were considered in the creation of consensus sequences. The numbers of the figures in which the sequences are used in alignments is also given in the Figure Nos column
(36) TABLE-US-00001 TABLE1 DatabaseaccessionnumbersforRubiscoLSUSequencesusedin FIGS.9,10,11and18.TheSequenceIDnumberscorrespondto thoseinthecomputer-readablesequencelisting. SEQID FIG. NO: Description Nos Databaseaccessionnumber(s) 1 Protistaglaucophyta 9 P24312 2 Protistarhodophyta(9) 9,10, BAA75676 AAB17222 BAA75796 AAR13681 11,18 ABU53651 BAE78417 CAB58236 AAD04746 BAE78409 3 Eubacteria 9,10, P00879 400856410 4797620 402041420 cyanobacteria(11) 11,18 5199140 P27568 403124030 Q8DIS5 4296100 403238160 P00880 4 Plantae 9 Q31795 anthocerotophyta 5 Plantaebryophyta(28) 9 Q95G53 Q76GQ2 Q5TM96 Q75W61 Q76GQ0 Q50L57 Q5TMB1 Q75W60 Q76HL3 Q5TM93 Q9BB41 Q95G63 Q94N80 Q5TMA1 Q9TM58 Q75VP7 Q8HW62 Q8SN97 Q95G62 Q9TM63 Q5TM99 Q5TMA5 Q9GIF5 Q5TMA3 Q5TM97 Q9GGM2 Q5TM95 NP_904194 6 Plantaecharophyta(4) 9 Q8SN66 P48716 Q32RY7 Q32RQ1 7 Plantaechlorophyta(4) 9 NP_958405 BAE48225 AAD00447 BAC06367 8 Plantaeconiferophyta 9 P41621 9 Plantaeequisetophyta 9 P48702 10 Plantaegnetophyta 9 Q9THI3 11 Plantaemagnoliophyta 9,10, Q3T5C7 Q5EKM0 Q95EI0 Q06021 (134) 11,18 Q31857 Q9MRW9 Q3V6P6 Q9GDM8 Q3V6M3 Q31670 O63085 Q3T5G3 Q51221 Q06022 Q37167 AAF78948 Q6R615 Q3T5C1 P48688 Q06023 P48690 Q75VD8 Q42664 Q7YKF9 Q9XQE3 Q9XPK2 Q8WJD8 P48693 Q5EKL6 O63123 Q8SLM3 Q7YKF8 CAA57001 Q42674 P92255 Q98530 Q9XQB9 Q6R613 Q32072 Q3T575 Q9XQE7 Q5EKM4 Q8WLJ3 Q95F13 Q3L237 Q5EKM2 P92287 Q37319 Q33449 Q9ZT30 Q32188 Q9SB16 Q9BBC7 P19161 P48703 CAB08877 Q9GGC1 Q42828 Q4VWN7 Q5XLF7 Q95F23 Q3T5G1 Q95BC5 Q95F12 Q32488 Q5EKL5 AAK72524 Q01873 Q75VD6 Q9BBU1 Q32518 Q95EH6 Q9GHT6 Q8WIB0 Q32622 Q9MVF1 Q42916 O98611 Q5EKL8 Q3T5E7 Q32685 NP_054507 Q5C9P7 BAA00147 Q9GHS0 Q68RZ8 Q3T5F6 Q7YL87 Q37257 Q36849 Q95F15 Q95F24 Q95A48 Q32916 P04717 Q8WGU4 Q9XK53 Q9XQA7 Q32820 Q8ME88 Q5MB28 Q5EKL2 Q8M962 Q75VD3 Q98612 Q8WKT8 Q6R614 Q9MTS7 Q8WIA8 Q95F20 AAX44985 O62943 Q8WIC4 P48715 ABB90049 Q9XQ93 Q9GHN0 Q3T5E2 Q33064 Q37281 Q8WIC3 Q9XPK3 Q6R617 Q8LUX7 Q8WKR2 AAP92166 Q6USP5 Q3T5E4 P28459 Q95F10 P92364 CAA60294 Q75VD7 NP_054944 12 Plantaepinophyta(2) 9 P26961 P26962 13 Plantaepteridophyta(2) 9 Q85WR7 Q33015 14 Galdieriapartita 18 BAA75796 15 Griffithsiamonolis 18 ABU53651 16 Synechococcus 18 P00880 elongatusPCC6301 17 Spinach(Spinacia 18 NP_054944 oleracea) 18 Tobacco(Nicotiana 18 NP_054507 tabacum) 19 Rice(Orizasativa) 18 BAA00147 20 Soybean(Glycinemax) 18 YP_538747 21 Sugarcane 18 BAD27301 (Saccharum officinarum)
Identification of Target ResiduesComputational Mechanism
(37) Steps (i) and (ii) hereafter relating to the identification of Target amino acid Residues were performed using the GAUSSIAN program package, for ab initio QM and ONIOM calculations (Frisch et al., 2004), but there are several other proprietary or free-to-use programs available which might be used alternatively. Step (iii) uses the generally available AMBER program (Case et al., 2006) to perform protein MD simulations; this capability is also available in many other programs. Steps (iv) and (v) employ published theory, protocols and programs for enzyme mechanism simulations (Gready et al., 2006); the core semi-empirical QM/MM MD simulation methods (Cummins and Gready, 1997, 1998, 1999, 2003, 2005; Cummins et al., 2007) are implemented in the program MOPS (Cummins, 1996).
(38) (i) Active-Site Fragment-Complex QM Calculations
(39) The following description relates to calculations in respect of the Rubsico LSU, in which active-site residues are totally conserved between species. These calculations use a high level ab initio QM method (B3LYP/6-31G(d,p)) to define the energetics and structures of the reaction species (substrate, transition-state (TS), intermediate, and product complexes) in the multi-step Rubisco reaction mechanism, as shown in
(40) (ii) ONIOM Hybrid QM/QM and QM/MM Calculations:
(41) These calculations define the perturbations to the energetics and structures of the reaction-pathway species, and mainly focused on the gas-addition step from the next nearest neighbours and beyond of the active-site residues. This was done using methods which use a high, but computationally expensive, ab initio QM model for the system core (i.e. as in (i)) and a less expensive QM (semi-empirical QM) or MM model for an extended region.
(42) The calculations (QM/QM and QM/MM) were performed at several stages using the ONIOM module in GAUSSIAN 03. The ONIOM QM/QM calculations used a model of a high-level ab initio QM core layer of 93 atoms. For study of the starting point at the gas-addition step, the QM core layer comprises the magnesium atom (Mg.sup.2+), enediolate of RuBP (to compute the structure and energies of subsequent reaction species the corresponding RuBP-derived chemical species were used), GLU60, ASN123, LYS175, LYS177, carbamylated LYS201, ASP203, GLU204, HIS294 and LYS334. The core layer is surrounded by a further 711 atoms in the outer layer computed at the PM3 (semi-empirical QM) level, which comprises amino acid residues up to 12 from the magnesium atom. The starting co-ordinates were taken from the X-ray structure of the spinach Rubisco-2CABP complex (pdb 8ruc). This model is illustrated in
(43) (iii) Molecular Dynamics (MD) Simulations of Protein Complexes
(44) These simulations assessed whether the protein structure of grafted Rubisco Candidate Mutants could accommodate the changed residues, i.e. whether the mutant protein structure was conformationally stable or whether it tended to unravel. MD simulations are particularly useful for multiply-grafted mutants, and provide a global stability screening test to complement the electronic tests on the chemical mechanism from (ii).
(45) These calculations were performed with the AMBER8 or AMBER9 program package (Case et al., 2006), but other protein MD simulation packages (e.g. GROMACS) could be used to obtain similar results.
(46) (iv) Multiple ONIOM Hybrid QM/QM or QM/MM Calculations of Different Sampled Conformational States of Complexes
(47) In this method a series of calculations is undertaken using coordinates for protein complexes (e.g. for different reaction steps) taken from snapshots of trajectories of QM/MM MD simulations, as described by Gready et al. (2006). These calculations allowed a more detailed examination of features of the catalytic pathway, namely the effects of protein conformational flexibility on the enzyme-complex geometries and the activation and reaction energies.
(48) (v) Generation of the Full Reaction Free Energy Surfaces for the Gas-Addition Reactions
(49) A complete statistical ensemble (conformational average) of enzyme states over the complete course of a reaction step may be generated from semi-empirical QM/MM MD simulations for a grid of points defined by the reaction co-ordinates (a free energy hypersurface). A more accurate free-energy surface may then generated at ab initio QM level by ONIOM QM/MM calculations using multiple configurations, for example up to 120, sampled from the points on the semi-empirical QM/MM reaction hypersurface (Gready et al., 2006; Cummins et al., 2007). These enzymic free energy surfaces provide reaction and activation free energies that may be compared directly with experimental data, such as experimentally measured kinetic constants, and also may be used to calculate differences in the reaction and activation free energies between wild type and mutants.
(50) (vi) Definition of Reaction Mechanism and Target Residues
(51) Based on the results of these computations, the inventors were able to deduce a mechanism for the entire sequence of reactions in the carboxylase catalysis, and to define precise roles for the active-site residues, singly and in concert (Kannappan and Gready, 2008). From the QM fragment calculations, a pair of key amino acid residues were identified, one acting as a base and the other acting as an acid, for each reaction step. In particular, the pair HIS294 and LYS334 were identified for the gas-addition step.
(52) For the Rubisco carboxylase reaction, the starting point is the Rubisco complex with the enediolate form of RuBP bound to the active site and the CO.sub.2 molecule held at a van der Waals interaction distance to the C2 carbon of the enediolate. This state is represented as I in
(53) In the gas-addition step, HIS294 acts as a base to remove a proton completely from the O3 atom; this transfers a partial negative charge to the C2 carbon and enables it to form a covalent bond with the carbon atom of CO.sub.2, and also transfers the negative charge to the nascent carboxylate group. LYS334, which is positively charged, helps in stabilizing this negative charge developing on the nascent carboxylate group. These features can be seen in the detailed structures for I-III in
(54) Modifying the properties of these two residues, for example, by sterically altering the orientation/distance of their interactions with the enediolate substrate or -keto intermediate or electronically altering the charge on the atoms interacting with the enediolate substrate or -keto intermediate may affect the energetics of the gas-addition step. HIS294 and LYS334 are, thus, identified as Target Residues, broadly defined to be residues predicted to have a significant effect on the reaction mechanism and energetics. HIS294 and LYS334 are in the C-terminal domain, are spatially separated, and affect different parts of the enediolate substrate or -keto intermediate. Hence, the amino acids which may affect their properties have been classified into different regions; Region 2 for His294 and Region 3 for Lys334, as shown in
(55) Although residue ASN123 was not included in the FM20 active-site fragment model for the QM calculations, examination of crystal structures and the preliminary QM/QM calculations suggested that it also is involved in stabilizing the charge on the nascent carboxylate group added at C2. Furthermore, examination of crystal structures showed residues E60 and Y20 are positioned to directly alter the charge on LYS334 (i.e. the charge/orientation of LYS334 can be altered by manipulating E60 and Y20), and the C2-carboxylate group of the intermediate, 2C3KABP. Thus, E60, Y20 and N123 were also identified as Target Residues. These three residues are in the N-terminal domain of the LSU and, thus, amino acids which may affect their properties were classified as belonging to a different region (Region 1) from those of HIS294 and LYS334 (
(56) In summary, the above method comprises a full suite of computational methods for investigating mechanistic, energetic and stability issues at global or more detailed levels for the carboxylation and oxygenation steps for wild type and any predicted Candidate Mutant of Rubisco. In this manner, it was possible to identify one or more Target Residues to act as the focus for the phylogenetic grafting.
(57) Protein ComparisonsPhylogenetic Grafting
(58) In its broadest form, the method described herein also comprises the comparison of at least one second protein with at least the first protein. The second protein may originate from the same or a different phylogenetic branch as the first protein. The process of comparison entails the identification of at least one Variant amino acid Residue between the first protein and the second protein. A plurality of Variant Residues of the second protein act as a pool of different specific amino acid residue identities which may be grafted onto the first protein in an attempt to improve the functional property of the first protein mediated by the Target Residues.
(59) Taking Rubisco as an example, phylogenetic branch-specific changes in the Rubisco amino acid sequence, such as changes in Rubiscos from phylogenetic groups of different evolutionary lineages or in Rubiscos which express environment-specific changes, represent possible partial optimizations of the Rubiscos' catalytic efficiency. A strategy, termed phylogenetic grafting, was developed to identify the key residues which represent these partial evolutionary solutions and to selectively transplant these residues into a host Rubisco, such as a Rubisco from Synechococcus sp., by changing the specific host residues to those of the donor Rubisco or donor group of Rubiscos with one or more improved (or preferred) kinetic features, with a view to producing a host Rubisco with these improved kinetic features.
(60) Partial evolutionary solutions described above were identified by a procedure of combining the results of the computational studies (the Target Residues), as shown in
(61) The integration of the results of the computational studies with those of the phylogenetic analysis to identify a specific subset of Variant Residues (i.e. the Candidate Residues) allows differentiation between residues which may affect a functional property, for example, the efficiency of the gas-addition step, from other characteristic (consensus) conserved sequence changes between phylogenetic branches, which may represent, for example, neutral phylogenetic drift or a branch-specific physiological role. Taking the example of a Rubisco enzyme, a branch-specific physiological role may include folding and assembly of the protein, including interactions with the small subunit, or protein stability.
(62) (i) Identification of Variant Residues by Phylogenetic Analyses
(63) The combined use of the computationally-deduced mechanisms to identify Target Residues with sequence conservation and phylogenetic information in order to identify the Variant Residues is illustrated by the following discussion of specificity factors of Rubisco. The very high specificity factors of red-algal Rubiscos may be attributed to residues which are in common between cyanobacterial and flowering plant Rubiscos, but differ in red-algal Rubiscos. Such residues are defined herein as Variant Residues. If single Variant Residues or a plurality of Variant Residues which act as specificity-determining factors in red-algal Rubiscos are identified and selectively incorporated into flowering plant/cyanobacterial Rubiscos, then a Rubisco which is physiologically active in the host organism may be produced which has higher specificity for CO.sub.2 than the native enzyme.
(64) First-shell residues, i.e. those residues directly coordinating to the reaction centre (Glu60, Asn123, Lys175, LYS177, KCX201, Asp203, Glu204, His294 and Lys334) are totally conserved among Rubiscos. This conservation is illustrated in
(65) However, residues in the second and subsequent shells surrounding the reaction centre show variation among the main Rubisco branches of flowering plants, red algae and cyanobacteria. Red algae show the greatest specificity for CO.sub.2, as identified by the CO.sub.2/O.sub.2 ratio of 160 compared with 80 for green plants and 40 for cyanobacteria. The sequence variation among flowering plants, red algae and cyanobacteria is more clearly illustrated in the alignment in
(66) In this example, 134 Variant Residues were identified from the Rubisco LSU, amongst which are distributed the residues which are responsible for the partial evolutionary solution for increased specificity which is exhibited by red algal Rubiscos. These are shown as grey-shaded residues in the alignment in
(67) (ii)(a) Identification of Candidate Residues
(68) Using methods described below, specific Variant Residues were identified which have the potential to affect the gas-addition step of the reaction catalysed by Rubisco, and these were termed Candidate Residues. This allowed conserved changes between phylogenetic branches or sub-branches/sub-species, which may represent neutral phylogenetic drift or which may have a branch-specific physiological role, such as in the stability, folding or assembly of the Rubsico LSU, to be disregarded.
(69) Many of the Variant Residues may not contribute to the improved property of the Rubisco and consequently may not be part of the evolutionary solution for, in this example, increased CO.sub.2 specificity, but rather are silent mutations or mutations relevant to other enzyme properties, such as the folding and assembly of the protein, or its stability, in the cell. In order to identify the Variant Residues most likely to be part of the evolutionary solution for the improved property, in this example increased CO.sub.2 specificity, the mechanistic insights obtained from the QM calculations were employed to select Candidate Residues from the plurality of Variant Residues. This procedure was based on the hypothesis that Variant Residues which can influence the functionality of the Target Residues identified by the computational chemistry step as involved in the gas-addition step in Rubisco, form a part of the evolutionary solution. This was the primary criterion used to select from the Variant Residues to obtain a subset of residues here called Candidate Residues.
(70) In general, the selection process was based on assessing the spatial proximity of the Variant Residues to the Target Residues and estimating and ranking their ability to influence the electrostatics and orientation of the Target Residues. Selection may utilise visual screening of crystallographic structures using a molecular modelling and visualization program package such as Accelrys Discovery Studio v2.0 (Accelrys Software Inc., San Diego, Calif., 2007), although other similar modelling packages could be used. Standard chemical concepts for intermolecular interactions, such as charge-charge electrostatic pairing, typical van der Waals and hydrogen-bonding distances, and space-filling models for amino acid sidechains may be used in an initial scan of residues for selection. The procedure may also be systematized, for example, by mapping all the atom-to-atom electrostatic and hydrophobic interactions of each of the Variant Residues with all other amino acid residues that are within 3-5 A in distance and excluding those interactions which are equivalent in the sequences of the first and second protein.
(71) Examples of such equivalent interactions include backbone-backbone interactions which are generally, but not always, unaltered by mutation. Interactions were also considered equivalent, for example, if a hydrophobic interaction in the sequence of one protein involved an or aliphatic carbon of the side chain of a Variant Residue with an atom of a non-Variant Residue, and in the sequence of the second protein the amino acid Variant Residue, while different from that in the sequence of the other protein, also had an or aliphatic carbon in the side chain interacting with the same non-Variant Residue as in the first protein sequence.
(72) Interactions of methyl groups in the amino acid side chains, such as those in valine, leucine and isoleucine were considered equivalent if only the particular methyl group was involved in the interaction with a non-Variant Residue. Hydrogen bonds formed by the carboxylate groups of aspartate and glutamate residues were also considered equivalent if the corresponding hydrogen-bonding distances were similar.
(73) After screening the Variant Residues for differences in interaction patterns between the sequences of the first and second protein, only those Variant Residues which had the potential to affect an identified Target Residue through changed interaction patterns were retained. The potential of a Variant Residue to affect a Target Residue was recognized by the interaction of a Variant Residue with the Target Residue or with amino acid residues adjacent to Target Residues or with amino acid residues in the secondary structural unit harbouring the Target Residue. Even Variant Residues that are parts of loops, turns or unstructured strands, but which are connected to the secondary structural units harbouring the Target Residue, have the potential to alter the orientation of a Target Residue by assisting in repositioning of the secondary structural units through changed interactions. The selected Variant Residues, which had one or more variant interactions and the potential to influence Target Residues constituted the set of Candidate Residues. The 20 Candidate Residues identified as able to influence the Target Residues in Region 1, i.e. ASN123, Glu60 and TYR20, are identified by reverse shading in the alignment in
(74) (ii)(b) Identification of Alternative Candidate Residues and Divergent Candidate Residues (ACRs and DVRs)
(75) The above-described scheme for the selection of a Candidate Residue relies on a selection criterion that the Candidate Residue is present in a consensus sequence of a second protein exhibiting an improved or desirable property, while being different from the corresponding residue in a plurality of consensus sequences of first proteins not exhibiting the improved or desirable property, and where the consensus sequences of the first proteins share the same residue. Thus, using the example of CO.sub.2 specificity in Rubisco, the Candidate Residue will be a residue which is present in the red algae consensus sequence, and which is different from the residue common to both the consensus sequences of flowering plants and cyanobacteria.
(76) Other partial evolutionary solutions may also be expressed in residues found in the pool of Variant Residues. These have been termed Alternative Candidate Residues and Divergent Candidate Residues. These partial solutions found in nature may be used to extend the process of selecting Candidate Mutants, to produce an expanded pool of alternative or supplementary residues with which to influence Target Residues.
(77) For example, where the selected residue of the second protein is not the majority residue found in the consensus sequence of the second protein but instead is found at lower frequency in a plurality of the second proteins, while still different from the residues present in a plurality of consensus sequences of first proteins, and where the consensus sequences of the first proteins share the same residue, the residue is termed an Alternative Candidate Residue (ACR). The ACR represents an alternative to the consensus residue at a Candidate Residue position. As for a Candidate Residue, an ACR must still satisfy the selection criteria related to influencing a relevant Target Residue. A purpose for introducing ACRs may be to provide residues for grafting which are suspected to be associated with a greater improvement of the desirable property than the majority residue of the consensus sequence of the second protein.
(78) Thus, using the example of CO.sub.2 specificity in Rubisco, the Alternative Candidate Residue will be a residue which is expressed by at least one of the red algae which contribute to the red algae consensus sequence but which is not the majority residue which is contained in the consensus sequence, and which is different to the residue common to both the consensus sequences of green plants and cyanobacteria The red algal species Griffithsia monolis has a higher catalytic rate compared with other red algal species while maintaining the high specificity typical of red algae (Whitney et al., 2001), and hence the Griffithsia monolis sequence may be used as a source of ACRs. The variation of the Rubisco N-terminal domain sequence between G. monolis and that of the red algae consensus sequence and a typical (reference) red algal species, Galdieri partita, is illustrated in
(79) A Co-variant Residue (CvR) may be identified in the sequence of a second protein from a particular species, as being in the vicinity of an Alternative Candidate Residue (ACR) and showing complementary variation to the ACR. This variation at the position of the CvR, which is not present in the consensus sequence for the second protein, may be suspected to reflect complementary changes in the structural and/or electrostatic properties of the ACR and CvR. These complementary changes may constitute a partial evolutionary solution. Identification of CvRs provides a means to identify auxiliary residue positions which may be mutated in the first protein to better accommodate changes made from transplanting ACRs.
(80) Yet another partial evolutionary solution may also be expressed in residues found in the pool of Variant Residues where residues from at least three phylogenetic branches are examined and the selected residue is predicted to influence at least one Target Residue and is found in the consensus sequence of the second protein from one branch while being different from the residues present in the consensus sequences of at least two other branches, and where the consensus sequences of the first proteins are also different at the same position. Such a residue is termed a Divergent Candidate Residue (DCR). As for a Candidate Residue, a DCR must still satisfy the selection criteria related to influencing a relevant Target Residue. A purpose for introducing a DCR may be to provide residues for grafting which are suspected of being associated with the improved or desirable property, but which may also be expected to produce a greater variation of the expressed properties when substituted into different first (host) proteins from the at least two phylogenetic branches than would be expected if the substitution was with a Candidate Residue.
(81) Thus, using the example of CO.sub.2 specificity in Rubisco, the Divergent Candidate Residue may be a residue which is expressed in the red algae consensus sequence, but which is different from the residue expressed in the consensus sequences of flowering plants and in the consensus sequence of cyanobacteria, while at the same time residues of the flowering plant and cyanobacteria consensus sequences also differ at this position. Six Divergent Candidate Residues identified as able to influence the Target Residues in Region 1 (ASN123, Glu60 and TYR20) are identified by grey shading in the alignment in
(82) TABLE-US-00002 TABLE 2 Residue composition table (in percentage) for cyanobacteria, flowering plants and red algae at current 20 Candidate Residues (including three from the C-terminal domain of partner LSU) and 6 Divergent Candidate Residue sites. Number of sequences used is shown in brackets. The DCRs are at the bottom of the table (36, 86, 116, 117, 138, 140). Substitution in Cyano- Flowering Red Synechococcus PCC6301 by bacteria plants algae residue in Res (11) (134) (9) red flowering cyano- Sub- No Res % Res % Res % algae plants bacteria region 18 K 73 K 100 I 67 K18I K18Q 1A Q 27 33 19 D 81 D 92 P 67 D19P D19E D19E 1A E 18 E 8 33 23 T 100 T 99 G 100 T23G T23N 1A N 1 25 Y 82 Y 98 W 100 Y25W Y25H W 18 H 2 51 E 91 E 100 V 56 D51V D51E D 9 I 44 D51I 54 G 55 G 99 S 78 G54A G54R 1C A 45 R 1 A 22 G54S 59 A 100 A 100 G 100 A59G 64 G 100 G 100 A 100 G64A 68 T 100 T 99 V 100 T68V 1A A 1 81 K 100 K 100 R 100 K81R 1A 84 C 100 C 100 A 89 C84A 1C C 11 87 I 82 I 98 V 100 I87V I87L 1C L 9 L 2 V 9 88 E 100 E 100 D 78 E88D 1C E 22 104 P 100 P 100 D 89 P104D 1A E 11 P104E 114 T 100 T 100 A 100 T114A 118 T 100 T 100 A 100 T118A 121 V 100 V 100 I 100 V121I 1B 271 T 100 T 100 V 100 T271V 297 M 100 M 100 G 100 M297G 300 V 100 V 100 T 100 V300T 36 L 73 I 100 V 89 L36V I 18 I 11 L36I V 9 86.sup. D 63 H 57 K 89 H86K H86G H86D 1C R 18 G 20 R 11 H86R H86D H86R D 16 116 V 64 M 99 L 100 I116L I116M 1116V 1B M 18 L 1 I 18 117 L 100 F 99 T 100 L117T L117F 1B L 1 138 I 82 L 100 M 100 I138M I138L I138L 1B L 18 140 F 82 I 97 L 56 F140L F140V 1B I 18 V 2 I 44 F140I F140S S 1 .sup.For residue 86, only those alternate residues with more than 5% occurrence are displayed for cyanobacteria and flowering plants.
(83) As shown in Table 2, most CRs have an alternative residue in one or more of the 3 groups (cyanobacteria, flowering plants, red algae) although in most cases the consensus residue occurs with >70% frequency. Exceptions are 59, 64, 81, 114, 118 and 121, and all three CRs (271, 297 and 300) in the C-terminal domain segments of the adjacent LSU (Sub-region 1B). However, the magnitude of this variation is of little significance as the sequence sets for each group are not phylogenetically balanced. Rather this variation should only be taken as a approximate indication of degree of conformity with the CR definition. The strongest Mutant #4 (T23G/K81R) (see Examples 7 and 11) showed almost no variation. Similarly the strongest mutant of subregion 1B, Mutant #8 (V121I/M297G/V300T) (see Example 7), also showed no variation of the 3CRs.
(84) (iii) Grouping Candidate Residues, Alternative Candidate Residues and Divergent Candidate Residues into Candidate Mutants
(85) More than one Candidate Residue (CR), Alternative Candidate Residue (ACR) or Divergent Candidate Residue (DCR), or combinations thereof, may contribute to changes in a single contiguous interaction region between the sequences of the first and second protein. For example, a single non-Variant amino acid Residue may interact with two CRs, two ACRs, two DCRs, or a combination of two residues derived from two of these groups, such that both of the changed interactions may affect the same Target Residue. The two CRs, two ACRs, two DCRs, or combination may then be grouped into a single Candidate Mutant. Similarly, a given CR, ACR or DCR may contribute to changes in more than one contiguous interaction region affecting a Target Residue. Thus, there may be other CRs and/or ACRs and/or DCRs with which a given CR, ACR or DCR may be grouped to form other Candidate Mutants. These grouped CRs, ACRs, DCRs or combinations thereof may contribute to a proposed Candidate Mutant enzyme where each grouped CRs, ACRs, DCRs or combination thereof is grafted onto the host, replacing the corresponding residues in the sequence of the first (host) protein.
(86) (iv) Combining Candidate Mutants to Produce a Cumulative Effect
(87) If the residue mutations from two or more Candidate Mutants (groups of one or more CRs and/or ACRs and/or DCRs) affect the same secondary structural unit harbouring a Target Residue and mutations of each of these Candidate Mutants is expected to act in a coordinated fashion in changing the interaction, then the Candidate Mutants can be further combined to form a single combined Candidate Mutant. For example, if a Target Residue is part of a helix with one Candidate Mutant interacting with the N-terminal end of the helix and another Candidate Mutant interacting with the C-terminal end of the helix, with both interactions involving addition of a strong hydrophobic interaction in going from the sequence of the first protein to the sequence of the second protein, then a combined Candidate Mutant with both the Candidate Mutants grafted into the sequence of the first protein would be predicted to show concerted effects in repositioning the helix.
(88) (v) Ranking Mutants
(89) Despite the large reduction in the number of residues for consideration for mutation that may be effected by the initial selection of the Variant Residues against Target Residues to produce the list of Candidate Residues (CRs), and, optionally, Alternative Candidate Residues (ACRs) and Divergent Candidate Residues (DCRs), the number of predicted Candidate Mutants that could result from grouping the CRs, ACRs and DCRs could still be large. Consequently, it is useful to rank the Candidate Mutants based on their predicted potential to show enhancement of the desired functional property. The higher ranked Candidate Mutants may then be the first choice for further computational and experimental assessments, thus minimizing effort and cost. As illustrated in
(vi) Extended Phylogenetic Grafting Predictions
(90) Extended phylogenetic grafting predictions may be used to exploit the capacity of the phylogenetic grafting procedure to learn from experience through interpretation of the results (successes and failures) from cycles of application of the core method of prediction and testing of Candidate Mutants for proteins of interest described above. While the core method described above focuses on recognizing Candidate Residues, Alternative Candidate Residues and Divergent Candidate Residues and grouping them into Candidate Mutants based on networks of interactions of these residues, the initial results for Rubisco detailed in Example 3 showed that a proportion of these Candidate Mutants were ineffective or even slightly deleterious to the Rubisco function, although not greatly. The extended phylogenetic grafting strategy allowed this accumulated knowledge, both successes and failures, to be re-interpreted and built into a method further customized for the particular protein application, and that may continue to be refined as additional results are obtained. The interaction between the extended phylogenetic grafting technology and the core method described above is shown in
(91) The extended phylogenetic grafting strategy has three main components. Firstly, refinement of the concept of interacting networks of Candidate Residues, and/or
(92) Alternative Candidate Residues and/or Divergent Candidate Residues as the basis of the core method for grouping into Candidate Mutants. Examination of the initial successes and failures of Region 1 mutants of Rubisco, and in particular comparing Synechococcus Mutant #6a and its component Mutants #4 and #1 a (discussed further in Examples 3 and 4), suggested that an improved framework for prediction would involve recognising that there are hotspots for evolutionary adaptation which contain the partial evolutionary solutions to improved Rubisco, and that identification of these mutatable subregions provides a better, or supplementary, basis for identifying Candidate Residues, Alternative Candidate Residues and Divergent Candidate Residues which should be preferentially grouped to form Candidate Mutants in order to manipulate Rubisco's functional properties, rather than using solely the interaction networks within the Regions, as in the core method. Thus, in the extended method, the focus of investigation was more on identifying spatial regions than on identifying interacting networks of residues.
(93) Secondly, re-interpretation of the initial results for Region 1 mutants of Rubisco within this framework allowed identification of spatially contiguous volumes of protein structure, called Sub-regions, containing a subset of the Region's CRs (including initially identified ACRs and DCRs), and, which could be predicted to preferentially influence the properties of a particular Target Residue or Target Residues linked to the Region. Identification of these subsets of the Region's CRs, ACRs and DCRs provides a means by which they may be preferentially grouped to form Candidate Mutants.
(94) Thirdly, identification of Sub-regions with imprecise and overlapping boundaries provides a basis for identifying additional CRs, ACRs, CvRs and DCRs which are predicted to interact with the core subsets of CRs, ACRs and DCRs and which may be recruited to the core subsets to provide additional residues for grouping into Candidate Mutants.
(95) Furthermore, identification of Sub-regions as hotspots of natural sequence variation provides the means to identify and exploit other types of sequence diversity data, such as Species-specific Variant Residues (SsVRs) which vary among closely related species, and which may represent partial evolutionary solutions, such as for adaptation to particular environments such as hot/dry or cold/wet. Using Rubisco as an example,
(96) For example, three Sub-regions of Region 1 of the Rubisco LSU were identified as mutatable hotspots, and labelled regions 1A, 1B and 1C, as shown in
(97) In summary, the development of the extended phylogenetic grafting method provides a means to improve the functional properties of lead mutants. The development of the concept of Sub-regions as evolutionary hotspots and as a framework for building a database of protein sequence mapped to experimental data from test of predictions, also positions the phylogenetic grafting technology to better exploit sequence-diversity data. For example, the aforementioned database may be interrogated against sequence-diversity data to identify residues which may be recruited as species-specific variant residues (SsVRs) for possible inclusion in Candidate Mutants. For example for Rubisco, sequence data and, in some cases, kinetic data are available for photosynthetic organisms growing under varied or atypical environments, including for related C3 plant species such as those found in the Balearic Islands with different tolerances to drought and temperature (Galms et al., 2005), the drought-adapted southern African Marama bean (Parry et al., 2007), and extremophilic Cyanidiales red algae (Ciniglia et al., 2004). Alternatively for Rubisco, in cases where such sequence diversity and other functional data are available for crop plants such as wheat (Evans and Austin, 1986), the aforementioned database may be used to identify natural species with improved functional properties that may be used as germplasm in selective breeding.
(98) Producing Proteins
(99) Proteins with at least one Candidate Residue and/or Alternative Residue and/or Divergent Candidate Residue, and optionally including other substitutions of CvRs and/or SsVRs, from the second protein may be modelled in silico, or may be engineered in vitro and/or in vivo, for example by site directed mutagenesis of a polynucleotide encoding the protein and then expressed in an expression system.
(100) Screening Mutant Proteins
(101) Candidate Mutant proteins comprising the combination of Target Residues with one or more Candidate Residues and/or Alternative Candidate Residues and/or Divergent Candidate Residues, and optionally including other substitutions of Co-variant Residues (CvRs) and/or Species-specific Variant Residues (SsVRs) are thereafter screened to identify those Candidate Mutant proteins having said improved functional property. The process of screening the Candidate Mutant proteins may use any one or more of several techniques and may comprise catalytic assessment, biochemical assessment, and physiological assessment.
(102) Directed Evolution
(103) The proteins of the invention may be modified by directed evolution. Accordingly, the function of a protein produced by the methods of the invention may be improved or otherwise modified. In general, directed evolution may involve mutagenizing one or more parental molecular templates and identifying any desirable molecules among the progeny molecules. Progeny molecules may then be screened for the desired property, by assessing, for example, the activity of the molecule, the stability of the molecule, and/or frequency of mutation in the molecule. Progeny molecules with desirable properties may then be selected and further rounds of mutagenesis and screening performed. Methods by which directed evolution may be performed are well known in the art. Exemplary methods include, among others, rational directed evolution methods described in U.S. application Ser. No. 10/022,249; and U.S. Published Application No. US-2004-0132977-A1. For a general description of experimental methodology and techniques involved in directed evolution, reference may be made to Sambrook et al., Molecular Cloning, A Laboratory Manual 2.sup.nd ed., Cold Spring Harbor Laboratory Press, 1989.
(104) In one embodiment of the invention, mutant proteins produced by the methods of the invention may be used as a starting point for directed evolution. Taking the example of a Rubisco protein, Candidate Mutant proteins found to have improved enzymatic activity may be selected and further optimised by directed evolution. This optimisation may facilitate, for example, the relief of steric conflicts by recruitment of SvRs and other naturally occurring variant residues which may be identified as complementary to mutations in the Candidate Mutant proteins. Candidate Mutant proteins with improved enzymatic activity may thus serve as a novel starting point for directed evolution as they have different potential for exploring sequence space compared with wild type Rubisco protein. Systems suitable for directed evolution of Rubisco include, but are not limited to those which employ E. coli strain MMI as recently been reported (Mueller-Cajar et al., 2007).
(105) Rubisco Proteins
(106) The proteins generated in accordance with the invention include functional equivalents, variants, active fragments and fusion proteins. For the avoidance of doubt, the following are included within the scope of the invention: functional equivalents of the active fragments and fusion proteins; active fragments of the functional equivalents and fusion proteins; and fusion proteins comprising a functional equivalent or active fragment.
(107) The term fragment refers to a nucleic acid or polypeptide sequence that encodes a constituent or is a constituent of full-length protein. In terms of the polypeptide, the fragment possesses qualitative biological activity in common with the full-length protein.
(108) A biologically active fragment of use in accordance with the present invention may typically possess at least about 50% of the activity of the corresponding full-length protein, more typically at least about 60% of such activity, more typically at least about 70% of such activity, more typically at least about 80% of such activity, more typically at least about 90% of such activity, and more typically at least about 95% of such activity.
(109) Methods of measuring protein sequence identity are well known in the art and it will be understood by those of skill in the art that in the present context, sequence identity is calculated on the basis of amino acid identity (sometimes referred to as hard homology). Sequence identity is calculated after aligning the sequences. The inventors have used the ClustalW (Thompson et al., 1994) program provided within the BioEdit Sequence Alignment Editor (Hall, 1999) to align the sequences. There are several free-to-use and proprietary software packages available that perform sequence alignments and yield effectively the same results. The identification of Variant Residues may also be performed by collecting Rubisco sequences, using, for example, BLAST searches from one or more phylogenetic groups that differ in the kinetic property selected for improvement, and aligning them against said first sequence, using, for example, CLUSTALW.
(110) The functional equivalents, active fragments and fusion proteins of the invention retain the ability of the protein (SEQ ID NO: 23 for Synechococcus sp. PCC7942 and SEQ ID NO: 72 for tobacco) to act as a Rubisco enzyme with improved efficiency. Persons skilled the art will, however, be able to devise assays or means for assessing enzymatic activity.
(111) Functionally-equivalent proteins according to the invention are, therefore, intended to include mutants (such as mutants containing amino acid substitutions, insertions or deletions). Such mutants may include proteins in which one or more of the amino acid residues are substituted with a conservative or non-conservative amino acid residue and such substituted amino acid residue(s) may or may not be one encoded by the genetic code.
(112) Particularly preferred are proteins in which several, i.e. 30 and 50, between 20 and 30, between 15 and 20, between 10 and 15, between 5 and 10, 1 and 5, 1 and 3, 1 and 2 or just 1 amino acids are substituted, deleted or added in any combination. Mutant proteins also include proteins in which one or more of the amino acid residues include a substituent group.
(113) Such fragments may be free-standing, i.e. not part of or fused to other amino acids or proteins, or they may be comprised within a larger protein of which they form a part or region. When comprised within a larger protein, the fragment of the invention in one embodiment forms a single continuous region. Additionally, several fragments may be comprised within a single larger protein.
(114) In one embodiment of the invention there is provided a fusion protein comprising a protein of the invention fused to a peptide or other protein, such as a label, which may be, for instance, bioactive, radioactive, enzymatic or fluorescent, or an antibody.
(115) For example, it is often advantageous to include one or more additional amino acid sequences which may contain secretory or leader sequences, pro-sequences, sequences which aid in purification, or sequences that confer higher protein stability, for example during recombinant production. Alternatively or additionally, the mature protein may be fused with another compound, such as a compound to increase the half-life of the protein (for example, polyethylene glycol).
(116) Enzyme Functions
(117) In another embodiment the protein generated in accordance with the invention is an enzyme. Where the protein is an enzyme, the function may be the catalysis of at least one chemical reaction. In other embodiments the function may be structural (e.g. serving as a cytoskeletal protein). The function may involve the active or passive transport of a substance within the cell or between the cell interior and exterior, or between different compartments within the cell, or between different regions of the organism, for example where the protein is involved in a channel or a membrane pore, or the protein is involved in trafficking of materials to specific cellular compartments or the protein acts as a chaperone or a transporter. The function may be involved with ligand/receptor interactions, for example where the protein is a growth factor, a cytokine, a neurotransmitter or an intracellular or extracellular ligand, or the protein is a receptor for the growth factor, cytokine, neurotransmitter or the intracellular or extracellular ligand.
(118) Where the protein is an enzyme, the enzyme may be involved in catabolism or metabolism. The enzyme may be involved in the synthesis of at least one product. The enzyme may be involved in the breakdown of at least one substrate. The enzyme may be involved in the chemical modification of at least one substrate, for example the addition or deletion of one or more phosphate groups from a molecule.
(119) The enzymes may suitable for use in, for example, degradation of pesticides, and detergent residues, for mineral extraction, or for bulk or fine chemical processes, such as amylases. The enzymes may also be suitable for use in medical applications, and in particular may be used for minimizing changes to biological and physicochemical stability.
(120) The enzymes may have specifically engineered properties, for example, the ability to perform optimally in a desired temperature range, a narrower, wider or altered substrate specificity, or the ability to prevent the production and/or release of toxic or potentially toxic byproducts. The enzyme may be re-designed such that it is an efficient catalyst for a minor reaction of the wildtype enzyme using either its natural substrate or an alternative substrate to produce a different product.
(121) In the context of Rubisco, an improved functional property of Rubisco may comprise any one or more of improved specificity for CO.sub.2 over O.sub.2 (S.sub.c/o), improved carboxylation efficiency k.sup.c.sub.cat/K.sub.c.sup.air or improvements in one or both of its component parameters k.sup.c.sub.cat, the carboxylation rate, or K.sub.c, the affinity for substrate (CO.sub.2), or improvements in these functional properties over a range of temperatures, especially at higher temperature. At higher temperature wildtype Rubisco efficiency is limited by decreased specificity due mostly to a relative increase in the efficiency of the oxygenation reaction compared with that of the carboxylation reaction. Improved S.sub.c/o over a range of temperatures may be exhibited by a Rubisco if the efficiency of the oxygenation reaction does not increase with increasing temperature to the extent exhibited by a wildtype Rubisco, i.e. there is a decreased rate of increase or no increase in the efficiency of the oxygenation reaction catalyzed by the Rubisco with elevated temperatures, for example as measured between 25 C. and 35 C., when compared with a wild-type Rubisco.
(122) The improved functional property of a Rubisco may be any two of improved specificity for CO.sub.2 over O.sub.2 (S.sub.c/o), improved carboxylation efficiency k.sup.c.sub.cat/K.sub.c.sup.air or improvements in one or both of its component parameters K.sup.c.sub.cat, the carboxylation rate, or K.sub.c, the affinity for substrate (CO.sub.2), or improvements in these functional properties over a range of temperatures, especially at higher temperature. The improved functional property may be any three of improved specificity for CO.sub.2 over O.sub.2 (S.sub.c/o), improved carboxylation efficiency K.sup.c.sub.cat/K.sub.c.sup.air or improvements in one or both of its component parameters k.sup.c.sub.cat, the carboxylation rate, or K.sub.c, the affinity for substrate (CO.sub.2), or improvements in these functional properties over a range of temperatures, especially at higher temperature.
(123) The improved functional property of Rubisco, when functionally incorporated into a plant may result in the generation of a plant with improvements in any one or more of growth rate, biomass production, leaf index area, Rubisco content (Rubisco mRNA and protein content), carbon to nitrogen ratios of plant leaves, starch content, and photosynthetic performance. The improvements may be exhibited under optimal growth conditions for the plant. The improvements may be exhibited under sub-optimal growth conditions for the plant, for example but not limited to under elevated temperatures for growth, or where water, nitrogen or illumination is limiting plant growth, or a combination of any two or more of the above.
(124) Purification of Rubisco Proteins
(125) The invention provides a method of purifying a Rubisco protein produced according to the methods of the invention. The holoenzyme of the functional form of Rubisco from eukaryotic organisms (form I Rubisco) is a hexadecamer made of 8 large subunits (LSUs) and 8 small subunits (SSUs), and requires appropriate chaperones to correctly fold and assemble the enzyme correctly. E. coli is the most widely used microbial host for expressing recombinant DNA and proteins. When the operon coding for the Rubisco genes (rbcLS and rbcSS) from Synechococcus sp. PCC7942 is expressed in E. coli both subunits are abundantly synthesized, however only about 1 to 5% of the expressed LSUs are correctly folded and assembled into functional form with the amount of functional Rubisco accumulating to 1 to 3% (wt/wt) of the E. coli soluble protein.
(126) In order to overcome these difficulties, a recently adapted system (Baker et al., 2005, the entire contents of which are incorporated herein by reference) may be used to purify native or mutant Rubisco proteins. In this case, the aforementioned system was used for the purification of Synechococcus sp. PCC7942 Rubisco expressed in E. coli. The first step of the purification method involves fusing into a first vector the coding sequence for a H.sub.6 tagged ubiquitin (Ub) sequence (H.sub.6Ub) to the 5 end of an rbcSS gene. A host is then co-transformed with the first vector and second vector coding for the the native (or mutated) large subunit and small subunit of the Rubisco protein, and expression of the Rubisco protein and vectors is then induced, producing all three Rubisco subunit peptides (i.e. LSU, SSU and H.sub.6UbSSU). Some are assembled into functional Rubisco hexadecamers made up of 8LSU octameric cores and different ratios of SSU (at most 8) and H.sub.6UbSSU. The Rubisco protein is purified based on the expression of the H.sub.6 tag fused to the Rubisco small subunit. This purification may be performed, for example, using chromatography techniques such as metal affinity chromatography. The Ub fragments may then removed from the Rubisco using, for example, a Ub-specific protease.
(127) The present invention will now be described with reference to specific examples, which should not be construed as in any way limiting the scope of the invention.
EXAMPLES
Example 1
Identification of Target Residues
Computational Chemistry
(128) A computational study of the complete Rubisco carboxylation mechanism (Kannappan and Gready, 2008) and complementary oxygenation mechanism using ab initio QM calculations on an extended active-site fragment complex provides the basis for the strategy.
(129) This fragment complex model comprises fragments of most of the active-site amino acid residues that have either been established or mooted to have a key role in the series of reactions that are catalyzed at the Rubisco active site. It contains all residues directly co-ordinated to Mg.sup.2+ or interacting with the reactive centre of the substrate. This fragment complex model was built from the coordinates of the crystal structure with PDB code 8ruc (crystal structure of the complex of activated Rubisco with Mg.sup.2+ and 2-carboxyarabinitol 1,5-bisphosphate (2CABP)). Initially, the fragment complex model of Rubisco with 2-carboxy-3-ketorarabinitol 1,5-bisphosphate (2C3KABP), a structural analogue of 2CABP and the actual reaction intermediate produced during the Rubisco carboxylase activity, was built from the crystallographic coordinates. As shown in
(130) The analysis is focused entirely on the large subunit (LSU). Although most Rubiscos, including green plants, algae and cyanobacteria, are complex multimeric (hexadecameric) proteins consisting of 8 large subunits (LSU; 475 residues) and 8 small subunits (SSU; 140 residues), the active-site chemistry is conducted by a protein region consisting of a dimer of LSUs only, with 8 such dimer active sites in the hexadecameric protein. A moiety of the C-terminal (TIM-barrel) domain of one LSU contains most of the active-site residues while a smaller region of the N-terminal domain of the adjacent LSU completes the active site. However, predictions arising more generally from bioinformatics studies suggest other regions may be involved in modulating the chemistry, e.g., intersubunit contacts (LSU-LSU or LSU-SSU).
(131)
(132) The most significant features of the proposed reaction mechanism are discussed hereafter.
(133) Firstly the inventors have made the surprising discovery that H.sub.2O[Mg] is not displaced from Mg-coordination by CO.sub.2 during carboxylation. The water molecule in fact assists in binding CO.sub.2 to the active site and contributes to the stability of the carboxylated product and the corresponding TS. The same water molecule acts as the water of hydration in the later step. Previously this role of hydration had been assigned to a water molecule found in the vicinity of the coordination sphere.
(134) The inventors have made the further surprising discovery that the O2 atom remains unprotonated in the enediolate intermediate, despite expectations from general chemical principles that it would need to be deprotonated in order to direct carboxylation exclusively to C2, rather than to C3. ESP-derived atomic charges also show that O3 is more negative than O2. However, this unexpected result is explained by the observation of strong hydrogen bonds between LYS175 and protonated KCX201 with O2, which effectively prevent O2 from directing carboxylation to C3. Additionally, as LYS334 is H-bonded to the P1-phosphate group in the enzyme, its interaction with the substrate CO.sub.2 would be disrupted if C3 carboxylation were to take place, leaving no scope for stabilization of the corresponding TS and intermediate.
(135) Further features of the reaction mechanism elucidated by the inventors are as follows. KCX201 has a direct role only in the initial enolization reaction and it remains in a protonated state after enolization. KCX201 and LYS175 have a role in hindering C3-carboxylation by partially neutralizing the negative charge on O2. HIS294 has a significant role in the multi-step catalysis of Rubisco. It shuttles the proton between N.sub. and O3, modulating the C3-O3 bond length appropriately. GLU204 activates the Mg-coordinated water molecule for hydration by abstracting its proton. Thus, both carboxylation and hydration take place on the same face of the enediolate intermediate. The H3 proton is eventually transferred to O2 only after the formation of the aci-acid intermediate (VII). The charge on the aci-acid intermediate is stabilized by LYS175 and LYS334. LYS175 ensures stereospecific protonation of the C2-carbon to yield the final products. LYS334 shares its proton with LYS175.
(136) On the basis of the above findings, the inventors have identified two amino acid residues, one acting as a base the other as an acid, for each reaction step. For the gas-addition step, HIS294 acts as a base by abstracting the proton from O3, while LYS334 is the acid donating a proton to stabilize the carboxylate group formed by addition of CO.sub.2. Alteration in the steric or electronic environment of these two key residues, or any other residues that are structurally or chemically (through electrostatic interactions) linked to them, would impact the specificity of the enzyme and likely also affect k.sub.cat.
(137) Residues TYR20 GLU60, ASN123 were also identified as being crucial for appropriate orientation of gas molecules relative to the substrate prior to addition, and for the stability of the gas-adduct and the corresponding transition state structures. These five residues comprise an initial group of Target Residues for further examination.
(138) At this stage in the summary of the solution of the problem of reduction of sequence space for Rubisco illustrated in
Example 2
Phylogenetic Analysis to Identify Variant Residues Containing Specificity-Determining Residues
(139) Rubisco LSU sequences from available phyla of photosynthetic organisms were collected for phylogenetic analysis from publicly available databases at NCBI (www.ncbi.nlm.nih.gov/) and JGI (http://img.jgi.doe.gov/cgi-bin/pub/main.cgi) by performing protein BLAST searches (Altschul et al., 1997) using the spinach Rubisco LSU sequence as the query sequence. As the LSU sequences are so distinctive, and the conservation relatively high compared with most protein-homologue classes over such wide evolutionary distances, they are easy to identify and other free-to-use or proprietary search software would work equally well.
(140) Alignment of the extracted Rubisco LSU sequences from photosynthetic organisms belonging to thirteen different phyla covering red algae, cyanobacteria, glaucophyta and plants (10 phyla) was carried out using ClustalW software (Thompson et al., 1994) within the BioEdit Sequence Alignment Editor (Hall, 1999) to assess their diversity. An alignment is shown in
(141)
(142) To solve this problem, the inventors have developed the hypothesis-based phylogenetic grafting method and applied it to identification of Rubisco residues which already represent natural partial evolutionary solutions for enhanced specificity (S.sub.C/O). It is well known that diverse photosynthetic organisms exhibit characteristically different values for S.sub.C/O. The majority of land plants possess a typical S.sub.C/O value of 80, while red algae are known to have the highest specificity factor (160). The specificity factor for cyanobacteria has a modest value of about 40.
(143) As cyanobacteria are a common ancestor to both land plants and red algae, and as land plants diverged from cyanobacteria earlier in evolution compared with red algae (http://www.geocities.com/we_evolve/Plants/chloroplast.html) the partial evolutionary solution for enhanced specificity embedded in amino acid residue changes in red algae can be partially revealed by comparison of Rubisco LSU sequences from these three groups.
(144) At this stage in the summary of the solution of the problem of reduction of sequence space for Rubisco illustrated in
Example 3
Identification, Grouping and Ranking of Candidate Residues, and Prediction of Candidate Mutants Using Core Phylogenetic Grafting Method
(145) In order to identify the specificity-determining residues that account for the increased specificity of red algal Rubiscos, the inventors used the enzyme-mechanistic insights from the QM calculations described in Example 1 to develop a procedure for selecting a subset of the 134 Variant Residues identified in Example 2. The Variant Residues, identified in Example 2, were selected against the Target Residues, identified in Example 1 to have a functional role in the gas-addition step, in order to identify Candidate Residues. Several of the selected Candidate Residues were also Alternative Candidate Residues. In addition, several Divergent Candidate Residues were also selected. This used a procedure based on the general principle that evolutionary changes of residues which modified the properties of these Target Residues would alter their specificity and kinetic efficiency. Thus, any Variant Residue that has the potential to affect the Target Residues electronically or structurally, and, hence, cause a change in the functional property of the enzyme, in this case specificity and kinetic efficiency, is deemed to be part of the partial evolutionary solution to optimisation of the property.
(146) The selection procedure was applied to identify such Candidate Residues, Alternative Candidate Residues and Divergent Candidate Residues. Selection was carried out by visual analysis and comparison of differences in inter-residue interactions with Accelrys Discovery Studio v1.5.1 (Accelrys Software Inc., San Diego, Calif., 2005), using the crystal structures of Rubiscos from spinach (PDB code: 8ruc) and Galdieria partita (PDB code: 1BWV). The crystal structure of spinach Rubisco was used instead of Rubisco from Synechococcus sp. PCC6301 (PDB code: 1RB1) because the resolution of the available structures is superior for the spinach Rubisco structure. However, the inventors accounted for residue changes between LSUs of spinach and Synechococcus sp. PCC6301 Rubiscos when analyzing the inter-residue interactions, and used the Synechococcus structure also for reference.
(147) Three of the five Target Residues, TYR20, GLU60 and ASN123 are in the N-terminal domain of the Rubisco LSU, while HIS294 and LYS334 are in the C-terminal domain that forms the TIM-barrel structure (
(148) At this stage in the summary of the solution of the problem of reduction of sequence space for Rubisco illustrated in
(149) The initial analysis and Candidate Mutant design is illustrated by consideration of the Variant Residues in Region 1. As the N-terminal domain of the LSU of Rubisco is a compact domain (
(150) A Variant Residue whose sidechain interacts with the backbone or sidechains of Target Residues or with residues that are part of the secondary structural units harbouring the Target Residues was considered a Candidate Residue (or Divergent Candidate Residue). As an example, Target Residue GLU60 is at the C-terminal end of helix B and the Variant Residue ILE51, at the N-terminal end of this helix, interacts with another Variant Residue TRP25 in the Galdieria partita LSU (see
(151) Before grafting, the groups of Candidate Residues can be further combined into groups that can act further in coordinating and amplifying the perturbative effect on a given Target Residue or Target Residues. Such extended grouping is useful if two different groups of Candidate Residues affect the same secondary structural unit. For example, both of the Candidate-Residue groups {25, 51} and {54, 84, 87} affect helix B, which harbours GLU60, through two different interactions. Hence, one of the predicted mutants comprised these combined groups (Shown as Mutant #7a in Table 3).
(152) The potential grafted Candidate Mutants were further assessed to check for new unfavourable steric interactions introduced by grafting residues from Galdieria partita into Synechococcus sp. PCC6301. Such undesirable steric interactions may be rectified by adding spatially complementing mutations to the Candidate Mutant, or could be investigated by MD simulations to assess whether structural relaxation to relieve such bad contacts is energetically accessible.
(153) The final step in the Candidate-Mutant prediction procedure is to rank the potential grafted mutants to develop a ranked list for use in prioritising experimental testing or detailed computational in silico pre-screening. The ranking reflects the expected degree to which the combined mutations in the individual Candidate Mutants is expected to change the functional property, in this example in the direction towards improvement of specificity by influencing the chemistry, and relative chemistry, of the gas-addition steps for CO.sub.2 and O.sub.2. Ranking depends on a number of parameters such as the Target Residue affected by the Candidate-Residue group, the strength of the changed interactions of the Candidate-Residue group within the Target-Residue Region, and the number of such interactions for each Candidate-Residue group.
(154) In addition to the Rubisco sequence from Galdieria partita, the sequence of another red-algal species Griffithsia monolis, which is known to have a better k.sup.c.sub.cat than Galdieria partita, was considered in the analysis. For the initial set of Candidate Residues selected, two residues (51 and 54) show differences between G. partita and G. monolis, i.e. they are Alternative Candidate Residues. Candidate Mutants with both residue variants in positions 51 and 54 were formed, as shown in Table 3 for CM's #1, #5, #6, #7 and #13.
(155) As an example, the set of sixteen Candidate-Residue groups shown in Table 3 was predicted from initial analysis of Region 1 Target Residues, forming 21 potential Candidate Mutants (with G. partita and G. monolis variants). For each Candidate Mutant, the last column in Table 3 details the predicted structural change associated with the mutations. For some Candidate Mutants, these changes are explained graphically in a figure; the second column of the table gives the figure number for these CMs. Table 3 also shows the rankings of priorities for experimental test.
(156) Table 3 shows that two Divergent Candidate Residues (36, 116 and 140; see Table 2) were selected in this initial analysis for the reasons summarized for the relevant Candidate Mutants (#9 and #10). Two of these DCRs (36 and 140) also show differences between G. partita and G. monolis (see Table 2); for Candidate Mutants #9 and #10, the Gm variant of 36 and the Gp variant of 140 were judged to be the most promising for transplant into Synechococcus.
(157) At this stage in the summary of the solution of the problem of reduction of sequence space for Rubisco illustrated in
(158) TABLE-US-00003 TABLE 3 Predicted and Ranked Candidate Mutants from Analysis of Region 1 in the N-terminal Domain of the Rubisco LSU surrounding Target Residues TYR20, GLU60 and ASN123. Superscript Gp and Gm denote the residue from Galdieria partita and Griffithsia monolis, respectively. No # FIG..sup.a Mutant Rank Region Affected 1a 13, 14 Y25W/D51I.sup.Gp 3 Adds a new hydrophobic interaction between the C-terminal end of B and A. A is close to Y20 in sequence. 1b 13, 14 Y25W/D51V.sup.Gm 3 Adds a new hydrophobic interaction between the C-terminal end of B and A. A is close to Y20 in sequence. 2 A59G/G64A 6 Swapping mutation. Interaction between B and the long chain that connects B to C. The swapped methyl group is spatially close to Y20 and adjacent to E60. 3 P49D/D51I 10 Alters the interaction in the short loop connecting B and B. 4 13, 14 T23G/K81R 4 Interaction between A and C is broken. Affects the positioning of Y20. 5a 14 G54A.sup.Gp/C84A/I87V 5 Introduces a strong hydrophobic interaction between B and C. 5b 14 G54S.sup.Gm/C84A/I87V 5 Introduces a strong hydrophobic interaction between B and C. 6a 13 T23G/Y25W/ 1 Mutant #1a adds a hydrophobic interaction D51I.sup.Gp/K81R between A and B, while Mutant #4 breaks the interaction of A with C. These two sets of mutations together may have a large effect on Y20. 6b 13 T23G/Y25W/ 1 Mutant #1b adds a hydrophobic interaction D51V.sup.Gm/K81R between A and B, while Mutant #4 breaks the interaction of A with C. These two sets of mutations together may have a large effect on Y20. 7a 14 Y25W/D51I.sup.Gp/ 2 Cumulative effect of Mutants #1a and #5a on E60. G54A.sup.Gp/C84A/I87V 7b 14 Y25W/D51I.sup.Gp/ 2 Cumulative effect of Mutants #1a and #5b on E60. G54S.sup.Gm/C84A/I87V 7c 14 Y25W/D51V.sup.Gm/ 2 Cumulative effect of Mutants #1b and #5a on E60. G54A.sup.Gp/C84A/I87V 7d 14 Y25W/D51V.sup.Gm/ 2 Cumulative effect of Mutants #1b and #5b on E60. G54S.sup.Gm/C84A/I87V 8 15 V121I/M297G/ 7 Hydrophobic interaction between the two LSUs V300T are broken (residues 297 and 300 are from the neighbouring LSU that contains the Mg-complex of the active site being considered). Affects N123. 9 15 L36I.sup.Gm/I116L/ 8 Forms a large hydrophobic region involving the F140L.sup.Gp ends of two adjacent -strands (B and E) and C. Could alter the orientation/positioning of N123. 10 15 L36I.sup.Gm/I116L/ 3 Could together act on - C and have a cumulative V121I/F140L.sup.Gp/ effect on N123. M297G/V300T 11 T114A/T118A/ 2 Polar interaction of 114 and 118 with 271 in the T271V/V121I partner LSU is broken. 271 forms a new hydrophobic interaction with 121. Impacts N123 12 21 K18I/T23G 5 Polar interaction between side-chains of T23 and K18 is broken. Impacts Y20. 13a 13, 14 Y25W/D51I.sup.Gp/ 6 (1a + A21) Insertion of A21 moves K21 away insert A21 from residue 51 and forms a new hydrophobic interaction with 151. 14 21 KLTYY-(21-25)- 4 Shape of a coil adjacent to Y20 is altered, affects AKMGYW orientation of Y20. Involves an insertion (M); see FIG. 6. 15 21 K18I/KLTYY-(21- 1 The change of shape of coil next to Y20 is 25)-AKMGYW/ associated with changes to its interaction with 18 K81R (loss of polar interaction) and 81 (change in length) and may have a coordinated effect on Y20. 16 A15S/K18I/T68V/ 9 Possible interaction between residues 18 and 68 L407I which could bind N-terminal tail to rest of domain,; conserved residue 69 interacts with 407 of partner LSU,; S15 can form strong H-bonds with backbone carbonyl groups of 408 and 409 of partner LSU in red algae. Targets Y20. .sup.aFIGS. 13-15 and 21 show the predicted mutation sites.
Example 4
Identification, Grouping and Ranking of Candidate Residues, and Prediction of Candidate Mutants Using Extended Phylogenetic Grafting Method
(159) As aforementioned and as shown in
(160) The extended method was first used to interpret the core-method results. This resulted in identification of a new model for grouping of already identified CRs, ACRs and DCRs (see CMs in Table 3), which was based on Sub-regions as mutatable hotspots rather than focussed on networks of interactions as in the core method. The three Sub-regions (1A, 1B and 1C) exhibited the property of being predicted to preferentially influence the properties of one of the three Target Residues linked to Region 1: Sub-region 1A to TYR20, 1B to ASN123 and 1C to GLU60. This anchoring of the three TRs to the Sub-regions of Region 1 is shown graphically in
(161) Identification of the Sub-regions provided the basis for identification of additional CRs and DCRs, using visual inspection and other analyses described previously for identifying CRs and DCRs using the core method, which may be preferentially grouped with CRs and DCRs already identified by the core method to form new Candidate Mutants (see Table 4). Additional CRs and DCRs so identified were positions 19, 68, 88 and 104, and 86, 117 and 138, respectively. Alternatively, the identification of the Sub-regions provides a new basis for preferential regrouping the CRs and DCRs already identified by the core method to form new Candidate Mutants.
(162) The use of the extended method, and the basis for recruiting particular additional CRs and DCRs, is illustrated below with reference to examples for refinement of activity of mutants predicted by the core method and initially tested (Table 5). The new predicted CMs are shown in Table 4.
(163) TABLE-US-00004 TABLE 4 Candidate Mutants predicted using the extended phylogenetic grafting method to refine the most promising initially predicted Candidate Mutants (Table 3). Also listed are additional single and double-residue component mutations of these most promising initially predicted Candidate Mutants. No. FIG. Mutation Rank Comments 17-1A 21 T23G/K18I/T68V/K81R 2 Prediction based on refined 18a-1A 21 T23G/K81R/P104E.sup.Gp 5 phylogenetic grafting on subregion 18b-1A 21 T23G/K81R/P104D.sup.Gm 5 1A. 19-1A 21 T23G/D19P/K81R 4 20-1B 20, I116L/L117T/V121I/ 3 Prediction based on refined 21 I138M/F140L phylogenetic grafting on subregion 1B. Targets residue N123. 21-1A, B 21 T23G/K81R/V121I/ 1 Combination of two predicted CMs M297G/V300T that showed maximum overall improvement in the enzyme efficiency in kinetic assessment study. 22-1A 13, 14 T23G Single-residue components of Mutant #4. 23-1A 13, 14 K81R 24 15 V121I/M297G Double-residue component of Mutant #8. 25 15 M297G Single-residue components of 26-1B 15 V121I Mutant #8.
(164) The first example below focuses on the refinement of the best mutant, Mutant #4 (T23G/K81R), from first-round testing, and on subregion 1A. As shown in Table 3, Mutant #4 is a component with Mutant #1a (Y25W/D51I.sup.Gp) of Mutant #6a (T23G/Y25W/D51I.sup.Gp/K81R). However, although Mutant #6a showed increased specificity (8.5%, see Table 5), it showed no efficiency improvement, and overall is inferior to Mutant #4, which showed both increased specificity (10%) and efficiency (8%). Reference to the results for Mutant #1a indicated it is a relatively poor mutant with increased specificity of only 3.4% and no change in efficiency. Thus, it was concluded that grouping of the Mutant #4 mutations with those of Mutant #1a was ineffective, as the net effect was not a cumulative improvement but rather the two groups of mutations apparently interfered disadvantageously. As Sub-region 1A contains the residue positions of the mutations in Mutant #4, it was deduced that coupling of the Mutant #4 mutations with mutations of other CRs and/or DCRs in subregion 1A, may provide a better strategy for further improving on the properties of this lead mutant.
(165) Examination of the sequence and structural data by visual and other analysis procedures already described, especially an analysis of species-specific covariation data for already identified CRs and DCRs in Sub-region 1A, identified three additional CRs, 19, 68 and 104, which may influence TR TYR20 or residue 81 (component of #Mutant #4). Hence, the following new CMs, showing the indicated changes in interactions, were predicted as potential improvements on Mutant #4.
(166) In relation to Mutant #17-1A (#4+K18I/T68V), the mutations of K18I and T68V are predicted to result in an altered interaction between the N-terminal tail and the region between B and helix B. In combination with the mutations in Mutant #4 (T23G and K81R) this is expected to have an increased impact on TR TYR20.
(167) In relation to Mutant #18a(b)-1A (#4+P104E.sup.Gp(D.sup.Gm)), the mutation of CR 104 (which is also an ACR), i.e. P104E.sup.Gp or P104D.sup.Gm, is expected to result in a new interaction with the backbone N of residue 81. It is expected that the three mutations in the combined CM #18-1A would have an increased impact on TR TYR20.
(168) In relation to Mutant #19-1A (#4+D19P), residue 19 forms a salt bridge with residue 21 (K or R) in cyanobacteria and plants, which is absent in red algae where residue 19 is P. It is expected that the mutation D19P will influence the reach of TR TYR20 into the active site, and that in combination with the Mutant #4 mutations, this would produce an increased impact on TR TYR20.
(169) These predictions are shown graphically in
(170) The second example below relates to the identification of a major new mutational area (subregion 1B) which in turn led to the initial prediction of Mutant #20-1B (V116L/L117T/V121I/I138M/F140L). This mutant has Mutant #8 (V121I/M297G/V300T), identified by the core method (Table 3), as a component. As shown in Table 5, initial testing of Mutant #8 showed improvement in specificity (5.6%) which providing a basis for further characterisation. Following identification of Sub-region 1B, as described, examination of residue variations near residue 121 (CR adjacent to TR N123; see
Example 5
Generation of Mutant Rubiscos and Screening Thereof
(171) Based on the ranked predicted Candidate Mutants in Tables 3 and 4, proof-of-principle studies were conducted by mutating the red-algal specific residues into Region 1 of Synechococcus sp. PCC7942, in groups of multiple mutations. Initially fourteen of the predicted LSU mutants detailed in Example 3 and listed in Table 3 (Mutants #1a, #1b, #4, #5a, #5b, #6a, #7a, #7b, #7d, #8, #9, #10, #12 and #14, where a and b refer to variants where the CR for G. partita or the ACR for G. monolis, respectively, were transplanted) were engineered by mutating the Synechococcus sp. PCC7942 rbcLS gene with the QuickChange multi-mutagenesis kit (Stratagene) using appropriate primers. In addition, several mutants with single (#22-1A, #23-1A, #25, #26-1B) or double (#24) mutations, which are components of predicted mutants (#4, #8, #21-1A,B), were engineered as controls.
(172) The genomic sequence for the rbcL-rbcS sequence (operon) of wildtype Synechococcus sp. PCC7942 is shown in SEQ ID No. 22. This sequence is the same as for Synechococcus sp. PCC6301. In the native genome sequence, the rbcLS coding sequence reads ATG CCC (coding Met then Pro) but in all the cloned rbcLS sequences it is ATG GCC (coding Met then Ala). This nucleotide substitution was introduced to code for a unique restriction site (NcoI) used for cloning of the gene. A silent mutation in the second last codon (Arg) in the rbcSS sequence (CGA to CGC) was also introduced for cloning purposes. The translated sequence for the wildtype large subunit is shown in SEQ ID No. 24 and the translated sequence for the wildtype small subunit is shown in SEQ ID No. 23. The nucleotide and protein sequences for mutants are shown by SEQ ID NOS in Table 5, for example SEQ ID NOS: 25 and 26, respectively, for Mutant #1a.
(173) After consideration of these initial results and development of the extended phylogenetic grafting method, a further four of the predicted LSU mutants detailed in Example 4 and listed in Table 4 (Mutant #17-1A, #18-1A, #19-1A, #21-1A,B) were engineered by the same method. All of the control mutants (#22-1A, #23-1A, #24, #25, #26-1B) mentioned above are components of these four mutants and are relevant, in particular, to Mutants #4 and #21-1A,B. The nucleotide and protein sequences for these mutants are shown by SEQ ID NOS in Table 5, for example SEQ ID NOS: 53 and 54, respectively, for Mutant #17-1A.
(174) The mutated rbcLS genes were sequenced before cloning back into the second expression plasmid which coded for the mutated LSU and the native SSU. The mutant Rubiscos were expressed and purified using a procedure described in Example 6. These initial experiments on the eighteen Synechococcus sp. PCC7942 mutants and five control mutants showed good expression in E. coli of active (i.e. properly folded and assembled hexadecameric) mutant Rubiscos, with specificity and kinetic constants comparable with, or better than, wild type, as described in Example 7. The way in which experimental test and optimisation procedures, detailed in Examples 6-11 are integrated with the prediction and in silico screening steps is shown in
Example 6
Expression and Purification of Mutant Rubiscos
(175) Although E. coli is the most widely used microbial host for expressing recombinant DNA and proteins, obtaining the functional form of Rubisco from eukaryotic organisms (defined herein as form I protein) is more complex. As the holoenzyme of form I Rubisco is a hexadecamer made of 8 large subunits (LSUs) and 8 small subunits (SSUs), it requires appropriate chaperones to correctly fold and assemble the enzyme correctly. Conveniently, however, when the operon coding for the Rubisco genes (rbcLS and rbcSS) from Synechococcus sp. PCC7942 is expressed in E. coli both LSU and SSU subunits are abundantly synthesized. Hence, Synechococcus sp. PCC7942 was used here as the model L.sub.8S.sub.8 enzyme for initial testing of the Rubisco predictions, and E. coli was used as the natural choice of expression host for producing the mutant Rubiscos.
(176) However, only about 1 to 5% of the expressed LSUs are correctly folded and assembled into functional form with the amount of functional Rubisco accumulating to 1 to 3% (wt/wt) of the E. coli soluble protein. Purification of the functional Rubisco by traditional methods is a laborious and protracted process that may take up to 3 days.
(177) The use of 6Histidine (H.sub.6) affinity tags is an attractive alternative that could save substantial effort and time toward enzyme purification. But experiments have shown that fusion of H.sub.6 tags to either termini of the LSU or SSU of form I Rubisco can compromise the catalytic activity. To overcome these difficulties, a recently adapted system (Baker et al., 2005, the entire contents of which are incorporated herein by reference) was used to simplify and speed up the purification of Synechococcus sp. PCC7942 Rubisco expressed in E. coli. This system involved the construction of a unique (pACYC-based) plasmid vector that incorporates fusing in frame the coding sequence for a H.sub.6-tagged ubiquitin (Ub) sequence (H.sub.6Ub) to the 5 end of rbcSS. The wild-type rbcLS in plasmid pTrcSynLS (that contains the PCC7942 rbcL-rbcS operon; Emlyn-Jones et al., 2006) was replaced with the mutated rbcLS (rbcLS*) and then co-transformed into E. coli with the pACYC-based plasmid that codes for H.sub.6Ub-tagged wild-type SSU (H.sub.6Ub-SSU). When Rubisco subunit expression was induced with IPTG, all three Rubisco subunit peptides were produced (i.e. LSU, SSU and H.sub.6UbSSU). Some were assembled into functional Rubisco hexadecamers made up of 8LSU octameric cores and different ratios of SSU (at most 8) and H.sub.6UbSSU. Rubiscos with one or more H.sub.6 tags were easily purified from other E. coli proteins using immobilized metal affinity chromatography (IMAC) and the H.sub.6Ub sequence then cleaved with a H.sub.6Ub-specific protease which, along with unassembled H.sub.6Ub peptides, may be removed by IMAC. Using this method, purified Rubiscos were isolated from the E. coli in approximately 1 hour.
(178) Eighteen of the mutant Rubiscos and five control mutants specified in Example 5 and identified in Tables 3 or 4, as well as wild type, were expressed and purified using the above procedure.
Example 7
In Vitro Kinetic Assay of Mutant Rubiscos
(179) Rubisco proteins purified using the method described in Example 6 were used to measure the Michaelis constants for CO.sub.2 (K.sub.c) and substrate saturated carboxylation rates (v.sub.c.sup.max) using .sup.14CO.sub.2-fixation assays at 25 C., pH 8 according to the method described in Andrews (1988), the entire contents of which is incorporated herein by reference.
(180) The purified enzyme was pre-incubated at 25 C. for 30 min in buffer containing 20 mM MgCl.sub.2 and 25 mM NaHCO.sub.3, and K.sub.c measurements were performed in nitrogen sparged septum capped scintillation vials. The reactions were initiated by adding 10 L of purified enzyme to 0.5 mL of N.sub.2-equilibrated assay buffer (100 mM EPPS-NaOH, 20 mM MgCl.sub.2, 0.8 mM ribulose-P.sub.2, 0.1 mg/ml carbonic anhydrase) containing varying concentrations of NaH.sup.14CO.sub.3.
(181) The Michaelis constants were determined by fitting the data to the Michaelis-Menten equation. Quantification of Rubisco content in the assays was measured using the [2-.sup.14C] carboxyarabinitol-P.sub.2 (.sup.142CABP) binding assay described by Ruuska et al. (1998) and Whitney and Andrews (2001). The substrate saturated carboxylation turnover rate (k.sup.c.sub.cat) was calculated by dividing the extrapolated maximal carboxylase activity (V.sub.c.sup.max) by the concentration of Rubisco active sites in the assay. The purified Rubisco preparations were also used to measure the CO.sub.2/O.sub.2 specificity (S.sub.c/o) at pH 8.3 as described in Kane et al. (1994).
(182) A summary of the results obtained is presented in Table 5. For specificity, average results are given as % change compared with wild type: a positive value represents an improvement. For kinetics, the values and % change compared with wild type are given; a positive % change for the catalytic rate k.sup.c.sub.cat and the catalytic efficiency k.sup.c.sub.cat/K.sub.c represents an improvement, while a negative % change for the Michaelis constant K.sub.c represents an improvement.
(183) TABLE-US-00005 TABLE 5 Specificity and kinetic results for Synechococcus sp. PCC7942 mutant Rubiscos. Kinetics S.sub.c/o k.sup.c.sub.cat/K.sub.c Average.sup.d k.sup.c.sub.cat (s.sup.1) K.sub.c (M) (s.sup.1mM.sup.1) Mutant # (modified SEQ (% change (% change (% change (% change residue(s)).sup.c ID.sup.e wild type) wild type) wild type) wild type) Wild-Type 22-24 41.6 0.7 13.2 0.2.sup.a 203 10.sup.a .sup.65.sup.a 13.0 0.2.sup.b 197 10.sup.b .sup.66.sup.b #1a (Y25W, D51I.sup.Gp) 25, 26 (3.4%) 11.5 0.2.sup.a 176 6.sup.a 65 (13%) (13%) (0%) #1b (Y25W, D51V.sup.Gm) 27, 28 (1.1%) #4 (T23G, K81R) 29, 30 (10.0%) 14.1 0.2.sup.b 198 8.sup.b 71 (8%) (0%) (8%) #5a (G54A.sup.Gp, C84A, I87V) 31, 32 (6%) 11.3 0.3.sup.a 182 12.sup.a 62 (14%) (10% (4%) #5b (G54S.sup.Gm, C84A, I87V) 33, 34 (1.9%) 10.6 0.2.sup.a 167 8.sup.a 63 (20%) (18%) (3%) #6a (T23G, Y25W, D51I.sup.Gp, 35, 36 (8.5%) 12.2 0.2.sup.b 185 9.sup.b 66 K81R) (6%) (6%) (0%) #7a (Y25W, D51I.sup.Gp, G54A.sup.Gp, 37, 38 (8.8%) 12.4 0.2.sup.b 195 11.sup.b 64 C84A, I87V) (5%) (1%) (3%) #7b (Y25W, D51I.sup.Gp, G54S.sup.Gm, 39, 40 (1.4%) n.m. n.m. n.m. C84A, I87V) #7d (Y25W, D51V.sup.Gm, 41, 42 (0.9%) n.m. n.m. n.m. G54A.sup.Gp, C84A, I87V) #8 (V121I, M297G, V300T) 43, 44 (5.6%) n.m. n.m. n.m. #9 (L36I, I116L, F140L) 45, 46 (3.6%) n.m. n.m. n.m. #10 (L36I, I116L, V121I, 47, 48 (2.9%) 10.8 0.4.sup.b 325 27.sup.b 33 F140L, M297G, V300T) (17%) (65%) (50%) #12 (K18I, T23G) 49, 50 (1%) n.m. n.m. n.m. #14 (loop AKMGYW) 51, 52 (1.7%) 7.1 0.4.sup.b 260 36.sup.b 27 (45%) (32%) (59%) #17-1A (T23G, K18I, T68V, 53, 54 (0.6%) n.m. n.m. n.m. K81R) #18-1A (T23G, K81R, P104E) 55, 56 (0.5%) n.m. n.m. n.m. #19-1A (T23G, D19P, K81R) 57, 58 (2.0%) n.m. n.m. n.m. #21-1A, B (T23G, K81R, 59, 60 (0.9%) n.m. n.m. n.m. V121I, M297G, V300T) #22-1A (T23G) 61, 62 (5.0%) n.m. n.m. n.m. #23-1A (K81R) 63, 64 (7.0%) n.m. n.m. n.m. #24 (V121I, M297G) 65, 66 n.m. n.m. n.m. n.m. #25 (M297G) 67, 68 n.m. n.m. n.m. n.m. #26-1B (V121I) 69, 70 n.m. n.m. n.m. n.m. .sup.a,brefers to replicate measurements on different Rubisco samples done on different days. .sup.cwhere given, specifies mutations are in Subregions 1A, 1A, B or 1B. .sup.daverage of replicate measurements on different Rubisco samples done on different days. .sup.eSequence ID numbers correspond to those in the sequence file. The numbers in the SEQ ID column correspond to the sequences on the computer-generated sequence listings, n.m. represents not measured at time of filing.
(184) Of the eighteen mutants and two single (control) mutants for which specificity measurements were made, all showed specificity comparable with wild type, i.e. none was significantly impaired, while five mutants and the two controls showed improvements of 5% or better. Of interest, four of these include the mutations T23G and/or K81R. Also, of interest is that mutants with ACR variants (#1a and #1b for position 51, #5a and #5b for position 54, and #7a, #7b and #7d for positions 51 and 54) showed significant differences (in the order of 8% for the Mutant #5 and #7 variants) demonstrating the sensitivity of specificity to these changes which spatially are relatively far from the active site (see
(185) Of the 9 mutants for which k.sup.c.sub.cat was assayed, most showed slightly to significantly lower (i.e. poorer) values compared with wild type with the exception, Mutant #4, which showed a significant improvement of 8%. The corresponding 9 mutants assayed for K.sub.c exhibited a range of values, with 4 showing moderately improved CO.sub.2 binding (#1 a, #5a, #5b, #6a), 2 showing little change (#4, #7a) and 2 showing significant impairment (#10, #14). The overall catalytic efficiency values (k.sup.c.sub.cat/K.sub.c) similarly show a range of small to significant improvement (#4, #5a, #5b), to little change (#1a, #6a, #7a) to significant impairment (#10, #12) when compared with wild type. It is notable that in tobacco (see Example 11), Mutant #23-1A showed a significantly poorer k.sup.c.sub.cat value and also overall catalytic efficiency, whereas Mutant #4 showed significant improvements in all three kinetic measures.
(186) From the results of the initial set of mutants predicted by the core method (#1-#14), the stand-out mutants in terms of overall performance were Mutants #4 and #6a, which showed improvements in specificity and kinetic efficiency of 10% and 8%, and 8.5% and 0%, respectively. The properties of the more complex mutant (#6a), which comprised the mutations in #4 and #1a, were overall inferior to Mutant #4 (the best mutant on the current list). This observation suggests that other mutations may be more advantageously grouped with those for Mutant #4 than those of Mutant #1a. As discussed in Example 4, this observation prompted the development of the concept of Sub-regions and the extended method, which led to the first predictions of Mutants #17-1A, #18-1A and #19-1A, as well as a more complex mutant including CRs from Sub-region 1B also (#21-1A,B). The specificity results for these extended-method predictions showed little change compared with wild type.
Example 8
Directed Evolution of Synechococcus Mutants in E. coli
(187) The phylogenetic grafted mutants represent one directed strategy for exploring regions of sequence space not sampled naturally. However any increase in the activity of these mutants may be impaired due to some areas of poor sequence fits, as they may not be optimized for the host Rubisco structure. Although the extended phylogenetic grafting method provides a rational in silico strategy which may be used for optimising lead mutants, including relieving steric conflicts by recruitment of SvRs and other naturally occurring variant residues which may be identified as complementary to mutations in the leads, an alternative option to minimize the effects of these conflicts may be to use an experimental directed evolution method to optimize them, i.e. to use these partially optimized Rubiscos as starting points. These mutants also provide in themselves a novel starting point for directed evolution as they have different potential for exploring sequence space compared with wild type.
(188) As detailed in Example 6, unlike all other Form I Rubiscos (i.e. hexadecameric) from eukaryotic organisms, Rubisco from Synechococcus PCC7942 can assemble correctly in E. coli and has been chosen for a mutant screening procedure. Using methods described in Examples 5-7, Candidate Mutant predictions (which may include groups of correlated mutations, and independent mutations in different structural regions surrounding the Target Residues) can be screened initially in E. coli to confirm they are active and to obtain in vitro kinetic constants for comparison against each other and wild type. Mutants with up to 10-12 mutations can be produced routinely using the current technology. As detailed in Examples 5-7, selected single mutants can be made to test the general hypothesis underlying the phylogenetic grafting method that single mutants are likely to be poorly active/inactive, and that at least two correlated mutations are necessary to produce an acceptably active enzyme.
(189) A system suitable for directed evolution of Rubisco in E. coli has recently been reported (Mueller-Cajar et al., 2007). This uses an engineered E. coli strain, MM1, whose growth can be made dependent on functional expression of Rubisco, when co-expressed with phosphoribulokinase (PRK). Glycolysis in MM1 was blocked by deletion of the glyceraldehyde 3-phosphate dehydrogenase gene (gapA) and a metabolic bypass shunt comprising a Synechococcus PRK and Rhodospirillum rubrum Rubisco was introduced. As a result, MM1 is dependent on functional Rubisco expression to metabolize the product of PRK catalysis, ribulose-1,5-bisphosphate, that is toxic to E. coli.
(190) This general method may be used to evaluate whether Rubiscos with significantly enhanced activity can be more efficiently evolved starting from inactivated forms of the most promising Synechococcus sp. PCC7942 grafted mutants detailed in Example 7 and Table 5. For this purpose, randomly mutagenised libraries (made using methods reported by Mueller-Cajar et al., 2007) of these inactivated genes may be transformed into MM1 cells grown under differing selective conditions (e.g. varying the growth CO.sub.2/O.sub.2 pressures, changing the extent of PRK production). Colonies expressing evolved Rubisco variants with improved fitness (i.e. those that survive the screen) may be isolated, sequenced and the kinetics of the purified mutated Rubiscos characterised as detailed in Example 7.
Example 9
Testing Biochemical and Physiological Competence of Synechococcus Mutants In Vivo
(191) The in vitro functional tests in Example 7 identified several Synechococcus mutants with improved Rubisco activity, and which also, necessarily, were, thus, correctly folded and assembled when expressed from the E. coli expression system and purified as described in Example 6. In the Rubisco re-engineering strategy, Synechococcus has been used as the most convenient initial host for experimental test of mutant predictions to identify lead candidates. Due to recent advances in engineering mutant Rubisco in a model plant, tobacco, using plastid transformation, as described in Example 10, in the work described here a Candidate Mutant (#4) identified as a promising Synechococcus mutant was tested directly in the test flowering plant (tobacco) without undertaking the intermediate step, shown in
(192) In Synechococcus sp. PCC7942, the Rubisco genes (rbcLS and rbcSS) are coded by a single operon on the chromosome (of which there are typically 5 chromosome copies per cell) and, analogous to E. coli, this cyanobacterial strain is naturally competent and can be genetically transformed either by targeted modifications to its chromosome (e.g. gene deletion, gene substitution) by homologous recombination or by stable retainment of plasmid shuttle vectors within its cells.
(193) Synechococcus PCC7942 mutant strains have been developed (see Emlyn-Jones et al., 2006 and Price et al., 1993 for examples). A Synechococcus PCC7942 mutant strain in which the chromosomal rbcLS-rbcSS operon is deleted (i.e. a 7942rbcLS strain) may be used to facilitate the re-introduction of mutant Rubisco rbcLS-rbcSS genes. As Synechococcus sp. PCC7942 cannot grow heterotrophically (i.e. on an external carbon source) and requires a Rubisco for growth, 7942rbcLS strains can be generated by: (1) introducing a second Rubisco gene (e.g. rbcM coding for the structurally different Form II Rubisco homodimer (L.sub.2) from the bacterium Rhodospirillum rubrum or the native L.sub.8S.sub.8 PCC7942 Rubisco) on a plasmid shuttle vector into Synechococcus sp. PCC7942 then (2) homologously recombining in an antibiotic resistance gene km.sup.R to replace the rbcLS-rbcSS operon in each chromosome copy (i.e. so the mutation can be fully segregated). Synechococcus PCC7942 cells transformed with rbcM expressed on a shuttle vector can be subsequently transformed with another plasmid to homologously replace the chromosomal rbcLS-rbcSS coding region with a km.sup.R gene. Upon isolation of completely segregated rbcLS-rbcSS::km.sup.R transformants (i.e. all the chromosomes have rbcLS-rbcSS replaced with km.sup.R) the PCC7942rbcLS cells may be used to homologously re-introduce the mutated rbcLS* and rbcSS genes and the cells cured of the shuttle vector.
(194) Using established techniques (e.g. see Emlyn-Jones et al., 2006), the phenotype of the transformed cells may then be comprehensively characterised biochemically, for example, assessing whether the mutated Rubisco LSU subunits are readily folded and assembled properly, and physiologically, for example, assessing whether there are differences in photosynthetic capacity, inorganic carbon partitioning or growth rate.
Example 10
Generation of Rubisco Plastome Transformants in Tobacco
(195) A strategy for re-engineering Rubisco which is applicable to higher plant Rubiscos is by plastome transformation using mutated rbcL* genes. The plastome of tobacco is readily transformable (Andrews and Whitney, 2003) and was used as a model to conduct proof-of-principle tests for the transformation of other plants.
(196) The most promising Synechococcus Rubisco Mutant #4 was used as the initial test case. The component single mutants, T23G (Mutant #22-1A), and K81R (Mutant #23-1A), and the more complex mutant #6a (T23G, Y25W, E51I, K81R) were also tested. The nucleotide and protein SEQ ID NOS for tobacco Mutants #4, #22-1A, #23-1A and #6a are given in Table 6. As these residues are Candidate Residues (see Table 2) these same mutations were used in tobacco as in Synechococcus.
(197) Transplastomic tobacco lines are available which allow more rapid screening of the kinetics of mutated tobacco Rubiscos than the traditional lengthy chloroplast transformation methods in which the native rbcL genes in the plastome are substituted with mutated versions (Andrews and Whitney, 2003 (supra)) using the biolistic transformation technique described in Svab and Maliga (1993). In a recent improvement to the method for transforming mutated or foreign rbcL genes back into the tobacco plastome the native rbcL gene was replaced with the rbcM gene from Rhodospirillum rubrum and the aadA selectable maker gene (coding for spectinomycin resistance) and then the aadA gene removed to produce marker-free (aadA) tobacco-rubrum transplastomic lines. (Whitney and Sharwood, 2008, the entire contents of which are incorporated by reference) These lines were generated by biolistically transforming the plastome of wild-type tobacco (Nicotiana tabacum L. cv Petit Havana [N,N]) with plasmid p.sup.cmtrLA (Genbank accession number AY827488). The aadA gene in the transformants is flanked by 34-bp loxP sites that enable its excision by CRE-lox recombination. To excise aadA, leaves from a p.sup.cmtrLA-transformed line were biolistically bombarded with the CRE expressing plasmid pKO27 (Corneille et al., 2001). The bombarded leaves were dissected (0.5 cm.sup.2) and propagated in kanamycin-selective medium (agar-solidified Murashige-Skoog salts containing 3% (w/v) sucrose, 15 g ml.sup.1 kanamycin and hormones (Svab and Maliga, 1993). The first plantlets to emerge from the bleached bombarded leaf sections were transferred to MS medium (selective medium without kanamycin or hormones) and loss of aadA confirmed by routine PCR analyses.
(198) The aadA tobacco-rubrum lines permit re-use of the aadA marker gene for subsequently transforming its plastome, such as re-transforming back in mutated tobacco rbcL* variants to replace rbcM. The transformation efficiency of replacing rbcM in the aadA tobacco-rubrum with variant rbcL* genes is 3 to 10-fold higher than transforming wild-type tobacco, and is immune to unwanted recombination events that may occur when transforming rbcL* genes into wild-type tobacco plastomes. The method allowed the production of transformants containing only rbcL*-transformed plastome copies (i.e. homoplastic transformants) within 6 to 8 weeks as the plastome copies containing the rbcM gene were rapidly eliminated. As the R. rubrum Rubisco is a small homodimer of LSUs (L.sub.2, 100 kDa), rbcL*-transformed lines producing the larger form I L.sub.8S.sub.8 Rubiscos (520 kDa) may be identified by separating the soluble leaf protein by non-denaturing polyacrylamide gel electrophoresis as described in Whitney and Sharwood (2007) and homoplasmicity measured by the absence of L.sub.2 Rubisco.
(199) This system is potentially adaptable to other plants where plastid transformation has been reported. The genetic transformation of plastids in a variety of different plants has been reported, for example in Koop et al. (1997), the entire contents of which is incorporated herein by reference.
(200) A straightforward variation of the transformation system (Whitney and Sharwood, 2008) offers the potential to rapidly test the kinetics of predicted mutant Rubiscos of other flowering plants or crops of interest without performing a full plastid transformation in the plant of interest itself. Sharwood et al. (2008), the entire contents of which is incorporated herein by reference, have shown that the rbcL gene of another plant (sunflower) can be successfully transformed into the aadA tobacco-rubrum line to produce active chimeric Rubisco consisting of sunflower LSUs and tobacco SSUs, which can be isolated and characterized kinetically. Its kinetic parameters mimic those of sunflower Rubisco. Use of this method would allow development and optimisation of Rubisco phenotype for a range of mutants of different plants of interest, using the convenient tobacco transformation model.
(201) Mutations for Mutants #4, #22-1A, #23-1A and #6a were made to the wild-type tobacco (Nicotiana tabacum) plastome rbcL coding sequence using the QuickChange multi-mutagenesis kit (Stratagene) using appropriate primers, and introduced into the tobacco plastome transforming plasmid pLEV1 where selection of transformants is facilitated by the incorporation of a promoter-less aadA gene downstream of rbcL (Whitney et al., 1999). The pLEV1-derived transforming plasmids coding the mutagenized tobacco rbcL* copies with the nucleotide sequences coding for Mutants #4, #22-1A, #23-1A and #6a, as well as wild type, were biolistically transformed into a aadA tobacco-rubrum line and spectinomoycin-resistant plantlets selected as described (Svab and Maliga, 1993). Transformants where the rbcM had been replaced with the rbcL* or rbcL genes were identified by the production of L.sub.8S.sub.8 Rubisco using non-denaturing polyacrylamide gel electrophoresis. A sample gel for transformants for wildtype, and Mutants #4 and #23-1A is shown in
Example 11
Biochemical Characterization of Tobacco Rubisco Transformants
(202) The following methods were used to extract and purify Rubisco expressed in homoplasmic transplastomic and wild-type tobacco cells to carry out kinetic analyses.
(203) Radiolabeled .sup.14CO.sub.2 fixation assays were used to measure the substrate saturated turnover rate (k.sub.c.sup.cat) and Michaelis constant for CO.sub.2 at 0% (K.sub.c.sup.0%) or 21% O.sub.2 (K.sub.c.sup.air) using soluble leaf protein extract. Leaf discs (1 cm.sup.2) were taken during the photoperiod and extracted on ice using glass homogenisors (Wheaton, USA) into 0.8 ml CO.sub.2-free extraction buffer (50 mM Bicine-NaOH, pH8.0, 1 mM EDTA, 2 mM DTT, 1% (v/v) plant protease inhibitor cocktail (Sigma-Aldrich) and 1% (w/v) PVPP). The sample was centrifuged (36,000 g, 5 min, 4 C.) and the soluble protein incubated (activated) with NaHCO.sub.3 and MgCl.sub.2 (15 mM each) for 15 min and used to measure Rubisco content in duplicate aliquots incubated with 40 M of .sup.142-CABP and the amount of Rubisco-bound-.sup.142-CABP recovered by gel filtration (Ruuska et al., 1998), or used to measure K.sub.c.sup.0% and K.sub.c.sup.air at 25 C., pH 8.0 using .sup.14CO.sub.2 fixation assays (Andrews 1988; Whitney and Sharwood, 2007). To confirm the samples used were homoplasmic (i.e. only contain plastome copies transformed with the rbcL* genes and none with rbcM) the protein was also separated on non-denaturing polyacrylamide gels (
(204) The kinetic assays were initiated by adding activated soluble protein extract into septum capped scintillation vials containing either N.sub.2 (for K.sub.c.sup.0%) or CO.sub.2-free air (for K.sub.c.sup.air) equilibrated assay buffer (100 mM Bicine-NaOH, 15 mM MgCl.sub.2, 0.6 mM ribulose-P.sub.2, 0.1 mg.Math.ml.sup.1 carbonic anhydrase) containing 0 to 90 M .sup.14CO.sub.2. Ribulose-P.sub.2 was synthesized according to (Kane et al., 1998). The assays were stopped after 1 min with 0.5 volumes of 25% (v/v) formic acid, the reactions dried at 80 C. and the residue dissolved in water, two volumes of scintillant were added, vortexed, and .sup.14C measured by scintillation counting. K.sub.c.sup.0% and K.sub.c.sup.air were calculated from the Michaelis-Menten plot of carboxylation rate versus [CO.sub.2].
(205) Measurements of k.sub.c.sup.cat were calculated using comparable .sup.14CO.sub.2 fixation assays containing 15 mM NaH.sup.14CO.sub.3 and dividing the substrate saturated carboxylation rate, V.sub.c.sup.max, by the concentration of Rubisco active sites measured by .sup.142-CABP binding (see above).
(206) Specificity measurements were done using purified Rubisco. Soluble leaf protein was extracted as described above and 0.2 mL chromatographed through a Superdex 200HR 10/30 column equilibrated with specificity buffer (30 mM Triethanolamine pH 8.3, 30 mM Mg acetate) using an KTA explorer system (APBiotech). The three peak fractions (0.3 ml) containing L.sub.8S.sub.8 Rubisco were pooled (100-150 pmol L-subunit sites) and used to measure CO.sub.2/O.sub.2 specificity at 25 C. as described (Kane et al., 1994) after equilibrating with an atmosphere containing 500 ppm CO.sub.2 in O.sub.2 controlled using three Wsthoff precision gas-mixing pumps.
(207) A summary of the results obtained is presented in Tables 6 and 7. For specificity, average results are given as % change compared with wild type: a positive value represents an improvement. For kinetics, the values and % change compared with wild type are given; a positive % change for the catalytic rate k.sup.c.sub.cat and the catalytic efficiency k.sup.c.sub.cat/K.sub.c represents an improvement, while a negative % change for the Michaelis constant K.sub.c (in air or 0% O.sub.2) represents an improvement.
(208) TABLE-US-00006 TABLE 6 Specificity and kinetic results measured in air (O.sub.2 21%) for Tobacco mutant Rubiscos. kinetics S.sub.c/o k.sup.c.sub.cat/K.sub.c.sup.air Mutant # Average.sup.a k.sup.c.sub.cat/(s.sup.1).sup.a K.sub.c.sup.air (M) (s.sup.1 mM.sup.1) (modified Seq (% change (% change (% change (% change from residue(s)) ID.sup.b from wildtype) from wildtype) from wildtype) wildtype) wild-type 71, 72 81.0 1.6 3.2 0.1 24.1 132 #4 (T23G, K81R) 73, 74 77.6 2.8 3.6 0.1 20.4 176 (4%) (13%) (15%) (33%) #23-1A (K81R) 75, 76 76.1 3.4 3.5 0.1 23.8 147 (6%) (9%) (1%) (11%) #22-1A (T23G) 77, 78 76.2 1.0 3.5 0.1 20.5 173 (6%) (9%) (15%) (31%) #6a (T23G, Y25W, 79, 80 66.4 4.4 3.5 0.1 19.4 179 E51I, K81R) (18%) (9%) (20%) (36%) .sup.aAverage or calculated value from measurements made on 3 separate protein assays S.D. .sup.bSequence ID numbers correspond to those in the sequence listing; the first number is the nucleotide SEQ ID NO and the second number is the protein SEQ ID NO.
(209) The results in Table 6 for measurements in air (21% O.sub.2) show significant differences in specificity and kinetic parameters for the different mutants. Significant improvements over wild type are evident in both k.sup.c.sub.cat and K.sub.c.sup.air values for Mutant #4 (T23G, K81R) of 13% and 15%, respectively, producing an overall improvement in catalytic efficiency of 33%. These can be compared with k.sup.c.sub.cat values of 9%, i.e. less improvement, and K.sub.c.sup.air values of 1%, 15% and 20%, i.e. minimal (#23-1A) or similar improvement, for the single Mutants #23-1A (K81R) and #22-1A (T23G), and Mutant #6a (T23G, Y25W, E51I, K81R) which includes these two mutations. These translate into comparable improvement in catalytic efficiency of 31 and 36%, respectively, for #22-1A and #6a, but only 11% for #23-1A. The kinetic results (triplicate assays) were obtained using soluble leaf protein extracted from two or three different transformants for wildtype and mutants. The replicate isolated Rubiscos gave similar results, contributing to the small errors of the measurements.
(210) The specificity results for Mutant #4 show a modest impairment (4%) compared with wild type, while both the single mutants, #22-1A and #23-1A show slightly greater impairment (6%). However, the specificity for Mutant #6a is greatly impaired (18%. These values were obtained from triplicate measurements, as shown in Table 6, using different purified Rubiscos from two wildtype and two of each of the mutant transformants.
(211) Consideration of the specificity and kinetic results together indicates that the improvement in carboxylation efficiency has been matched by a comparable improvement in oxygenation efficiency for Mutant #4, but much greater improvement for #6a. As an initial test of whether changes in oxygenation efficiency might be due to changes in k.sup.o.sub.cat or K.sub.o, K.sub.c.sup.0% was measured in 0% O.sub.2. These results, given in Table 7, show that all the mutants have poorer K.sub.c.sup.0% values than wild type, 10, 30, 18 and 5%, respectively, for #4, #23-1A, #22-1A and #6a, but that the deterioration in the K.sub.o (i.e. K.sub.i(O.sub.2)) values are much greater87%, 92%, 133% and 89% higher, respectively. The significantly higher K.sub.o values for the mutants compared with wild type indicate that the mutants are less inhibited by O.sub.2. Values for K.sub.o were calculated according to Whitney et al. (1999). The improved oxygenation efficiency is explained by the values for k.sup.o.sub.cat shown in Table 7, which show significant improvements of 201%, 173%, 234% and 241%, respectively for mutants #4, #23-1A, #22-1A and #6a compared with wildtype.
(212) TABLE-US-00007 TABLE 7 Specificity and kinetic results measured in 0% O.sub.2 for Tobacco mutant Rubiscos. kinetics k.sup.c.sub.cat (s.sup.1) K.sub.c.sup.0% (M) k.sup.c.sub.cat/K.sub.c.sup.0% k.sup.o.sub.cat(s.sup.1) (% change (% change (s.sup.1 mM.sup.1) (% change Mutant # (modified from wild from wild (% change from K.sub.i(O.sub.2) from wild residue(s)) type) type) wild type) (M) type) wild-type 3.2 0.1 .sup.a 12.2 0.4 .sup.b 262 259 23 .sup.b 0.83 .sup.d #4 (T23G, K81R) 3.6 0.1 .sup.a 13.4 0.5 .sup.b 269 485 56 .sup.b 1.67 .sup.d (13%) (10%) (3%) (87%) (201%) #23-1A (K81R) 3.5 0.1 .sup.a 15.8 0.4 .sup.a 222 497 70 .sup.a 1.44 .sup.d (9%) (30%) (15%) (92%) (173%) #22-1A (T23G) 3.5 0.1 .sup.a 14.4 0.7 .sup.c 243 603 120 .sup.c 1.94 .sup.d (9%) (18%) (7%) (133%) (234%) #6a (T23G, Y25W, 3.5 0.1 .sup.a 12.8 0.5 .sup.a 273 490 68 .sup.a 2.00 .sup.d E51I, K81R) (9%) (5%) (4%) (89%) (241%) .sup.a Average or calculated value from measurements made on 3 separate protein assays S.D. .sup.b Calculated value from measurements made on 4 separate protein assays S.D. .sup.c Calculated value from measurements made on 2 separate protein assays S.D. .sup.d Calculated using the equation S.sub.c/o = (k.sup.c.sub.cat/K.sub.c)/(k.sup.o.sub.cat/K.sub.o).
(213) In summary, the results show that Mutant #4 retains its superior properties in tobacco, although they are expressed more as improvements in catalytic efficiency than as roughly equal improvements in specificity and efficiency, as is the case for the Synechococcus Mutant #4. Although Mutant #6a shows comparable catalytic efficiency to Mutant #4, it has significantly impaired specificity S.sub.c/o.
Example 12
Simulation of Phenotype of Tobacco Rubisco Transformants
(214) Results for Synechococcus and tobacco mutants in Tables 5 and 6 show a range of levels of improvements in key kinetic parameters (S.sub.c/o, k.sup.c.sub.cat, K.sub.c) compared with wildtype. These parameters are also expected to show different temperature dependence compared with wildtype. Thus, for each branch of photosynthetic organism the mutants show different profiles of improvements in the kinetic parameters, that is there is a range of percentage changes in the components of the parameter set {S.sub.c/o, k.sup.c.sub.cat, K.sub.c} for a given II) mutant. This parameter set is termed Rubisco phenotype.
(215) The rate of CO.sub.2 assimilation in C.sub.3 plants reflects Rubisco's kinetic properties and content in the plant (Farquhar et al., 1980; von Caemmerer, 2000). This correlation has been validated for mutant, foreign-transformant or differently expressed tobacco Rubiscos. Methods for simulating CO.sub.2 assimilation under variable growth conditions, such as CO.sub.2 concentration, water, nutrients, temperature and light intensity, have been used to predict photosynthetic performance in leaves using Rubisco kinetic data for a range of plants (von Caemmerer, 2000, the entire contents of which are incorporated herein by reference). An analogous study of performance within the leaf canopy of the whole plant has been reported (Zhu et al., 2004, the entire contents of which are incorporated herein by reference). By these means it is possible to predict how a plant with a given Rubisco and Rubisco phenotype, as determined by kinetic measurements in vitro, would perform under different sets of growth conditions in planta. There is also extensive experimental evidence that increases in leaf photosynthesis are translated into increases in biomass and crop yield. Together these studies have shown that increased efficiency of photosynthesis from improved Rubiscos benefits plant growth, improves water-use efficiency and increases the C/N ratio.
(216)
(217)
(218)
(219) In
(220) Plots in
(221) Plots in
(222) Plots in
(223) In
(224) The plots for mutant #4 in
(225) The effectiveness of the double mutation in tobacco mutant #4 in enhancing carbon assimilation, may be seen from analogous modelling of the kinetic data (Tables 6 and 7) of mutant #23-1A, which contains one of its mutations (K81R). Mutant #23-1A has been modelled with the kinetic profile of {S.sub.c/o=6%; k.sup.c.sub.cat=+9%; K.sub.c=+30%; K.sub.o=+92%). Compared with the results in
(226) The third set of predictions, shown in
(227) In summary, this analysis demonstrates that particular improved mutant Rubisco phenotypes may be more suitable for particular growing conditions. The methods described herein provide the capability to produce mutant Rubiscos with kinetic profiles optimized for particular preferred plant growing conditions. It is expected that particular Rubisco phenotypes will show a range of benefits in the phenotype of the plant, such as faster growth rate and shorter time to flowering, lower requirements for water and/or nitrogen fertilizer, or ability to grow efficiently at higher temperatures. These are expected to translate into increased productivity of plant growth (and grain production) under growth conditions such as drought and/or heat stressed environments (hot arid climates), nutrient-poor soils, or low-light conditions with a short growing season (higher latitudes). Accordingly, the methods described in this example allow the prediction of the rate of CO.sub.2 assimilation by a plant expressing a Rubisco, such as a mutant Rubisco produced by methods described herein, by modelling based on parameters for Rubisco functional properties obtained by in vitro measurements. The methods described in this example thus allow the prediction of plant performance under both optimal growth conditions and sub-optimal growth conditions, such as where illumination, water, or nitrogen are limiting, or where temperature is elevated.
Example 13
Experimental Characterization of Phenotype of Tobacco Rubisco Transformants
(228) Homoplasmic transplastomic lines and wild-type tobacco controls are grown to maturity in soil in controlled environment growth chambers or glass houses and standard physiological tests are undertaken (as described in Whitney et al., 1999; Whitney and Andrews, 2001; and Whitney et al., 2001, the entire contents of which are incorporated herein by reference). Tests comprise an assessment of growth rate, biomass production, leaf index area, Rubisco mRNA and protein content, carbon to nitrogen rations of plant leaves, starch content, and photosynthetic performance.
(229) These tests are performed under both optimum (high light, non limiting water and nutrients, temperature set at 25 C.) or resource-limiting conditions (e.g. reduced N, water or CO.sub.2) or elevated temperatures. Tests under optimum conditions assess differences in various growth (e.g. exponential growth rate, biomass, leaf area index), biochemical (e.g. Rubisco mRNA and enzyme content, leaf C:N ratio, starch content) and metabolite (e.g. RuBP:PGA) indices. Tests under limiting conditions assess the performance of the mutants under growth conditions mimicking environmental stress, such as drought and/or heat stress (hot arid climates), nutrient-poor soils, or low-light conditions with a short growing season (higher latitudes), as detailed in Example 12. Gas-exchange measurements of photosynthetic performance at varying CO.sub.2 and light levels are performed on the leaves of comparable fully expanded leaves from younger plants during their exponential growth phase (20-30 cm tall). The photosynthetic rates are quantified relative to Rubisco active-site content in the assayed leaves.
(230) As demonstrated by the models in Example 12, and reported in the literature (Parry et al., 2005), improved photosynthesis may be obtained in different limiting growth conditions by different mutant Rubisco phenotypes.
REFERENCES
(231) Altschul S F, Madden T L, Schaffer A A, Zhang J, Zhang Z, Miller W, Lipman D J. (1997) Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res. 25, 3389-3402. Andrews T J. (1988) Catalysis by cyanobacterial ribulosebisphosphate carboxylase large subunits in the complete absence of small subunits. J. Biol. Chem. 263, 12213-12220. Andrews T J, Whitney S M. (2003) Manipulating ribulose bisphosphate carboxylase/oxygenase in the chloroplasts of higher plants. Arch. Biochem. Biophys. 414, 159-169. Baker R T, Catanzariti A M, Karunasekara Y, Soboleva T A, Sharwood R, Whitney S, Board P G. (2005) Using deubiquitylating enzymes as research tools. Methods Enzymol. 398, 540-554. Case D A, Darden T A, Cheatham III, T E, Simmerling C L, Wang J, Duke R E, Luo R, Merz K M, Pearlman D A, Crowley M, Walker R C, Zhang W, Wang B, Hayik S, Roitberg A, Seabra G, Wong K F, Paesani F, Wu X, Brozell S, Tsui V, Gohlke H, Yang L, Tan C, Mongan J, Hornak V, Cui G, Beroza P, Mathews D H, Schafmeister C, Ross W S, Kollman P A. (2006) AMBER 9, University of California, San Francisco. Ciniglia C, Yoon H S, Pollio A, Pinto G, Bhattacharya D. (2004) Hidden biodiversity of the extremophilic Cyanidiales red algae. Mol. Ecol. 13, 1827-1838. Corneille S, Lutz K, Svab Z, Maliga P. (2001) Efficient elimination of selectable marker genes from the plastid genome by the CRE-lox site-specific recombination system. Plant J. 27, 171-178. Cummins P L. (1996) Molecular Orbital Programs for Simulations (MOPS), Australian National University, Canberra. Cummins P L, Gready J E. (1997) A coupled semiempirical molecular orbital and molecular mechanical model (QM/MM) for organic molecules in aqueous solution. J. Comput. Chem. 18, 1496-1512. Cummins P L, Gready J E. (1998) A molecular dynamics and free energy perturbation (MD/FEP) study of the hydride-ion transfer step in dihydrofolate reductase using a combined quantum and molecular mechanical (QM/MM) model. J. Comput. Chem. 19, 977-988. Cummins P L, Gready J E. (1999) Coupled semiempirical quantum mechanics and molecular mechanics model (QM/MM) calculations on the aqueous solvation energies of ionised molecules. J. Comput. Chem. 20, 1028-1038. Cummins P L, Gready J E. (2003) Computational methods for the study of enzymic reaction mechanisms II: An overlapping mechanically embedded method for hybrid semiempirical-QM/MM calculations. THEOCHEM 632, 245-255. Cummins P L, Gready J E. (2005) Computational methods for the study of enzymic reaction mechanisms III: a perturbation plus QM/MM approach for calculating relative free energies of protonation. J. Comput. Chem. 26, 561-568. Cummins P L, Rostov I, Gready J E. (2007) Calculation of a complete enzymic reaction surface: reaction and activation free energies for hydride-ion transfer in dihydrofolate reductase. J. Chem. Theor. Comput. 3, 1203-1211. Emlyn-Jones D, Woodger F J, Price G P, Whitney S M. (2006) RbcX can function as a rubisco chaperonin, but is non-essential in Synechococcus PCC7942. Plant Cell Physiol. 47, 1630-1640. Evans J R, Austin R B. (1986) The specific activity of ribulose-1,5-bisphosphate carboxylase in relation to genotype in wheat. Planta 167, 344-350. Farquhar G D, von Caemmerer S, Berry J A (1980) A biochemical model of photosynthetic CO.sub.2 assimilation in leaves of C.sub.3 species. Planta 149, 78-90. Fersht A. (1998) Structure and mechanism in protein science: guide to enzyme catalysis and protein folding. W. H. Freeman & Co., 1998. Frey P A, Hegeman A. (2007) Enzymatic reaction mechanisms. Oxford University Press USA, 2007. Frisch M J, (80 co-authors) and Pople J A. (2004) Gaussian 03, Revision C.02, Gaussian Inc., Wallingford, Conn. Galms J, Flexas J, Keys A J, Cifre J, Mitchell R A C, Madgwick P J, Haslam R P, Medrano H, Parry M A J. (2005) Rubisco specificity factor tends to be larger in plant species from drier habitats and in species with persistent leaves. Plant Cell Environ. 28, 571-579. Gready J E, Rostov I, Cummins P L. (2006) Simulations of enzyme reaction mechanisms in active sites: accounting for an environment which is much more than a solvent perturbation. In Modelling Molecular Structure and Reactivity in Biological Systems, K. J. Naidoo, M. Hann, J. Gao, M. Field and J, Brady, eds, Royal Society of Chemistry, London, pp. 101-118. Hall T A. (1999) BioEdit: a user-friendly biological sequence alignment editor and analysis program for Windows 95/98/NT. Nucl. Acids Symp. Ser. 41, 95. Kane H J, Viil J, Entsch B, Paul B K, Morell M K, Andrews T J. (1994) An improved method for measuring the CO.sub.2/O.sub.2 specificity of ribulosebisphosphate carboxylase-oxygenase. Aust. J. Plant Physiol. 21, 449-461. Kane H J, Wilkin J M, Portis A R, Andrews T J. (1998) Potent inhibition of ribulose-bisphosphate carboxylase by an oxidized impurity in ribulose-1,5-bisphosphate. Plant Physiol. 117, 1059-1069. Kannappan B, Gready J E. (2008) Redefinition of Rubisco carboxylase reaction reveals origin of water for hydration and new roles for active-site residues. J. Am. Chem. Soc. 130, 15063-15080. Koop H-U, Herz S, Golds T J, Nickelsen J. (2007) The genetic transformation of plastids. Topic in Current Genetics. DOI 10.1007/4735_2007_0225/Published online: 15 May 2007. Mueller-Cajar O, Morell M, Whitney S M. (2007) Directed evolution of Rubisco in Escherichia coli reveals a specificity-determining hydrogen bond in the Form II enzyme. Biochemistry, in press, September 2007. Parry M A, Andralojc P J, Mitchell R A, Madgwick P J, Keys A J. (2003) Manipulation of Rubisco: the amount, activity, function and regulation. J. Exptl Bot. 54, 1321-1333. Parry M A J, Flexas J, Medrano H. (2005) Prospects for crop production under drought: research priorities and future directions. Ann. Appl. Biol. 147, 211-226. Parry M A J, Madgwick P J, Carvalho H U, Andralojc P J. (2007) Prospects from increasing photosynthesis by overcoming the limitations of Rubisco. J. Ag. Sci. 145, 31-43. Price G D, Howitt S M, Harrison K, Badger M R. (1993) Analysis of a genomic DNA region from the cyanobacterium Synechococcus sp. strain PCC7942 involved in carboxysome assembly and function. J. Bacteriol. 175, 2871-2879. Ruuska S, Andrews T J, Badger M R, Hudson G S, Laisk A, Price G D, von Caemmerer S. (1998). The interplay between limiting processes in C-3 photosynthesis studied by rapid-response gas exchange using transgenic tobacco impaired in photosynthesis. Aust. J. Plant Physiol. 25, 859-870. Sharwood R E, von Caemmerer S, Maliga P, Whitney S M (2008) The catalytic properties of hybrid rubisco comprising tobacco small and sunflower large subunits mirror the kinetically equivalent source Rubiscos and can support tobacco growth. Plant Physiol. 146, 83-96. Svab Z, Maliga P. (1993) High-frequency plastid transformation in tobacco by selection for a chimeric aadA gene. Proc. Natl Acad. Sci. USA 90, 913-917. Spreitzer R J, Salvucci M E. (2002) Rubisco: structure, regulatory interactions, and possibilities for a better enzyme. Annu. Rev. Plant. Biol. 53, 449-475. Thompson J D, Higgins D G, Gibson T J. (1994). CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucleic Acids Res. 22, 4673-4680. von Caemmerer S. (2000) Biochemical Models of Leaf Photosynthesis, CSIRO Publishing, ISBN 0 643 06379 X. Whitney S M, von Caemmerer S, Hudson G S, Andrews T J. (1999) Directed mutation of the Rubisco large subunit of tobacco influences photorespiration and growth. Plant Physiol. 121, 579-588. Whitney S M, Andrews T J. (2001) Plastome-encoded bacterial ribulose-1,5-bisphosphate carboxylase/oxygenase (Rubisco) supports photosynthesis and growth in tobacco. Proc. Natl Acad. Sci. USA 98, 14738-14743. Whitney S M, Baldet P, Hudson G S, Andrews T J. (2001). Form I Rubiscos from non-green algae are expressed abundantly but not assembled in tobacco chloroplasts. Plant J. 26, 535-547. Whitney S M, Sharwood R E. (2007) Linked Rubisco subunits can assemble into functional oligomers without impeding catalytic performance. J. Biol. Chem. 282, 3809-3818. Whitney S M, Sharwood R E. (2008) Construction of a tobacco master line to improve Rubisco engineering in chloroplasts. J. Exp. Bot. 59, 1909-1921. Zhu X G, Portis A R, Long S P. (2004) Would transformation of C.sub.3 crop plants with foreign Rubisco increase productivity? A computational analysis extrapolating from kinetic properties to canopy photosynthesis. Plant, Cell Environ. 27, 155-165.