Membrane Transport Protein and Uses Thereof
20220275406 · 2022-09-01
Inventors
- Steven KELLY (Botley, Oxfordshire, GB)
- Michael NIKLAUS (Botley, Oxfordshire, GB)
- Oliver MATTINSON (Botley, Oxfordshire, GB)
- Basel ABU-JAMOUS (Botley, Oxfordshire, GB)
Cpc classification
C12N15/8218
CHEMISTRY; METALLURGY
C12P7/46
CHEMISTRY; METALLURGY
C12P7/40
CHEMISTRY; METALLURGY
C12N15/8243
CHEMISTRY; METALLURGY
C12N15/8245
CHEMISTRY; METALLURGY
C12N15/625
CHEMISTRY; METALLURGY
International classification
C12P7/40
CHEMISTRY; METALLURGY
C12N15/82
CHEMISTRY; METALLURGY
Abstract
Recombinant cells expressing membrane transport proteins are provided, along with methods for their use in various applications. These applications include, without limitation, industrial biotechnology and the reproduction/emulation of biochemical pathways or components thereof (e.g. photosynthetic pathways or components thereof). The recombinant cells may be provided as a component of a transgenic organism (e.g. a transgenic plant).
Claims
1. A recombinant cell engineered to overexpress a UPF0114 family protein as compared to a corresponding wild-type form of the cell, wherein the UPF0114 family protein is encoded by a recombinant nucleic acid sequence stably or transiently introduced into the recombinant cell, and is capable of transporting carboxylates and/or carboxylic acids across a membrane of the recombinant cell.
2. The recombinant cell of claim 1, wherein: the carboxylates comprise any one of: (i) monocarboxylates; (ii) dicarboxylates; or (iii) tricarboxylates; or (iv) monocarboxylates and dicarboxylates; or (v) monocarboxylates and tricarboxylates; or (vi) dicarboxylates and tricarboxylates; or (vii) monocarboxylates, dicarboxylates and tricarboxylates; the carboxylic acids comprise any one of: (i) monocarboxylic acids; (ii) dicarboxylic acids; or (iii) tricarboxylic acids; or (iv) monocarboxylic acids and dicarboxylic acids; or (v) monocarboxylic acids and tricarboxylic acids; or (vi) dicarboxylic acids and tricarboxylic acids; or (vii) monocarboxylic acids, dicarboxylic acids and tricarboxylic acids.
3. The recombinant cell of claim 1, wherein: (i) the corresponding wild-type form of the cell does not express the UPF0114 family protein; or (ii) the UPF0114 family protein is exogenous to the recombinant cell; or (iii) the carboxylates comprise any one or more of: malate, pyruvate, succinate, fumarate, α-ketoglutarate, citrate, glycerate-3-phosphate, phosphoenolpyruvate; or (iv) the carboxylic acids comprise any one or more of: malic acid, pyruvic acid, succinic acid, fumaric acid, α-ketoglutaric acid, citric acid, 3-phosphoglyceric acid, phosphoenolpyruvic acid.
4. (canceled)
5. (canceled)
6. The recombinant cell of claim 1, wherein the UPF0114 family protein is capable of bidirectional transport of the carboxylates and/or carboxylic acids across the membrane.
7. (canceled)
8. The recombinant cell of claim 1, wherein the membrane is selected from a cytoplasmic membrane, a cell-internal membrane, a chloroplast membrane, an inner chloroplast envelope membrane, an outer chloroplast envelope membrane, a chloroplast internal membrane, a thylakoid membrane, a peroxisomal membrane, a mitochondrial membrane, an inner mitochondrial membrane, or an outer mitochondrial membrane.
9. The recombinant cell of claim 1, wherein the UPF0114 family protein is capable of transporting carboxylates and/or carboxylic acids across a membrane of the recombinant cell against a concentration gradient existing on one side of the membrane.
10. (canceled)
11. The recombinant cell of claim 1, wherein the recombinant cell is: (i) a prokaryotic, eukaryotic, archaeal, plant, algal, bacterial, yeast, fungal, animal, mammalian, or synthetic cell; or (ii) a recombinant Corynebacterium species, a recombinant Xanthomonas species, a recombinant Escherichia species, a recombinant Bacillus species, a recombinant Clostridium species, a recombinant Lactobacillus species, a recombinant Lactococcus species, a recombinant Streptococcus species, a recombinant Actinomycetes species, a recombinant Streptomyces species, or a recombinant Actinobacillus species; or (iii) a recombinant Escherichia coli cell; or (iv) a plant cell or an algal cell; or (v) a plant cell that is : a vascular sheath cell, a bundle sheath cell, a mestome sheath cell, or a mesophyll cell; of a C.sub.3 photosynthetic plant, a CAM photosynthetic plant, or a C.sub.4 photosynthetic plant.
12. (canceled)
13. (canceled)
14. The recombinant cell of claim 11, wherein: the carboxylates comprise any one or more of: succinate, pyruvate, fumarate, malate, citrate, phosphoenolpyruvate, α-ketoglutarate, 3-phosphoglycerate; or the carboxylic acids comprise any one or more of: succinic acid, pyruvic acid, fumaric acid, malic acid, citric acid, phosphoenolpyruvic acid, α-ketoglutaric acid, 3-phosphoglyceric acid.
15. (canceled)
16. The recombinant cell of claim 11, wherein the recombinant cell is a plant cell and the plant cell is: a vascular sheath cell, a bundle sheath cell, a mestome sheath cell, or a mesophyll cell; of a C.sub.3 photosynthetic plant, a CAM photosynthetic plant, or a C.sub.4 photosynthetic plant.
17. The recombinant cell of claim 11, wherein the recombinant cell is a plant cell, and: the carboxylates comprise malate and/or pyruvate; or the carboxylic acids comprise malic acid and/or pyruvic acid.
18. The recombinant cell of claim 17, wherein: (i) the UPF0114 family protein is capable of uptaking malate and/or malic acid into the recombinant cell and exporting pyruvate and/or pyruvic acid from the recombinant cell: or (ii) the UPF0114 family protein is capable of uptaking malate and/or malic acid into the recombinant cell and exporting pyruvate and/or pyruvic acid from the recombinant cell against a concentration gradient.
19. (canceled)
20. The recombinant cell of claim 11, wherein the recombinant cell is a plant cell and the recombinant nucleic acid sequence comprises a sequence encoding a targeting peptide targeting the UPF0114 family protein to a chloroplast membrane, a cytoplasmic membrane, a peroxisomal membrane, or a mitochondrial membrane.
21. The recombinant cell of claim 1, wherein the UPF0114 family protein comprises: (i) a PFAM protein domain UPF0114 (PF03350) amino acid sequence as defined in any one of SEQ ID NOs: 28-37; or (ii) a PFAM protein domain UPF0114 (PF03350) amino acid sequence having at least: 70%, 75%, 80%, 85%, 87%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%; sequence identity to any one of SEQ ID NOs: 28-37; or (iii) a homolog, analog, ortholog or paralog of the PFAM protein domain UPF0114 (PF03350) amino acid sequence of (i) or (ii).
22. (canceled)
23. The recombinant cell of claim 11, wherein the recombinant cell is a plant cell and the plant cell is a genus Oryza plant (e.g. a rice plant), a Oryza sativa or Oryza glaberrima plant, or from a: Soy (Glycine max), Cotton (Gossypium hirsutum), Oilseed rape/Cannola (B. napus subsp. Napus), Potato (Solanum tuberosum), tomato (Solanum lycopersicum), Cassava (Manihot esculenta), Wheat (Triticum aestivum), Barley (Hordeum vulgare), pigeon pea (Cajanus cajan), cowpea (Vigna unguiculata), pea (Pisum sativum), cannabis (Cannabis sativa), sugar beet (Beta vulgaris), oat (Avena sativa), rye (Secale cereal), peanut (Arachis hypogaea), Sunflower (Helianthus annuus), flax (Linum spp.), beans (Phaseolus vulgaris), lima bean (Phaseolus lunatus), mung bean (Phaseolus mung), Adzuki bean (Phaseolus angularis), Chickpea (Cicer arietinum), tobacco (Nicotiana tabacum), buckwheat (Fagopyrum esculentum), oil palm (Elaeis guineensis), or rubber (Hevea brasiliensis); plant.
24. The recombinant cell of claim 1, wherein the UPF0114 family protein is: (i) a C.sub.4 photosynthetic plant UPF0114 protein, a C.sub.3 photosynthetic plant UPF0114 protein, an algal UPF0114 protein, a bacterial UPF0114 protein, or an archaeal UPF0114 protein; or (ii) an Arabidopsis thaliana UPF0114 protein; or (ii) a Setaria italica UPF0114 protein; or (iii) a Setaria viridis UPF0114 protein; or (iv) an Escherichia coli UPF0114 protein; or (v) a Zea mays UPF0114 protein; or (vi) a UPF0114 protein comprising or consisting of an amino acid sequence having at least: 70%, 75%, 80%, 85%, 87%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%; sequence identity to the UPF0114 protein of (i), (ii), (iii), (iv) or (v); or (vii) a homolog, analog, ortholog or paralog of the UPF0114 protein of (i), (ii), (iii), (iv) or (v).
25. (canceled)
26. The recombinant cell of claim 1, wherein the UPF0114 family protein: (i) comprises or consists of an amino acid sequence as defined in SEQ ID NO: 1, SEQ ID NO: 2, SEQ ID NO: 3, SEQ ID NO: 4, SEQ ID NO: 5, SEQ ID NO: 6; SEQ ID NO: 9, SEQ ID NO: 10, SEQ ID NO: 11, SEQ ID NO: 15, SEQ ID NO: 18, SEQ ID NO: 19, SEQ ID NO: 20, SEQ ID NO: 21, SEQ ID NO: 212, SEQ ID NO: 23, SEQ ID NO: 24, SEQ ID NO: 25, SEQ ID NO: 26, or SEQ ID NO: 27; or (ii) comprises or consists of an amino acid sequence having at least: 70%, 75%, 80%, 85%, 87%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%; sequence identity to SEQ ID NO: 1, SEQ ID NO: 2, SEQ ID NO: 3, SEQ ID NO: 4, SEQ ID NO: 5, SEQ ID NO: 6 SEQ ID NO: 9, SEQ ID NO: 10, SEQ ID NO: 11, SEQ ID NO: 15, SEQ ID NO: 18, SEQ ID NO: 19, SEQ ID NO: 20, SEQ ID NO: 21, SEQ ID NO: 212, SEQ ID NO: 23, SEQ ID NO: 24, SEQ ID NO: 25, SEQ ID NO: 26, or SEQ ID NO: 27; or (iii) is a homolog, analog, ortholog or paralog of the UPF0114 family protein comprising or consisting of an amino acid sequence of (i) or (ii); or (iv) is encoded by a nucleotide sequence comprising or consisting of SEQ ID NO: 7, SEQ ID NO: 8, SEQ ID NO: 12, SEQ ID NO: 13, SEQ ID NO: 14, or SEQ ID NO: 16; or (v) is encoded by a nucleotide sequence comprising or consisting a nucleotide sequence having at least: 70%, 75%, 80%, 85%, 87%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99%; sequence identity to SEQ ID NO: 7 SEQ ID NO: 8, SEQ ID NO: 12, SEQ ID NO: 13, SEQ ID NO: 14, or SEQ ID NO: 16; or (vi) is a homolog, analog, ortholog or paralog of the UPF0114 family protein encoded by the nucleotide sequence of (iv) or (v).
27. The recombinant cell of claim 1, wherein the recombinant nucleic acid sequence: (i) is operably linked to a regulatory sequence; and/or (ii) is a component of an expression vector; and/or (iii) is codon optimised for expression in the recombinant cell type; and/or (iv) has intronic sequences removed; and/or (v) comprises a signal peptide sequence for directing the UPF0114 family protein to an internal membrane or cytoplasmic membrane of the recombinant cell.
28. The recombinant cell of claim 1, wherein: (i) the carboxylates and/or carboxylic acids are phosphorylated; or (ii) the recombinant cell is further engineered to produce or overexpress an enzyme and/or regulatory protein of a biochemical pathway, for production of the carboxylates and/or carboxylic acids.
29. (canceled)
30. (canceled)
31. A transgenic plant or a seed thereof comprising the recombinant plant cell of claim 11.
32. (canceled)
33. (canceled)
34. A process for production of carboxylic acids and/or carboxylates comprising: (i) producing the carboxylates in the recombinant cell according to claim 1, and (ii) exporting the carboxylates from the recombinant cell using a UPF0114 family protein embedded within the membrane of the recombinant cell.
35. (canceled)
36. (canceled)
37. (canceled)
38. (canceled)
39. (canceled)
40. (canceled)
Description
BRIEF DESCRIPTION OF THE FIGURES
[0122] Preferred embodiments of the present invention will now be described by way of example only, with reference to the accompanying figures wherein:
[0123]
[0124]
[0125]
[0126]
[0127]
[0128]
[0129]
[0130]
[0131]
[0132]
[0133]
[0134]
[0135]
[0136]
[0137]
[0138]
[0139]
[0140]
[0141]
[0142]
[0143]
[0144]
[0145]
DETAILED DESCRIPTION
[0146] The following detailed description conveys exemplary embodiments of the present invention in sufficient detail to enable those of ordinary skill in the art to practice the present invention. Features or limitations of the various embodiments described do not necessarily limit other embodiments of the present invention, or the present invention as a whole. Hence, the following detailed description does not limit the scope of the present invention, which is defined only by the claims.
[0147] It will be appreciated by persons of ordinary skill in the art that numerous variations and/or modifications can be made to the present invention as disclosed in the specific embodiments without departing from the spirit or scope of the present invention as broadly described. The present embodiments are, therefore, to be considered in all respects as illustrative and not restrictive.
[0148] Known transporters of monocarboxylates, dicarboxylates and tricarboxylates are suboptimal for many applications in industrial biotechnology due to their inability to export these molecules from the cells in which they are produced or overexpressed. This adds to the complexity, time and/or cost of processes aimed at the mass production of these metabolites. Additionally, although the C.sub.4 photosynthetic pathway is well-characterised, the missing/unknown molecular components of the C.sub.4 cycle in most C.sub.4 species are the monocarboxylate/monocarboxylic acid and dicarboxylate/dicarboxylic acid transporters. Specifically, in C.sub.4 plants it is unknown how the dicarboxylate malate enters the bundle sheath chloroplast and how the monocarboxylate pyruvate exits the bundle sheath chloroplast.
[0149] The present inventors have identified that UPF0114 family proteins provide a means of transporting monocarboxylates/monocarboxylic acids, and/or dicarboxylates/dicarboxylic acids, and/or tricarboxylates/tricarboxylic acids, across cell membranes (internal and/or external), and in particular a means of exporting these molecules from cells into the external environment. In doing so, they have provided a solution to current difficulties experienced in isolating these molecules from cells in the industrial biotechnology setting.
[0150] Additionally, as noted above the identity of the transporters facilitating movement of the dicarboxylate malate into the bundle sheath chloroplast and the exit of the monocarboxylate pyruvate from the bundle sheath chloroplast is needed to engineer C.sub.4 photosynthesis into C.sub.3 plants. The present inventors have demonstrated that UPF0114 family proteins from C.sub.4 photosynthetic plants facilitate both uptake of malate and export of pyruvate, as required for the bundle sheath cell chloroplast to conduct C.sub.4 photosynthesis. They have also shown that reduction of the amount of transcript encoding the UPF0114 protein in the C.sub.4 plant Setaria viridis, severely disrupts C.sub.4 photosynthesis and thus that the UPF0114 family protein is required for C.sub.4 photosynthesis. They have additionally shown that UPF0114 family proteins can be over-expressed in both C.sub.3 and C.sub.4 plant cells including rice (Oryza sativa).
UPF0114 Protein Family
[0151] The present invention provides recombinant cells expressing UPF0114 family proteins, and methods and processes for using them.
[0152] Prior to the present invention, the UPF0114 protein family (also known as the yqhA gene family) had not been functionally characterized and its biological role was unknown. Genes encoding members of the UPF0114 protein family can be found in the genomes of viruses, bacteria, archaea, algae, plants and some other eukaryotic organisms, and are defined by the presence of the PFAM protein domain of the same name; UPF0114 (PF03350). This PFAM domain typically comprises three or four transmembrane helices. Members of the UPF0114 protein family may comprise additional domains in addition to the UPF0114 domain. Non-limiting examples include any one or more: AAA+ATPase domains, ATP-binding domains, nucleotide triphosphate hydrolase domains, SHOCT domains, Fe-S hydro-lyase domains, NB-ARC domains, cytochrome C oxidase domains, reverse transcriptase domains, structural maintenance of chromosomes domains, major facilitator superfamily domains. Members of the UPF0114 protein family may also comprise a chloroplast and/or a mitochondrial targeting peptide (e.g. algae and plant UPF0114 family proteins). Non-limiting/representative UPF0114 protein family sequences from various organisms including viruses, archaea, bacteria, green algae and plants (SEQ ID NOs: 18-27) and their individual PFAM domain PF03350 sequences (SEQ ID NOs: 28-37) are provided below.
[0153] A non-limiting example of a viral protein in the UPF0114 family is the AXQ68784.1 protein in the Caulobacter phage CcrPW. The UPF0114 PFAM domain PF03350 is shown underneath.
TABLE-US-00001 (SEQ ID NO: 18) MIFETRWLLVPIYLAMIIAIAAYVILFTKQAIDMG LGVWHWDAEHLLLASLALVDMSMVANLIVMILAGG FSTFVAEFDQSLFPNRPRWMNGLDSTTLKIQMGKS LIGVTSVHLLQTFMRLHDILKEENGLVLVIAEIAI HMVFIVTTVSYCYISKLTHGHKVAPAALPTPATAE GH
Caulobacter phage CcrPW AXQ68784.1 protein PFAM domain PF03350 sequence:
TABLE-US-00002 (SEQ ID NO: 28) IFETRWLLVPIYLAMIIAIAAYVILFTKQAIDMGL GVWHWDAEHLLLASLALVDMSMVANLIVMILAGGF STFVAEFDQSLFPNRPRWMNGLDSTTLKIQMGKSL IGVTSVHLLQTFMRLHDILKEENGLVLVIAEIA
[0154] A non-limiting example of an archaeal protein in the UPF0114 family is the WP_095643983.1 protein in Methanosarcina spelaei. The UPF0114 domain is shown underneath.
TABLE-US-00003 (SEQ ID NO: 19) MKVVRFIAGMRFFVLIPVIGLAIAACVLFIKGGID IIHFMGELIIGMSEEGPEKSIIVEIVETVHLFLVG TVLFLTSFGLYQLFIQPLPLPEWVKVNNIEELELN LVGLTVVVLGVNFLSIIFEPQETDLAIYGIGYALP IAALAYFMKVRSHIRKGSNDEEEMRNIGEVTSVNS ESNWLINKKGD
Methanosarcina spelaei WP_095643983.1 protein PFAM domain PF03350 sequence:
TABLE-US-00004 (SEQ ID NO: 29) VVRFIAGMRFFVLIPVIGLAIAACVLFIKGGIDII HFMGELIIGMSEEGPEKSIIVEIVETVHLFLVGTV LFLTSFGLYQLFIQPLPLPEWVKVNNIEELELNLV GLTVVVLGVNFLSIIFEPQETDLAIYGIGYALPIA ALAYF
[0155] Another non-limiting example of an archaeal protein in the UPF0114 family is the WP_012192968.1 protein in Methanococcus maripaludis. The UPF0114 PFAM domain PF03350 is shown underneath.
TABLE-US-00005 (SEQ ID NO: 20) MGKSDKLKKKYGIKNISEQGFFEHFFELILWNSRF IVVLAVIFGTLGSIMLFLAGSAEIFHTILSYISDP MSSEQHNQILIGVIGAVDLYLIGVVLLIFSFGIYE LFISKIDIARVDGDVSNILEIYTLDELKSKIIKVI IMVLVVSFFQRVLSMHFETSLDMIYMAISIFAISL GVYFMHRQKM
Methanococcus maripaludis WP_012192968.1 protein PFAM domain PF03350 sequence:
TABLE-US-00006 (SEQ ID NO: 30) FEHFFELILWNSRFIVVLAVIFGTLGSIMLFLAGSAEIFHTILSYISDPM SSEQHNQILIGVIGAVDLYLIGVVLLIFSFGIYELFISKIDIARVDGDVS NILEIYTLDELKSKIIKVIIMVLVVSFFQRVLSMHFETSLDMIYMAISIF AISLGVYFM
[0156] A non-limiting example of a bacterial protein in the UPF0114 family is the yqhA protein in Escherichia coli. The UPF0114 PFAM domain PF03350 is shown underneath.
TABLE-US-00007 (SEQ ID NO: 21) MERFLENAMYASRWLLAPVYFGLSLALVALALKFFQEIIHVLPNIFSMAE SDLILVLLSLVDMTLVGGLLVMVMFSGYENFVSQLDISENKEKLNWLGKM DATSLKNKVAASIVAISSIHLLRVFMDAKNVPDNKLMWYVIIHLTFVLSA FVMGYLDRLTRHNH
Escherichia coli yqhA protein PFAM domain PF03350 sequence:
TABLE-US-00008 (SEQ ID NO: 31) ERFLENAMYASRWLLAPVYFGLSLALVALALKFFQEIIHVLPNIFSMAES DLILVLLSLVDMTLVGGLLVMVMFSGYENFVSQLDISENKEKLNWLGKMD ATSLKNKVAASIVAISSIHLLRVFMDAKNVPDNKLMWYVIIHLTFVLSAF
[0157] Another non-limiting example of a bacterial protein in the UPF0114 family is the WP_021087398.1 protein in Campylobacter concisus. The UPF0114 PFAM domain PF03350 is shown underneath.
TABLE-US-00009 (SEQ ID NO: 22) MRKIFERILLASNSFTLFPVVFGLLGAIVLFIIASYDVGKVLLEVYKYFF AADFHVENFHSEVVGEIVGAIDLYLMALVLYIFSFGIYELFISEITQLKQ SKQSKVLEVHSLDELKDKLGKVIVMVLIVNFFQRVLHANFTTPLEMAYLA ASILALCLGLYFLHKGDH
Campylobacter concisus WP_021087398.1 protein PFAM domain PF03350 sequence:
TABLE-US-00010 (SEQ ID NO: 32) KIFERILLASNSFTLFPVVFGLLGAIVLFIIASYDVGKVLLEVYKYFFAA DFHVENFHSEVVGEIVGAIDLYLMALVLYIFSFGIYELFISEITQLKQSK QSKVLEVHSLDELKDKLGKVIVMVLIVNFFQRVLHANFTTPLEMAYLAAS ILALCLGLYFLHKGD
[0158] Another non-limiting example of a bacterial protein in the UPF0114 family is the OUV44343.1 protein in Rhodobacteraceae bacterium TMED111. The UPF0114 PFAM domain PF03350 is shown underneath.
TABLE-US-00011 (SEQ ID NO: 23) MGFIERIGEKILWNSRFIVILAVIFSIIASISLFIIGSYEIIYSLVYENP IWSEKYKHNHAQILYKIISAVDLYLIGVVLMIFGFGIYELFISKIDIARK NPSITILEIENLDELKNKIVKVIVMVLIVSFFERILKNSDAFTSSLNLLY FAISIFAISFSIYYINKNKN
Rhodobacteraceae bacterium TMED111 PFAM domain PF03350 sequence:
TABLE-US-00012 (SEQ ID NO: 33) ERIGEKILWNSRFIVILAVIFSIIASISLFIIGSYEIIYSLVYENPIWSE KYKHNHAQILYKIISAVDLYLIGVVLMIFGFGIYELFISKIDIARKNPSI TILEIENLDELKNKIVKVIVMVLIVSFFERILKNSDAFTSSLNLLYFAIS IFAISFSIYYIN
[0159] A non-limiting example of a green algal protein in the UPF0114 family is the 108867 protein in Micromonas pusilla. The UPF0114 PFAM domain PF03350 is shown underneath.
TABLE-US-00013 (SEQ ID NO: 24) MSSSGVLSLSASARVAPRATSVRRARAPVRATQLARSRADTAAWGKKFMS VERGSRAVGVRSLVEAANTEPGASYDDGDDHVDTTYDAEDLAHPDVAMMK ASREVRKPFREFSLIEKVEYVFVRFTLISACIFVLLGVLASLLLSALLFS MGMKEVLFDAVQAWAGYSPVGLVSSAVGALDRFLLGMVCLVFGLGSFELF LARSNRAGQVRDRRLKKLAWLKVSSIDDLEQKVGEIIVAVMVVNLLEMSL HMTYAAPLDLVWAALAAVMSAGALALLHYAAGHGDHNHKDKGGHDSGAGL LH
Micromonas pusilla 108867 PFAM domain PF03350 sequence:
TABLE-US-00014 (SEQ ID NO: 34) TLISACIFVLLGVLASLLLSALLFSMGMKEVLFDAVQAWAGYSPVGLVSS AVGALDRFLLGMVCLVFGLGSFELFLARSNRAGQVRDRRLKKLAWLKVSS IDDLEQKVGEIIVAVMVVNLLEMSLHMTYAAPLDLVWAALAAVMSAGALA LL
[0160] Another non-limiting example of a green algal protein in the UPF0114 family is the GAQ84557.1 protein in Klebsormidium nitens. The UPF0114 PFAM domain PF03350 is shown underneath.
TABLE-US-00015 (SEQ ID NO: 25) MSKDGVAAIDVMMPDGASEDYPITLEEADASDGEWTRRKRHVKRLKKVES TIERVIFDCRFFALMGVVGSLIGSFLCFVKGCFYVYKAIIAAAFDVTHGL NSYKVVLKLIEALDTYLVATVMLIFGMGLYELFVNELEAVATTDSVVGCK SNLFGLFRLRERPKWLQINGLDALKEKLGHVIVMILLVGMFEKSKKVPIR NGVDLVCVATSVLLCAGSLYLLSQLSKNGNGH
Klebsormidium nitens GAQ84557.1 protein PFAM domain PF03350 sequence:
TABLE-US-00016 (SEQ ID NO: 35) ESTIERVIEDCRFFALMGVVGSLIGSFLCFVKGCFYVYKAIIAAAFDVTH GLNSYKVVLKLIEALDTYLVATVMLIFGMGLYELFVNELEAVATTDSVVG CKSNLFGLFRLRERPKWLQINGLDALKEKLGHVIVMILLVGMFEKSKKVP IRNGVDLVCVATSVLLCAGSLYLL
[0161] A non-limiting example of a plant protein in the UPF0114 family is the AT5G13720.1 protein in Arabidopsis thaliana. The UPF0114 PFAM domain PF03350 is shown underneath.
TABLE-US-00017 (SEQ ID NO: 26) MALSSLISATPLSLSVPRYLVLPTRRRFHLPLATLDSSPPESSASSSIPT SIPVNGNTLPSSYGTRKDDSPFAQFFRSTESNVERIIFDFRFLALLAVGG SLAGSLLCFLNGCVYIVEAYKVYWTNCSKGIHTGQMVLRLVEAIDVYLAG TVMLIFSMGLYGLFISHSPHDVPPESDRALRSSSLFGMFAMKERPKWMKI SSLDELKTKVGHVIVMILLVKMFERSKMVTIATGLDLLSYSVCIFLSSAS LYILHNLHKGET
Arabidopsis thaliana AT5G13720.1 protein PFAM domain PF03350 sequence:
TABLE-US-00018 (SEQ ID NO: 36) SNVERIIFDFRFLALLAVGGSLAGSLLCFLNGCVYIVEAYKVYWTNCSKG IHTGQMVLRLVEAIDVYLAGTVMLIFSMGLYGLFISHSPHDVPPESDRAL RSSSLFGMFAMKERPKWMKISSLDELKTKVGHVIVMILLVKMFERSKMVT IATGLDLLSYSVCIFLSSASLYIL
[0162] Another non-limiting example of a plant protein in the UPF0114 family is the LOC_Os03g52910.1 protein in Oryza sativa. The UPF0114 PFAM domain PF03350 is shown underneath.
TABLE-US-00019 (SEQ ID NO: 27) MAAAAAGGGGGGGGSGRLLRGATAKAFHGDGSSHHRMMPSSSSSVAAGGG GGVAGPCRIPSLKFPSLWESKRQGGGVGSRAAERKAALIALGAAGVTALE RERGGGVVLLPEEARRGADLLLPLAYEVARRLVLRQLGGATRPTQQCWSK IAEATIHQGVVRCQSFTLIGVAGSLVGSVPCFLEGCGAVVRSFFVQFRAL TQTIDQAEIIKLLIEAIDMFLIGTALLTFGMGMYIMFYGSRSIQNPGMQG DNSHLGSFNLKKLKEGARIQSITQAKTRIGHAILLLLQAGVLEKFKSVPL VTGIDMACFAGAVLASSAGVFLLSKLSTTAAQAQRQPRKRTAFA
Oryza sativa LOC 0s03g52910.1 protein PFAM domain PF03350 sequence:
TABLE-US-00020 (SEQ ID NO: 37) ATIHQGVVRCQSFTLIGVAGSLVGSVPCFLEGCGAVVRSFFVQFRALTQT IDQAEIIKLLIEAIDMFLIGTALLTFGMGMYIMFYGSRSIQNPGMQGDNS HLGSFNLKKLKEGARIQSITQAKTRIGHAILLLLQAGVLEKFKSVPLVTG IDMACFAGAVLASSAGVFLLS
[0163] As noted above, UPF0114 family proteins for use in the present invention are capable of transporting carboxylates/carboxylic acids (e.g. monocarboxylates/monocarboxylic acids, and/or dicarboxylates/dicarboxylic acids, and/or tricarboxylates/tricarboxylic acids) across biological membranes (e.g. those of organelles and/or the cytoplasmic membrane i.e. the cell membrane surrounding the cytoplasm). The proteins may thus be capable of exporting the carboxylates/carboxylic acids from cell organelles (e.g. chloroplasts, mitochondria) and/or from cells into the external environment. In some embodiments, the UPF0114 family proteins are capable of bidirectional transport of the same or different molecules into and out of cell organelles and/or cells. Additionally or alternatively, the UPF0114 family proteins may be capable of importing and/or exporting molecules (e.g. into and/or out of a cell organelle; into and/or out of a cell) against a concentration gradient, wherein the amount or concentration of the molecule in proximity to a first side of the membrane is below that of the opposing side of the membrane to which the molecule is being transported.
[0164] A non-limiting example of a bacterial member of the UPF0114 protein family is the Escherichia coli gene yqhA (UniProt ID P67244, SEQ ID NO: 1).
[0165] A non-limiting example of a plant member of the UPF0114 protein family is the (C.sub.3 photosynthetic plant) Arabidopsis thaliana gene AT4G19390 (amino acid sequence: SEQ ID NO: 2). A second non-limiting example of a plant member of the UPF0114 protein family is the (C.sub.4 photosynthetic plant) Setaria italica Si007164m (also known as Seita.4G275500) (amino acid sequence: SEQ ID NO: 3). A third non-limiting example of a plant member of the UPF0114 protein family is the (C.sub.4 photosynthetic plant) Setaria viridis Sevir.4G287300 gene (amino acid sequence: SEQ ID NO: 6). A fourth non-limiting example of a plant member of the UPF0114 protein family is the (C.sub.4 photosynthetic plant) Zea mays GRMZM2G179292 gene (amino acid sequence: SEQ ID NO: 9). A fifth non-limiting example of a plant member of the UPF0114 protein family is the (C.sub.4 photosynthetic plant) Zea mays GRMZM2G133400 gene (amino acid sequence: SEQ ID NO: 10). A sixth non-limiting example of a plant member of the UPF0114 protein family is the (C.sub.4 photosynthetic plant) Zea mays GRMZM2G327686 gene (amino acid sequence: SEQ ID NO: 11). In some embodiments, the UPF0114 protein may be classified as an Embryophyta, Klebsormidiophyceae, Chlorophyta, Viridae, Bacteria, or Archaea protein.
[0166] The present invention encompasses homologs, analogs, orthologs and paralogs of the specific UPF0114 proteins and protein sequences provided herein. In view of the high level of evolutionary conservation evident among, for example, viral, bacterial, archaeal, algal, and plant UPF0114 family proteins, the skilled person can identify such homologs, analogs, orthologs and paralogs using routine methods without inventive effort. Numerous publicly accessible online tools are available to the skilled person which can be used to find nucleotide and protein sequences similar to a UPF0114 protein or nucleotide sequence of interest.
[0167] Methods for assessing the level of homology and identity between sequences are well known in the art. The percentage of sequence identity between two sequences may, for example, be calculated using a mathematical algorithm. A non-limiting example of a suitable mathematical algorithm is described in the publication of Karlin and colleagues (1993, PNAS USA, 90:5873-5877). This algorithm is integrated in the BLAST (Basic Local Alignment Search Tool) family of programs (see also Altschul et al. (1990), J. Mol. Biol. 215, 403-410 or Altschul et al. (1997), Nucleic Acids Res, 25:3389-3402) accessible via the National Center for Biotechnology Information (NCBI) website homepage (https://www.ncbi.nlm.nih.gov). The BLAST program is freely accessible at https://blast.ncbi.nlm.nih.gov/Blast.cgi. Other non-limiting examples include the HMMER (http://hmmer.org/), (Clustal (http://www.clustal.org/) and FASTA (Pearson (1990), Methods Enzymol. 83, 63-98; Pearson and Lipman (1988), Proc. Natl. Acad. Sci. U. S. A 85, 2444-2448.) programs. These and other programs can be used to identify sequences which are at least to some level identical to a given input sequence. Additionally or alternatively, programs available in the Wisconsin Sequence Analysis Package, version 9.1 (Devereux et al. 1984, Nucleic Acids Res., 387-395), for example the programs GAP and BESTFIT, may be used to determine the percentage of sequence identity between two polypeptide sequences. BESTFIT uses the local homology algorithm of Smith and Waterman (1981, J. Mol. Biol. 147, 195-197) and identifies the best single region of similarity between two sequences. Where reference herein is made to an amino acid sequence sharing a specified percentage of sequence identity to a reference amino acid sequence, the difference/s between the sequences may arise partially or completely from amino acid substitution/s. In such cases, the sequence identified with the amino acid substitution/s may substantially or completely retain the same biological activity of the reference sequence.
Sequence Modifications
[0168] UPF0114 protein family sequences of the present invention may be modified to enhance expression in a recombinant cell. Many publicly available online tools exist to enable the skilled artisan to optimise a nucleotide or protein sequence for use in the present invention (see, for example, http://genomes.urv.es/OPTIMIZER).
[0169] For example, the sequence may be modified by codon optimisation. As known to those of skill in the art, organisms differ in their tendency to use specific codons over others to encode the same amino acid. Codon optimisation may thus be employed to enhance expression of UPF0114 protein sequences in specific cell types.
[0170] Additionally or alternatively, nucleotide sequences encoding UPF0114 family proteins of the present invention may be modified by the removal of one or more introns.
[0171] Additionally or alternatively, nucleotide sequences encoding UPF0114 family proteins of the present invention may be modified by operably linking them to regulatory sequences (e.g. promoters, enhancers and the like) to manipulate the level at which they are transcribed.
[0172] Additionally or alternatively, UPF0114 protein family sequences of the present invention may be manipulated to direct the movement of the proteins to specific internal cellular locations (e.g. the envelope membranes of organelles such as a chloroplast or mitochondria) or to the cytoplasmic membrane itself (i.e. the cell membrane surrounding the cytoplasm). For example, the sequences may be operably linked to a signal peptide or targeting peptide sequence, or alternatively have an existing signal peptide sequence removed.
[0173] Additionally or alternatively, UPF0114 protein family sequences of the present invention may be manipulated to facilitate detection and/or isolation by way of incorporating tag sequences or the like.
[0174] The skilled addressee will recognise that the examples of sequence modifications above are non-limiting, with many other known sequence modifications available that could be used as a matter of routine. The present invention contemplates any and all modifications of this nature.
Carboxylates
[0175] UPF0114 family proteins of the present invention are used to transport carboxylates, and in particular any one or more of monocarboxylates/monocarboxylic acids, and/or dicarboxylates/dicarboxylic acids, and/or tricarboxylates/tricarboxylic acids.
[0176] In some embodiments of the present invention, the carboxylates/carboxylic acids may comprise or consist of monocarboxylates/monocarboxylic acids. For example, the monocarboxylates/monocarboxylic acids may comprise or consist of pyruvate/pyruvic acid. Additionally or alternatively, the monocarboxylates/monocarboxylic acids may comprise or consist of any one or more of: lactate/lactic acid, glycerate/glyceric acid, acetate/acetic acid, branched-chain oxo acids, acetoacetate, betα-hydroxybutyrate.
[0177] In some embodiments of the present invention, the carboxylates/carboxylic acids may comprise or consist of dicarboxylates/dicarboxylic acids. For example, the dicarboxylates/dicarboxylic acids may comprise or consist of any one or more of: succinate/succinic acid, malate/malic acid, fumarate/fumaric acid, α-ketoglutarate/α-ketoglutaric acid, aspartate/aspartic acid, glutamate/glutamic acid.
[0178] In other embodiments of the present invention, the carboxylates/carboxylic acids may comprise or consist of tricarboxylates/tricarboxylic acids. For example, the tricarboxylates/tricarboxylic acids may comprise or consist of any one or more of: citrate/citric acid, isocitrate/isocitric acid, aconitate/aconitic acid, propane-1,2,3-tricarboxylic acid, trimesic acid.
[0179] In still other embodiments of the present invention, the carboxylates/carboxylic acids may be phosphorylated. Accordingly, the UPF0114 family proteins of the present invention may be used to transport any one or more of: phosphorylated monocarboxylates/monocarboxylic acids, phosphorylated dicarboxylates/dicarboxylic acids, phosphorylated tricarboxylates/tricarboxylic acids. Non-limiting examples of phosphorylated carboxylic acids that may be transported by the UPF0114 family proteins include glycerate-3-phosphate/3-phosphoglyceric acid and phosphoenolpyruvate/phosphoenolpyruvic acid.
[0180] As noted above, UPF0114 family proteins of the present invention may be capable of bidirectional movement of carboxylates/carboxylic acids across biological membranes. In some embodiments, the UPF0114 family proteins may be capable of the uptake of malate and the export of more pyruvate. Additionally or alternatively, the UPF0114 family proteins may be capable of exporting any one of more of lactate, succinate, malate, fumarate, glycerate, α-ketoglutarate, aspartate, aconitate, citrate, branched-chain oxo acids, acetoacetate, betα-hydroxybutyrate from an organelle (e.g. a chloroplast), a cell (e.g. a bacterial, plant or algal cell). This transport may occur with or against a concentration gradient.
Recombinant Cells
[0181] The present invention provides recombinant cells expressing UPF0114 family proteins. The UPF0114 family protein may be encoded by a recombinant nucleic acid sequence (e.g. recombinant DNA, recombinant RNA, and the like) introduced into the base cell.
[0182] For example, a recombinant nucleic acid sequence encoding a UPF0114 family protein may be transiently introduced into the cell. This may result in transient expression of the UPF0114 family proteins for a finite period (e.g. 1, 2, 3, 4, 5, 7, 8, 9, or 10 days). Methods for achieving transient expression of recombinant nucleic acids in host cells are well known in the art. In some embodiments, transient expression may be characterised by a lack of replication of the recombinant nucleic acid sequence when the host cell replicates. In some embodiments, transient expression may be characterised by an absence of integration of the recombinant nucleic acid sequence into the genome of the host cell.
[0183] Additionally or alternatively, a recombinant nucleic acid sequence encoding a UPF0114 family protein may be stably introduced into the cell. Recombinant nucleic acid sequences that have been stably introduced into the cell will generally be replicated when the host cell replicates. In some embodiments, stable expression may be characterised by integration of the recombinant nucleic acid sequence into the genome of the host cell. In some embodiments, stable expression may be characterised by introducing the recombinant nucleic acid sequence into the cell as a component of a vector (e.g. an expression vector). Suitable vectors for this purpose are well known to those of skill in the art and include, without limitation, plasmids, cosmids, yeast vectors, yeast artificial chromosomes, bacterial artificial chromosomes, P1 artificial chromosomes, plant artificial chromosomes, algal artificial chromosomes, modified viruses (e.g. modified adenoviruses, retroviruses or phages), and mobile genetic elements (e.g. transposons).
[0184] Techniques for producing recombinant nucleic acids (e.g. recombinant DNA, recombinant RNA, and the like) including those provided in the form of a vector, are well known to those skilled in the art, as are techniques for the introduction of recombinant nucleic acids into cells (e.g. electroporation, microinjection, biolistic delivery systems, calcium phosphate co-precipitation, cationic lipid-based transfection reagents, diethylaminoethyl-dextran). General guidance on suitable methods can be found, for example, in standard texts such as Green and Joseph. (2012), Molecular cloning: a laboratory manual, fourth edition. Cold Spring Harbor, N.Y.: Cold Spring Harbor Laboratory Press; Ausubel et al. (1987-2016). Current Protocols in Molecular Biology. New York, N.Y., John Wiley & Sons; and ‘Cloning a Specific Gene.’ in Griffiths et al. 1999 Modern Genetic Analysis. New York: W.H. Freeman.
[0185] The recombinant cell may be any suitable type including, but not limited to, prokaryotic, eukaryotic, archaeal, plant, algal, bacterial, yeast, fungal, animal, mammalian, or synthetic cells.
[0186] In some embodiments, the host cell may be bacterial cell such as, for example, Escherichia coli or Agrobacterium tumefaciens. The bacterial cell may be autotrophic (e.g. a cyanobacterium).
[0187] In other embodiments, the host cell may be a plant cell (e.g. a C.sub.3 photosynthetic plant cell, such as a C.sub.3 plant vascular sheath cell, a C.sub.3 plant bundle sheath cell, a C.sub.3 plant mestome sheath cell, or a C.sub.3 plant mesophyll cell; a C.sub.4 photosynthetic plant cell such as a C.sub.4 plant vascular sheath cell, a C.sub.4 plant bundle sheath cell, a C.sub.4 plant mestome sheath cell or a C.sub.4 plant mesophyll cell; or a CAM photosynthetic plant cell, such as a CAM plant vascular sheath cell, a CAM plant bundle sheath cell, a CAM plant mestome sheath cell or a CAM plant mesophyll cell).
[0188] In still other embodiments, the host cell may be yeast such as, for example, Saccharomyces cerevisiae, Pichia pastoris, Pichia methanolica and Hansenula polymorpha.
[0189] The recombinant cells expressing carboxylates/carboxylic acids of the present invention may also be engineered to produce carboxylates/carboxylic acids. For example, the recombinant cells may further produce any one or more of monocarboxylates/monocarboxylic acids, and/or dicarboxylates/dicarboxylic acids, and/or tricarboxylates/tricarboxylic acids. Additionally or alternatively, the recombinant cells may be engineered to produce or overexpress enzyme/s and/or regulatory protein/s of biochemical pathway/s for production of the carboxylates/carboxylic acids (e.g. for production of monocarboxylates/monocarboxylic acids, and/or dicarboxylates/dicarboxylic acids, and/or tricarboxylates/tricarboxylic acids).
[0190] Production of the carboxylates/carboxylic acids and/or enzyme/s and/or regulatory protein/s in the recombinant cells can be achieved, for example, using the same materials and techniques as described above in relation to the overexpression of the UPF0114 family proteins.
[0191] Non-limiting examples of monocarboxylates/monocarboxylic acids that may be produced by the recombinant cells include any one more of: pyruvate/pyruvic acid, lactate/lactic acid, glycerate/glyceric acid, acetate/acetic acid, branched-chain oxo acids, acetoacetate, betα-hydroxybutyrate.
[0192] Non-limiting examples of dicarboxylates/dicarboxylic acids that may be produced by the recombinant cells include any one or more of: succinate/succinic acid, malate/malic acid, fumarate/fumaric acid, α-ketoglutarate/α-ketoglutaric acid, aspartate/aspartic acid, glutamate/glutamic acid.
[0193] A non-limiting example of a tricarboxylates/tricarboxylic acid that may be produced by the recombinant cells include any one or more of: citrate/citric acid, isocitrate/isocitric acid, aconitate/aconitic acid, propane-1,2,3-tricarboxylic acid, trimesic acid.
[0194] The carboxylates/carboxylic acids produced in the recombinant cells may be phosphorylated (e.g. phosphorylated monocarboxylates/monocarboxylic acids, and/or phosphorylated dicarboxylates/dicarboxylic acids, and/or phosphorylated tricarboxylates/tricarboxylic acids). Non-limiting examples include glycerate-3-phosphate/3-phosphoglyceric acid and phosphoenolpyruvate/phosphoenolpyruvic acid.
[0195] The enzyme/s and/or regulatory protein/s of biochemical pathway/s for production of the carboxylates/carboxylic acids that may be produced in the recombinant cell include, for example, any one or more of: pyruvate carboxylase, pyruvate synthase, pyruvate dehydrogenase, pyruvate kinase, citrate synthase, aconitase, isocitrate dehydrogenase, α-ketoglutarate dehydrogenase, Succinyl-CoA synthase, succinic dehydrogenase, fumarase, malate dehydrogenase, malic enzyme, phosphoenolpyruvate carboxykinase, malate quinone-oxidoreductase, glutamate dehydrogenase, lactate dehydrogenase, isocitrate lyase, malate synthase.
Transgenic Plants
[0196] Recombinant plants cells of the present invention may be used to generate transgenic plants. In some embodiments of the present invention, the transgenic plants have an increased rate of photosynthesis relative to the unmodified plant line.
[0197] By way of non-limiting example, a C.sub.3 photosynthetic plant cell (e.g. a C.sub.3 plant vascular sheath cell, a C.sub.3 plant mestome sheath cell, a C.sub.3 plant mesophyll cell, or a C.sub.3 plant bundle sheath cell) may be engineered to express or overexpress a UPF0114 family protein capable of importing and/or exporting carboxylates/carboxylic acids (e.g. monocarboxylates/monocarboxylic acids, and/or dicarboxylates/dicarboxylic acids, and/or tricarboxylates/tricarboxylic acids) across membrane/s of the cell (e.g. those of organelles such as chloroplasts and/or mitochondria, and/or the cytoplasmic membrane). The UPF0114 family protein may, for example, be a UPF0114 protein from a C.sub.3 plant, a C.sub.4 plant, a CAM plant, an alga, a virus, a bacterium or an archaeon.
[0198] In some embodiments, the UPF0114 family protein may be capable of importing malate into any cell type or subcellular organelle within a C.sub.3 plant including but not limited to a C.sub.3 plant mesophyll cell, a C.sub.3 plant bundle sheath cell, a C.sub.3 plant mesophyll cell chloroplast, a C.sub.3 plant bundle sheath cell chloroplast, a C.sub.3 plant mesophyll cell mitochondrion, a C.sub.3 plant bundle sheath cell mitochondrion. Additionally or alternatively, the UPF0114 family protein may be capable of exporting pyruvate from any cell type or subcellular organelle within a C.sub.3 plant including but not limited to: a C.sub.3 plant mesophyll cell, a C.sub.3 plant bundle sheath cell, a C.sub.3 plant mesophyll chloroplast, a C.sub.3 plant bundle sheath cell chloroplast.
[0199] By way of further non-limiting example, a C.sub.4 photosynthetic plant cell (e.g. a C.sub.4 plant vascular sheath cell, a C.sub.4 plant bundle sheath cell, a C.sub.4 plant mestome sheath cell or a C.sub.4 plant mesophyll cell) may be engineered to express or overexpress a UPF0114 family protein capable of importing and/or exporting carboxylates/carboxylic acids (e.g. monocarboxylates/monocarboxylic acids, and/or dicarboxylates/dicarboxylic acids, and/or tricarboxylates/tricarboxylic acids) across membrane/s of the cell (e.g. those of organelles such as chloroplasts and/or mitochondria, and/or the cytoplasmic membrane). The UPF0114 family protein may, for example, be a UPF0114 protein from a C.sub.3 plant, a C.sub.4 plant, a CAM plant, an alga, a virus, a bacterium or an archaeon.
[0200] In some embodiments, the UPF0114 family protein may be capable of importing malate into any cell type or subcellular organelle within a C.sub.4 plant including but not limited to: a C.sub.4 plant mesophyll cell, a C.sub.4 plant bundle sheath cell, a C.sub.4 plant mesophyll cell chloroplast, a C.sub.4 plant bundle sheath cell chloroplast, a C.sub.4 plant mesophyll cell mitochondrion, a C.sub.4 plant bundle sheath cell mitochondrion. Additionally or alternatively, the UPF0114 family protein may be capable of exporting pyruvate from any one or more of: a C.sub.4 plant mesophyll cell, a C.sub.4 plant bundle sheath cell, a C.sub.4 plant mesophyll chloroplast, a C.sub.4 plant bundle sheath cell chloroplast.
[0201] By way of further non-limiting example, a plant cell that conducts crassulacean acid metabolism (CAM) (e.g. a CAM plant vascular sheath cell, a CAM plant bundle sheath cell, a CAM plant mestome sheath cell, a CAM plant mesophyll cell, or a CAM plant bundle sheath cell) may be engineered to express or overexpress a UPF0114 family protein capable of importing and/or exporting carboxylates/carboxylic acids (e.g. monocarboxylates/monocarboxylic acids, and/or dicarboxylates/dicarboxylic acids, and/or tricarboxylates/tricarboxylic acids) across membrane/s of the cell (e.g. those of organelles such as chloroplasts and/or mitochondria, and/or the cytoplasmic membrane). The UPF0114 family protein may, for example, be a UPF0114 protein from a C.sub.3 plant, a C.sub.4 plant, a CAM plant, an alga, a virus, a bacterium or an archaeon.
[0202] In some embodiments, the UPF0114 family protein may be capable of importing malate into any cell type or subcellular organelle within a CAM plant including but not limited to: a CAM plant mesophyll cell, a CAM plant bundle sheath cell, a CAM plant mesophyll cell chloroplast, a CAM plant bundle sheath cell chloroplast, a CAM plant mesophyll cell mitochondrion, a CAM plant bundle sheath cell mitochondrion. Additionally or alternatively, the UPF0114 family protein may be capable of exporting pyruvate from any one or more of: a CAM plant mesophyll cell, a CAM plant bundle sheath cell, a CAM plant mesophyll chloroplast, a CAM plant bundle sheath cell chloroplast.
[0203] Methods for producing transgenic plants are well known to persons skilled in the art (see, for example, Gamborg and Phillips, 1995, Plant cell, tissue and organ culture: fundamental methods. Springer, Berlin; Low et al. 2018, ‘Transgenic Plants: Gene Constructs, Vector and Transformation Method’ in New Visions in Plant Science, Çelik (Ed), IntechOpen; Transgenic Crop Plants, Volume 1. Principles and Development, 2010, Kole, Michler, Abbott, Hall, (Eds.)).
[0204] In some embodiments, the transgenic plants may be monocotyledonous. In other embodiments, the transgenic plants may be dicotyledonous. In still other embodiments, the transgenic plants may be a genus Oryza plant such as, for example, a rice plant (e.g. a Oryza sativa plant or a Oryza glaberrima plant).
[0205] In some embodiments, the transgenic plant may be soy (Glycine max), cotton (Gossypium hirsutum), oilseed rape/Cannola (B. napus subsp. Napus), potato (Solanum tuberosum), tomato (Solanum lycopersicum), cassava (Manihot esculenta), maize (Zea mays), sorghum (Sorghum bicolor), sugar cane (Saccharum officinarum), foxtail millet (Setaria italica), proso millet (Panicum miliaceum), mischanthus (Miscanthus giganteus), wheat (Triticum aestivum), barley (Hordeum vulgare), pigeon pea (Cajanus cajan), cowpea (Vigna unguiculata), pea (Pisum sativum), cannabis (Cannabis sativa), sugar beet (Beta vulgaris), oat (Avena sativa), rye (Secale cereal), peanut (Arachis hypogaea), sunflower (Helianthus annuus), flax (Linum spp.), beans (Phaseolus vulgaris), lima bean (Phaseolus lunatus), mung bean (Phaseolus mung), adzuki bean (Phaseolus angularis), Chickpea (Cicer arietinum), tobacco (Nicotiana tabacum), buckwheat (Fagopyrum esculentum), oil palm (Elaeis guineensis), or rubber (Hevea brasiliensis).
[0206] Also provided are seeds obtained from the transgenic plants of the present invention.
Methods of Use
[0207] Provided herein are methods for exploiting the recombinant cells of the present invention.
[0208] Without limitation, the recombinant cells may be used in metabolite production given that they provide a means of exporting carboxylates/carboxylic acids with or against concentration gradients. For example, the recombinant cells of the present invention can be used in the commercial production of carboxylates such as pyruvate or succinate, which may in turn be used as building blocks for a large range of complex chemicals, non-limiting examples of which include polymers, solvents and pharmaceuticals. In some embodiments, biological production of these metabolites may occur by fermentation from cheaper sugars. The microorganisms currently used for bioproduction of carboxylates either naturally, or have been engineered to, accumulate high concentrations of carboxylates within the cell. A large component of the cost of biological production of these metabolites is attributable to the process of extracting the metabolites from the cells and subsequently separating them from other cellular contaminants. Thus, the recombinant cells and methods of the present invention may provide a substantial reduction in the cost of carboxylate production by specifically exporting these metabolites from cells during the process of fermentation. In other embodiments, carboxylates may be overexpressed in the recombinant cells of the present invention, and similarly exported via UPF0114 family proteins engineered into membrane/s of the cell to facilitate more efficient and simplified collection.
[0209] Further methods of the present invention involve the generation of transgenic plants as described above. The transgenic plants will ideally have an increased photosynthetic rate as compared to a corresponding wild-type plant. In some embodiments, the transgenic plants are constructed from C.sub.3 photosynthetic plants to include C.sub.4 photosynthetic traits. In other embodiments, the transgenic plants are constructed from C.sub.3 photosynthetic plants to include crassulacean acid metabolism (CAM) photosynthetic traits. In still other some embodiments, the transgenic plants are constructed from C.sub.4 photosynthetic plants in which photosynthesis has been improved by overexpression of UPF0114 family proteins.
EXAMPLES
[0210] The present invention will now be described with reference to specific Examples, which should not be construed as in any way limiting.
Example One
The Gene Family Encodes a Family of Carboxylate and Phosphorylated Carboxylate Transporters
[0211] To characterise the transport activities of these representative members of this gene family the genes were cloned into an inducible expression vector (
[0212] In total the transport activities of the proteins encoded by 8 different members of the UPF0114 gene family were subject to experimental interrogation. These comprised 1) The protein encoded by the yqhA gene in Escherichia coli for which the complete amino acid sequence shown in SEQ ID NO: 1. 2) The protein encoded by the AT4G19390 gene in Arabidopsis thaliana for which the complete amino acid sequence shown in SEQ ID NO: 2. 3) The protein encoded by the Sevir.4G287300 gene in Setaria viridis for which the complete amino acid sequence shown in SEQ ID NO: 6. 4) The protein encoded by the GRMZM2G179292 gene in Zea mays for which the complete amino acid sequence shown in SEQ ID NO: 9. 5) The protein encoded by the GRMZM2G133400 gene in Zea mays for which the complete amino acid sequence shown in SEQ ID NO: 10. 6) The protein encoded by the GRMZM2G327686 gene in Zea mays for which the complete amino acid sequence shown in SEQ ID NO: 11. In the case of the Escherichia coli yqhA gene, a nucleotide sequence encoding the complete amino acid sequence shown in SEQ ID NO: 1 was used and this gene was cloned into the inducible expression plasmid to generate plasmid 1.
[0213] In the case of the Arabidopsis thaliana, Setaria viridis and Zea mays member of the gene family, the nucleotide sequences corresponding to the protein sequences described above were designed to be codon optimised for expression in E. coli. In addition, the introns present in these genes were removed such that the nucleotide sequence comprised only coding sequence. Furthermore, the chloroplast transit peptides were removed to prevent misfolding or mistargeting of the protein in E. coli. These synthetic nucleotide sequences are shown in SEQS ID NOs: 7, 8, 12, 13 and 14. These genes were individually cloned into the inducible expression plasmid to generate plasmids 2-6.
[0214] Independent E. coli cell lines were generated such that each contained one of the inducible plasmids listed above. Specifically, cell line 1 contained plasmid 1, cell line 2 contained plasmid 2, cell line 3 contained plasmid 3, cell line 4 contained plasmid 4, cell line 5 contained plasmid 5, cell line 6 contained plasmid 6.
[0215] To characterise the metabolites that were exported by the transporters cell lines 1, 2 and 3 (containing the plasmids expressing yqhA, AT4G19390 and Sevir.4G287300 respectively) were grown in M9 minimal medium supplemented with 22mM glucose as the sole carbon source (henceforth referred to as M9 glucose). No other carbon containing molecules were added to the medium and thus glucose was the sole carbon source available to the cells for growth and respiration.
[0216] These three cell lines were pre-grown over night from a cell culture with an optical density measured at a wavelength of 600 nm (0D600) of 0.1 in 50m1 in M9 glucose. The following day, each cell line was subcultured to an OD600 of 0.1 in M9 glucose in two separate flasks. Both flasks were allowed to grow to an OD600 of 0.2 and then expression of the transporter gene was induced in one flask by addition of 50 μM 2,4-diacetylphloroglucinol (DAPG) to the cell culture medium. As DAPG stock solution was dissolved in ethanol, an equivalent volume of ethanol without DAPG was added to the non-induced control flasks. Samples of cell culture were taken from both the induced and non-induced control flasks at time 0 and at three hours following induction of transporter gene expression. The cell culture was spun at 13,000 g for five minutes at 4° C. Following centrifugation, the supernatant was aspirated and the cell pellet discarded. In each case, 20 μl of ice-cold supernatant was subject to metabolite extraction by mixing with 350 μl of CHCl.sub.3/CH.sub.3OH (3:7 v/v) and incubating at −20° C. for two hours with mixing. At two hours, 350 μl of ice-cold water was added to this mixture and allowed to warm up to 4° C. This mixture was centrifuged at 13,000 g for ten minutes at 4° C. After this, the upper aqueous-CH.sub.3OH phase was transferred to a 1.5 ml tube. This remaining CHCl.sub.3 phase was re-extracted with 300 μl of ice-cold water and the upper aqueous-CH.sub.3OH phase was removed as before. The two upper aqueous-CH3OH phases were then combined and dried using a centrifugal vacuum dryer. Samples were analysed by LC-MS/MS with authentic standards for accurate metabolite quantification.
[0217] Expression of all three transporters (E. coli yqhA, A. thaliana AT4G19390, and Setaria viridis Sevir.4G287300) resulted in the export of the monocarboxylate/monocarboxylic acid pyruvate to the cell culture medium (
[0218] Expression of both of the representative plant members of this gene family resulted in the export of a range of dicarboxylates/dicarboxylic acids (
[0219] Expression of the Setaria viridis member of this gene family resulted in the export of the tricarboxylates/tricarboxylic acid citrate (
[0220] Expression of both of the representative plant members of this gene family resulted in the export of a range of phosphorylated carboxylates (
[0221] To confirm that all members of the gene family share this transport function the cell lines plasmids 4, 5 and 6 were also subject to analysis. Here these cell lines pre-grown over night from a cell culture with an optical density measured at a wavelength of 600 nm (0D600) of 0.1 in 50m1 in M9 glucose. The following day, each cell line was subcultured to an OD600 of 0.1 in M9 glucose in two separate flasks. Both flasks were allowed to grow to an OD600 of 0.2 and then expression of the transporter gene was induced in one flask by addition of 50 μM 2,4-diacetylphloroglucinol (DAPG) to the cell culture medium. As DAPG stock solution was dissolved in ethanol, an equivalent volume of ethanol without DAPG was added to the non-induced control flasks. Samples of cell culture were taken from both the induced and non-induced control flasks at time 0 and at six hours following induction of transporter gene expression. The cell culture was spun at 13,000 g for five minutes at 4° C. Following centrifugation, the supernatant was aspirated and the cell pellet discarded. The concentration of pyruvate in cell culture supernatants was assessed using a pyruvate oxidase-based enzymatic assay with colorimetric detection (abcam ab65342) according to the manufacturer's instructions. Colorimetric detection was performed using a plate reader (FLUOstar Omega, BMG Labtech), and pyruvate concentration calculated by comparison to the standard curve. In all cases, the expression of the genes encoding different members of the UPF0114 protein family resulted in the export of the monocarboxylate pyruvate. Pyruvate was not exported from non-induced cells (
Example Two
The Transporter Can Transport Metabolites Both With and Against a Concentration Gradient
[0222] The intracellular concentration of pyruvate in E. coli is 390 μM. To demonstrate that the transporter can export metabolites against a concentration gradient the experiment described in Example one was repeated using the nucleotide sequence of the Sevir.4G287300 gene from Setaria viridis (amino acid sequence shown in SEQ ID NO: 6). This time the M9 glucose growth medium was supplemented with different concentrations of additional pyruvate such that the concentration of pyruvate outside the cell was higher than inside the cell. Initial starting concentrations were chosen to be 0 μM, 300 μM and 700 μM. In all cases, pyruvate was exported from the cells. In the case of both the 300 μM and 700 μM starting concentrations, pyruvate was exported such that pyruvate accumulated to concentrations exceeding the intracellular concentration by three hours (
[0223] Example Three: The transporters facilitate bidirectional transport of metabolites
[0224] Under aerobic conditions the dicarboxylate/dicarboxylic acid transporter dctA is solely responsible for uptake of dicarboxylates in E. coli. When the gene encoding dctA is deleted from the E. coli genome, dicarboxylates/dicarboxylic acids can no longer enter the cell and thus E. coli cannot grow on malate as a sole carbon source (
[0225] The inducible expression plasmid containing the Sevir.4G287300 gene from Setaria viridis was transformed into the dctA knockout line (ΔdctA). ΔdctA lines harbouring the inducible expression plasmid were pre-grown over night from a cell culture with OD600 of 0.1 in 50m1 in M9 glucose. The following day, the cell line was subcultured to an OD600 of 0.2 in M9 glucose in two separate flasks. Expression of the transporter gene was induced in one flask by addition of 50 mM 2,4-diacetylphloroglucinol (DAPG) to the cell culture medium. As DAPG stock solution was dissolved in ethanol, an equivalent volume of ethanol without DAPG was added to the non-induced control flasks. Cell lines were incubated for 2 hours to allow transporter gene expression. Cells were subsequently isolated by centrifugation at 13,000 g for 5 min, washed twice in M9 (+/−DAPG as appropriate) with no carbon source. Cells were then resuspended in M9 malate (+/−DAPG as appropriate) and samples of cell-free supernatant were collected after two and three hours. Pyruvate levels were measured in the supernatant using a colorimetric assay. Pyruvate was readily exported from the cells in the presence of malate, but not in the absence of malate as a carbon source (
Example Four
In C.SUB.3 .Plants the Transporter Localises to Chloroplasts
[0226] The AT4G19390 gene from Arabidopsis thaliana was tested for subcellular localisation using C-terminal GFP fusions in Arabidopsis thaliana leaf protoplasts. The nucleotide sequence corresponding to the full length amino acid sequence including the predicted chloroplast transit peptide (SEQ ID NO: 2) and with original endogenous codon use, but lacking any introns, was expressed from a constitutive expression vector. The same vector expressing GFP was used as a control.
[0227] The Arabidopsis thaliana AT4G19390 gene expressed as a C-terminal GFP fusion in leaf cell protoplasts localised to foci on the periphery in chloroplasts (
[0228] To further confirm this localisation in C.sub.3 plants a C-terminal GFP fusion of the Seita.4G275500 gene from Setaria italica (SEQ ID NO: 8) was expressed in protoplasts isolated from Oryza sativa (rice) sheath tissue (
[0229] To further confirm this localisation in C.sub.3 plants a C-terminal GFP fusion of the AT4G19390 gene from Arabidopsis thaliana (SEQ ID NO: 2) was expressed in intact plant leaves from Nicotiana benthamiana (
Example Five
In C.SUB.4 .Plants the Transporter Can Localise to the Chloroplast and to the Plasma Membrane
[0230] The Setaria italica member of this gene family was tested for subcellular localisation using C-terminal GFP fusions in Setaria viridis leaf protoplasts. The nucleotide sequence corresponding to the full length amino acid sequence including the predicted chloroplast transit peptide (SEQ ID NO: 3) and with original endogenous codon use, but lacking any introns, was expressed from a constitutive expression vector. The same vector expressing GFP was used as a control.
[0231] The Setaria italica gene expressed as a C-terminal GFP fusion in leaf cell protoplasts localised to foci in chloroplasts (
[0232] Example Six: RNAi knockdown of the transporter disrupts C.sub.4 photosynthesis
[0233] As the protein encoded by the Setaria italica representative member of this gene family can uptake malate and export pyruvate, and as it localises to the chloroplast envelope, and as it is extremely highly expressed in bundle sheath cells of the C.sub.4 plant Setaria viridis (
[0234] The construct was transformed into callus generated from the Setaria viridis ME034V ecotype. Transgenic plants were screened by PCR for presence of insert in TO generation. Plants that were positive for the selectable marker gene and for the RNAi fragment were taken forward for screening my quantitative PCR. T0 plants with low levels of expression of the Setaria viridis gene Sevir.4G287300 were selected. Plants had ˜10% levels of expression of the gene compared to wild-type plants (
[0235] Knock-down plants were subject to photosynthesis phenotyping using a LI-COR LI-6800 to measure photosynthetic rate. Photosynthetic response to CO.sub.2 concentration curves (also known as CO.sub.2 response curves or A/C.sub.i curves) were conducted. This revealed that knock-down of the transporter severely disrupted C.sub.4 photosynthesis (
Example Seven
Pyruvate Efflux Activity Can be Stimulated by the Presence of Exogenous Malate
[0236] The import of malate and efflux of pyruvate from cells expressing members of the UPF0114 gene family is compatible with the hypothesis that the proteins of this family can function as antiporters. A key prediction of this hypothesis that E. coli cells expressing any member of this gene family, when fed on glucose, will show a rapid and substantial increase in pyruvate efflux if malate (and not other dicarboxylates) is added to the cell culture medium. To test this prediction, E. coli AdctA cells were grown on glucose, then expression of the Setaria italica Seita.4G275500 gene (SEQ ID NO: 8) was induced, different four-carbon dicarboxylates were added to the cell culture medium, and rapid changes to pyruvate efflux rate were assessed. Stimulated pyruvate efflux was only detected in cells that were supplemented with exogenous malate (
Example Eight
Members of the UPF0114 Gene Family are Highly Expressed in Plants that Conduct CAM Photosynthesis.
[0237] As well as being key metabolites of the C.sub.4 photosynthetic pathway, pyruvate and malate are also key metabolites of CAM photosynthesis. In the CAM photosynthetic pathway malate is biosynthesised and accumulated during the night and then decarboxylated during the day. This process stores CO.sub.2 at night and releases it during the day to enhance CO.sub.2 concentration around RuBisCO. This process enhances the water use efficiency of the plant as it allows the plants to shut their stomata during the day and thus reduce water loss through transpiration.
[0238] Several species of plant perform inducible CAM photosynthesis whereby they can switch between C.sub.3 and CAM photosynthesis depending on conditions. Under well-watered growth conditions these plants perform normal C.sub.3 photosynthesis. However, under drought conditions or, when water is scarce, these plants switch to using CAM photosynthesis to improve their water use efficiency. Accordingly, there are two hallmarks that characterise genes that are involved in the CAM photosynthetic pathway. 1) The transcripts corresponding to the genes show a substantial increase in abundance when plants switch from C.sub.3 to CAM photosynthesis and the CAM pathway becomes active. 2) When conducting CAM photosynthesis, the transcripts corresponding to the genes differentially accumulate in between the day and the night. Transcriptome analysis of two different inducible CAM plants species demonstrate that the members of the UPF0114 gene family display both of these hallmarks of functioning in CAM photosynthesis. Specifically, analysis of the transcriptome of Talinum triangulare (Brilhaus et al. 2016. Plant Physiology 170(1) 102-122) revealed that the transcripts corresponding to the ortholog of AT4G19390 in Talinum triangulare (Tt48731, SEQ ID NOs 15 and 16) substantially increase in abundance when the plant switches from C.sub.3 to CAM photosynthesis (
INCORPORATION BY CROSS REFERENCE
[0239] The present application claims priority from Australian provisional patent application number 2019902940, the entire contents of which are incorporated herein by cross-reference.