Modified expression of prolyl-4-hydroxylase in physcomitrella patens

Abstract

The field of the invention relates to a method for the production of a recombinant protein in a plant-based system comprising the steps of providing a plant-based system comprising a modulation for a plant endogenous prolyl-4-hydroxylase gene, delivering a gene encoding the recombinant protein into the plant-based system, and cultivating the plant-based system for the expression of the gene encoding the recombinant protein. The field of the invention further relates to a recombinant protein, which has been produced in a plant-based system. A plant-based system and use of the recombinant protein are also provided.

Claims

1. Cells derived from Physcomitrella patens, comprising an ablation of expression of the plant endogenous prolyl-4-hydroxylase 1 gene according to SEQ ID NO: 1 or comprising a down-regulation of expression of the plant endogenous prolyl-4-hydroxylase 1 gene according to SEQ ID NO: 1 by amiRNA or antisense RNA.

2. Plant cells derived from Physcomitrella patens, according to claim 1, wherein the ablation of expression of the plant endogenous prolyl-4-hydroxylase 1 gene comprises knockout of the prolyl-4-hydroxylase 1 gene.

3. A method for the manufacture of a recombinant protein comprising the steps of: providing cells according to claim 1; delivering a gene encoding the recombinant protein into said cells; and cultivating said cells for the expression of the gene encoding the recombinant protein.

4. The method according to claim 3 for the manufacture of recombinant human erythropoietin (rhEPO).

Description

BRIEF DESCRIPTION OF THE FIGURES

(1) FIG. 1 Protein sequence comparison of P. patens putative prolyl-4-hydroxylases (P4Hs): PpP4H1 (SEQ ID No: 2), PpP4H6a (SEQ ID No: 12), PpP4H6b (SEQ ID No: 14), PpP4H5 (SEQ ID No: 10), PpP4H2 (SEQ ID No: 4), PpP4H3 (SEQ ID No: 6), PpP4H4 (SEQ ID No: 8).

(2) FIG. 2 In vivo subcellular localization of P. patens P4H homologues

(3) FIG. 3 Schematic representation of the p4h knockout constructs

(4) FIG. 4 p4h gene expression analysis in recombinant moss lines

(5) FIG. 5 Mass spectrometric analysis of the hydroxylation of moss-produced rhEPO

(6) FIG. 6 MS/MS analysis of the peptide EAISPPDAASAAPLR (144-158) from moss-produced rhEPO

(7) FIG. 7 Effect of overexpression of the prolyl-hydroxylase gene p4h1

(8) FIG. 8 Analysis of the hydroxylation status of the N-terminal peptide of moss-produced rhEPO

(9) FIG. 9 Phylogenetic tree of the amino acid sequences of different plant prolyl-4-hydroxylases

DETAILED DESCRIPTION OF THE INVENTION

(10) The present disclosure provides a method for the production of a recombinant protein comprising only human-specific prolyl hydroxylation in a plant-based system, comprising the steps of providing a plant-based system, wherein the plant-based system comprises a modulation for a plant endogenous prolyl-4-hydroxylase gene, delivering a gene encoding the recombinant protein into the plant-based system, and cultivating the plant-based system for the expression of the gene encoding the recombinant protein.

(11) The term plant endogenous shall refer to the plant's own prolyl hydroxylase gene. In other words, if the plant-based system comprises plant cells derived from Physcomitrella patens, the prolyl-4-hydroxylase gene is also derived from Physcomitrella patens. It is not intended to insert an additional mammalian gene.

(12) The delivery of DNA shall be understood as the introduction of DNA into cells and tissue. Any known method in the state of the art may be used, for example transformation, particle bombardment, electroporation or viral transduction.

(13) Cultivation shall mean any type of cultivating technique known in the art using amongst standard laboratory equipment the appropriate media and substituents and cultivation conditions for the respective cells.

(14) It was unexpectedly shown that the method reveals recombinant proteins, which may comprise only human-specific prolyl hydroxylation meaning that all plant-specific prolyl hydroxylations can be eliminated.

(15) In this method, the plant-based system may comprise plant cells derived from Physcomitrella patens. The prolyl-4-hydroxylase gene may be the Physcomitrella patens prolyl-4-hydroxylase gene with the NCBI Accession No. XM_001753185. The recombinant protein may be recombinant human erythropoietin (rhEPO).

(16) The present disclosure also provides a recombinant protein, which has been produced in a plant-based system according to above-described method. The plant-based system therefore comprises modulation of a plant endogenous prolyl-4-hydroxylase gene. The recombinant protein may only comprise human-specific prolyl hydroxylation or shall not have plant-specific prolyl hydroxylation at at least one plant specific prolyl hydroxylation site.

(17) A plant-based system refers to plant cells or cells derived from plant cells. A plant-based system comprising a knock-out allele shall mean that the plant-based system is genetically modified so that a wild-type allele of the gene is replaced by an engineered construct. The expression of the respective gene can thus be down-regulated or completely abolished. It has to be noted that even the down-regulation of a single p4h gene has been shown to be sufficient.

(18) The plant-based system before genetic modification can be wildtype or mutant. Wildtype sequences within the meaning of the present disclosure refer to the non-mutated version of a gene common in nature or the allele required to produce the wildtype phenotype. The wildtype phenotype is the most common form or phenotype in nature or in a natural breeding population.

(19) Recombinant proteins are derived from DNA sequences that in turn result from the use of molecular cloning to bring together genetic material from multiple sources, creating sequences that would not otherwise be found in biological organisms. A recombinant human protein for instance is derived from human DNA sequences which have been modified by genetic material from multiple sources.

(20) Human-specific prolyl hydroxylation shall mean that the recombinant human protein comprises no plant-specific prolyl hydroxylations. Plant-specific prolyl hydroxylation is the hydroxylation of prolines, which is performed by the plant's unmodulated enzymes. Thus, when a recombinant human protein is expressed in a plant-based system, the plant's enzymes will hydroxylate the prolines in a plant-specific manner, giving rise to non-human O-glycosylation of the recombinant human protein. Thus, elimination of the plant-specific prolyl hydroxylation has the advantage that adverse O-glycosylation is avoided. Recombinant human proteins produced in a plant-based system can thus be humanized via glyco-engineering.

(21) Given the great importance of O-glycosylated proteins for the human body, even slight differences between recombinant human proteins produced in a plant-based system and their native human counterparts in this posttranslational modification will hamper approval of the drug by the relevant authorities. Thus, the present approach is to precisely eliminate the attachment sites for plant-specific O-glycosylation, hydroxylated proline residues, on the recombinant human protein.

(22) The plant-based system may comprise plant cells derived from Physcomitrella patens. The prolyl-4-hydroxylase gene may be the Physcomitrella patens prolyl-4-hydroxylase gene with the NCBI Accession No. XM_001753185.

(23) It was unexpectedly shown that ablation of the gene with the NCBI Accession No. XM_001753185 can abolish undesired prolyl hydroxylation. Surprisingly, growth rate, differentiation, rhEPO productivity and secretion of the protein to the culture medium were not impaired in these knockout plants compared to the parental line.

(24) Physcomitrella patens shall refer to the wildtype or the mutated moss.

(25) In a further embodiment of the present disclosure, the recombinant protein is recombinant human erythropoietin (rhEPO).

(26) The present disclosure also provides a plant-based system comprising a modulation for a plant endogenous prolyl-4-hydroxylase gene, wherein the plant-based system comprises plant cells derived from Physcomitrella patens and wherein further the prolyl-4-hydroxylase gene is the Physcomitrella patens prolyl-4-hydroxylase gene with the NCBI Accession No. XM_001753185 for the production of a recombinant protein, wherein the recombinant protein does not comprise any non-human prolyl hydroxylation.

(27) The plant-based system may be the Physcomitrella patens mutant deposited with the International Moss Stock Center under IMSC No. 40218.

(28) It is a further object of the present disclosure to use the recombinant protein as a pharmaceutical, including biopharmaceuticals, or for the manufacture of a pharmaceutical.

(29) Biopharmaceuticals are pharmaceuticals produced using biotechnological means. They can be, for example, proteins (including antibodies) or nucleic acids (DNA, RNA or antisense oligonucleotides) and can be used for therapeutic or in vivo diagnostic purposes. They are produced by means other than direct extraction from a native (non-engineered) biological source. For example, biopharmaceuticals can be produced in genetically modified plants.

(30) It is intended that the recombinant protein of the present disclosure can be used as a biopharmaceutical because it does not comprise non-human prolyl hydroxylation and no plant-specific prolyl hydroxylation.

EXPERIMENTS

Experiment 1: Identification of Physcomitrella patens prolyl-4-hydroxylases (P4Hs)

(31) For the identification of prolyl-4-hydroxylase homologues in P. patens, the amino acid sequence of the Arabidopsis thaliana P4H1 (AT2G43080.1) was used to perform a BLAST (basic local alignment search tool) search against the gene models in the Physcomitrella patens resource (cosmoss.org). Six sequences from the Physcomitrella patens genome with homology to P4H enzymes were identified: Pp1s8_114V6.1 (PpP4H1), Pp1s192_51V6.1 (PpP4H2), Pp1s19_322V6.1 (PpP4H3), Pp1s172_91V6.1 (PpP4H4), Pp1s12_247V6.1 (PpP4H5) and Pp1s328_29V6.1 (PpP4H6). As sequence information was not complete for Ppp4h2, 3 and 6 mRNA, 5 RACE (rapid amplification of cDNA-ends)-PCR was employed (GeneRacer, Invitrogen, Karlsruhe, Germany) according to the manufacturer's protocol to obtain full length sequences. Two different cDNAs were amplified for the Ppp4h6 gene, corresponding to alternative splice forms of the mRNA, from which two protein variants with different N-termini could be predicted (Ppp4h6a and Ppp4h6b).

(32) The following sequences were identified:

(33) TABLE-US-00001 P4H1cDNA(Pp1s8_114V6.1AccessionNo.: XM_001753185;SEQIDNO.1) GCAAGATCGTCTGATTGCGCGCACGTCGGAGATCGCTTAAAGTGAAGGTT GCATTGCTCTGGCAAGAAGTATTTGCAGGTAGGACGGTAGAGTCTGGATG CGCCAGAGTTGTCGGTTTGGCCTTCTTCGCAAGGGAGAAGAAGTCATGAT GCTTGGATTTAGCGAATTCGAAGAGCTGATCCTTGTTTTTCCGTCAGACT GGCAAGGGATGGAGTAATTCTACGAAGCGAGCGCGTCAGGGTTTGGTTTT AGGAAGCTGGGCTGCCACAGACACTTTTGACGATGGGTCCCTCTAGATAT GTCATTGTGCTCCTCACATTTGTGACGATCGGCATGGCTGGGGGGGCGTT ATTGCAGCTGGCTTTCTTGAAGAAGCTAGAACAAAGTAGTGGAGCTGGGA TTTACAATTATAGAAGAGAGATAGGGGAATACGAAAACCAAACATTTGGA TCGGGATTGTCCCTTTGGGCTAATGATGAAGATGCGAGAACACTACGTGT TGGACTGGTTAAGCAAGAAGTTATTAGCTGGCAACCCAGAATCATTCTCC TGCACAATTTCCTTAGTGCTGATGAATGTGATCACCTGATAAATCTTGCT CGCCCCAGGCTCGTGAAGTCAACAGTCGTGGATGCAACCACAGGCAAGGG AATCGAGAGTAAGGTTCGAACAAGCACAGGCATGTTCCTTAATGGAAATG ACCGCAGACATCACACTATTCAGGCAATCGAAACCCGTATTGCTGCGTAT TCTATGGTACCTGTTCAAAATGGGGAGCTCCTCCAAGTTTTACGATATGA ATCTGATCAATATTACAAGGCACATCACGACTACTTTTCAGATGAGTTCA ATTTAAAAAGGGGTGGGCAACGTGTGGCGACAATGCTTATGTACTTGACC GAGGGGGTCGAGGGAGGCGAAACAATATTTCCGCAGGCTGGAGATAAAGA GTGTAGCTGTGGCGGTGAAATGAAAATCGGCGTCTGTGTGAAACCTAAAC GAGGGGATGCTGTCCTGTTTTGGAGCATTAAGCTGGATGGACAAGTTGAT CCAACAAGCCTTCATGGTGGATGCAAAGTTTTGTCAGGAGAGAAATGGTC GTCTACCAAATGGATGAGGCAGCGAGCCTTTGATTAGGGTGAACTTTGGA TGGTAGGAGCTGTAATCATAGTAGAAGACCAATAATAGCGATTATGCCTC ATCATTCCGGAAGCTTTGCGGGCTTTTCCCGATGCATCTAAGAATGTATG TAATGAGCAACTTTGAATACTGTCAGTGATTCGTAACAAGAAAAAAATCG ATTTAGTGGTATTGTGGACTTTGAAATGAAGGTTAAGATCACGAAGAGCT TT TranslationcorrespondingtoP4H1cDNA (SEQIDNO.2) MGPSRYVIVLLTFVTIGMAGGALLQLAFLKKLEQSSGAGIYNYRREIGEY ENQTFGSGLSLWANDEDARTLRVGLVKQEVISWQPRIILLHNFLSADECD HLINLARPRLVKSTVVDATTGKGIESKVRTSTGMFLNGNDRRHHTIQAIE TRIAAYSMVPVQNGELLQVLRYESDQYYKAHHDYFSDEFNLKRGGQRVAT MLMYLTEGVEGGETIFPQAGDKECSCGGEMKIGVCVKPKRGDAVLFWSIK LDGQVDPTSLHGGCKVLSGEKWSSTKWMRQRAFD P4H2cDNA(Pp1s192_51V6.1AccessionNo.:JX964780; SEQIDNO.3) GTGATGCGTGATCCTGTGCTGCTGAGCGTGGGTTTTACCGACTTTAATCG GGCAAGGGCGTTGATGTTAACTTCTGCATCGTACTGGGAGGTTTGTCTAC ATCTCCGCGGGAATTTTCTGCGTCTTTTGGTGTGGATCCACAGCATGGCG TTGAGAGATAGAAGATGTAGTCTTATTCTAGCTCTCTTATTACTATCGGG ATTACAAGCATTGGGAGCTCGTGTGGAAGACTTGCCTGGTTGGATGGAAG AAATCAATGAGGTGAAGGATGCTGAGGGTGGCGTGATTCAACAAGTTTCT AGGATTGATCCCACTCGTGTCAAGCAGCTTTCGTGGAAACCGCGTGCATT TCTATATTCAAACTTTTTGTCAGATGCAGAGTGTGATCATATGATATCGT TGGCAAAGGACAAGCTGGAGAAGTCAATGGTGGCCGATAATGAATCTGGG AAGAGTGTGAAGAGTGAAATTCGCACTAGCTCAGGTATGTTTTTGATGAA GGGTCAGGATGATATCATATCAAGGATTGAGGATAGGATTGCTGCATGGA CCTTTCTACCGAAGGAGAATGGGGAGGCAATCCAGGTCTTGAGGTACCAA GATGGGGAGAAGTATGAGCCACATTTTGATTATTTCCACGATAAGAACAA TCAGGCTCTTGGAGGTCACCGCATTGCCACTGTGTTAATGTACCTCTCCG ACGTCGTCAAAGGTGGAGAGACAGTATTTCCTTCTTCTGAAGATCGAGGT GGTCCCAAGGATGATTCGTGGTCTGCTTGTGGGAAAACTGGGGTGGCCGT GAAACCAAGGAAAGGCGATGCCCTGCTCTTCTTCAGCCTACACCCCTCTG CAGTTCCAGATGAGTCAAGCTTACACACAGGATGCCCAGTTATCGAAGGG GAGAAATGGTCTGCTACAAAGTGGATCCATGTTGCTGCATTTGAAAAGCC GCGTCCTAAGAATGGTGCATGTGTAAATGAGGTCGACAGTTGCGAAGAGT GGGCAGCTTATGGGGAATGTCAGAAAAATCCAGCCTACATGGTTGGGACA AAAGAGTGGCCAGGCTATTGCCGGAAAGCATGCCATGTGTGCTAGGTAGG GATATACCGTATTTCTTGGTTGCACTCTGTTGGGTTAGGGTAGGATATTT AATGTATTTGTGTCATCATCTAAGTATTAGGTCAGTTTCCAAACCAAGGA ATCAGAGTTGTGGCTTTTGAAGAAGTATTATAGATCTTACGTACTAATTA AAAGGCTTGTGACCCTTGAGATGCACTTTATAAT TranslationcorrespondingtoP4H2cDNA (SEQIDNO.4) MRDPVLLSVGFTDFNRARALMLTSASYWEVCLHLRGNFLRLLVWIHSMAL RDRRCSLILALLLLSGLQALGARVEDLPGWMEEINEVKDAEGGVIQQVSR IDPTRVKQLSWKPRAFLYSNFLSDAECDHMISLAKDKLEKSMVADNESGK SVKSEIRTSSGMFLMKGQDDIISRIEDRIAAWTFLPKENGEAIQVLRYQD GEKYEPHFDYFHDKNNQALGGHRIATVLMYLSDVVKGGETVFPSSEDRGG PKDDSWSACGKTGVAVKPRKGDALLFFSLHPSAVPDESSLHTGCPVIEGE KWSATKWIHVAAFEKPRPKNGACVNEVDSCEEWAAYGECQKNPAYMVGTK EWPGYCRKACHVC P4H3cDNA(Pp1s19_322V6.1AccessionNo.: JX964781;SEQIDNO.5) CGGCGCTTTGCAACTCCAATTTTGACCAGGCGAAGTGCACTTTGACATCT TGTTGAATGTCCTCTTCTAGAGCATTGAACGGCCCTTCTGTGAACATTTT AAACTATTCAACGGATGCCATTGACAGTCGTGGTTTTTGAAGTTCGAATC CAGAGCCCTCGCCATCAAATCGTTGCAGTAATCCTTGGTGATTTAGCAAG CTCGGGATCACTTCATGGATTTGGGGTCCTTCCTCTGCAGAGGCTGTTAG TACACACACACTGCATCAACTCCTACTGGTCTGGAAGCTTTTGAGGTTGG AAATAGTATGAAAGAGTCCCAGACAATTGGTGTATTGAGTGGAAGAGGGT TGTGAAGTTTGGGCGCTCGACTGAAATGACCTGCGTGGATGTTAGAAAAT AAGCCAATTGGTGTTATGTAGAGATTCGTCACAACGCCCTCATTCCTCCA ACCCTTAAATGCCTTGCCCTATTTGTGTACTCTCGTGTGCGGGAATGACG CTGTCCTTATACAATATGAAGTCATCGAAAAACAAAGGAAGAAAATGGAA TCCTTTTACATACAAGCTCAGTTTGCCACAGGTGCTATTGTGGTGCACAA TCTGCCTCTTAGCAGGCTATGCCGCCTCCAATTTCTTCCCCCAGAAAATA GAAGAGGAAGCAATATATCAGCCGTATCGGAAATCGGCTCAGCAAGAAGG GGAATTTCCATTTGGTGAATTCAGTGAAAAAGTGGTGTTAGATCATGGTA GCACTGGGGACAACTTCATCGCTGACATTCCTTTCCAGGTGTTGAGCTGG AAGCCTCGTGCGCTCTTGTATCCGAGATTTGCTAGCAAGGAGCAATGCGA GGCCATCATGAAGCTTGCAAGGACTCGTCTTGCTCCTTCTGCTCTGGCTT TGAGGAAAGGGGAGAGTGAAGACTCAACGAAAGACATCCGAACTAGTTCC GGGACTTTCTTGAGAGCCGACGAAGACACGACGCGGAGTTTGGAGCAAGT TGAAGAGAAGATGGCGAAAGCAACCATGATACCTCGCGAGAATGGAGAGG CTTTCAATGTGTTGAAGTACAATGTGGGACAAAAATACGACTGCCATTAT GATGTTTTTGACCCAGCTGAGTATGGACCTCAACCAAGCCAACGGATGGC CTCCTTTCTCTTATATCTATCGGATGTGGAAGAGGGTGGAGAGACCATGT TTCCCTTCGAAAATTTTCAAAACATGAACATAGGCTTTGACTACAAGAAG TGCATTGGAATGAAAGTCAAGCCCCGCCAAGGTGATGCATTGCTTTTCTA CTCAATGCATCCTAACGGCACATTTGATAAGAGCGCTCTGCATGGAAGCT GCCCTGTAATCAAAGGCGAGAAATGGGTTGCCACAAAGTGGATTCGCAAC ACTGACAAATTTTGATCACCACCATGCGAACGTTTTTACGTCCAAAATTA GGACATAGGAATCTGTCAATCAAATTAAAGGACATATCTTTTATATCATT TAAAAATTCTGAAACTGAGAACTCATATGAACACCAGTTGAAACATTCGG GTCAACCGGATTATCGACAT TranslationcorrespondingtoP4H3cDNA (SEQIDNO.6) MPCPICVLSCAGMTLSLYNMKSSKNKGRKWNPFTYKLSLPQVLLWCTICL LAGYAASNFFPQKIEEEAIYQPYRKSAQQEGEFPFGEFSEKVVLDHGSTG DNFIADIPFQVLSWKPRALLYPRFASKEQCEAIMKLARTRLAPSALALRK GESEDSTKDIRTSSGTFLRADEDTTRSLEQVEEKMAKATMIPRENGEAFN VLKYNVGQKYDCHYDVFDPAEYGPQPSQRMASFLLYLSDVEEGGETMFPF ENFQNMNIGFDYKKCIGMKVKPRQGDALLFYSMHPNGTFDKSALHGSCPV IKGEKWVATKWIRNTDKF P4H4cDNA(Pp1s172_91V6.1AccessionNo.: XM_001774115;SEQIDNO.7) GTTACACAAATTCATCAACCTCGAGGCATTTGGTTCATCAGTGGATCCAT TTGTTGGGGTTTCGTGTGGATTGAGCTTGTGGGTTTCCTTCTCCGACTCG GAAATCGCTCCTGACAGAGTTTTCACGGAAGCTTTTGAGGCTGGAAACGG AGAAGGATTATTCCAAAGAATCGGTTTTTTAAAGTGTCACTTATCTTGTT TTCAAGGACAGTCTCAATAACAATTTGGCGCAATTATCTGCAATGATTTA CATGGATTGAATCGATTTTCAGTAGCTAAATGTAGGGTCTGCTAGGCCCT CTATATTCCGACCCTTGAGTGAAGACACTGCCTCCCAGGCAGTCCGTGCC TTATTTTAATCTCCTTGCGTGCAAAGAACAGGAAGGCTGACACCGATTAT AAACGGTTGAGACATGAAAACGCCAAAGGTCCGGGCAAGGAGTGCAAACC CTTTAAGATACAAGCTTGGTTTTCCTCTGGTGCTCTTGTGTTGCACATTC TTCTTCTTGGTCGGCTTTTACGGTTCCAATTCCCTCTCCAAGGAAGAAAA ACATGTGGTGATTGACCCCGTCACCAATGAGAAACTTGTGTTCGAACATG GCCGTACTGGAGACAGTTCTGTTACTGACATTCCTTTCCAGGTGTTAAGT TGGAAACCACGTGCCCTTTTGTATCCGAATTTTGCAAGCAAAGAGCAATG TGAAGCCATCATCAAGCTTGCGAGGACACGTCTTGCTCCTTCTGGTCTGG CTTTGAGGAAAGGGGAGAGTGAAGCCACAACGAAAGAAATCAGAACTAGT TCTGGAACTTTCTTGAGAGCCAGTGAAGATAAAACACAGAGTTTAGCGGA GGTTGAGGAGAAGATGGCCAGAGCAACCATGATACCTCGGCAGAATGGGG AGGCTTTTAATGTGTTGCGGTACAACCCAGGTCAAAAATACGATTGTCAC TATGATGTTTTTGATCCAGCTGAGTATGGTCCTCAACCAAGCCAGCGGAT GGCTTCCTTTCTCCTTTATTTATCAGACGTCGAAGAGGGCGGAGAAACGA TGTTTCCCTTCGAAAACTTTCAAAATATGAACACAGGCTATAATTATAAG GACTGTATTGGGTTGAAAGTGAAACCCCGCCAAGGCGATGCTCTTCTTTT CTATTCAATGCATCCTAACGGTACATTTGACAAGACCGCATTGCATGGAA GCTGTCCAGTTATCAAAGGCGAAAAATGGGTCGCCACGAAGTGGATACGC AATACCGACAAATTTTAATCTGAAAGATCCCACTGGTGACTGTTATAACT TGCTGCCTTCTTAAAGTTCTTTCGGTAGTACTCTAGGAGCTTCAGGTTAT CTTACAAAAGTATCGGGTCTGAGAAAGTGTAAAATCTGTGCGTACCTGAA TCCATCAATTAAGTCATGGGTGTTATCTTTTAACATTCCTGGTCTCTGCC AACCAGAGTTCCAGAGAAACGGTTGTTCGCTGGATTATTGCCAGCTTAAA GTTCACTTAAGAAATTCTAAACTCTTCAACTAAGAAGACATTGTCCTTG TranslationcorrespondingtoP4H4cDNA (SEQIDNO.8) MKTPKVRARSANPLRYKLGFPLVLLCCTFFFLVGFYGSNSLSKEEKHVVI DPVTNEKLVFEHGRTGDSSVTDIPFQVLSWKPRALLYPNFASKEQCEAII KLARTRLAPSGLALRKGESEATTKEIRTSSGTFLRASEDKTQSLAEVEEK MARATMIPRQNGEAFNVLRYNPGQKYDCHYDVFDPAEYGPQPSQRMASFL LYLSDVEEGGETMFPFENFQNMNTGYNYKDCIGLKVKPRQGDALLFYSMH PNGTFDKTALHGSCPVIKGEKWVATKWIRNTDKF P4H5cDNA(Pp1s12_247V6.1AccessionNo.: JX964782;SEQIDNO.9) GCTGCTTCAGGGTAGGACAAACCATCGTCGAAGGGGATGTGGGTCGACCT ATTTTGGTCAACTTTATCTGTCTTTCTACTTCCGATGAATTGCCGTTTTT GTTGTAAGCGTTTGCACATGCAGGTTGGAGGCTGGTGAACTGCATACACA AATTTGATAGTCGGGGAGAAAGAGGAGTTTCTCACAGTGTCTTTGGTGAT TGGATCATCCTCGAGGAGCTTTTAGCTCGAAGGGTTTCCTGATTTTAAGT TTGGAACCGAGGTATTTCAATCGTGAGAGTGGTTCTTAGCATGCATACAT TTTGAGTGTGTAGGTATGGATCTCTATTCTAGAAGCCGTAGAGGCTGAGT AACTATTGCATTCTCTGAAATCCTGTTTACCTCGGCGCGGCCACATCTCG AAGTAGTCGGTAATTTTCTTCCTTGGGTTTCGTGGGAGCCGGGCGAAGTT CGTAACTATGGCGAAGCTGAGTCGAGGTCAAAGGAGAGGAGCTGGCACGA TGGCTTTGTTGGTGCTGGTCCTGTTGTCTCTAGCGCTCATGCTCATGTTG GCACTTGGCTTTGTAGCCATGCCATCGGCGTCCCACGGGAGTTCGGCTGA CGTTGTGGAAATCAAGCTGCCCTCACACAGGCATTTTGGTGCCAACCCCT TATCACGTTGGGTTGAAGTCCTCTCTTGGGAGCCCAGAGCCTTTCTATAT CACCACTTTCTGACAGAAGAGGAATGCAATCATCTAATTGAAGTGGCCAG GCCAAGTCTGGTGAAGTCAACGGTTGTAGATAGTGATACAGGAAAGAGCA AAGACAGCAGAGTACGCACAAGTTCAGGTACATTTTTGATGCGAGGCCAA GATCCTGTGATCAAAAGAATCGAGAAGCGAATAGCTGACTTCACATTTAT ACCTGCTGAGCAAGGTGAAGGCTTACAAGTTCTGCAGTACAAAGAAAGTG AAAAATACGAGCCCCATTATGATTACTTCCACGATGCATACAATACCAAA AATGGCGGCCAAAGAATTGCTACCGTACTGATGTACCTGTCAAATGTCGA GGAAGGAGGAGAAACAGTTTTTCCAGCTGCTCAGGTGAACAAGACTGAAG TTCCCGATTGGGATAAATTATCTGAGTGTGCTCAGAAAGGTCTTTCTGTG CGACCACGCATGGGAGATGCCTTGCTTTTCTGGAGCATGAAACCAGATGC GACACTTGATTCCACTAGCTTGCATGGTGGCTGCCCCGTGATCAAGGGTA CCAAATGGTCTGCTACTAAGTGGTTACATGTAGAAAACTATGCAGCCTGA TGAGGATGGTACAAGATGTCTTCTGCAGGAAGTGAATTGTCACAAGCACC TGGTACAAGCAGATTCGAAATGCTTGGATGTAATGCATGGATGTTGGGAG AGGACAAACATACAAATTTATGATTCTGCATTACGTGAGATGTAATGATG AACCACCTCGTGCCTATCTGAATTCATATGAACAAACGAATAGATTTCCA ATTCATACCAATAAAACAGAAAAGCCGCTTAACTTATTTGTTAACTTAGG CAGTTTTTTTGTTTTATTATTGGTGGTTTGCAATCGACCTTAACGACCAT TTCTTGTAATCACCACAAACAAGCAAAATGCATATCTGATTTCATTCAAA ATATACTTATAAAGACTGCTGAATCTATAACAAACAAAA TranslationcorrespondingtoP4H5cDNA (SEQIDNO.10) MAKLSRGQRRGAGTMALLVLVLLSLALMLMLALGFVAMPSASHGSSADVV EIKLPSHRHFGANPLSRWVEVLSWEPRAFLYHHFLTEEECNHLIEVARPS LVKSTVVDSDTGKSKDSRVRTSSGTFLMRGQDPVIKRIEKRIADFTFIPA EQGEGLQVLQYKESEKYEPHYDYFHDAYNTKNGGQRIATVLMYLSNVEEG GETVFPAAQVNKTEVPDWDKLSECAQKGLSVRPRMGDALLFWSMKPDATL DSTSLHGGCPVIKGTKWSATKWLHVENYAA P4H6_a_cDNA(Pp1s328_29V6.1AccessionNo.: JX964783;SEQIDNO.11) GAAAAAGAGCAGCAGTTGGAGTTGGAGTAGGCCAGATCGATGCTCCTCCT CCTCCCATGATGATAGATGACGAAGATTATGCTGTTGTTGTCGATGTTGT TGCTCGCTGATCATCAACACGAAGTTGCCGTTGCAGCTGCTCTTGCTCTT CACCGTCGACTCGGCAGAGGGGCACAGCTCAGCTGGTAATTTATTATTAG TGCCCATGGGTGGGATGGATGTGAGTGACATCGGCGCTTCTACCGACAGT GTGAAACCCCAGCGAGGCTGTGCCTTGCCTTGCCTTGGCTTGTGTGCATT GCCTCTCCCCTCCAGTTTTTTGGTGGGTTGGTGTTTGTGTGAGGGGGGAA CAGAGGAGAGGGCGGGGGCAAGGGCTGTGGCAGCTATGGCGAGGTTGAGT AGGGGGCAAAGGACTGGAGTTGGCACGATGGCATTGCTGGTGTTCGCGTT TTTGTCTTTGATAGTCATGGTCATGTTGCTTCTGGACGTGGTAGCAATGC CATCGGGACGTCGAGGCTCGATTGACGAGGGAGCCGAAGTGGAATTGAAG CTGCCTACCCACAGGCATGTGGATGAAAATCCACTGGCACCTTGGGTTGA GGTCCTTTCCTGGGAGCCCAGAGCTTTTCTGTATCACCACTTTCTGACAC AAGTGGAATGCAACCATCTTATTGAGGTGGCCAAGCCTAGCCTGGTGAAG TCAACAGTTATAGATAGTGCTACGGGAAAAAGCAAAGACAGCAGGGTTCG CACAAGTTCAGGGACATTTTTGGTGCGGGGCCAAGATCACATCATTAAGA GGATTGAGAAACGTATCGCTGACTTCACATTCATACCTGTTGAACAAGGT GAAGGCTTGCAAGTTTTGCAGTATAGAGAGAGTGAGAAATACGAGCCTCA TTATGACTACTTTCACGATGCTTTCAATACTAAAAATGGTGGTCAGCGGA TTGCTACCGTACTGATGTATCTGTCAGACGTTGAGAAAGGGGGAGAAACA GTTTTCCCGGCTTCTAAAGTGAACGCTAGTGAGGTTCCTGATTGGGATCA GCGATCCGAATGCGCTAAACGGGGCCTTTCTGTACGACCACGTATGGGAG ATGCCTTACTTTTTTGGAGCATGAAACCAGATGCGAAGCTTGACCCTACC AGTTTGCATGGCGCTTGCCCTGTGATTCAAGGTACGAAATGGTCTGCTAC AAAGTGGTTACATGTTGAAAAATACGCAGCACGGTAAACATCCTTCTAGA AGTCTTCAACAGGATTACATGAATTATGCGAGCAGTCTTCTGGCATGAGC AGAGGTGAACTTGCCCAAACTTGCTCATGGAACAACAGAATCAGCTTGCG AGTTATTTACAAGGAGCGAGTGTCCATGCCTGAATGCTGGAACACCAGCG TGATGAGAACGCTTAGGAATACCAATTCTTCACTGATTTTACAAACCACA CTAGCTACTACACATGACAAATTTCATGCTTTGACTTGGTTGATCTGCTT TTGTGTGAGGATCAGTATTTTATAAATAGGGGATGGAGCTCTTCAGCTCC TAATGTGCGATTTCG TranslationcorrespondingtoP4H6_a_cDNA (SEQIDNO.12) MGGMDVSDIGASTDSVKPQRGCALPCLGLCALPLPSSFLVGWCLCEGGTE ERAGARAVAAMARLSRGQRTGVGTMALLVFAFLSLIVMVMLLLDVVAMPS GRRGSIDEGAEVELKLPTHRHVDENPLAPWVEVLSWEPRAFLYHHFLTQV ECNHLIEVAKPSLVKSTVIDSATGKSKDSRVRTSSGTFLVRGQDHIIKRI EKRIADFTFIPVEQGEGLQVLQYRESEKYEPHYDYFHDAFNTKNGGQRIA TVLMYLSDVEKGGETVFPASKVNASEVPDWDQRSECAKRGLSVRPRMGDA LLFWSMKPDAKLDPTSLHGACPVIQGTKWSATKWLHVEKYAAR P4H6_b_cDNA(Pp1s328_29V6.1AccessionNo.: JX964784;SEQIDNO.13) GAAAAAGAGCAGCAGTTGGAGTTGGAGTAGGCCAGATCGATGCTCCTCCT CCTCCCATGATGATAGATGACGAAGATTATGCTGTTGTTGTCGATGTTGT TGCTCGCTGATCATCAACACGAAGTTGCCGTTGCAGCTGCTCTTGCTCTT CACCGTCGACTCGGCAGAGGGGCACAGCTCAGCTGGTAATTTATTATTAG TGCCCATGGGTGGGATGGATGTGAGTGACATCGGCGCTTCTACCGACAGT GTGAAACCCCAGCGAGGCTGTGCCTTGCCTTGCCTTGGCTTGTGTGCATT GCCTCTCCCCTCCAGTCGTAATTGAGACGTACTATTAAACACGTAGGCGG TAGTTTTTGGTGGGTTGGTGTTTGTGTGAGGGGGGAACAGAGGAGAGGGC GGGGGCAAGGGCTGTGGCAGCTATGGCGAGGTTGAGTAGGGGGCAAAGGA CTGGAGTTGGCACGATGGCATTGCTGGTGTTCGCGTTTTTGTCTTTGATA GTCATGGTCATGTTGCTTCTGGACGTGGTAGCAATGCCATCGGGACGTCG AGGCTCGATTGACGAGGGAGCCGAAGTGGAATTGAAGCTGCCTACCCACA GGCATGTGGATGAAAATCCACTGGCACCTTGGGTTGAGGTCCTTTCCTGG GAGCCCAGAGCTTTTCTGTATCACCACTTTCTGACACAAGTGGAATGCAA CCATCTTATTGAGGTGGCCAAGCCTAGCCTGGTGAAGTCAACAGTTATAG ATAGTGCTACGGGAAAAAGCAAAGACAGCAGGGTTCGCACAAGTTCAGGG ACATTTTTGGTGCGGGGCCAAGATCACATCATTAAGAGGATTGAGAAACG TATCGCTGACTTCACATTCATACCTGTTGAACAAGGTGAAGGCTTGCAAG TTTTGCAGTATAGAGAGAGTGAGAAATACGAGCCTCATTATGACTACTTT CACGATGCTTTCAATACTAAAAATGGTGGTCAGCGGATTGCTACCGTACT GATGTATCTGTCAGACGTTGAGAAAGGGGGAGAAACAGTTTTCCCGGCTT CTAAAGTGAACGCTAGTGAGGTTCCTGATTGGGATCAGCGATCCGAATGC GCTAAACGGGGCCTTTCTGTACGACCACGTATGGGAGATGCCTTACTTTT TTGGAGCATGAAACCAGATGCGAAGCTTGACCCTACCAGTTTGCATGGCG CTTGCCCTGTGATTCAAGGTACGAAATGGTCTGCTACAAAGTGGTTACAT GTTGAAAAATACGCAGCACGGTAAACATCCTTCTAGAAGTCTTCAACAGG ATTACATGAATTATGCGAGCAGTCTTCTGGCATGAGCAGAGGTGAACTTG CCCAAACTTGCTCATGGAACAACAGAATCAGCTTGCGAGTTATTTACAAG GAGCGAGTGTCCATGCCTGAATGCTGGAACACCAGCGTGATGAGAACGCT TAGGAATACCAATTCTTCACTGATTTTACAAACCACACTAGCTACTACAC ATGACAAATTTCATGCTTTGACTTGGTTGATCTGCTTTTGTGTGAGGATC AGTATTTTATAAATAGGGGATGGAGCTCTTCAGCTCCTAATGTGCGATTT CG TranslationcorrespondingtoP4H6_b_cDNA (SEQIDNO.14) MARLSRGQRTGVGTMALLVFAFLSLIVMVMLLLDVVAMPSGRRGSIDEGA EVELKLPTHRHVDENPLAPWVEVLSWEPRAFLYHHFLTQVECNHLIEVAK PSLVKSTVIDSATGKSKDSRVRTSSGTFLVRGQDHIIKRIEKRIADFTFI PVEQGEGLQVLQYRESEKYEPHYDYFHDAFNTKNGGQRIATVLMYLSDVE KGGETVFPASKVNASEVPDWDQRSECAKRGLSVRPRMGDALLFWSMKPDA KLDPTSLHGACPVIQGTKWSATKWLHVEKYAAR

(34) All deduced protein sequences had a prolyl-4-hydroxylase alpha subunit catalytic domain (SMART 0702). N-terminal transmembrane domains were predicted for all homologues except P4H2 (TMHMM server v.2.0, www.cbs.dtu.dk).

(35) In order to gain more information about the predicted P4H enzymes, the deduced amino acid sequences were aligned with sequences of already characterized P4Hs from human, Arabidopsis thaliana and Nicotiana tabacum. Protein sequence alignments were performed with the program CLUSTAL W (ebi.ac.uk) and visualized with Jalview (www.jalview.org). The catalytic domain in the C-terminal end of the protein is highly conserved in all seven P. patens homologues (FIG. 1). The seven putative P4Hs share 16-24% identity with the human catalytic a (I) subunit and 30-63% identity with AtP4H1. Among the moss sequences the degree of identity is between 30 and 81%. All sequences contain the motif HXD and a distal histidine, which are necessary to bind the cofactor Fe.sup.2+. Further, they contain the basic residue lysine which binds the C-5 carboxyl group of 2-oxoglutarate (FIG. 1). These residues are indispensable for the activity of collagen P4Hs (Kivirikko and Myllyharju, Matrix Biol., 16:357-368, 1998) and of P4H1 from A. thaliana (Hieta and Myllyharju, J. Biol. Chem., 277:23965-23971, 2002), indicating that all seven sequences from P. patens are functional prolyl-4-hydroxylases.

Experiment 2: In Silico Prediction of Intracellular Localization

(36) Recombinant human erythropoietin (rhEPO) serves as an example of a recombinant human protein in the following examples. Non-human prolyl-hydroxylation occurred on moss-derived rhEPO which has been secreted from the tissue to the medium of the moss bioreactor culture. Therefore, it was concluded that the P4H enzyme responsible for posttranslational rhEPO modification is located in the secretory compartments, i.e. the endoplasmic reticulum (ER) or the Golgi apparatus. Accordingly, the subcellular localization of the seven P. patens P4H homologues was examined. First, their putative intracellular localization was analyzed in silico with four different programs based on different algorithms: Target P (www.cbs.dtu.dk), MultiLoc (abi.inf.uni-tuebingen.de), SherLoc (abi.inf.uni-tuebingen.de) and Wolf PSORT (wolfpsort.org). No consistent prediction was obtained by this approach (Table 1).

(37) TABLE-US-00002 TABLE 1 In silico localization prediction of Physcomitrella patens P4Hs using different programs. P4H P4H1 P4H2 P4H3 P4H4 P4H5 P4H6a P4H6b SherLok ER ER ER Golgi ER secreted mitochondria WoLFPSORT vacuole plastid plastid nucleus vacuole cytoplasm plastid MultiLoc mitochondria plastid plastid mitochondria mitochondria plastid mitochondria Target p SP / / mitochondria mitochondria mitochondria mitochondria

Experiment 3: In Vivo Analysis of Intracellular Localization

(38) The in vivo intracellular localization of each of the seven P. patens P4Hs was studied by expressing them as GFP fusion proteins (green fluorescent protein, P4H-GFP) in P. patens cells (for details on the generation of plasmids and on the plant material and transformation procedure, see below). Subcellular localization of the seven different P4H-GFP fusion proteins was analyzed 3 to 14 days after transfection by Confocal Laser Scanning Microscopy (CLSM) (510 META; Carl Zeiss MicroImaging, Jena, Germany) and the corresponding software (version 3.5). Excitation at 488 nm was achieved with an argon laser and emission was measured with a META detector at 494-558 nm for GFP and at 601-719 nm for the chlorophyll. Cells were examined with a C-Apochromat 63/1.2 W corr water immersion objective (Carl Zeiss MicroImaging). Confocal planes were exported from the ZEN2010 software (Carl Zeiss MicroImaging).

(39) In optical sections GFP signals from all seven different P4H fusion proteins were predominantly detected as defined circular structures around the nucleus, revealing labeling of the nuclear membranes (FIG. 2). As the nuclear membrane is part of the endomembrane continuum of eukaryotic cells, these signals reveal that all seven moss P4Hs were targeted to the secretory compartments. An ER-targeted GFP version (ASP-GFP-KDEL, Schaaf et al., Eur. J. Cell Biol., 83:145-152, 2004) as well as GFP without any signal peptide displaying GFP fluorescence in the cytoplasm as well as the nucleus (Schaaf et al., Eur. J. Cell Biol., 83:145-152, 2004) served as controls. Thus, these experiments provided no clear indication of a specific P4H responsible for generation of Hyp on secreted rhEPO in P. patens.

Experiment 4: Ablation of the Gene Functions of Each of the P. patens P4H Homologues

(40) In order to definitively identify those homologues responsible for plant-typical prolyl-hydroxylation of moss-produced rhEPO the gene functions of each of the P. patens P4H homologues were ablated. Accordingly, gene-targeting constructs for the six p4h genes were designed (FIG. 3).

(41) The gene targeting constructs were then transferred to the rhEPO-producing moss line 174.16 (Weise et al., Plant Biotechnol. J., 5:389-401, 2007) to generate specific knockout (KO) lines for each of the P4H-genes. After antibiotic selection, surviving plants were screened for homologous integration of the KO construct into the correct genomic locus (for details on the screening of transformed plants, see below).

(42) Loss of the respective transcript was proven by RT-PCR (FIG. 4a), confirming successful gene ablation. One line for each genetic modification was chosen for further analysis, and stored in the International Moss Stock Center (moss-stock-center.org; Table 2).

(43) TABLE-US-00003 TABLE 2 International Moss Stock Center accession numbers of plants used. Plants IMSC No. EPO 174.16 40216 p4h1KO No. 192 EPO 40218 p4h2 KO No. 6 EPO 40234 p4h3 KO No. 21 EPO 40230 p4h4 KO No. 95 EPO 40231 p4h5 KO No. 29 EPO 40223 p4h6 KO No. 31 EPO 40239 p4h1 OE No. 12 in p4h1 KO 40336 192 EPO p4h1 OE No. 16 in p4h1 KO 40337 192 EPO p4h1 OE No. 32 in p4h1 KO 40338 192 EPO p4h1 OE No. 41 in p4h1 KO 40339 192 EPO p4h1 OE No. 45 in p4h1 KO- 40340 192 EPO

Experiment 5: Analysis of the Recombinant Proteins Via Mass Spectrometry

(44) To investigate the effect of each of the p4h ablations on the prolyl-hydroxylation observed for moss-produced rhEPO, the recombinant protein from each of the KO lines (p4h) was analyzed via mass spectrometry. For this purpose, total soluble proteins were precipitated from the culture supernatant of the parental plant 174.16 and one knockout line from each p4h homologue, and separated by SDS-PAGE. Subsequently, the main rhEPO-containing band was cut from the Coomassie-stained gel, digested with trypsin and subjected to mass spectrometry for an analysis of the tryptic peptide EAISPPDAASAAPLR (144-158; SEQ ID NO. 81) (for details on protein and peptide analysis, see below). In the parental plant 174.16, almost half of the rhEPO was hydroxylated (FIG. 5), mainly in the second proline from the SPP motif, as shown by MS/MS (FIG. 6). Surprisingly, while rhEPO produced in moss lines with ablated p4h2, p4h3, p4h4, p4h5 or p4h6, respectively, was hydroxylated in similar levels to those found on the parental plant, the ablation of exclusively the p4h1 gene was sufficient to completely abolish the prolyl-hydroxylation on the biopharmaceutical (FIG. 5). Growth rate, rhEPO productivity and secretion of the protein to the culture medium were not impaired in these knockout plants compared to the parental line 174.16 (data not shown). Thus, the complete lack of Hyp on rhEPO produced by the p4h1 lines was shown.

(45) It is to be noted that neither sequence analysis nor intracellular localization of the seven proteins revealed which genes were responsible for the adverse O-glycosylation of rhEPO. Only the ablation of each of the seven genes revealed surprisingly the responsible gene.

Experiment 6: Verification of P4H1 Enzymatic Activity

(46) To verify P4H1 enzymatic activity in prolyl-hydroxylation this gene was ectopically expressed in the p4h1 knockout line #192. Strong overexpression of the p4h1 transcript was confirmed in the resulting lines via semi-quantitative RT-PCR (FIG. 4b). Five p4h1 overexpression lines (p4h1OE) were analyzed for rhEPO-Pro-hydroxylation. The LC-ESI-MS results revealed that p4h1 overexpression restored prolyl-hydroxylation of the moss-produced rhEPO (FIG. 7). The proportion of hydroxylated rhEPO, as well as the hydroxylation pattern, was altered by the elevated expression levels of the gene. While in the parental plant 174.16, with native P4H1 activity, approximately half of rhEPO displayed Hyp (FIG. 5), nearly all rhEPO was oxidized in the p4h1 overexpressors (FIG. 7). Furthermore, in these overexpressors not only one proline in the motif SPP was hydroxylated as seen in the parental plant 174.16, but both contiguous prolines were converted to Hyp (FIG. 7). Thus, it was shown that the expression of p4h1 is essential and sufficient for the prolyl-hydroxylation of the moss-produced rhEPO, and that its expression level influences its enzyme activity, not only in the proportion of hydroxylated protein molecules but also in the pattern of hydroxylation.

Experiment 7: Analysis of the rhEPO N-Terminal Peptide APPRLICDSRVL (SEQ ID NO. 82) for Prolyl-Hydroxylation in P. patens

(47) As hydroxylation and arabinosylation of the human epithelial mucin MUC1 at the sequence APP was reported upon expression in N. benthamiana (Pinkhasov et al., Plant Biotechnol. J., 9:991-1001, 2011), the rhEPO N-terminal peptide APPRLICDSRVL was analyzed for prolyl-hydroxylation in P. patens. After chymotryptic digestion of rhEPO derived from the parental plant 174.16, the knockout plant p4h1 #192 and the overexpressor p4h1OE-451, LC-ESI-MS analysis revealed that this peptide was not hydroxylated in any of the cases (FIG. 8), demonstrating that the mere presence of contiguous proline residues preceded by an alanine is not sufficient to be recognized by moss prolyl-hydroxylases.

Experiment 8: Phylogenetic Comparison of the Sequences of Plant Prolyl-4-Hydroxylases

(48) A multiple sequence alignment was generated from the amino acid sequences of the prolyl-4-hydroxylases of different plants (e. g., Populus, Oryza, Arabidopsis, Physcomitrella) by using the program Jalview (MAFFT Version 5.0). A phylogenetic tree was calculated with QuickTree (Howe et al., Bioinformatics, 18:1546-1547, 2002). The phylogenetic tree is shown in FIG. 9.

(49) Methods Relating to Above Experiments

(50) Generation of Plasmid Constructs

(51) The cDNAs corresponding to the seven P4H homologues identified in Physcomitrella patens were amplified using the primers listed in Table 3 (see below).

(52) The cDNAs were cloned into pJET 1.2 (CloneJET PCR CloningKit, Fermentas, St Leon-Rot, Germany). Subsequently, the p4h coding sequences including a portion of the 5 UTR were cloned into the plasmid mAV4mcs (Schaaf et al., Eur. J. Cell Biol., 83:145-152, 2004) using the Xhol and BglII sites giving rise to N-terminal fusion P4H-GFP proteins under the control of the cauliflower mosaic virus (CaMV) 35S promoter. Unmodified mAV4mcs was used as a control for cytoplasmic and nuclear localization. As positive control for ER localization, pASP-GFP-KDEL was taken (Schaaf et al., Eur. J. Cell Biol., 83:145-152, 2004).

(53) To generate the p4h knockout constructs, P. patens genomic DNA fragments corresponding to the prolyl-4-hydroxylases were amplified using the primers listed in Table 3 and cloned either into pCR4-TOPO (Invitrogen, Karlsruhe, Germany) or into pETBlue-1 AccepTor (Novagen, Merck KGaA, Darmstadt, Germany). The pTOPO_p4h1 genomic fragment was first linearized using BstBI and SacI, thus deleting a 273 bp fragment, and recircularized by ligating double-stranded oligonucleotide containing restriction sites for BamHI and HindIII. These sites were used for the insertion of a zeomycin resistance cassette (zeo-cassette). The zeo-cassette was obtained from pUC-zeo (Parsons et al., Plant Biotechnol. J., 10:851-861, 2012) by digestion with HindIII and BamHI. For the p4h5 KO construct, a 1487 bp fragment was cut out from the pTOPO_p4h5 using SalI and BglII sites and replaced by double-stranded oligonucleotide containing restriction sites for BamHI and HindIII. These restriction sites were used for the insertion of the zeo-cassette obtained from the pUC-Zeo plasmid. The p4h2 KO construct was cloned into the pETBlue-1 AccepTor, and the zeo-cassette replaced a 270 bp genomic fragment deleted by digestion with KpnI and HindIII. The zeo-cassette obtained from pRT101-zeo (Parsons et al., Plant Biotechnol. J., 10:851-861, 2012) by HindIII digestion was inserted into the pET_p4h3 and the pTOPO_p4h4 KO constructs digested with the same enzyme, replacing a 990 bp and a 1183 bp genomic fragment, respectively. For the p4h6 KO construct, the zeo-cassette was obtained from the pUC-zeo via digestion with HindIII and SacI and inserted into pTOPO_p4h6, replacing a 1326 bp genomic fragment. In all KO constructs the regions homologous to the target gene had approximately the same size at both ends of the selection cassette, comprising between 500 and 1000 bp.

(54) For the overexpression construct, the p4h1 coding sequence and 79 bp of the 5UTR were amplified from moss WT cDNA with the primers listed in Table 3, and cloned under the control of the 35S promoter and the nos terminator into the mAV4mcs vector (Schaaf et al., Eur. J. Cell Biol., 83:145-152, 2004). For this purpose the GFP gene was deleted from the vector by digestion with Ecl136II and SmaI and subsequent relegation of the vector. The p4h1 cDNA was inserted into the vector via Xhol and BglII restriction sites. The p4h1 overexpression construct was linearized via digestion with EcoRI and PstI and transferred into the line p4h1 No. 192 together with pUC 18 sul (Parsons et al., Plant Biotechnol. J., 10:851-861, 2012) for sulfadiazine selection.

(55) TABLE-US-00004 TABLE3 OligonucleotidesusedandcorrespondingNOs. SEQ ID gene oligonucleotide NO. P4H-GFPconstruct p4h1 fwd:5-GGGATGGAGTAATTCTACGAAGC-3 15 rev:5-AATCAAAGGCTCGCTGCCTCAT-3 16 p4h2 fwd:5-GTGATGCGTGATCCTGTGC-3 17 rev:5-GGCACACATGGCATGCTTTC-3 18 p4h3 fwd:5-GGTGTTATGTAGAGATTCGTCACAAC-3 19 rev:5-GAAATTTGTCAGTGTTGCGAATC-3 20 p4h4 fwd:5-GACTCGGAAATCGCTCCTGA-3 21 rev:5-GAAATTTGTCGGTATTGCGTATC-3 22 p4h5 fwd:5-GCCACATCTCGAAGTAGTCGGTAAT-3 23 rev:5-CGGCTGCATAGTTTTCTACATGTAAC-3 24 p4h6-a fwd:5-CTCTTGCTCTTCACCGTCGACTC-3 25 rev:5-ACCGTGCTGCGTATTTTTCAAC-3 26 p4h6-b fwd:5-GAGACGTACTATTAAACACGTAGG-3 27 rev:5-ACCGTGCTGCGTATTTTTCAAC-3 28 genomicDNAamplificationforKOconstruct p4h1 fwd:5-TGAATTCTGAATGTCATAAGGCCTCTACTG- 29 3 rev:5-TGAATTCAGAGGGTAGGATTGTGTGAAG-3 30 p4h2 fwd:5-CGAATTCCTCTGCTCCCTGTTCTTGTTTG-3 31 rev:5-CGAATTCCACAAACTTCATCGACTTGATCC- 32 3 p4h3 fwd:5-GAATTCGTTGCAGTAATCCTTGGTGAT-3 33 rev:5-GAATTCTCTCCACCCTCTTCCACATC-3 34 p4h4 fwd:5-TGAATTCCTGAGGGGATTGAAGAG-3 35 rev:5-TGAATTCAGAACACAGGGATCAGC-3 36 p4h5 fwd:5-TGAATTCTGCAGCTTGTTACACTCCCAAT-3 37 rev:5-ATGAATTCAGATAGGCACGAGGTGGT-3 38 p4h6 fwd:5-TGAATTCTGCAGTAGATGGCCAATCATGT- 39 3 rev:5-GTAATCCTGCAACAAGAATTCAAAGCAG-3 40 screeningofintegrationinthegenome p4h1 5-integration fwd:5-GGCTAATGATGAAGATGCGAGA-3 41 rev:5-TGTCGTGCTCCACCATGTTG-3 42 3-integration fwd:5-GTTGAGCATATAAGAAACCC-3 43 rev:5-AGCATCCCCTCGTTTAGGTT-3 44 p4h2 5-integration fwd:5-TGTGGTATTCTCGCAGATTAGGG-3 45 rev:5-TGTCGTGCTCCACCATGTTG-3 46 3-integration fwd:5-GTTGAGCATATAAGAAACCC-3 47 rev:5-CGGTCATAATTTGAGTTTTGCT-3 48 p4h3 5-integration fwd:5-CAACGGATGCCATTGACAGT-3 49 rev:5-TGTCGTGCTCCACCATGTTG-3 50 3-integration fwd:5-GTTGAGCATATAAGAAACCC-3 51 rev:5-CATTTGGCAACTTAAGGGTGTA-3 52 p4h4 5-integration fwd:5-GACTCGGAAATCGCTCCTGA-3 53 rev:5-TGTCGTGCTCCACCATGTTG-3 54 3-integration fwd:5-GTTGAGCATATAAGAAACCC-3 55 rev:5-CATCGACAGTTGTTCGTGGA-3 56 p4h5 5-integration fwd:5-GTAAAGGACATTCGTTTATGCATCG-3 57 rev:5-TGTCGTGCTCCACCATGTTG-3 58 3-integration fwd:5-GTTGAGCATATAAGAAACCC-3 59 rev:5-TGTGGTGATTACAAGAAATGGTCGT-3 60 p4h6 5-integration fwd:5-ATAGGTGTCGCTACAGCAATCG-3 61 rev:5-TGTCGTGCTCCACCATGTTG-3 62 3-integration fwd:5-GTTGAGCATATAAGAAACCC-3 63 rev:5-ATGGACACTCGCTCCTTGTAA-3 64 p4h1 overexpression fwd:5-GGGATGGAGTAATTCTACGAAG-3 65 rev:5-CTAATCAAAGGCTCGCTGCCTCAT-3 66 transcriptscreening p4h1 fwd:5-GGCTAATGATGAAGATGCGAGA-3 67 rev:5-AGCATCCCCTCGTTTAGGTT-3 68 p4h2 fwd:5-AGGACAAGCTGGAGAAGTCAATG-3 69 rev:5-GCCTAGCACACATGGCATG-3 70 p4h3 fwd:5-GGTGTTATGTAGAGATTCGTCACAAC-3 71 rev:5-GAATTCTCTCCACCCTCTTCCACATC-3 72 p4h4 fwd:5-TTGGTCGGCTTTTACGGTTC-3 73 rev:5-AAAGAAGAGCATCGCCTTGG-3 74 p4h5 fwd:5-TCCTGTTGTCTCTAGCGCTCAT-3 75 rev:5-CGGCTGCATAGTTTTCTACATGTAAC-3 76 p4h6 fwd:5-CCAGAGCTTTTCTGTATCACCAC-3 77 rev:5-ACCGTGCTGCGTATTTTTCAAC-3 78 tbp fwd:5-GCTGAGGCAGTCTTGGAG-3 79 rev:5-TCGAGCCGGATAGGGAAC-3 80
Plant Material and Transformation Procedure

(56) Physcomitrella patens (Hedw.) Bruch & Schimp was cultivated as described previously (Frank et al., Plant Biol., 7:220-227, 2005). Moss-produced rhEPO was shown to be hydroxylated at the prolyl-hydroxylation consensus motif SPP (amino acids 147-149), therefore the rhEPO-producing P. patens line 174.16 (Weise et al., Plant Biotechnol. J., 5:389-401, 2007) was used as the parental line for the p4h knockout generation and the line p4h1 #192 was used for the generation of p4h1 overexpression lines. In these moss lines the 1,3 fucosyltransferase and the 1,2 xylosyltransferase genes are disrupted (Koprivova et al., Plant Biotechnol. J., 2:517-523, 2004). Wild-type moss was used for the subcellular localization experiments with P4H-GFP.

(57) Protoplast isolation and PEG-mediated transfection was performed as described previously (Frank et al., Plant Biol., 7:220-227, 2005; Rother et al., J. Plant Physiol., 143:72-77, 1994). Mutant selection was performed with Zeocin (Invitrogen) or sulfadiazine (Sigma) as described before (Parsons et al., Plant Biotechnol. J., 10:851-861, 2012).

(58) For rhEPO production, P. patens was cultivated as described before (Parsons et al., Plant Biotechnol. J., 10:851-861, 2012).

(59) Screening of Transformed Plants

(60) Screening of stable transformed plants was performed via direct PCR (Schween et al., Plant Mol. Biol. Rep., 20:43-47, 2002) with genomic DNA extracted as described before (Parsons et al., Plant Biotechnol. J., 10:851-861, 2012). From these extracts, 2 l were used as template for PCR, using the primers listed in Table 3 to check the 5 and 3 integration of the knockout construct in the correct genomic locus and to check the integration of the overexpression construct into the moss genome, respectively. Plants, which showed the expected PCR, products were considered as putative knockouts or overexpression lines, respectively, and subsequently analyzed. The absence of the p4h transcripts in the KO lines was analyzed via RT-PCR as described before (Parsons et al., Plant Biotechnol. J., 10:851-861, 2012) using the primers listed in Table 3. Expression of p4h1 in the overexpression lines was analyzed via semi-quantitative RT-PCR. For this purpose, cDNA equivalent to 150 ng RNA was amplified with 24, 26 and 28 cycles using the p4h1 primers listed in Table 3. The primers for the constitutively expressed TATA box-binding protein, TBP fwd and TBP rev (Table 3) were used as controls.

(61) Protein and Peptide Analysis

(62) Total soluble proteins were recovered from 160 ml of a 16-days-old culture supernatant by precipitation with 10% (w/v) trichloroacetic acid (TCA, Sigma-Aldrich, Deisenhofen, Germany) as described (Bttner-Mainik et al., Plant Biotechnol. J., 9:373-383, 2011). The pellet was resuspended in sample Laemmli loading buffer (Biorad, Munich, Germany) and electrophoretic separation of proteins was carried out in 12% SDS-polyacrylamide gels (Ready Gel Tris-HCl, BioRad) at 150 V for 1 h under non-reducing conditions.

(63) For peptide analysis, the proteins in the gels were stained with PageBlue Protein Staining Solution (Fermentas) and the bands corresponding to 25 kDa were cut out, 5-alkylated and digested with trypsin or chymotrypsin (Grass et al., Anal. Bioanal. Chem. 400:2427-2438, 2011). Analysis by reversed-phase liquid chromatography coupled to electrospray ionization mass spectrometry on a Q-TOF instrument (LC-ESI-MS and MS/MS) was performed as described previously (Grass et al., Anal. Bioanal. Chem. 400:2427-2438, 2011).

(64) Quantification of the moss-produced rhEPO was performed using a hEPO Quantikine IVD ELISA kit (cat. no DEP00, R&D Systems) according to the manufacturer's protocol.

DETAILED DESCRIPTION OF THE FIGURES

(65) FIG. 1 shows the protein sequence comparison of P. patens putative prolyl-4-hydroxylases (P4Hs): PpP4H1 (SEQ ID No: 2), PpP4H6a (SEQ ID No: 12), PpP4H6b (SEQ ID No: 14), PpP4H5 (SEQ ID No: 10), PpP4H2 (SEQ ID No: 4), PpP4H3 (SEQ ID No: 6), PpP4H4 (SEQ ID No: 8). Amino acids that are identical in at least 5 sequences are marked with dashes above the respective positions. The conserved residues responsible for binding Fe.sup.2+ and the C-5 carboxyl group of 2-oxoglutarate are marked with asterisk below the respective positions. The first 147 amino acids of the human (I) subunit did not align with any other analyzed sequence.

(66) FIG. 2 shows the in vivo subcellular localization of P. patens P4H homologues. Fluorescence of P4H-GFP fusion proteins in P. patens protoplasts was observed by confocal microscopy 3 to 14 days after transfection. The images obtained for PpP4H1-GFP, PpP4H3-GFP and PpP4H4-GFP are taken as example of the fluorescence pattern which was observed for all homologues. (a-c) PpP4H1-GFP, (d-f) PpP4H3-GFP, (g-i) PpP4H4-GFP, (j-l) ASP-GFP-KDEL as control for ER localization, (m-o) GFP without any signal peptide as control for cytosolic localization. (a, d, g, j and m) single optical sections emitting GFP fluorescence (494-558 nm), (b, e, h, k and n) merge of chlorophyll autofluorescence (601-719 nm) and GFP flourescence, (c, f, i, l and o) transmitted light images. The arrows indicate the cell nucleus membrane.

(67) FIG. 3 shows the schematic representation of the p4h knockout constructs. Exons are presented as rectangles and introns as lines. White rectangles represent the regions of the genes used for the constructs and striped rectangles represent the selection cassette. The restriction sites used to insert the selection cassette are marked as RS. Arrows represent oligonucleotides used for the screening of genomic integration.

(68) FIGS. 4a and 4b show the p4h gene expression analysis in recombinant moss lines. FIG. 4a is the expression analysis of p4h1, p4h2, p4h3, p4h4, p4h5 and p4h6, respectively, in the putative knock-out plants. As a control for efficient mRNA isolation, RT-PCR was performed with primers corresponding to the constitutively expressed gene for the ribosomal protein L21 (control). FIG. 4b is the expression analysis of p4h1 in moss wild type (WT), the rhEPO producing line 174.16, and five putative moss lines overexpressing p4h1 (No. 12, 16, 32, 41 and 45). Semi-quantitative RT-PCR was performed with increasing cycle number (24, 26 and 28) and primers specific for p4h1 as well as a control with primers corresponding to the constitutively expressed gene encoding the TATA-box binding protein TBP.

(69) FIGS. 5a and 5b show the mass spectrometric analysis of the hydroxylation of moss-produced rhEPO. FIG. 5a displays the reversed-phase liquid chromatography of tryptic peptides showing peaks of oxidized and non-oxidized peptide EAISPPDAASAAPLR (144-158; SEQ ID NO. 81) derived from rhEPO produced in moss lines 174.16 (control parental plant), p4h1 No. 192, p4h2 No. 6, p4h3 No. 21, p4h4 No. 95, p4h5 No. 29 and p4h6 No. 8. Selected ion chromatograms for the doubly charged ions of non-oxidized (m/z=733.4) and oxidized peptide (m/z 741.4) are shown. FIG. 5b shows broad band sum spectra for peptide 144-158 showing the absence of prolyl-hydroxylation (Pro) in the line p4h1 No. 192 and the presence of hydroxylated peptide (Hyp) in the line p4h4 No. 95, as an example. The peak between Pro and Hyp is the incidentally co-eluting peptide YLLEAK (SEQ ID NO. 86). Retention time deviations are technical artifacts

(70) FIG. 6 shows the MS/MS analysis of the peptide EAISPPDAASAAPLR (144-158; SEQ ID NO. 81) from moss-produced rhEPO. The one spectrum (FIG. 6a) was derived from non-oxidized peptide (m/z 933.45) faithfully showing the partial sequence SPPDAAS (SEQ ID NO. 83). The other spectrum (FIG. 6b) was derived from one of the two oxidized peptides (m/z 941.45). It gave the apparent partial sequence SPLDAAS (SEQ ID NO. 84), which stands for SPODAAS(SEQ ID NO. 85) as Hyp (0) and Leu isobaric. A second, slightly smaller peak of m/z 941.45 eluted a bit later and probably arose from hydroxylation of the other proline of the hydroxylation motif SPP.

(71) FIG. 7 shows the effect of overexpression of the prolyl-hydroxylase gene p4h1. Comparison of reversed-phase chromatograms showing the retention time for the moss-produced rhEPO peptide EAISPPDAASAAPLR (144-158; SEQ ID NO. 81) and its hydroxylated versions in the knockout moss line p4h1 No. 192 (FIG. 7a) and in the overexpressing line p4hOE No. 32 (FIG. 7b). The spectra of each peak are shown below the chromatograms. In the overexpressing line, the doubly hydroxylated peptide and two singly hydroxylated isomersone coeluting with the parent peptidewere found.

(72) FIG. 8 shows the analysis of the hydroxylation status of the N-terminal peptide of moss-produced rhEPO. The N-terminal sequence APP may also constitute a target sequence for moss prolyl-hydroxylase. Therefore, the N-terminus of moss-produced rhEPO was analyzed by reverse-phase liquid chromatography coupled to electrospray ionization mass spectrometry (LC-ESI-MS) of chymotryptic peptides. Screening for the masses of the non-oxidized and the oxidized peptide APPRLICDSRVL (1-12; SEQ ID NO. 82) from rhEPO produced in moss control line 174.16, the knockout p4h1 No. 192 and the overexpression line p4h1OE No. 45 revealed no indication of Pro hydroxylation of this peptide.

(73) Thus, the experiments show the identification and functional characterization of a plant gene responsible for non-human prolyl hydroxylation of recombinant human erythropoietin (rhEPO) produced in moss bioreactors. Targeted ablation of this gene abolished undesired prolyl hydroxylation of rhEPO and thus paves the way for recombinant human proteins produced in a plant-based system humanized via glyco-engineering.

(74) FIG. 9 shows the phylogenetic tree of the amino acid sequences of different plant prolyl-4-hydroxylases. It is shown that the different Physcomitrella prolyl-4-hydroxylase genes are not phylogenetically separated from other plants. Rather, the sequence analysis shows that the different prolyl-4-hydroxylases from green algae, mosses and seed plants are very similar to each other and also more similar to each other than within one and the same species. Thus, it is obvious for the person skilled in the art that the disclosed method not only works in Physcomitrella but also in other plants.

(75) The present disclosure is not limited to disclosed embodiments as it is obvious for a person skilled in the art that the recombinant human protein may be any protein which is intended to be produced in a plant-based system without adverse prolyl hydroxylation. The disclosed invention is even not restricted to recombinant human proteins and may also be used in the manufacture of proteins from other species, like animals or plants. In addition, other plant-based systems are also possible. It is conceivable that a different prolyl-4-hydroxylase gene is responsible for a different recombinant human protein or a protein from another species and also when using a different plant for the production of the recombinant protein.

Modified expression of prolyl-4-hydroxylase in physcomitrella patens

Assignee

Inventors

Cpc classification

Classification Explorer

C07K14/415

CHEMISTRY; METALLURGY

Classification Explorer

C12N9/0071

CHEMISTRY; METALLURGY

Classification Explorer

C12N15/8218

CHEMISTRY; METALLURGY

Classification Explorer

C12N9/0012

CHEMISTRY; METALLURGY

Classification Explorer

C12Y114/11002

CHEMISTRY; METALLURGY

Classification Explorer

A61P7/06

HUMAN NECESSITIES

Classification Explorer

C12N15/8257

CHEMISTRY; METALLURGY

Classification Explorer

C07K14/505

CHEMISTRY; METALLURGY

Classification Explorer

C12N15/8242

CHEMISTRY; METALLURGY

Classification Explorer

A61K38/00

HUMAN NECESSITIES

International classification

Classification Explorer

C12N15/82

CHEMISTRY; METALLURGY

Classification Explorer

C12N9/06

CHEMISTRY; METALLURGY

Classification Explorer

C12N9/02

CHEMISTRY; METALLURGY

Classification Explorer

C07K14/505

CHEMISTRY; METALLURGY

Abstract

Claims

Description