PROTEIN COMPOSITIONS AND METHODS OF PRODUCTION

20240002824 ยท 2024-01-04

    Inventors

    Cpc classification

    International classification

    Abstract

    Provided are systems and methods for recombinant proteins in microorganisms engineered to use alternate carbon sources.

    Claims

    1. An engineered host cell comprising: an integrated coding sequence of a fusion protein comprising a catalytic domain of a heterologous glycosyl hydrolase; and an integrated coding sequence of a heterologous protein of interest (POI); wherein the engineered host cell does not endogenously express the glycosyl hydrolase and the POI; and wherein the glycosyl hydrolase is anchored on the surface of the engineered host cell.

    2. The engineered host cell of claim 1, wherein the glycosyl hydrolase is an invertase selected from: S. cerevisiae, Kluyveromyces lactis, Cyberlindnera jadinii, Oryza sativa japonica (rice), Oryza sativa japonica (rice), Arabidopsis thaliana, Arabidopsis thaliana, Arabidopsis thaliana, Rattus norvegicus (rat), Oryctolagus cuniculus (Rabbit), and Homo sapiens.

    3-4. (canceled)

    5. The engineered host cell of claim 1, wherein the invertase is encoded by a gene selected from: SUC2, MAL1, invertase (INV1), cytosolic invertase 1 (CINV1), CIN2, CINV1, INVA, INVE, and sucrase-isomaltase (SI) gene.

    6. The engineered host cell of claim 1, wherein the fusion protein is surface-displayed on the engineered host cell; wherein the surface-displayed fusion protein comprises a catalytic domain of the glycosyl hydrolase and an anchoring domain of a glycosylphosphatidylinositol (GPI)-anchored protein, wherein the anchoring domain comprises at least about 200 amino acids and/or at least about 30% of the residues in the anchoring domain are serines or threonines.

    7-8. (canceled)

    9. The engineered host cell of claim 8, wherein the serines or threonines in the anchoring domain are capable of being O-mannosylated.

    10. The engineered host cell of claim 6, wherein a fusion protein having an anchoring domain comprising at least about 325 amino acids provides greater glycosyl hydrolase activity relative to a fusion protein having an anchoring domain comprising less than about 300 amino acids or less than about 250 amino acids.

    11-12. (canceled)

    13. The engineered host cell of claim 1, wherein the fusion protein comprises the GPI anchored protein without its native signal peptide or native secretory signal to the engineering host cell.

    14. (canceled)

    15. The engineered host cell of claim 1, wherein the GPI anchored protein is naturally expressed by a S. cerevisiae cell and the engineered host cell is not a S. cerevisiae cell.

    16. The engineered host cell of claim 13, wherein the GPI anchored protein is selected from Tir4, Dan1, or Sed1.

    17. The engineered host cell of claim 1, wherein an anchoring domain of the GPI anchored protein comprises an amino acid sequence that is at least 70% identical to one of SEQ ID NO: 1 to SEQ ID NO: 14.

    18. (canceled)

    19. The engineered host cell of claim 1, wherein the engineered host cell is a yeast cell or a Pichia species.

    20. (canceled)

    21. The engineered host cell of claim 19, wherein the Pichia species is Pichia pastoris.

    22. The engineered host cell of claim 1, wherein the engineered host cell comprises a genomic modification that expresses the fusion or a portion of the glycosyl hydrolase in addition to its catalytic domain.

    23-24. (canceled)

    25. The engineered host cell of claim 1, wherein in the fusion protein, the catalytic domain is N-terminal to the anchoring domain, or wherein in the fusion protein, the catalytic domain is C-terminal to the anchoring domain.

    26. (canceled)

    27. The engineered host cell of claim 1, wherein the fusion protein comprises a linker between the catalytic domain and the anchoring domain.

    28. (canceled)

    29. The engineered host cell of claim 1, wherein a growth rate of the engineered host cell in a media containing sucrose as a primary carbon source is higher than a growth rate of a control host cell, wherein the control host cell is identical to the engineered host cell, except the control cell does not express the glycosyl hydrolase.

    30. The engineered eukaryotic cell of claim 1, wherein the engineered eukaryotic cell comprises a genomic modification that overexpresses a secreted recombinant protein and/or comprises an extrachromosomal modification that overexpresses a secreted recombinant protein.

    31. The engineered eukaryotic cell of claim 30, wherein the secreted recombinant protein is an egg protein.

    32. (canceled)

    33. The engineered eukaryotic cell of claim 31, wherein the egg protein is selected from the group consisting of ovalbumin, ovomucoid, lysozyme ovoglobulin G2, ovoglobulin G3, -ovomucin, -ovomucin, ovotransferrin, ovoinhibitor, ovoglycoprotein, flavoprotein, ovomacroglobulin, ovostatin, cystatin, avidin, ovalbumin related protein X, and ovalbumin related protein Y.

    34. The engineered eukaryotic cell of claim 30, wherein the genomic modification and/or the extrachromosomal modification that overexpresses the secreted recombinant protein comprises an inducible promoter selected from an A0X1, DAK2, PEX11, FLD1, FGH1, DAS1, DAS2, CAT1, MDH3, HAC1, BiP, RAD30, RVS161-2, MPP10, THP3, TLR, GBP2, PMP20, SHB17, PEX8, PEX4, or TKL3 promoter, and/or a terminator selected from an AOX1, TDH3, MOX, RPS25A, or RPL2A terminator.

    35-36. (canceled)

    37. The engineered eukaryotic cell of claim 30, wherein the genomic modification and/or the extrachromosomal modification that overexpresses a secreted recombinant protein encodes a signal peptide, a secretory signal, and/or codons that are optimized for the species of the engineered eukaryotic cell.

    38. (canceled)

    39. The engineered eukaryotic cell of claim 30, wherein the secreted recombinant protein is designed to be secreted from the cell and/or is capable of being secreted from the cell.

    40. The engineered eukaryotic cell of claim 1, wherein the fusion protein comprises an amino acid sequence that is at least 80%, at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, or 100% identical to the amino acid sequence selected from SEQ ID NOs: 315, 332-335, and 342.

    41. The engineered eukaryotic cell of claim 1, wherein the fusion protein comprises a nucleotide sequence that is at least 80%, at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, or 100% identical to the nucleotide sequence of SEQ ID ON: 314.

    42. A method of growing/culturing the engineered host cell of claim 1, wherein the method comprises culturing the engineered host cell with a carbon source that is not naturally utilized by the host cell in the absence of the glycosyl hydrolase.

    43. A method for growing/culturing a host cell with a carbon source that is not naturally utilized by the host cell, the method comprising: (a) recombinantly producing in the host cell, a fusion protein comprising a catalytic domain of a glycosyl hydrolase capable of digesting sucrose; optionally, wherein the glycosyl hydrolase capable of digesting sucrose is an invertase; (b) recombinantly producing in the host cell a heterologous protein of interest (POI); wherein the host cell does not express the glycosyl hydrolase endogenously; wherein the engineered host cell prior to step (a) does not utilize sucrose as a carbon source as efficiently as glucose, and wherein the glycosyl hydrolase is expressed on the surface of the engineered host cell.

    44. A method for manufacturing a host cell capable of utilizing a carbon source that is not naturally utilized by the host cell, the method comprising: (a) obtaining a host cell that recombinantly expresses a fusion protein comprising a catalytic domain of a glycosyl hydrolase capable of digesting sucrose, wherein the glycosyl hydrolase capable of digesting sucrose is an invertase; and (b) genetically modifying the host cell to express a heterologous protein of interest (POI); wherein the host cell does not utilize sucrose as a carbon source as efficiently as glucose in the absence of the glycosyl hydrolase.

    45. A method for manufacturing a host cell capable of utilizing a carbon source that is not naturally utilized by the host cell, the method comprising: (a) obtaining a host cell that recombinantly expresses a heterologous protein of interest (POI); and (b) genetically modifying the host cell to express a fusion protein comprising a catalytic domain of a glycosyl hydrolase capable of digesting sucrose; wherein the glycosyl hydrolase capable of digesting sucrose is an invertase; wherein the host cell prior to step (b) does not utilize sucrose as a carbon source as efficiently as glucose.

    Description

    BRIEF DESCRIPTION OF THE DRAWINGS

    [0051] The novel features of the invention are set forth with particularity in the appended claims. A better understanding of the features and advantages of the present invention will be obtained by reference to the following detailed description that sets forth illustrative embodiments, in which the principles of the invention are utilized, and the accompanying drawings of which:

    [0052] FIG. 1 illustrates the growth of P. pastoris on minimal nutrient plates containing glucose, fructose and sucrose.

    [0053] FIG. 2 illustrates an exemplary schematic of a construct to express a surface displayed protein comprising SUC2 and an anchored protein Tir4.

    [0054] FIG. 3 illustrates the growth of P. pastoris strains using mannose as a sole carbon source.

    [0055] FIG. 4 illustrates the growth of P. pastoris strains using glucose or sucrose as a sole carbon source. The strains labelled _D in FIG. 4 denote that dextrose (glucose) was used as the carbon source in the experimental condition. The strains labelled _S in FIG. 4 denote that sucrose was used as the carbon source in the experimental condition.

    [0056] FIG. 5 is an SDS-PAGE gel comparing protein of interest production in P. pastoris strains using glucose or sucrose as a sole carbon source.

    DETAILED DESCRIPTION

    [0057] High-yielding recombinant protein expression is a cornerstone of various industries such as therapeutic proteins, food industry, cosmetics, etc. The growth of host cells in readily available media to produce such recombinant proteins is therefore one of the most important factors not only from an economic perspective but also from an environment perspective. Recombinant protein expression using commonly available carbon sources, while maintaining high titers of the recombinant proteins is necessary. The present invention addresses this need. The systems and methods provide high-titer expression of recombinant proteins in large scale production using genetic modifications to the host cell which are capable of utilizing carbon sources not usually utilized by the host cell and are particularly useful for expressing pure heterologous animal derived proteins in a microbial host.

    Host Cell

    [0058] As used herein, a host cell refers to a cell which is capable of protein expression and optionally protein secretion. Such host cell is applied in the methods of the present invention. For that purpose, for the host cell to express a polypeptide, a nucleotide sequence encoding the polypeptide is present or introduced in the cell. Host cells provided by the present invention can be prokaryotes or eukaryotes. As will be appreciated by one of skill in the art, a prokaryotic cell lacks a membrane-bound nucleus, while a eukaryotic cell has a membrane-bound nucleus. Examples of eukaryotic cells include, but are not limited to, vertebrate cells, mammalian cells, human cells, animal cells, invertebrate cells, plant cells, nematodal cells, insect cells, stem cells, fungal cells or yeast cells.

    [0059] Examples of yeast cells include but are not limited to the Saccharomyces genus (e.g. Saccharomyces cerevisiae, Saccharomyces kluyveri, Saccharomyces uvarum), the Komagataella genus (Komagataella pastoris, Komagataella pseudopastoris or Komagataella phaffii), Kluyveromyces genus (e.g. Kluyveromyces lactis, Kluyveromyces marxianus), the Candida genus (e.g. Candida utilis, Candida cacaoi), the Geotrichum genus (e.g. Geotrichum fermentans), as well as Hansenula polymorpha and Yarrowia lipolytica. A host cell may also be a member of the following species: Arxula spp., Arxula adeninivorans, Kluyveromyces spp., Kluyveromyces lactis, Komagataella phaffii, Pichia spp., Pichia angusta, Pichia pastoris, Saccharomyces spp., Saccharomyces cerevisiae, Schizosaccharomyces spp., Schizosaccharomyces pombe, Yarrowia spp., Yarrowia lipolytica, Agaricus spp., Agaricus bisporus, Aspergillus spp., Aspergillus awamori, Aspergillus fumigatus, Aspergillus nidulans, Aspergillus niger, Aspergillus oryzae, Bacillus subtilis, Colletotrichum spp., Colletotrichum gloeosporiodes, Endothia spp., Endothia parasitica, Escherichia coli, Fusarium spp., Fusarium graminearum, Fusarium solani, Mucor spp., Mucor miehei, Mucor pusillus, Myceliophthora spp., Myceliophthora thermophila, Neurospora spp., Neurospora crassa, Penicillium spp., Penicillium camemberti, Penicillium canescens, Penicillium chrysogenum, Penicillium (Talaromyces) emersonii, Penicillium funiculo sum, Penicillium purpurogenum, Penicillium roqueforti, Pleurotus spp., Pleurotus ostreatus, Rhizomucor spp., Rhizomucor miehei, Rhizomucor pusillus, Rhizopus spp., Rhizopus arrhizus, Rhizopus oligosporus, Rhizopus oryzae, Trichoderma spp., Trichoderma altroviride, Trichoderma reesei, or Trichoderma vireus.

    [0060] The genus Pichia is of particular interest. Pichia comprises a number of species, including the species Pichia pastoris, Pichia methanolica, Pichia kluyveri, and Pichia angusta. Most preferred is the species Pichia pastoris.

    [0061] The former species Pichia pastoris has been divided and renamed to Komagataella pastoris and Komagataella phaffii. Therefore, Pichia pastoris is synonymous for both Komagataella pastoris and Komagataella phaffii.

    [0062] In some embodiments, the host cell is a Pichia pastoris, Hansenula polymorpha, Trichoderma reesei, Saccharomyces cerevisiae, Kluyveromyces lactis, Yarrowia lipolytica, Pichia methanolica, Candida boidinii, and Komagataella, and Schizosaccharomyces pombe.

    Protein of Interest

    [0063] The term protein of interest (POI) as used herein refers to a protein that is produced by means of recombinant technology in a host cell. More specifically, the protein may either be a polypeptide not naturally occurring in the host cell, i.e. a heterologous protein, or else may be native to the host cell, i.e. a homologous protein to the host cell, but is produced, for example, by transformation with a self-replicating vector containing the nucleic acid sequence encoding the POI, or upon integration by recombinant techniques of one or more copies of the nucleic acid sequence encoding the POI into the genome of the host cell, or by recombinant modification of one or more regulatory sequences controlling the expression of the gene encoding the POI, e.g. of the promoter sequence. In general, the proteins of interest referred to herein may be produced by methods of recombinant expression well known to a person skilled in the art. Exemplary proteins of interest are provided in Table 6. A recombinant POI expressed in a host cell may comprise a sequence with at least 75%, at least 80%, at least 85%, at least 90%, at least 92%, at least 95%, at least 97% or at least 99% sequence identity to any of the sequences in Table 6.

    [0064] There is no limitation with respect to the protein of interest (POI). The POI may comprise a eukaryotic or prokaryotic polypeptide, variant or derivative thereof. The POI can be any eukaryotic or prokaryotic protein. The protein can be a naturally secreted protein or an intracellular protein, i.e. a protein which is not naturally secreted. The present invention also includes biologically active fragments of proteins. In another embodiment, a POI may be an amino acid chain or present in a complex, such as a dimer, trimer, hetero-dimer, multimer or oligomer.

    [0065] The protein of interest may be a protein used as nutritional, dietary, digestive, supplements, such as in food products, feed products, or cosmetic products. The food products may be, for example, bouillon, desserts, cereal bars, confectionery, sports drinks, dietary products or other nutrition products. Preferably, the protein of interest is a food additive.

    Glycosyl Hydrolases

    [0066] In some cases, a heterologous glycosyl hydrolase is produced in a host cell that has been engineered to express or overexpress one or more heterologous recombinant proteins such as the proteins of interest. A glycosyl hydrolase may be a surface-displayed enzyme that hydrolyses a disaccharide which allows a host cell to utilize a carbon source which it previously was unable to utilize or utilize efficiently. In some embodiments, a carbon source which a host cell is previously unable to utilize or utilize efficiently may comprise sucrose, maltose, fructose, high fructose corn syrup, molasses, or some combination thereof. In some embodiments, the carbon source which a host cell is previously unable to utilize or utilize efficiently may be present in a mixture with glucose. In some examples, a glycosyl hydrolase may be an enzyme that hydrolyzes a carbon source, e.g., a disaccharide, to its monomers, e.g., glucose, fructose, and galactose, which can be utilized by the host cell. For example, in some examples, the glycosyl hydrolase may be an invertase such as proteins encoded by the SUC2 or MAL1 genes which cleave a disaccharide sucrose to release glucose and fructose which can be utilized by a yeast such as P. pastoris. In some embodiments, the glycosyl hydrolase may be an invertase such as proteins encoded by the INV1, CINV1, CIN2, INVE, INVA, or SI genes which cleave a disaccharide sucrose to release glucose and fructose which can be utilized by a yeast. Additional non-limiting examples of glycosyl hydrolases include, but are not limited to: invertase, invertase 1, cytosolic invertase 1, Beta-fructofuranosidase, insoluble isoenzyme 2, Alkaline/neutral invertase, Alkaline/neutral invertase A, Alkaline/neutral invertase E, and Sucrase-isomaltase. Exemplary sequences for glycosyl hydrolases are provided in Table 2. A recombinant glycosyl hydrolase expressed in a host cell may comprise a sequence with at least 75%, at least 80%, at least 85%, at least 90%, at least 92%, at least 95%, at least 97% or at least 99% sequence identity to any of the sequences in Table 2.

    [0067] In certain embodiments, the glycosyl hydrolase is of the family GHS. In certain embodiments, the glycosyl hydrolase is of the family GH7. In certain embodiments, the glycosyl hydrolase is of the family GH9. Such glycosyl hydrolases are found in PCT Application Publication No.: WO2009090381, which is hereby incorporated by reference in its entirety.

    [0068] An engineered host cell expressing a heterologous glycosyl hydrolase may be cultured with a carbon source that is not naturally utilized by the host cell or not utilized as efficiently as glucose in the absence of the glycosyl hydrolase.

    [0069] An engineered host cell expressing a heterologous glycosyl hydrolase may have a growth rate in a media containing sucrose as a primary carbon source is higher than a growth rate of a control host cell, wherein the control host cell is identical to the engineered host cell, except the control cell does not express the glycosyl hydrolase.

    [0070] In some embodiments, an engineered host cell expressing a heterologous glycosyl hydrolase may have a growth rate in a media containing fructose as a primary carbon source is higher than a growth rate of a control host cell, wherein the control host cell is identical to the engineered host cell, except the control cell does not express the glycosyl hydrolase.

    [0071] In some embodiments, an engineered host cell expressing a heterologous glycosyl hydrolase may have a growth rate in a media containing maltose as a primary carbon source is higher than a growth rate of a control host cell, wherein the control host cell is identical to the engineered host cell, except the control cell does not express the glycosyl hydrolase.

    [0072] In some embodiments, an engineered host cell expressing a heterologous glycosyl hydrolase may have a growth rate in a media containing high fructose corn syrup as a primary carbon source is higher than a growth rate of a control host cell, wherein the control host cell is identical to the engineered host cell, except the control cell does not express the glycosyl hydrolase.

    [0073] In some embodiments, an engineered host cell expressing a heterologous glycosyl hydrolase may have a growth rate in a media containing molasses as a primary carbon source is higher than a growth rate of a control host cell, wherein the control host cell is identical to the engineered host cell, except the control cell does not express the glycosyl hydrolase.

    [0074] In some embodiments, an engineered host cell expressing a heterologous glycosyl hydrolase may have a growth rate in a media containing a disaccharide as a primary carbon source is higher than a growth rate of a control host cell, wherein the control host cell is identical to the engineered host cell, except the control cell does not express the glycosyl hydrolase.

    [0075] In some embodiments, an engineered host cell expressing a heterologous glycosyl hydrolase may have a growth rate in a media containing a mixture of glucose and a disaccharide as a primary carbon source is higher than a growth rate of a control host cell, wherein the control host cell is identical to the engineered host cell, except the control cell does not express the glycosyl hydrolase.

    [0076] In some embodiments, an engineered host cell expressing a heterologous glycosyl hydrolase may have a growth rate in a media containing a carbon source that is not glucose as a primary carbon source is higher than a growth rate of a control host cell, wherein the control host cell is identical to the engineered host cell, except the control cell does not express the glycosyl hydrolase.

    Surface Display of Glycosyl Hydrolases

    [0077] Surface displaying a catalytic domain of an enzyme provides effective and efficient means to project the catalytic domain into the extracellular space, thereby increasing the likelihood that the catalytic domain will encounter and catalyze an enzymatic reaction with its substrate, e.g., protein, lipid, carbohydrate, or another compound. In the present disclosure, a fusion protein is localized to the extracellular surface of a host cell, i.e., is surface displayed. This way, the catalytic domain is unlikely to contact an intracellular, membrane-associated, or cell wall protein, thereby lowering the opportunity for the enzyme to modify, degrade, or the like a substrate needed by the cell. In some embodiments, the fusion protein catalyzes a reaction that cleaves a disaccharide, which would allow the cell to utilize an alternate carbon source that was previously not possible or efficient. By cleaving the disaccharide into monosaccharides, the cell is able to use the monosaccharides even though the culturing medium did not include the monosaccharide. In further embodiments, the fusion protein expresses an enzyme, e.g., a sucrase, that digests an impurity secreted by the cell.

    [0078] An aspect of the present disclosure is an engineered host cell that expresses a surface-displayed fusion protein. In some embodiments, host cells that can be engineered to express a surface-displayed fusion protein provided by the present invention can be prokaryotes or eukaryotes. As will be appreciated by one of skill in the art, a prokaryotic cell lacks a membrane-bound nucleus, while a eukaryotic cell has a membrane-bound nucleus. Examples of eukaryotic cells include, but are not limited to, vertebrate cells, mammalian cells, human cells, animal cells, invertebrate cells, plant cells, nematodal cells, insect cells, stem cells, fungal cells or yeast cells.

    [0079] Examples of yeast cells that may be transformed to include one or more expression cassettes include but are not limited to the Saccharomyces genus (e.g. Saccharomyces cerevisiae, Saccharomyces kluyveri, Saccharomyces uvarum), the Komagataella genus (Komagataella pastoris, Komagataella pseudopastoris or Komagataella phaffii), Kluyveromyces genus (e.g. Kluyveromyces lactis, Kluyveromyces mandanus), the Candida genus (e.g. Candida utilis, Candida cacaoi, the Geotrichum genus (e.g. Geotrichum fermentans), as well as Hansenula polymorpha and Yarrowia lipolytica. A host cell may also be a member of the following species: Arxula spp., Arxula adeninivorans, Kluyveromyces spp., Kluyveromyces lactis, Komagataella phaffii, Pichia spp., Pichia angusta, Pichia pastoris, Saccharomyces spp., Saccharomyces cerevisiae, Schizosaccharomyces spp., Schizosaccharomyces pombe, Yarrowia spp., Yarrowia lipolytica, Agaricus spp., Agaricus bisporus, Aspergillus spp., Aspergillus awamori, Aspergillus fumigatus, Aspergillus nidulans, Aspergillus niger, Aspergillus oryzae, Bacillus subtilis, Colletotrichum spp., Colletotrichum gloeosporiodes, Endothia spp., Endothia parasitica, Escherichia coli, Fusarium spp., Fusarium graminearum, Fusarium solani, Mucor spp., Mucor miehei, Mucor pusillus, Myceliophthora spp., Myceliophthora thermophila, Neurospora spp., Neurospora crassa, Penicillium spp., Penicillium camemberti, Penicillium canescens, Penicillium chrysogenum, Penicillium (Talaromyces) emersonii, Penicillium funiculo sum, Penicillium purpurogenum, Penicillium roqueforti, Pleurotus spp., Pleurotus ostreatus, Rhizomucor spp., Rhizomucor miehei, Rhizomucor pusillus, Rhizopus spp., Rhizopus arrhizus, Rhizopus oligosporus, Rhizopus oryzae, Trichoderma spp., Trichoderma altroviride, Trichoderma reesei, or Trichoderma vireus.

    [0080] The genus Pichia is of particular interest. Pichia comprises a number of species, including the species Pichia pastoris, Pichia methanolica, Pichia kluyveri, and Pichia angusta. Most preferred is the species Pichia pastoris.

    [0081] The former species Pichia pastoris has been divided and renamed to Komagataella pastoris and Komagataella phaffii. Therefore, Pichia pastoris is synonymous for both Komagataella pastoris and Komagataella phaffii.

    [0082] In some embodiments, the host cell is a Pichia pastoris, Hansenula polymorpha, Trichoderma reesei, Saccharomyces cerevisiae, Kluyveromyces lactis, Yarrowia lipolytica, Pichia methanolica, Candida boidinii, and Komagataella, and Schizosaccharomyces pombe.

    [0083] In some embodiments, the engineered host cell expresses a surface-displayed fusion protein. The fusion protein comprising a catalytic domain of an enzyme and an anchoring domain of a glycosylphosphatidylinositol (GPI)-anchored protein, wherein the anchoring domain comprises at least about 200 amino acids and/or at least about 30% of the residues in the anchoring domain are serines or threonines.

    [0084] A fusion protein is a protein consisting of at least two domains that are normally encoded by separate genes but have been joined so that they are transcribed and translated as a single unit; thereby, producing a single (fused) polypeptide.

    [0085] In the present disclosure, a fusion protein comprises at least a catalytic domain of an enzyme such as a glycosyl hydrolase and an anchoring domain of GPI-anchored protein. Typically, a GPI-anchored protein is a cell surface protein, e.g., which is located on the extracellular surface of the cell.

    [0086] A fusion protein may further comprise linkers that separate the two domains. Linkers can be flexible or rigid; they can be semi-flexible or semi-rigid. Separating the two domains, may promote activity of the catalytic domain in that it reduces steric hindrance upon the catalytic site which may be present if the catalytic site is too closely positioned relative to an anchoring domain. Additionally, a linker may further project the catalytic domain into the extracellular space, thereby increasing the likelihood that the catalytic domain will encounter and catalyze an enzymatic reaction with its substrate, e.g., protein, lipid, carbohydrate, or other compounds.

    [0087] In embodiments, the anchoring domain comprises at least about 225 amino acids, at least about 250 amino acids, at least about 275 amino acids, at least about 300 amino acids, at least about 325 amino acids, at least about 350 amino acids, at least about 375 amino acids, or at least about 400 amino acids.

    [0088] In some embodiments, at least about 35% of the residues in the anchoring domain are serines or threonines, at least about 40% of the residues in the anchoring domain are serines or threonines, at least about 45% of the residues in the anchoring domain are serines or threonines, or at least about 50% of the residues in the anchoring domain are serines or threonines.

    [0089] In various embodiments, the serines or threonines in the anchoring domain are capable of being 0-mannosylated.

    [0090] In embodiments, a fusion protein having an anchoring domain comprising at least about 325 amino acids provides greater enzymatic activity relative to a fusion protein having an anchoring domain comprising less than about 300 amino acids.

    [0091] In some embodiments, a fusion protein having an anchoring domain comprising at least about 300 amino acids provides greater enzymatic activity relative to a fusion protein having an anchoring domain comprising less than about 250 amino acids.

    [0092] In some embodiments, the fusion protein comprises the GPI anchored protein without its native signal peptide. In some embodiments, the fusion protein comprises the GPI anchored protein without a C terminus region having amino acid sequence of GAAKAVIGMGAGALAAVAAML (SEQ ID NO: 336). In some embodiments, the fusion protein comprises the GPI anchored protein with a C terminus region having amino acid sequence of GAAKAVIGMGAGALAAVAAML (SEQ ID NO: 336).

    [0093] In some embodiments, the GPI anchored protein is not native to the engineered eukaryotic cell.

    [0094] In various embodiments, the GPI anchored protein is naturally expressed by a S. cerevisiae cell and the engineered eukaryotic cell is not a S. cerevisiae cell.

    [0095] In embodiments, the GPI anchored protein is selected from Tir4, Dan1, Dan4, Sag1, FIG. 2, or Sed1.

    [0096] In some embodiments, the anchoring domain of the GPI anchored protein comprises an amino acid sequence that is at least 70% identical, at least 75% identical, at least 80% identical, at least 85% identical, at least 90% identical, or at least 95% identical, to one of SEQ ID NO: 1 to SEQ ID NO: 14.

    [0097] In various embodiments, the anchoring domain of the GPI anchored protein comprises an amino acid sequence of one of SEQ ID NO: 1 to SEQ ID NO: 14.

    [0098] Sed1p is a major component of the Saccharomyces cerevisiae cell wall. It is required to stabilize the cell wall and for stress resistance in stationary-phase cells. See, e.g., the world wide web (at) uniprot.org/uniprot/Q01589. It is believed that Asn318 (with respect to SEQ ID NO: 13) is the most likely candidate for the GPI attachment site in Sed1p. In some embodiments, a fusion protein comprising a Sed1p anchoring domain has a sequence having at least 95% or more sequence identity with SEQ ID NO:13 or SEQ ID NO: 14. In some cases, the sequence identity may be greater than or about 90%, 95%, 96%, 97%, 98%, 99%, or 100%. In various embodiments, the Sed1p anchoring domain of a fusion protein of the present disclosure comprises a GPI attachment site; thus, the anchoring domain may only require a short fragment of SEQ ID NO: 13 or SEQ ID NO: 14, i.e., a fragment that is 5, 10, 50, 100, 200, or 300 or more amino acids in length, as long as it is capable of projecting the catalytic domain of the fusion protein into the extracellular space. In some embodiments, the anchoring domain comprises, at least, Sed1p's GPI attachment site.

    [0099] When a linker is present, a fusion protein may have a general structure of: N terminus -(a)-(b)-(c)-C terminus, wherein (a) is comprises a first domain, (b) is one or more linkers, and (c) is a second domain. The first domain may comprise a catalytic domain of an enzyme and the second domain may comprise an anchoring domain of a GPI anchored protein. In some embodiments, in the fusion protein, the catalytic domain is N-terminal to the anchoring domain. The fusion protein may comprise a linker N-terminal to the anchoring domain.

    [0100] Linkers useful in fusion proteins may comprise one or more sequences of Table 3. In one example, a tandem repeat (of two, three, four, five, six, or more copies) of a linker, e.g., of SEQ ID NO: 33 or SEQ ID NO: 34 is included in a fusion protein.

    [0101] In embodiments, a fusion protein comprises a Glu-Ala-Glu-Ala (EAEA; SEQ ID NO: 19) spacer dipeptide repeat. The EAEA (SEQ ID NO: 19) is a signal that promotes yields of an expressed protein in certain cell types.

    [0102] Other linkers are well-known in the art and can be substituted for the linkers of Table 3. For example, in embodiments, the linker may be derived from naturally-occurring multi-domain proteins or are empirical linkers as described, for example, in Chichili et al., (2013), Protein Sci. 22(2):153-167, Chen et al., (2013), Adv Drug Deliv Rev. 65(10):1357-1369, the entire contents of which are hereby incorporated by reference. In embodiments, the linker may be designed using linker designing databases and computer programs such as those described in Chen et al., (2013), Adv Drug Deliv Rev. 65(10):1357-1369 and Crasto et. al., (2000), Protein Eng. 13(5):309-312, the entire contents of which are hereby incorporated by reference.

    [0103] In embodiments, the linker comprises a polypeptide. In embodiments, the polypeptide is less than about 500 amino acids long, about 450 amino acids long, about 400 amino acids long, about 350 amino acids long, about 300 amino acids long, about 250 amino acids long, about 200 amino acids long, about 150 amino acids long, or about 100 amino acids long. For example, the linker may be less than about 100, about 95, about 90, about 85, about 80, about 75, about 70, about 65, about 60, about 55, about 50, about 45, about 40, about 35, about 30, about 25, about 20, about 19, about 18, about 17, about 16, about 15, about 14, about 13, about 12, about 11, about 10, about 9, about 8, about 7, about 6, about 5, about 4, about 3, or about 2 amino acids long. In some cases, the linker is about 59 amino acids long.

    [0104] The length of a linker may be important to the effectiveness of a surface displayed enzyme's catalytic domain. For example, if a linker is too short, then the catalytic domain of the enzyme may not project far enough away from the cell surface such that it is incapable of interacting with its substrate, e.g., protein, lipid, carbohydrate, or another compound. In this case, the catalytic domain may be buried in the cell wall and/or among other cell surface proteins or sugars. On the other hand, the linker may be too long and/or too rigid to allow adequate contact between a substrate and the catalytic domain of the enzyme.

    [0105] The secondary structure of a linker may also be important to the effectiveness of a surface displayed enzyme's catalytic domain. More specifically, a linker designed to have a plurality of distinct regions may provide additional flexibility to the fusion protein. As examples, a linker having one or more alpha helices may be superior to a linker having no alpha helices.

    [0106] The longer linker comprises three subsections: an N-terminal flexible GS linker with higher S content, a rigid linker that forms four turns of an alpha helix, and a flexible GS linker with much higher G content on its C-terminus. Linkers containing only G's and S's in repetitive sequences are commonly used in fusion proteins as flexible spacers that do not introduce secondary structure. In some cases, the ratio of G to S determines the flexibility of the linker. Linkers with higher G content may be more flexible than linkers with higher S content. The structure of the linker of SEQ ID NO: 31 is designed to mimic multi-domain proteins in nature, which often uses alpha helices (sometimes multiple) to separate as well as orient their domains spatially. In fusion proteins of the present disclosure, a complex linker, such as that of SEQ ID NO: 32 can be viewed as a multi-domain protein with the catalytic domain of an enzyme and an anchoring domain of a GPI anchored protein being separate functional domains.

    [0107] In various embodiments, the fusion protein comprises a linker having an amino acid sequence that is at least 95%, 96%, 97%, 98%, 99%, or 100% identical to SEQ ID NO: 32.

    [0108] In embodiments, the linker is substantially comprised of glycine and serine residues (e.g. about 30%, or about 40%, or about 50%, or about 60%, or about 70%, or about 80%, or about 90%, or about 95%, or about 96%, or about 97%, or about 98%, or about 99%, or about 100% glycines and serines).

    [0109] In various embodiments, the engineered eukaryotic cell comprises a genomic modification that expresses the fusion protein and/or comprises an extrachromosomal modification that expresses the fusion protein.

    [0110] In embodiments, the fusion protein comprises a portion of the enzyme in addition to its catalytic domain.

    [0111] In some embodiments, the fusion protein comprises substantially the entire amino acid sequence of the enzyme.

    [0112] In some embodiments, upon translation, the fusion protein comprises a signal peptide and/or a secretory signal. In certain embodiments, the fusion protein comprises a signal peptide and a secretory signal.

    [0113] In some embodiments, the fusion protein comprises an amino acid sequence having at least at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99% or 100% amino acid sequence identity to the amino acid sequence selected from SEQ ID NOs: 315, and 332-335. In some embodiments, the fusion protein comprises an amino acid sequence having at least at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99% or 100% amino acid sequence identity to the amino acid sequence of SEQ ID NO: 315. In some embodiments, the fusion protein comprises an amino acid sequence having at least at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99% or 100% amino acid sequence identity to the amino acid sequence of SEQ ID NO: 332. In some embodiments, the fusion protein comprises an amino acid sequence having at least at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99% or 100% amino acid sequence identity to the amino acid sequence of SEQ ID NO: 333. In some embodiments, the fusion protein comprises an amino acid sequence having at least at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99% or 100% amino acid sequence identity to the amino acid sequence of SEQ ID NO: 334. In some embodiments, the fusion protein comprises an amino acid sequence having at least at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99% or 100% amino acid sequence identity to the amino acid sequence of SEQ ID NO: 335.

    [0114] In various embodiments, the engineered eukaryotic cell comprises two or more fusion proteins, three or more fusion proteins, or four fusion proteins.

    [0115] In some cases, the two or more fusion proteins comprise different enzyme types or the two or more fusion proteins comprise the same enzyme type.

    [0116] In various cases, the two of the three or more fusion proteins or two of the four or more fusion proteins comprise different enzyme types or two of the three or more fusion proteins or two of the four or more fusion proteins comprise the same enzyme type.

    [0117] In additional cases, the three of the three or more fusion proteins or three of the four or more fusion proteins comprise different enzyme types or three of the three or more fusion proteins or three of the four or more fusion proteins comprise the same enzyme type.

    [0118] In various cases, each of the two or more, three or more, or four fusion proteins comprise different enzyme types or each of the two or more, three or more, or four fusion proteins comprise the same enzyme type.

    [0119] In embodiments, the enzyme types are selected from an enzyme that catalyzes a post-translational modification of a protein secreted by the engineered eukaryotic cell, an enzyme that catalyzes a reaction which allows the engineered eukaryotic cell to rely on alternate carbon sources.

    [0120] In some embodiments, an engineered host cell expressing a fusion protein may have a growth rate in a media containing fructose as a primary carbon source is higher than a growth rate of a control host cell, wherein the control host cell is identical to the engineered host cell, except the control cell does not express the fusion protein.

    [0121] In some embodiments, an engineered host cell expressing a fusion protein may have a growth rate in a media containing maltose as a primary carbon source is higher than a growth rate of a control host cell, wherein the control host cell is identical to the engineered host cell, except the control cell does not express the fusion protein.

    [0122] In some embodiments, an engineered host cell expressing a fusion protein may have a growth rate in a media containing high fructose corn syrup as a primary carbon source is higher than a growth rate of a control host cell, wherein the control host cell is identical to the engineered host cell, except the control cell does not express the fusion protein.

    [0123] In some embodiments, an engineered host cell expressing a fusion protein may have a growth rate in a media containing molasses as a primary carbon source is higher than a growth rate of a control host cell, wherein the control host cell is identical to the engineered host cell, except the control cell does not express the fusion protein.

    [0124] In some embodiments, an engineered host cell expressing a fusion protein may have a growth rate in a media containing a disaccharide as a primary carbon source is higher than a growth rate of a control host cell, wherein the control host cell is identical to the engineered host cell, except the control cell does not express the fusion protein.

    [0125] In some embodiments, an engineered host cell expressing a fusion protein may have a growth rate in a media containing a mixture of glucose and a disaccharide as a primary carbon source is higher than a growth rate of a control host cell, wherein the control host cell is identical to the engineered host cell, except the control cell does not express the fusion protein.

    [0126] In some embodiments, an engineered host cell expressing a fusion protein may have a growth rate in a media containing a carbon source that is not glucose as a primary carbon source is higher than a growth rate of a control host cell, wherein the control host cell is identical to the engineered host cell, except the control cell does not express the fusion protein.

    Transporter Proteins

    [0127] In some cases, a heterologous transporter protein is produced in a host cell that has been engineered to express or overexpress one or more heterologous recombinant proteins such as the proteins of interest. A transporter protein may be a protein that allows the host cell to transport a carbon source into the host cell. The host cell then may be able to catalyze a reaction which allows the host cell to utilize a carbon source which it previously was unable to utilize or utilize efficiently. In some embodiments, the transporter protein may be a sucrose permease (such as encoded by the MAL 11 or AGT1 genes) or a maltose permease (such as encoded by the MAL2 gene). Exemplary sequences for glycosyl hydrolases are provided in Table 10. A recombinant glycosyl hydrolase expressed in a host cell may comprise a sequence with at least 75%, at least 80%, at least 85%, at least 90%, at least 92%, at least 95%, at least 97% or at least 99% sequence identity to any of the sequences in Table 10. In certain embodiments, the sucrose permease is a CscB sucrose permease. Exemplary sequences of sucrose permeases can be found in PCT Application Publication No.: WO2022129470, which is hereby incorporated by reference in its entirety.

    [0128] An engineered host cell expressing a heterologous transporter protein may be cultured with a carbon source that is not naturally utilized by the host cell or not utilized as efficiently as glucose in the absence of the transporter protein.

    [0129] An engineered host cell expressing a heterologous transporter protein may have a growth rate in a media containing sucrose as a primary carbon source is higher than a growth rate of a control host cell, wherein the control host cell is identical to the engineered host cell, except the control cell does not express the transporter protein.

    [0130] In some embodiments, an engineered host cell expressing a heterologous transporter protein may have a growth rate in a media containing fructose as a primary carbon source is higher than a growth rate of a control host cell, wherein the control host cell is identical to the engineered host cell, except the control cell does not express the transporter protein.

    [0131] In some embodiments, an engineered host cell expressing a heterologous transporter protein may have a growth rate in a media containing maltose as a primary carbon source is higher than a growth rate of a control host cell, wherein the control host cell is identical to the engineered host cell, except the control cell does not express the transporter protein.

    [0132] In some embodiments, an engineered host cell expressing a heterologous transporter protein may have a growth rate in a media containing high fructose corn syrup as a primary carbon source is higher than a growth rate of a control host cell, wherein the control host cell is identical to the engineered host cell, except the control cell does not express the transporter protein.

    [0133] In some embodiments, an engineered host cell expressing a heterologous transporter protein may have a growth rate in a media containing molasses as a primary carbon source is higher than a growth rate of a control host cell, wherein the control host cell is identical to the engineered host cell, except the control cell does not express the transporter protein.

    [0134] In some embodiments, an engineered host cell expressing a heterologous transporter protein may have a growth rate in a media containing a disaccharide as a primary carbon source is higher than a growth rate of a control host cell, wherein the control host cell is identical to the engineered host cell, except the control cell does not express the transporter protein.

    [0135] In some embodiments, an engineered host cell expressing a heterologous transporter protein may have a growth rate in a media containing a mixture of glucose and a disaccharide as a primary carbon source is higher than a growth rate of a control host cell, wherein the control host cell is identical to the engineered host cell, except the control cell does not express the transporter protein.

    [0136] In some embodiments, an engineered host cell expressing a heterologous transporter protein may have a growth rate in a media containing a carbon source that is not glucose as a primary carbon source is higher than a growth rate of a control host cell, wherein the control host cell is identical to the engineered host cell, except the control cell does not express the transporter protein.

    [0137] In some cases, the engineered host cell may endogenously express a glycosyl hydrolase which can utilize the alternate carbon source, but it is unable to do so efficiently. In such cases, a transporter protein may increase the uptake of the alternate carbon source and therefore increase the metabolization of the alternate carbon source.

    [0138] In some cases, the engineered host cell may not express a glycosyl hydrolase which is able to hydrolyze an alternate carbon source. In such examples, the host cell may be engineered to express a heterologous glycosyl hydrolase which is able to hydrolyze the alternate carbon source.

    Expression of Recombinant Proteins

    [0139] Expression of a recombinant proteins can be provided by an expression vector, a plasmid, a nucleic acid integrated into the host genome or other means. For example, a vector for expression can include: (a) a promoter element, (b) a signal peptide, (c) a heterologous protein sequence, and (d) a terminator element.

    [0140] Expression vectors that can be used for expression of a recombinant proteins include those containing an expression cassette with elements (a), (b), (c) and (d). In some embodiments, the signal peptide (c) need not be included in the vector. In general, the expression cassette is designed to mediate the transcription of the transgene when integrated into the genome of a cognate host microorganism.

    [0141] To aid in the amplification of the vector prior to transformation into the host microorganism, a replication origin (e) may be contained in the vector (such as pUC ORIC and pUC (DNA2.0)). To aide in the selection of microorganism stably transformed with the expression vector, the vector may also include a selection marker (f) such as URA3 gene and Zeocin resistance gene (ZeoR). The expression vector may also contain a restriction enzyme site (g) that allows for linearization of the expression vector prior to transformation into the host microorganism to facilitate the expression vectors stable integration into the host genome. In some embodiments the expression vector may contain any subset of the elements (b), (e), (f), and (g), including none of elements (b), (e), (f), and (g). Other expression elements and vector elements known to one of skill in the art can be used in combination or substituted for the elements described herein.

    [0142] Exemplary promoter elements (a) may include, but are not limited to, a constitutive promoter, inducible promoter, and hybrid promoter. Promoters include, but are not limited to, acu-5, adh1+, alcohol dehydrogenase (ADH1, ADH2, ADH4), AHSB4m, AINV, alcA, -amylase, alternative oxidase (AOD), alcohol oxidase I (AOX1), alcohol oxidase 2 (AOX2), AXDH, B2, CaMV, cellobiohydrolase I (cbh1), ccg-1, cDNA1, cellular filament polypeptide (cfp), cpc-2, ctr4+, CUP1, dihydroxyacetone synthase (DAS), enolase (ENO, ENO1), formaldehyde dehydrogenase (FLD1), FMD, formate dehydrogenase (FMDH), G1, G6, GAA, GAL1, GAL2, GAL3, GAL4, GAL5, GAL6, GALT, GAL5, GAL5, GAL10, GCW14, gdhA, gla-1, -glucoamylase (glaA), glyceraldehyde-3-phosphate dehydrogenase (gpdA, GAP, GAPDH), phosphoglycerate mutase (GPM1), glycerol kinase (GUT1), HSP82, inv1+, isocitrate lyase (ICL1), acetohydroxy acid isomeroreductase (ILV5), KAR2, KEX2, O-galactosidase (lac4), LEU2, melO, MET3, methanol oxidase (MOX), nmt1, NSP, pcbC, PETS, peroxin 8 (PEX8), phosphoglycerate kinase (PGK, PGK1), pho1, PHO5, PH089, phosphatidylinositol synthase (PIS1), PYK1, pyruvate kinase (pki1), RPS7, sorbitol dehydrogenase (SDH), 3-phosphoserine aminotransferase (SERI), SSA4, SV40, TEF, translation elongation factor 1 alpha (TEF1), THI11, homoserine kinase (THR1), tpi, TPS1, triose phosphate isomerase (TPI1), XRP2, YPT1, and any combination thereof. Illustrative inducible promoters include methanol-induced promoters, e.g., DAS1 and PEX11. Exemplary promoter sequences are provided in Table 4.

    [0143] A signal peptide (b), also known as a signal sequence, targeting signal, localization signal, localization sequence, signal peptide, transit peptide, leader sequence, or leader peptide, may support secretion of a protein or polynucleotide. Extracellular secretion of a recombinant or heterologously expressed protein from a host cell may facilitate protein purification. A signal peptide may be derived from a precursor (e.g., prepropeptide, preprotein) of a protein. Signal peptides can be derived from a precursor of a protein other than the signal peptides in native a recombinant protein.

    [0144] Any nucleic acid sequence that encodes a recombinant protein can be used as (c). Preferably such sequence is codon optimized for the species/genus/kingdom of the host cell.

    [0145] Exemplary transcriptional terminator elements include, but are not limited to, acu-5, adh1+, alcohol dehydrogenase (ADH1, ADH2, ADH4), AHSB4m, AINV, alcA, -amylase, alternative oxidase (AOD), alcohol oxidase I (AOX1), alcohol oxidase 2 (AOX2), AXDH, B2, CaMV, cellobiohydrolase I (cbh1), ccg-1, cDNA1, cellular filament polypeptide (cfp), cpc-2, ctr4+, CUP1, dihydroxyacetone synthase (DAS), enolase (ENO, ENO1), formaldehyde dehydrogenase (FLD1), FMD, formate dehydrogenase (FMDH), G1, G6, GAA, GAL1, GAL2, GAL3, GAL4, GAL5, GAL6, GALT, GAL5, GAL5, GAL10, GCW14, gdhA, gla-1, -glucoamylase (glaA), glyceraldehyde-3-phosphate dehydrogenase (gpdA, GAP, GAPDH), phosphoglycerate mutase (GPM1), glycerol kinase (GUT1), HSP82, inv1+, isocitrate lyase (ICL1), acetohydroxy acid isomeroreductase (ILV5), KAR2, KEX2, (3-galactosidase (lac4), LEU2, melO, MET3, methanol oxidase (MOX), nmt1, NSP, pcbC, PETS, peroxin 8 (PEX8), phosphoglycerate kinase (PGK, PGK1), pho1, PHO5, PH089, phosphatidylinositol synthase (PIS1), PYK1, pyruvate kinase (pki1), RPS7, sorbitol dehydrogenase (SDH), 3-phosphoserine aminotransferase (SERI), SSA4, SV40, TEF, translation elongation factor 1 alpha (TEF1), THI11, homoserine kinase (THR1), tpi, TPS1, triose phosphate isomerase (TPI1), XRP2, YPT1, and any combination thereof. Exemplary promoter sequences are provided in Table 5.

    [0146] Exemplary selectable markers (f) may include but are not limited to: an antibiotic resistance gene (e.g. zeocin, ampicillin, blasticidin, kanamycin, nurseothricin, chloroamphenicol, tetracycline, triclosan, ganciclovir, and any combination thereof), an auxotrophic marker (e.g. ade1, arg4, his4, ura3, met2, and any combination thereof). Exemplary terminator sequences are provided in Table 8.

    [0147] In one example, a vector for expression in Pichia sp. can include an AOX1 promoter operably linked to a signal peptide (alpha mating factor) that is fused in frame with a nucleic acid sequence encoding a recombinant protein, and a terminator element (AOX1 terminator) immediately downstream of the nucleic acid sequence encoding a recombinant protein.

    [0148] In another example, a vector comprising a DAS1 promoter is operably linked to a signal peptide (alpha mating factor) that is fused in frame with a nucleic acid sequence encoding a recombinant protein and a terminator element (AOX1 terminator) immediately downstream of a recombinant protein.

    [0149] A recombinant protein described herein may be secreted from the one or more host cells. In some embodiments, a recombinant POI is secreted from the host cell. The secreted recombinant POI may be isolated and purified by methods such as centrifugation, fractionation, filtration, affinity purification and other methods for separating protein from cells, liquid and solid media components and other cellular products and byproducts. In some embodiments, a recombinant POI is produced in a Pichia Sp. and secreted from the host cells into the culture media. The secreted recombinant protein such as the POI is then separated from other media components for further use.

    [0150] In some cases, multiple vectors comprising the gene sequence of a protein may be transfected into one or more host cells. A host cell may comprise more than one copy of the gene encoding the recombinant protein. A single host cell may comprise 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, or 20 copies of the recombinant POI or the fusion protein. A single host cell may comprise one or more vectors for the expression of the POI and/or the fusion protein. A single host cell may comprise 2, 3, 4, 5, 6, 7, 8, 9, or 10 vectors for the POI expression and/or the fusion protein expression. Each vector in the host cell may drive the expression of POI and/or the fusion protein using the same promoter. Alternatively, different promoters may be used in different vectors for POI and/or the fusion protein expression.

    [0151] A recombinant protein such as the POI or the fusion protein may be recombinantly expressed in one or more host cells. As used herein, a host or host cell denotes here any protein production host selected or genetically modified to produce a desired product. Exemplary hosts include fungi, such as filamentous fungi, as well as bacteria, yeast, plant, insect, and mammalian cells. A host cell can be an organism that is approved as generally regarded as safe by the U.S. Food and Drug Administration.

    [0152] In some embodiments, a host cell may be transformed to include one or more expression cassettes. As examples, a host cell may be transformed to express one expression cassette, two expression cassettes, three expression cassettes or more expression cassettes. In one example, a host cell is transformed express a first expression cassette that encodes a first POI and express a second expression cassette that encodes a second POI.

    [0153] As used herein, a host cell refers to a cell which is capable of protein expression and optionally protein secretion. Such host cell is applied in the methods of the present invention. For that purpose, for the host cell to express a polypeptide, a nucleotide sequence encoding the polypeptide is present or introduced in the cell. Host cells provided by the present invention can be prokaryotes or eukaryotes. As will be appreciated by one of skill in the art, a prokaryotic cell lacks a membrane-bound nucleus, while a eukaryotic cell has a membrane-bound nucleus. Examples of eukaryotic cells include, but are not limited to, vertebrate cells, mammalian cells, human cells, animal cells, invertebrate cells, plant cells, nematodal cells, insect cells, stem cells, fungal cells or yeast cells.

    [0154] Examples of yeast cells that may be transformed to include one or more expression cassettes include but are not limited to the Saccharomyces genus (e.g. Saccharomyces cerevisiae, Saccharomyces kluyveri, Saccharomyces uvarum), the Komagataella genus (Komagataella pastoris, Komagataella pseudopastoris or Komagataella phaffii), Kluyveromyces genus (e.g. Kluyveromyces lactis, Kluyveromyces mandanus), the Candida genus (e.g. Candida utilis, Candida cacaoi, the Geotrichum genus (e.g. Geotrichum fermentans), as well as Hansenula polymorpha and Yarrowia lipolytica. A host cell may also be a member of the following species: Arxula spp., Arxula adeninivorans, Kluyveromyces spp., Kluyveromyces lactis, Komagataella phaffii, Pichia spp., Pichia angusta, Pichia pastoris, Saccharomyces spp., Saccharomyces cerevisiae, Schizosaccharomyces spp., Schizosaccharomyces pombe, Yarrowia spp., Yarrowia lipolytica, Agaricus spp., Agaricus bisporus, Aspergillus spp., Aspergillus awamori, Aspergillus fumigatus, Aspergillus nidulans, Aspergillus niger, Aspergillus oryzae, Bacillus subtilis, Colletotrichum spp., Colletotrichum gloeosporiodes, Endothia spp., Endothia parasitica, Escherichia coli, Fusarium spp., Fusarium graminearum, Fusarium solani, Mucor spp., Mucor miehei, Mucor pusillus, Myceliophthora spp., Myceliophthora thermophila, Neurospora spp., Neurospora crassa, Penicillium spp., Penicillium camemberti, Penicillium canescens, Penicillium chrysogenum, Penicillium (Talaromyces) emersonii, Penicillium funiculo sum, Penicillium purpurogenum, Penicillium roqueforti, Pleurotus spp., Pleurotus ostreatus, Rhizomucor spp., Rhizomucor miehei, Rhizomucor pusillus, Rhizopus spp., Rhizopus arrhizus, Rhizopus oligosporus, Rhizopus oryzae, Trichoderma spp., Trichoderma altroviride, Trichoderma reesei, or Trichoderma vireus.

    [0155] The genus Pichia is of particular interest. Pichia comprises a number of species, including the species Pichia pastoris, Pichia methanolica, Pichia kluyveri, and Pichia angusta. Most preferred is the species Pichia pastoris.

    [0156] The former species Pichia pastoris has been divided and renamed to Komagataella pastoris and Komagataella phaffii. Therefore, Pichia pastoris is synonymous for both Komagataella pastoris and Komagataella phaffii.

    [0157] In some embodiments, the host cell is a Pichia pastoris, Hansenula polymorpha, Trichoderma reesei, Saccharomyces cerevisiae, Kluyveromyces lactis, Yarrowia lipolytica, Pichia methanolica, Candida boidinii, and Komagataella, and Schizosaccharomyces pombe.

    [0158] The term sequence identity as used herein in the context of amino acid sequences is defined as the percentage of amino acid residues in a candidate sequence that are identical with the amino acid residues in a selected sequence, after aligning the sequences and introducing gaps, if necessary, to achieve the maximum percent sequence identity, and not considering any conservative substitutions as part of the sequence identity. Alignment for purposes of determining percent amino acid sequence identity can be achieved in various ways that are within the skill in the art, for instance, using publicly available computer software such as BLAST, BLAST-2, ALIGN, ALIGN-2 or Megalign (DNASTAR) software. Those skilled in the art can determine appropriate parameters for measuring alignment, including any algorithms needed to achieve maximal alignment over the full-length of the sequences being compared.

    [0159] In some embodiments, an engineered host cell expressing a recombinant protein such as the POI or the fusion protein may have a growth rate in a media containing fructose as a primary carbon source is higher than a growth rate of a control host cell, wherein the control host cell is identical to the engineered host cell, except the control cell does not express the recombinant protein such as the POI or the fusion protein.

    [0160] In some embodiments, an engineered host cell expressing a recombinant protein such as the POI or the fusion protein may have a growth rate in a media containing maltose as a primary carbon source is higher than a growth rate of a control host cell, wherein the control host cell is identical to the engineered host cell, except the control cell does not express the recombinant protein such as the POI or the fusion protein.

    [0161] In some embodiments, an engineered host cell expressing a recombinant protein such as the POI or the fusion protein may have a growth rate in a media containing high fructose corn syrup as a primary carbon source is higher than a growth rate of a control host cell, wherein the control host cell is identical to the engineered host cell, except the control cell does not express the recombinant protein such as the POI or the fusion protein.

    [0162] In some embodiments, an engineered host cell expressing a recombinant protein such as the POI or the fusion protein may have a growth rate in a media containing molasses as a primary carbon source is higher than a growth rate of a control host cell, wherein the control host cell is identical to the engineered host cell, except the control cell does not express the recombinant protein such as the POI or the fusion protein.

    [0163] In some embodiments, an engineered host cell expressing a recombinant protein such as the POI or the fusion protein may have a growth rate in a media containing a disaccharide as a primary carbon source is higher than a growth rate of a control host cell, wherein the control host cell is identical to the engineered host cell, except the control cell does not express the recombinant protein such as the POI or the fusion protein.

    [0164] In some embodiments, an engineered host cell expressing a recombinant protein such as the POI or the fusion protein may have a growth rate in a media containing a mixture of glucose and a disaccharide as a primary carbon source is higher than a growth rate of a control host cell, wherein the control host cell is identical to the engineered host cell, except the control cell does not express the recombinant protein such as the POI or the fusion protein.

    [0165] In some embodiments, an engineered host cell expressing a recombinant protein such as the POI or the fusion protein may have a growth rate in a media containing a carbon source that is not glucose as a primary carbon source is higher than a growth rate of a control host cell, wherein the control host cell is identical to the engineered host cell, except the control cell does not express the recombinant protein such as the POI or the fusion protein.

    TABLE-US-00001 TABLE1 Anchoringproteins Sequence SEQID Info NO: Aminoacidsequence Tir4from SEQIDNO: QINELNVVLDDVKTNIADYITLSYTPNSGFSLDQMPAGIMDIAAQLVANPSDDSYTTLYSE Saccharomyces 1 VDFSAVEHMLTMVPWYSSRLLPELEAMDASLTTSSSAATSSSEVASSSIASSTSSSVAPSSS cerevisiae EVVSSSVASSSSEVASSSVASTSEATSSSAVTSSSAVSSSTESVSSSSVSSSSAVSSSEAVSSS PVSSVVSSSAGPASSSVAPYNSTIASSSSTAQTSISTIAPYNSTTTTTPASSASSVIISTRNGTT VTETDNTLVTKETTVCDYSSTSAVPASTTGYNNSTKVSTATICSTCKEGTSTATDFSTLKT TVTVCDSACQAKKSATVVSVQSKTTGIVEQTENGAAKAVIGMGAGALAAVAAMLL Tir4from SEQIDNO: QINELNVVLDDVKTNIADYITLSYTPNSGFSLDQMPAGIMDIAAQLVANPSDDSYTTLYSE Saccharomyces 320 VDFSAVEHMLTMVPWYSSRLLPELEAMDASLTTSSSAATSSSEVASSSIASSTSSSVAPSSS cerevisiae EVVSSSVASSSSEVASSSVASTSEATSSSAVTSSSAVSSSTESVSSSSVSSSSAVSSSEAVSSS PVSSVVSSSAGPASSSVAPYNSTIASSSSTAQTSISTIAPYNSTTTTTPASSASSVIISTRNGTT VTETDNTLVTKETTVCDYSSTSAVPASTTGYNNSTKVSTATICSTCKEGTSTATDFSTLKT TVTVCDSACQAKKSATVVSVQSKTTGIVEQTEN Tir4from SEQIDNO: MAYSKITLLAALAAIAYAQTQAQINELNVVLDDVKTNIADYITLSYTPNSGFSLDQMPAG Saccharomyces 2 IMDIAAQLVANPSDDSYTTLYSEVDFSAVEHMLTMVPWYSSRLLPELEAMDASLTTSSSA cerevisiae ATSSSEVASSSIASSTSSSVAPSSSEVVSSSVASSSSEVASSSVASTSEATSSSAVTSSSAVSS (underlinedis STESVSSSSVSSSSAVSSSEAVSSSPVSSVVSSSAGPASSSVAPYNSTIASSSSTAQTSISTIAP signalpeptide,may YNSTTTTTPASSASSVIISTRNGTTVTETDNTLVTKETTVCDYSSTSAVPASTTGYNNSTK ormaynotbe VSTATICSTCKEGTSTATDFSTLKTTVTVCDSACQAKKSATVVSVQSKTTGIVEQTENGA utilizedindesign) AKAVIGMGAGALAAVAAMLL Tir4from SEQIDNO: QINELNVVLDDVKTNIADYITLSYTPNSGFSLDQMPAGIMDIAAQLVANPSDDSYTTLYSE Saccharomyces 320 VDFSAVEHMLTMVPWYSSRLLPELEAMDASLTTSSSAATSSSEVASSSIASSTSSSVAPSSS cerevisiae EVVSSSVASSSSEVASSSVASTSEATSSSAVTSSSAVSSSTESVSSSSVSSSSAVSSSEAVSSS PVSSVVSSSAGPASSSVAPYNSTIASSSSTAQTSISTIAPYNSTTTTTPASSASSVIISTRNGTT VTETDNTLVTKETTVCDYSSTSAVPASTTGYNNSTKVSTATICSTCKEGTSTATDFSTLKT TVTVCDSACQAKKSATVVSVQSKTTGIVEQTEN Tir4 SEQIDNO: QINELNVVLDDVKTNIADYITLSYTPNSGFSLDQMPAGIMDIAAQLVANPSDDSYTTLYSE (NP_014652.1) 3 VDFSAVEHMLTMVPWYSSRLLPELEAMDASLTTSSSAATSSSEVASSSIASSTSSSVAPSSS from EVVSSSVAPSSSEVVSSSVAPSSSEVVSSSVASSSSEVASSSVAPSSSEVVSSSVASSSSEVA Saccharomyces SSSVAPSSSEVVSSSVAPSSSEVVSSSVASSSSEVASSSVAPSSSEVVSSSVASSTSEATSSSA cerevisiae VTSSSAVSSSTESVSSSSVSSSSAVSSSEAVSSSPVSSVVSSSAGPASSSVAPYNSTIASSSST AQTSISTIAPYNSTTTTTPASSASSVIISTRNGTTVTETDNTLVTKETTVCDYSSTSAVPAST TGYNNSTKVSTATICSTCKEGTSTATDFSTLKTTVTVCDSACQAKKSATVVSVQSKTTGI VEQTENGAAKAVIGMGAGALAAVAAMLL Tir4 SEQIDNO: MAYSKITLLAALAALAYAQTQAQINELNVVLDDVKTNIADYITLSYTPNSGFSLDQMPAG (NP_014652.1) 4 IMDIAAQLVANPSDDSYTTLYSEVDFSAVEHMLTMVPWYSSRLLPELEAMDASLTTSSSA from ATSSSEVASSSIASSTSSSVAPSSSEVVSSSVAPSSSEVVSSSVAPSSSEVVSSSVASSSSEVA Saccharomyces SSSVAPSSSEVVSSSVASSSSEVASSSVAPSSSEVVSSSVAPSSSEVVSSSVASSSSEVASSSV cerevisiae APSSSEVVSSSVASSTSEATSSSAVTSSSAVSSSTESVSSSSVSSSSAVSSSEAVSSSPVSSVV (underlinedis SSSAGPASSSVAPYNSTIASSSSTAQTSISTIAPYNSTTTTTPASSASSVIISTRNGTTVTETD signalpeptide,may NTLVTKETTVCDYSSTSAVPASTTGYNNSTKVSTATICSTCKEGTSTATDFSTLKTTVTVC ormaynotbe DSACQAKKSATVVSVQSKTTGIVEQTENGAAKAVIGMGAGALAAVAAMLL utilizedindesign) Tir4 SEQIDNO: QINELNVVLDDVKTNIADYITLSYTPNSGFSLDQMPAGIMDIAAQLVANPSDDSYTTLYSE (NP_014652.1) 321 VDFSAVEHMLTMVPWYSSRLLPELEAMDASLTTSSSAATSSSEVASSSIASSTSSSVAPSSS from EVVSSSVAPSSSEVVSSSVAPSSSEVVSSSVASSSSEVASSSVAPSSSEVVSSSVASSSSEVA Saccharomyces SSSVAPSSSEVVSSSVAPSSSEVVSSSVASSSSEVASSSVAPSSSEVVSSSVASSTSEATSSSA cerevisiae VTSSSAVSSSTESVSSSSVSSSSAVSSSEAVSSSPVSSVVSSSAGPASSSVAPYNSTIASSSST (withoutC- AQTSISTIAPYNSTTTTTPASSASSVIISTRNGTTVTETDNTLVTKETTVCDYSSTSAVPAST terminusofTir4 TGYNNSTKVSTATICSTCKEGTSTATDFSTLKTTVTVCDSACQAKKSATVVSVQSKTTGI GPIanchoror VEQTEN signalpeptideor signalpeptide) Dan1from SEQIDNO: ASVTTTLSPYDERVNLIELAVYVSDIGAHLSEYYAFQALHKTETYPPEIAKAVFAGGDFTT Saccharomyces 5 MLTGISGDEVTRMITGVPWYSTRLMGAISEALANEGIATAVPASTTEASSTSTSEASSAAT cerevisiae ESSSSSESSAETSSNAASTQATVSSESSSAASTIASSAESSVASSVASSVASSASFANTTAPV SSTSSISVTPVVQNGTDSTVTKTQASTVETTITSCSNNVCSTVTKPVSSKAQSTATSVTSSA SRVIDVTTNGANKFNNGVFGAAAIAGAAALLL Dan1from SEQIDNO: MSRISILAVAAALVASATAASVTTTLSPYDERVNLIELAVYVSDIGAHLSEYYAFQALHK Saccharomyces 6 TETYPPEIAKAVFAGGDFTTMLTGISGDEVTRMITGVPWYSTRLMGAISEALANEGIATA cerevisiae VPASTTEASSTSTSEASSAATESSSSSESSAETSSNAASTQATVSSESSSAASTIASSAESSV (underlinedis ASSVASSVASSASFANTTAPVSSTSSISVTPVVQNGTDSTVTKTQASTVETTITSCSNNVCS signalpeptide,may TVTKPVSSKAQSTATSVTSSASRVIDVTTNGANKFNNGVFGAAAIAGAAALLL ormaynotbe utilizedindesign) Dan4from SEQIDNO: ITATTTLSPYDERVNLIELAVYVSDIRAHIFQYYSFRNHHKTETYPSEIAAAVFDYGDFTTR Saccharomyces 7 LTGISGDEVTRMITGVPWYSTRLKPAISSALSKDGIYTAIPTSTSTTTTKSSTSTTPTTTITST cerevisiae TSTTSTTPTTSTTSTTPTTSTTSTTPTTSTTSTTPTTSTTSTTPTTSTTSTTPTTSTTSTTPTTST TSTTPTTSTTSTTPTTSTTPTTSTTSTTSQTSTKSTTPTTSSTSTTPTTSTTPTTSTTSTAPTTS TTSTTSTTSTISTAPTTSTTSSTFSTSSASASSVISTTATTSTTFASLTTPATSTASTDHTTSSV STTNAFTTSATTTTTSDTYISSSSPSQVTSSAEPTTVSEVTSSVEPTRSSQVTSSAEPTTVSEF TSSVEPTRSSQVTSSAEPTTVSEFTSSVEPTRSSQVTSSAEPTTVSEFTSSVEPTRSSQVTSSA EPTTVSEFTSSVEPTRSSQVTSSAEPTTVSEFTSSVEPIRSSQVTSSAEPTTVSEVTSSVEPIRS SQVTTTEPVSSFGSTFSEITSSAEPLSFSKATTSAESISSNQITISSELIVSSVITSSSEIPSSIEVL TSSGISSSVEPTSLVGPSSDESISSTESLSATSTFTSAVVSSSKAADFFTRSTVSAKSDVSGNS STQSTTFFATPSTPLAVSSTVVTSSTDSVSPNIPFSEISSSPESSTAITSTSTSFIAERTSSLYLS SSNMSSFTLSTFTVSQSIVSSFSMEPTSSVASFASSSPLLVSSRSNCSDARSSNTISSGLFSTIE NVRNATSTFTNLSTDEIVITSCKSSCTNEDSVLTKTQVSTVETTITSCSGGICTTLMSPVTTI NAKANTLTTTETSTVETTITTCPGGVCSTLTVPVTTITSEATTTATISCEDNEEDITSTETEL LTLETTITSCSGGICTTLMSPVTTINAKANTLTTTETSTVETTITTCSGGVCSTLTVPVTTITS EATTTATISCEDNEEDVASTKTELLTMETTITSCSGGICTTLMSPVSSFNSKATTSNNAESTI PKAIKVSCSAGACTTLTTVDAGISMFTRTGLSITQTTVTNCSGGTCTMLTAPIATATSKVIS PIPKASSATSIAHSSASYTVSINTNGAYNFDKDNIFGTAIVAVVALLLL Dan4from SEQIDNO: MVNISIVAGIVALATSAAAITATTTLSPYDERVNLIELAVYVSDIRAHIFQYYSFRNHHKTE Saccharomyces 8 TYPSEIAAAVFDYGDFTTRLTGISGDEVTRMITGVPWYSTRLKPAISSALSKDGIYTAIPTS cerevisiae TSTTTTKSSTSTTPTTTITSTTSTTSTTPTTSTTSTTPTTSTTSTTPTTSTTSTTPTTSTTSTTPT (underlinedis TSTTSTTPTTSTTSTTPTTSTTSTTPTTSTTSTTPTTSTTPTTSTTSTTSQTSTKSTTPTTSSTST signalpeptide,may TPTTSTTPTTSTTSTAPTTSTTSTTSTTSTISTAPTTSTTSSTFSTSSASASSVISTTATTSTTFA ormaynotbe SLTTPATSTASTDHTTSSVSTTNAFTTSATTTTTSDTYISSSSPSQVTSSAEPTTVSEVTSSV utilizedindesign) EPTRSSQVTSSAEPTTVSEFTSSVEPTRSSQVTSSAEPTTVSEFTSSVEPTRSSQVTSSAEPTT VSEFTSSVEPTRSSQVTSSAEPTTVSEFTSSVEPTRSSQVTSSAEPTTVSEFTSSVEPIRSSQV TSSAEPTTVSEVTSSVEPIRSSQVTTTEPVSSFGSTFSEITSSAEPLSFSKATTSAESISSNQITI SSELIVSSVITSSSEIPSSIEVLTSSGISSSVEPTSLVGPSSDESISSTESLSATSTFTSAVVSSSK AADFFTRSTVSAKSDVSGNSSTQSTTFFATPSTPLAVSSTVVTSSTDSVSPNIPFSEISSSPES STAITSTSTSFIAERTSSLYLSSSNMSSFTLSTFTVSQSIVSSFSMEPTSSVASFASSSPLLVSS RSNCSDARSSNTISSGLFSTIENVRNATSTFTNLSTDEIVITSCKSSCTNEDSVLTKTQVSTV ETTITSCSGGICTTLMSPVTTINAKANTLTTTETSTVETTITTCPGGVCSTLTVPVTTITSEA TTTATISCEDNEEDITSTETELLTLETTITSCSGGICTTLMSPVTTINAKANTLTTTETSTVET TITTCSGGVCSTLTVPVTTITSEATTTATISCEDNEEDVASTKTELLTMETTITSCSGGICTT LMSPVSSFNSKATTSNNAESTIPKAIKVSCSAGACTTLTTVDAGISMFTRTGLSITQTTVTN CSGGTCTMLTAPIATATSKVISPIPKASSATSIAHSSASYTVSINTNGAYNFDKDNIFGTAIV AVVALLLL Sag1from SEQIDNO: ININDITFSNLEITPLTANKQPDQGWTATFDFSIADASSIREGDEFTLSMPHVYRIKLLNSSQ Saccharomyces 9 TATISLADGTEAFKCYVSQQAAYLYENTTFTCTAQNDLSSYNTIDGSITFSLNFSDGGSSY cerevisiae EYELENAKFFKSGPMLVKLGNQMSDVVNFDPAAFTENVFHSGRSTGYGSFESYHLGMY CPNGYFLGGTEKIDYDSSNNNVDLDCSSVQVYSSNDFNDWWFPQSYNDTNADVTCFGS NLWITLDEKLYDGEMLWVNALQSLPANVNTIDHALEFQYTCLDTIANTTYATQFSTTREF IVYQGRNLGTASAKSSFISTTTTDLTSINTSAYSTGSISTVETGNRTTSEVISHVVTTSTKLS PTATTSLTIAQTSIYSTDSNITVGTDIHTTSEVISDVETISRETASTVVAAPTSTTGWTGAMN TYISQFTSSSFATINSTPIISSSAVFETSDASIVNVHTENITNTAAVPSEEPTFVNATRNSLNS FCSSKQPSSPSSYTSSPLVSSLSVSKTLLSTSFTPSVPTSNTYIKTKNTGYFEHTALTTSSVG LNSFSETAVSSQGTKIDTFLVSSLIAYPSSASGSQLSGIQQNFTSTSLMISTYEGKASIFFSAE LGSIIFLLLSYLLF Sag1from SEQIDNO: MFTFLKIILWLFSLALASAININDITFSNLEITPLTANKQPDQGWTATFDFSIADASSIREGD Saccharomyces 10 EFTLSMPHVYRIKLLNSSQTATISLADGTEAFKCYVSQQAAYLYENTTFTCTAQNDLSSY cerevisiae NTIDGSITFSLNFSDGGSSYEYELENAKFFKSGPMLVKLGNQMSDVVNFDPAAFTENVFH (underlinedis SGRSTGYGSFESYHLGMYCPNGYFLGGTEKIDYDSSNNNVDLDCSSVQVYSSNDFNDW signalpeptide,may WFPQSYNDTNADVTCFGSNLWITLDEKLYDGEMLWVNALQSLPANVNTIDHALEFQYT ormaynotbe CLDTIANTTYATQFSTTREFIVYQGRNLGTASAKSSFISTTTTDLTSINTSAYSTGSISTVET utilizedindesign) GNRTTSEVISHVVTTSTKLSPTATTSLTIAQTSIYSTDSNITVGTDIHTTSEVISDVETISRET ASTVVAAPTSTTGWTGAMNTYISQFTSSSFATINSTPIISSSAVFETSDASIVNVHTENITNT AAVPSEEPTFVNATRNSLNSFCSSKQPSSPSSYTSSPLVSSLSVSKTLLSTSFTPSVPTSNTYI KTKNTGYFEHTALTTSSVGLNSFSETAVSSQGTKIDTFLVSSLIAYPSSASGSQLSGIQQNF TSTSLMISTYEGKASIFFSAELGSIIFLLLSYLLF Fig2from SEQIDNO: QIVFYQNSSTSLPVPTLVSTSIADFHESSSTGEVQYSSSYSYVQPSIDSFTSSSFLTSFEAPTE Saccharomyces 11 TSSSYAVSSSLITSDTFSSYSDIFDEETSSLISTSAASSEKASSTLSSTAQPHRTSHSSSSFELP cerevisiae VTAPSSSSLPSSTSLTFTSVNPSQSWTSFNSEKSSALSSTIDFTSSEISGSTSPKSLESFDTTGT ITSSYSPSPSSKNSNQTSLLSPLEPLSSSSGDLILSSTIQATTNDQTSKTIPTLVDATSSLPPTL RSSSMAPTSGSDSISHNFTSPPSKTSGNYDVLTSNSIDPSLFTTTSEYSSTQLSSLNRASKSE TVNFTASIASTPFGTDSATSLIDPISSVGSTASSFVGISTANFSTQGNSNYVPESTASGSSQY QDWSSSSLPLSQTTWVVINTTNTQGSVTSTTSPAYVSTATKTVDGVITEYVTWCPLTQTK SQAIGVSSSISSVPQASSFSGSSILSSNSSTLAASNNVPESTASGSSQYQDWSSSSLPLSQTT WVVINTTNTQGSVTSTTSPAYVSTATKTVDGVITEYVTWCPLTQTKSQAIGISSSTISATQ TSKPSSILTLGISTLQLSDATFKGTETINTHLMTESTSITEPTYFSGTSDSFYLCTSEVNLASS LSSYPNFSSSEGSTATITNSTVTFGSTSKYPSTSVSNPTEASQHVSSSVNSLTDFTSNSTETI AVISNIHKTSSNKDYSLTTTQLKTSGMQTLVLSTVTTTVNGAATEYTTWCPASSIAYTTSI SYKTLVLTTEVCSHSECTPTVITSVTATSSTIPLLSTSSSTVLSSTVSEGAKNPAASEVTINT QVSATSEATSTSTQVSATSATATASESSTTSQVSTASETISTLGTQNFTTTGSLLFPALSTE MINTTVVSRKTLIISTEVCSHSKCVPTVITEVVTSKGTPSNGHSSQTLQTEAVEVTLSSHQT VTMSTEVCSNSICTPTVITSVQMRSTPFPYLTSSTSSSSLASTKKSSLEASSEMSTFSVSTQS LPLAFTSSEKRSTTSVSQWSNTVLTNTIMSSSSNVISTNEKPSSTTSPYNFSSGYSLPSSSTPS QYSLSTATTTINGIKTVYTTWCPLAEKSTVAASSQSSRSVDRFVSSSKPSSSLSQTSIQYTL STATTTISGLKTVYTTWCPLTSKSTLGATTQTSSTAKVRITSASSATSTSISLSTSTESESSSG YLSKGVCSGTECTQDVPTQSSSPASTLAYSPSVSTSSSSSFSTTTASTLTSTHTSVPLLPSSS SISASSPSSTSLLSTSLPSPAFTSSTLPTATAVSSSTFIASSLPLSSKSSLSLSPVSSSILMSQFSS SSSSSSSLASLPSLSISPTVDTVSVLQPTTSIATLTCTDSQCQQEVSTICNGSNCDDVTSTAT TPPSTVTDTMTCTGSECQKTTSSSCDGYSCKVSETYKSSATISACSGEGCQASATSELNSQ YVTMTSVITPSAITTTSVEVHSTESTISITTVKPVTYTSSDTNGELITITSSSQTVIPSVTTIITR TKVAITSAPKPTTTTYVEQRLSSSGIATSFVAAASSTWITTPIVSTYAGSASKFLCSKFFMI MVMVINFI Fig2from SEQIDNO: MNSFASLGLIYSVVNLLTRVEAQIVFYQNSSTSLPVPTLVSTSIADFHESSSTGEVQYSSSY Saccharomyces 12 SYVQPSIDSFTSSSFLTSFEAPTETSSSYAVSSSLITSDTFSSYSDIFDEETSSLISTSAASSEKA cerevisiae SSTLSSTAQPHRTSHSSSSFELPVTAPSSSSLPSSTSLTFTSVNPSQSWTSFNSEKSSALSSTI (underlinedis DFTSSEISGSTSPKSLESFDTTGTITSSYSPSPSSKNSNQTSLLSPLEPLSSSSGDLILSSTIQAT signalpeptide,may TNDQTSKTIPTLVDATSSLPPTLRSSSMAPTSGSDSISHNFTSPPSKTSGNYDVLTSNSIDPS ormaynotbe LFTTTSEYSSTQLSSLNRASKSETVNFTASIASTPFGTDSATSLIDPISSVGSTASSFVGISTA utilizedindesign) NFSTQGNSNYVPESTASGSSQYQDWSSSSLPLSQTTWVVINTTNTQGSVTSTTSPAYVST ATKTVDGVITEYVTWCPLTQTKSQAIGVSSSISSVPQASSFSGSSILSSNSSTLAASNNVPES TASGSSQYQDWSSSSLPLSQTTWVVINTTNTQGSVTSTTSPAYVSTATKTVDGVITEYVT WCPLTQTKSQAIGISSSTISATQTSKPSSILTLGISTLQLSDATFKGTETINTHLMTESTSITEP TYFSGTSDSFYLCTSEVNLASSLSSYPNFSSSEGSTATITNSTVTFGSTSKYPSTSVSNPTEA SQHVSSSVNSLTDFTSNSTETIAVISNIHKTSSNKDYSLTTTQLKTSGMQTLVLSTVTTTVN GAATEYTTWCPASSIAYTTSISYKTLVLTTEVCSHSECTPTVITSVTATSSTIPLLSTSSSTV LSSTVSEGAKNPAASEVTINTQVSATSEATSTSTQVSATSATATASESSTTSQVSTASETIS TLGTQNFTTTGSLLFPALSTEMINTTVVSRKTLIISTEVCSHSKCVPTVITEVVTSKGTPSNG HSSQTLQTEAVEVTLSSHQTVTMSTEVCSNSICTPTVITSVQMRSTPFPYLTSSTSSSSLAST KKSSLEASSEMSTFSVSTQSLPLAFTSSEKRSTTSVSQWSNTVLTNTIMSSSSNVISTNEKPS STTSPYNFSSGYSLPSSSTPSQYSLSTATTTINGIKTVYTTWCPLAEKSTVAASSQSSRSVD RFVSSSKPSSSLSQTSIQYTLSTATTTISGLKTVYTTWCPLTSKSTLGATTQTSSTAKVRITS ASSATSTSISLSTSTESESSSGYLSKGVCSGTECTQDVPTQSSSPASTLAYSPSVSTSSSSSFS TTTASTLTSTHTSVPLLPSSSSISASSPSSTSLLSTSLPSPAFTSSTLPTATAVSSSTFIASSLPL SSKSSLSLSPVSSSILMSQFSSSSSSSSSLASLPSLSISPTVDTVSVLQPTTSIATLTCTDSQCQ QEVSTICNGSNCDDVTSTATTPPSTVTDTMTCTGSECQKTTSSSCDGYSCKVSETYKSSAT ISACSGEGCQASATSELNSQYVTMTSVITPSAITTTSVEVHSTESTISITTVKPVTYTSSDTN GELITITSSSQTVIPSVTTIITRTKVAITSAPKPTTTTYVEQRLSSSGIATSFVAAASSTWITTP IVSTYAGSASKFLCSKFFMIMVMVINFI Sed1from SEQIDNO: QFSNSTSASSTDVTSSSSISTSSGSVTITSSEAPESDNGTSTAAPTETSTEAPTTAIPTNGTST Saccharomyces 13 EAPTTAIPTNGTSTEAPTDTTTEAPTTALPTNGTSTEAPTDTTTEAPTTGLPTNGTTSAFPPT cerevisiae TSLPPSNTTTTPPYNPSTDYTTDYTVVTEYTTYCPEPTTFTTNGKTYTVTEPTTLTITDCPC TIEKPTTTSTTEYTVVTEYTTYCPEPTTFTTNGKTYTVTEPTTLTITDCPCTIEKSEAPESSV PVTESKGTTTKETGVTTKQTTANPSLTVSTVVPVSSSASSHSVVINSNGANVVVPGALGL AGVAMLFL Sed1from SEQIDNO: MKLSTVLLSAGLASTTLAQFSNSTSASSTDVTSSSSISTSSGSVTITSSEAPESDNGTSTAAP Saccharomyces 14 TETSTEAPTTAIPTNGTSTEAPTTAIPTNGTSTEAPTDTTTEAPTTALPTNGTSTEAPTDTTT cerevisiae EAPTTGLPTNGTTSAFPPTTSLPPSNTTTTPPYNPSTDYTTDYTVVTEYTTYCPEPTTFTTN (underlinedis GKTYTVTEPTTLTITDCPCTIEKPTTTSTTEYTVVTEYTTYCPEPTTFTTNGKTYTVTEPTT signalpeptide,may LTITDCPCTIEKSEAPESSVPVTESKGTTTKETGVTTKQTTANPSLTVSTVVPVSSSASSHS ormaynotbe VVINSNGANVVVPGALGLAGVAMLFL utilizedindesign)

    TABLE-US-00002 TABLE2 Carbonutilizationproteins Sequence SEQID Info NO: Aminoacidsequences Saccharomyces SEQIDNO:15 SMTNETSDRPLVHFTPNKGWMNDPNGLWYDEKDAKWHLYFQYNPNDTVWGTPLFWGHATSD cerevisiae DLTNWEDQPIAIAPKRNDSGAFSGSMVVDYNNTSGFFNDTIDPRQRCVAIWTYNTPESEEQYISY SUC2 SLDGGYTFTEYQKNPVLAANSTQFRDPKVFWYEPSQKWIMTAAKSQDYKIEIYSSDDLKSWKLE (without SAFANEGFLGYQYECPGLIEVPTEQDPSKSYWVMFISINPGAPAGGSFNQYFVGSFNGTHFEAFD peptides NQSRVVDFGKDYYALQTFFNTDPTYGSALGIAWASNWEYSAFVPTNPWRSSMSLVRKFSLNTE thatare YQANPETELINLKAEPILNISNAGPWSRFATNTTLTKANSYNVDLSNSTGTLEFELVYAVNTTQTI cleaved SKSVFADLSLWFKGLEDPEEYLRMGFEVSASSFFLDRGNSKVKFVKENPYFTNRMSVNNQPFKS offpost- ENDLSYYKVYGLLDQNILELYFNDGDVVSTNTYFMTTGNALGSVNMTTGVDNLFYIDKFQVRE translationally) VK Saccharomyces SEQIDNO:16 MLLQAFLFLLAGFAAKISASMTNETSDRPLVHFTPNKGWMNDPNGLWYDEKDAKWHLYFQYN cerevisiae PNDTVWGTPLFWGHATSDDLTNWEDQPIAIAPKRNDSGAFSGSMVVDYNNTSGFFNDTIDPRQR SUC2 CVAIWTYNTPESEEQYISYSLDGGYTFTEYQKNPVLAANSTQFRDPKVFWYEPSQKWIMTAAKS (including QDYKIEIYSSDDLKSWKLESAFANEGFLGYQYECPGLIEVPTEQDPSKSYWVMFISINPGAPAGGS peptides FNQYFVGSFNGTHFEAFDNQSRVVDFGKDYYALQTFFNTDPTYGSALGIAWASNWEYSAFVPT thatare NPWRSSMSLVRKFSLNTEYQANPETELINLKAEPILNISNAGPWSRFATNTTLTKANSYNVDLSN cleaved STGTLEFELVYAVNTTQTISKSVFADLSLWFKGLEDPEEYLRMGFEVSASSFFLDRGNSKVKFVK offpost- ENPYFTNRMSVNNQPFKSENDLSYYKVYGLLDQNILELYFNDGDVVSTNTYFMTTGNALGSVN translationally) MTTGVDNLFYIDKFQVREVK UniProt KB- P00724 (INV2_ YEAST) Pichia SEQIDNO:17 MTIESQEPWWKSAVVYQVWPASFKDSNGDGIGDLNGITSELDHIKSLGTDVIWLSPHYASPLDD angusta MGYDISDYNAINPQFGTMEDMDRLLAEIKKRDMRLILDLVINHTSSEHAWFKESRSSRDNPKRD MAL1 WYIWKDNANNWLSFFSGSAWSYDEKTKQYYLRLFAETQPDLNWENPKTREAIYKSALEFWYE (including KGVSGFRIDTAGLYSKVQTFEDAPVTFPGEKYQPAGPLINSGPRIHEFHKEMYEKVTSRYDAMTV peptides GEVGHCSKADALKYVSAKEKEMNMMFLFDTVDVGSDKSDRFRYKGFTLTDFKDAIINQSNFIFD thatare DETGELNDAWSTVFIENHDQPRCVTRFGNTSNKLFWSRSAKMLALLQTTLTGTLFVYQGQEIGM cleaved TNVSPKWDISEYLDINTINYWNAFNETEHSDEEKAELLKIINLLARDNARTPVQWDSSENGGFGG offpost- KPWMRINDNYKDINVASQKEDPDSVLNFYRNAIKTRKHYSETLIFGRFEVQDYDNQEIFYYTKTS translationally) NKGQKKMAVVLNFTDREVEYPIPQGKLLLSNIANNITGKLQPYEGRLIEVN UniProt KB- Q9P8G8 (Q9P8G8_ PICAN) Saccharomyces SEQIDNO:322 MLLQAFLFLLAGFAAKISASMTNETSDRPLVHFTPNKGWMNDPNGLWYDAKEGKWHLYFQYN cerevisiae PNDTVWGLPLFWGHATSDDLTHWQDEPVAIAPKRKDSGAYSGSMVIDYNNTSGFFNDTIDPRQ SUC1 RCVAIWTYNTPESEEQYISYSLDGGYTFTEYQKNPVLAANSTQFRDPKVFWYEPSKKWIMTAAK (invertase1) SQDYKIEIYSSDDLKSWKLESAFANEGFLGYQYECPGLIEVPSEQDPSKSHWVMFISINPGAPAGG Unitprot SFNQYFVGSFNGHHFEAFDNQSRVVDFGKDYYALQTFFNTDPTYGSALGIAWASNWEYSAFVPS Accession: NPWRSSMSLVRPFSLNTEYQANPETELINLKAEPILNISSAGPWSRFATNTTLTKANSYNVDLSNS P10594 TGTLEFELVYAVNTTQTISKSVFADLSLWFKGLEDPEEYLRMGFEVSASSFFLDRGNSKVKFVKE NPYFTNRMSVNNQPFKSENDLSYYKVYGLLDQNILELYFNDGDVVSTNTYFMTTGNALGSVNM TTGVDNLFYIDKFQVREVK Kluyveromyces SEQIDNO:323 MLKLLSLMVPLASAAVIHRRDANISAIASEWNSTSNSSSSLSLNRPAVHYSPEEGWMNDPNGLW lactis YDAKEEDWHIYYQYYPDAPHWGLPLTWGHAVSKDLTVWDEQGVAFGPEFETAGAFSGSMVID INV1 YNNTSGFFNSSTDPRQRVVAIWTLDYSGSETQQLSYSHDGGYTFTEYSDNPVLDIDSDAFRDPKV (invertase) FWYQGEDSESEGNWVMTVAEADRFSVLIYSSPDLKNWTLESNFSREGYLGYNYECPGLVKVPY Unitprot VKNTTYASAPGSNITSSGPLHPNSTVSFSNSSSIAWNASSVPLNITLSNSTLVDETSQLEEVGYAW Accession: VMIVSFNPGSILGGSGTEYFIGDFNGTHFEPLDKQTRFLDLGKDYYALQTFFNTPNEVDVLGIAW Q9Y746 ASNWQYANQVPTDPWRSSMSLVRNFTITEYNINSNTTALVLNSQPVLDFTSLRKNGTSYTLENLT LNSSSHEVLEFEDPTGVFEFSLEYSVNFTGIHNWVFTDLSLYFQGDKDSDEYLRLGYEANSKQFF LDRGHSNIPFVQENPFFTQRLSVSNPPSSNSSTFDVYGIVDRNIIELYFNNGTVTSTNTFFFSTGNNI GSIIVKSGVDDVYEIESLKVNQFYVD Cyberlindnera SEQIDNO:324 MSLTKDASEDQEDIKSLTMNTSLVDSSIYRPLVHLTPPVGWMNDPNGLFYDSSESTYHVYYQYN jadinii PNDTIWGLPLYWGHATSDDLLTWDHHAPAIGPENDDEGIYSGSIVIDYDNTSGFFDDSTRPEQRI INV1 VAIYTNNLPDVETQDIAYSTDGGYTFEKYENNPVIDVNSTQFRDPKVIWYEETEQWVMTVAKSQ (invertase) EYKIQIYTSDNLKDWSLASNFSTKGYVGYQYECPGLFEATIENPKSGDPEKKWVMVLAINPGSPL Unitprot GGSINEYFVGDFNGTEFIPDDDATRFMDTGKDFYAFQAFFNAPENRSIGVAWSSNWQYSNQVPD Accession: PDGYRSSMSSIREYTLRYVSTNPESEQLILCQKPFFVNETDLKVVEEYKVSNSSLTVDHTFGSSFA O94224 NSNTTGLLDFNMTFTVNGTTDVTQKDSVTFELRIKSNQSDEAIALGYDYNNEQFYINRATESYFQ RTNQFFQERWSTYVQPLTITESGDKQYQLYGLVDNNILELYFNDGAFTSTNTFFLEKGKPSNVDI VASSSKEAYHRGPAD Oryza SEQIDNO:325 MELAVGAGGMRRSASHTSLSESDDFDLSRLLNKPRINVERQRSFDDRSLSDVSYSGGGHGGTRG sativa GFDGMYSPGGGLRSLVGTPASSALHSFEPHPIVGDAWEALRRSLVFFRGQPLGTIAAFDHASEEV japonica LNYDQVFVRDFVPSALAFLMNGEPEIVRHFLLKTLLLQGWEKKVDRFKLGEGAMPASFKVLHD (rice) SKKGVDTLHADFGESAIGRVAPVDSGFWWIILLRAYTKSTGDLTLAETPECQKGMRLILSLCLSE CINV1 GFDTFPTLLCADGCCMIDRRMGVYGYPIEIQALFFMALRCALQLLKHDNEGKEFVERIATRLHAL (invertase) SYHMRSYYWLDFQQLNDIYRYKTEEYSHTAVNKFNVIPDSIPDWLFDFMPCQGGFFIGNVSPAR Unitprot MDFRWFALGNMIAILSSLATPEQSTAIMDLIEERWEELIGEMPLKICYPAIENHEWRIVTGCDPKN Accession: TRWSYHNGGSWPVLLWLLTAACIKTGRPQIARRAIDLAERRLLKDGWPEYYDGKLGRYVGKQA Q69T31 RKFQTWSIAGYLVAKMMLEDPSHLGMISLEEDKAMKPVLKRSASWTN Arabidopsis SEQIDNO:326 MEGVGLRAVGSHCSLSEMDDLDLTRALDKPRLKIERKRSFDERSMSELSTGYSRHDGIHDSPRG thaliana RSVLDTPLSSARNSFEPHPMMAEAWEALRRSMVFFRGQPVGTLAAVDNTTDEVLNYDQVFVRD Alkaline/ FVPSALAFLMNGEPDIVKHFLLKTLQLQGWEKRVDRFKLGEGVMPASFKVLHDPIRETDNIVAD neutral FGESAIGRVAPVDSGFWWIILLRAYTKSTGDLTLSETPECQKGMKLILSLCLAEGFDTFPTLLCAD invertase GCSMIDRRMGVYGYPIEIQALFFMALRSALSMLKPDGDGREVIERIVKRLHALSFHMRNYFWLD CINV1 HQNLNDIYRFKTEEYSHTAVNKFNVMPDSIPEWVFDFMPLRGGYFVGNVGPAHMDFRWFALGN INVA CVSILSSLATPDQSMAIMDLLEHRWAELVGEMPLKICYPCLEGHEWRIVTGCDPKNTRWSYHNG UnitProt GSWPVLLWQLTAACIKTGRPQIARRAVDLIESRLHRDCWPEYYDGKLGRYVGKQARKYQTWSI Accession AGYLVAKMLLEDPSHIGMISLEEDKLMKPVIKRSASWPQL No.: Q9LQF2 Arabidopsis SEQIDNO:327 MSAIYLLRKISTKTPSRFHRSLFFSTFSKDSPPDLSRTTSIRHLSSSQRFVSSSIYCFPQSKILPNRFSE thaliana KTTGISVRQFSTSVETNLSDKSFERIHVQSDAILERIHKNEEEVETVSIGSEKVVREESEAEKEAWR Alkaline/ ILENAVVRYCGSPVGTVAANDPGDKMPLNYDQVFIRDFVPSALAFLLKGEGDIVRNFLLHTLQL neutral QSWEKTVDCYSPGQGLMPASFKVRTVALDENTTEEVLDPDFGESAIGRVAPVDSGLWWIILLRA invertase YGKITGDFSLQERIDVQTGIKLIMNLCLADGFDMFPTLLVTDGSCMIDRRMGIHGHPLEIQSLFYS A, ALRCSREMLSVNDSSKDLVRAINNRLSALSFHIREYYWVDIKKINEIYRYKTEEYSTDATNKFNIY mitochondrial PEQIPPWLMDWIPEQGGYLLGNLQPAHMDFRFFTLGNFWSIVSSLATPKQNEAILNLIEAKWDDII INVE GNMPLKICYPALEYDDWRIITGSDPKNTPWSYHNSGSWPTLLWQFTLACMKMGRPELAEKALA UnitProt VAEKRLLADRWPEYYDTRSGKFIGKQSRLYQTWTVAGFLTSKLLLANPEMASLLFWEEDYELL Accession DICACGLRKSDRKKCSRVAAKTQILVR No.: UnitProt Accession No.: Q9FXA8 Arabidopsis SEQIDNO:328 MAASETVLRVPLGSVSQSCYLASFFVNSTPNLSFKPVSRNRKTVRCTNSHEVSSVPKHSFHSSNS thaliana VLKGKKFVSTICKCQKHDVEESIRSTLLPSDGLSSELKSDLDEMPLPVNGSVSSNGNAQSVGTKSI Alkaline/ EDEAWDLLRQSVVFYCGSPIGTIAANDPNSTSVLNYDQVFIRDFIPSGIAFLLKGEYDIVRNFILYT neutral LQLQSWEKTMDCHSPGQGLMPCSFKVKTVPLDGDDSMTEEVLDPDFGEAAIGRVAPVDSGLW invertase WIILLRAYGKCTGDLSVQERVDVQTGIKMILKLCLADGFDMFPTLLVTDGSCMIDRRMGIHGHP E, LEIQALFYSALVCAREMLTPEDGSADLIRALNNRLVALNFHIREYYWLDLKKINEIYRYQTEEYS chloroplastic YDAVNKFNIYPDQIPSWLVDFMPNRGGYLIGNLQPAHMDFRFFTLGNLWSIVSSLASNDQSHAIL INVE DFIEAKWAELVADMPLKICYPAMEGEEWRIITGSDPKNTPWSYHNGGAWPTLLWQLTVASIKM UnitProt GRPELAEKAVELAERRISLDKWPEYYDTKRARFIGKQARLYQTWSIAGYLVAKLLLANPAAAKF Accession LTSEEDSDLRNAFSCMLSANPRRTRGPKKAQQPFIV No.: Q9FK88 Oryza SEQIDNO:329 MGVLGSRVAWAWLVQLLLLQQLAGASHVVYDDLELQAAATTADGVPPSIVDSELRTGYHFQPP sativa KNWINDPNAPMYYKGWYHLFYQYNPKGAVWGNIVWAHSVSRDLINWVALKPAIEPSIRADKY japonica GCWSGSATMMADGTPVIMYTGVNRPDVNYQVQNVALPRNGSDPLLREWVKPGHNPVIVPEGGI (rice) NATQFRDPTTAWRGADGHWRLLVGSLAGQSRGVAYVYRSRDFRRWTRAAQPLHSAPTGMWE Beta- CPDFYPVTADGRREGVDTSSAVVDAAASARVKYVLKNSLDLRRYDYYTVGTYDRKAERYVPD fructo- DPAGDEHHIRYDYGNFYASKTFYDPAKRRRILWGWANESDTAADDVAKGWAGIQAIPRKVWL furanosidase, DPSGKQLLQWPIEEVERLRGKWPVILKDRVVKPGEHVEVTGLQTAQADVEVSFEVGSLEAAERL insoluble DPAMAYDAQRLCSARGADARGGVGPFGLWVLASAGLEEKTAVFFRVFRPAARGGGAGKPVVL isoenzyme2 MCTDPTKSSRNPNMYQPTFAGFVDTDITNGKISLRSLIDRSVVESFGAGGKACILSRVYPSLAIGK CIN2 NARLYVFNNGKAEIKVSQLTAWEMKKPVMMNGA Unit Prot Accession No.: Q0JDC5 Rattus SEQIDNO:330 MAKKKFSALEISLIVLFIIVTAIAIALVTVLATKVPAVEEIKSPTPTSNSTPTSTPTSTSTPTSTSTPSP norvegicus GKCPPEQGEPINERINCIPEQHPTKAICEERGCCWRPWNNTVIPWCFFADNHGYNAESITNENAGL (rat) KATLNRIPSPTLFGEDIKSVILTTQTQTGNRFRFKITDPNNKRYEVPHQFVKEETGIPAADTLYDVQ Sucrase- VSENPFSIKVIRKSNNKVLCDTSVGPLLYSNQYLQISTRLPSEYIYGFGGHIHKRFRHDLYWKTWP isomaltase, IFTRDEIPGDNNHNLYGHQTFFMGIGDTSGKSYGVFLMNSNAMEVFIQPTPIITYRVTGGILDFYIF intestinal LGDTPEQVVQQYQEVHWRPAMPAYWNLGFQLSRWNYGSLDTVSEVVRRNREAGIPYDAQVTD SiGene IDYMEDHKEFTYDRVKFNGLPEFAQDLHNHGKYIIILDPAISINKRANGAEYQTYVRGNEKNVW UnitProt VNESDGTTPLIGEVWPGLTVYPDFTNPQTIEWWANECNLFHQQVEYDGLWIDMNEVSSFIQGSL Accession NLKGVLLIVLNYPPFTPGILDKVMYSKTLCMDAVQHWGKQYDVHSLYGYSMAIATEQAVERVF No.: PNKRSFILTRSTFGGSGRHANHWLGDNTASWEQMEWSITGMLEFGIFGMPLVGATSCGFLADTT P23739 EELCRRWMQLGAFYPFSRNHNAEGYMEQDPAYFGQDSSRHYLTIRYTLLPFLYTLFYRAHMFGE TVARPFLYEFYDDTNSWIEDTQFLWGPALLITPVLRPGVENVSAYIPNATWYDYETGIKRPWRKE RINMYLPGDKIGLHLRGGYIIPTQEPDVTTTASRKNPLGLIVALDDNQAAKGELFWDDGESKDSI EKKMYILYTFSVSNNELVLNCTHSSYAEGTSLAFKTIKVLGLREDVRSITVGENDQQMATHTNFT FDSANKILSITALNFNLAGSFIVRWCRTFSDNEKFTCYPDVGTATEGTCTQRGCLWQPVSGLSNV PPYYFPPENNPYTLTSIQPLPTGITAELQLNPPNARIKLPSNPISTLRVGVKYHPNDMLQFKIYDAQ HKRYEVPVPLNIPDTPTSSNERLYDVEIKENPFGIQVRRRSSGKLIWDSRLPGFGFNDQFIQISTRLP SNYLYGFGEVEHTAFKRDLNWHTWGMFTRDQPPGYKLNSYGFHPYYMALENEGNAHGVLLLN SNGMDVTFQPTPALTYRTIGGILDFYMFLGPTPEIATRQYHEVIGFPVMPPYWALGFQLCRYGYR NTSEIEQLYNDMVAANIPYDVQYTDINYMERQLDFTIGERFKTLPEFVDRIRKDGMKYIVILAPAI SGNETQPYPAFERGIQKDVFVKWPNTNDICWPKVWPDLPNVTIDETITEDEAVNASRAHVAFPDF FRNSTLEWWAREIYDFYNEKMKFDGLWIDMNEPSSFGIQMGGKVLNECRRMMTLNYPPVFSPE LRVKEGEGASISEAMCMETEHILIDGSSVLQYDVHNLYGWSQVKPTLDALQNTTGLRGIVISRST YPTTGRWGGHWLGDNYTTWDNLEKSLIGMLELNLFGIPYIGADICGVFHDSGYPSLYFVGIQVG AFYPYPRESPTINFTRSQDPVSWMKLLLQMSKKVLEIRYTLLPYFYTQMHEAHAHGGTVIRPLM HEFFDDKETWEIYKQFLWGPAFMVTPVVEPFRTSVTGYVPKARWFDYHTGADIKLKGILHTFSA PFDTINLHVRGGYILPCQEPARNTHLSRQNYMKLIVAADDNQMAQGTLFGDDGESIDTYERGQY TSIQFNLNQTTLTSTVLANGYKNKQEMRLGSIHIWGKGTLRISNANLVYGGRKHQPPFTQEEAKE TLIFDLKNMNVTLDEPIQITWS Oryctolagus SEQIDNO:331 MAKRKFSGLEITLIVLFVIVFILAIALIAVLATKTPAVEEVNPSSSTPTTTSTTTSTSGSVSCPSELNE cuniculus VVNERINCIPEQSPTQAICAQRNCCWRPWNNSDIPWCFFVDNHGYNVEGMTTTSTGLEARLNRK (Rabbit) STPTLFGNDINNVLLTTESQTANRLRFKLTDPNNKRYEVPHQFVTEFAGPAATETLYDVQVTENP Sucrase- FSIKVIRKSNNRILFDSSIGPLVYSDQYLQISTRLPSEYMYGFGEHVHKRFRHDLYWKTWPIFTRD isomaltase, QHTDDNNNNLYGHQTFFMCIEDTTGKSFGVFLMNSNAMEIFIQPTPIVTYRVIGGILDFYIFLGDT intestinal PEQVVQQYQELIGRPAMPAYWSLGFQLSRWNYNSLDVVKEVVRRNREALIPFDTQVSDIDYME SiGene DKKDFTYDRVAYNGLPDFVQDLHDHGQKYVIILDPAISINRRASGEAYESYDRGNAQNVWVNE UnitProt SDGTTPIVGEVWPGDTVYPDFTSPNCIEWWANECNIFHQEVNYDGLWIDMNEVSSFVQGSNKGC Accession NDNTLNYPPYIPDIVDKLMYSKTLCMDSVQYWGKQYDVHSLYGYSMAIATERAVERVFPNKRS No.: FILTRSTFAGSGRHAAHWLGDNTATWEQMEWSITGMLEFGLFGMPLVGADICGFLAETTEELCR P07768 RWMQLGAFYPFSRNHNADGFEHQDPAFFGQDSLLVKSSRHYLNIRYTLLPFLYTLFYKAHAFGE TVARPVLHEFYEDTNSWVEDREFLWGPALLITPVLTQGAETVSAYIPDAVWYDYETGAKRPWR KQRVEMSLPADKIGLHLRGGYIIPIQQPAVTTTASRMNPLGLIIALNDDNTAVGDFFWDDGETKD TVQNDNYILYTFAVSNNNLNITCTHELYSEGTTLAFQTIKILGVTETVTQVTVAENNQSMSTHSN FTYDPSNQVLLIENLNFNLGRNFRVQWDQTFLESEKITCYPDADIATQEKCTQRGCIWDTNTVNP RAPECYFPKTDNPYSVSSTQYSPTGITADLQLNPTRTRITLPSEPITNLRVEVKYHKNDMVQFKIF DPQNKRYEVPVPLDIPATPTSTQENRLYDVEIKENPFGIQIRRRSTGKVIWDSCLPGFAFNDQFIQI STRLPSEYIYGFGEAEHTAFKRDLNWHTWGMFTRDQPPGYKLNSYGFHPYYMALEDEGNAHGV LLLNSNAMDVTFMPTPALTYRVIGGILDFYMFLGPTPEVATQQYHEVIGHPVMPPYWSLGFQLC RYGYRNTSEIIELYEGMVAADIPYDVQYTDIDYMERQLDFTIDENFRELPQFVDRIRGEGMRYIIIL DPAISGNETRPYPAFDRGEAKDVFVKWPNTSDICWAKVWPDLPNITIDESLTEDEAVNASRAHA AFPDFFRNSTAEWWTREILDFYNNYMKFDGLWIDMNEPSSFVNGTTTNVCRNTELNYPPYFPEL TKRTDGLHFRTMCMETEHILSDGSSVLHYDVHNLYGWSQAKPTYDALQKTTGKRGIVISRSTYP TAGRWAGHWLGDNYARWDNMDKSIIGMMEFSLFGISYTGADICGFFNDSEYHLCTRWTQLGAF YPFARNHNIQFTRRQDPVSWNQTFVEMTRNVLNIRYTLLPYFYTQLHEIHAHGGTVIRPLMHEFF DDRTTWDIFLQFLWGPAFMVTPVLEPYTTVVRGYVPNARWFDYHTGEDIGIRGQVQDLTLLMN AINLHVRGGHILPCQEPARTTFLSRQKYMKLIVAADDNHMAQGSLFWDDGDTIDTYERDLYLSV QFNLNKTTLTSTLLKTGYINKTEIRLGYVHVWGIGNTLINEVNLMYNEINYPLIFNQTQAQEILNI DLTAHEVTLDDPIEISWS Homo SEQIDNO:343 MARKKFSGLEISLIVLFVIVTIIALALIVVLATKTPAVDEISDSTSTPATTRVTTNPSDSGKCPNVLN sapiens DPVNVRINCIPEQFPTEGICAQRGCCWRPWNDSLIPWCFFVDNHGYNVQDMTTTSIGVEAKLNRI Sucrase- PSPTLFGNDINSVLFTTQNQTPNRFRFKITDPNNRRYEVPHQYVKEFTGPTVSDTLYDVKVAQNP isomaltase, FSIQVIRKSNGKTLFDTSIGPLVYSDQYLQISTRLPSDYIYGIGEQVHKRFRHDLSWKTWPIFTRDQ intestinal LPGDNNNNLYGHQTFFMCIEDTSGKSFGVFLMNSNAMEIFIQPTPIVTYRVTGGILDFYILLGDTP SiGene EQVVQQYQQLVGLPAMPAYWNLGFQLSRWNYKSLDVVKEVVRRNREAGIPFDTQVTDIDYME UnitProt DKKDFTYDQVAFNGLPQFVQDLHDHGQKYVIILDPAISIGRRANGTTYATYERGNTQHVWINES Accession DGSTPIIGEVWPGLTVYPDFTNPNCIDWWANECSIFHQEVQYDGLWIDMNEVSSFIQGSTKGCNV No.: NKLNYPPFTPDILDKLMYSKTICMDAVQNWGKQYDVHSLYGYSMAIATEQAVQKVFPNKRSFIL P14410 TRSTFAGSGRHAAHWLGDNTASWEQMEWSITGMLEFSLFGIPLVGADICGFVAETTEELCRRW MQLGAFYPFSRNHNSDGYEHQDPAFFGQNSLLVKSSRQYLTIRYTLLPFLYTLFYKAHVFGETVA RPVLHEFYEDTNSWIEDTEFLWGPALLITPVLKQGADTVSAYIPDAIWYDYESGAKRPWRKQRV DMYLPADKIGLHLRGGYIIPIQEPDVTTTASRKNPLGLIVALGENNTAKGDFFWDDGETKDTIQN GNYILYTFSVSNNTLDIVCTHSSYQEGTTLAFQTVKILGLTDSVTEVRVAENNQPMNAHSNFTYD ASNQVLLIADLKLNLGRNFSVQWNQIFSENERFNCYPDADLATEQKCTQRGCVWRTGSSLSKAP ECYFPRQDNSYSVNSARYSSMGITADLQLNTANARIKLPSDPISTLRVEVKYHKNDMLQFKIYDP QKKRYEVPVPLNIPTTPISTYEDRLYDVEIKENPFGIQIRRRSSGRVIWDSWLPGFAFNDQFIQISTR LPSEYIYGFGEVEHTAFKRDLNWNTWGMFTRDQPPGYKLNSYGFHPYYMALEEEGNAHGVFLL NSNAMDVTFQPTPALTYRTVGGILDFYMFLGPTPEVATKQYHEVIGHPVMPAYWALGFQLCRY GYANTSEVRELYDAMVAANIPYDVQYTDIDYMERQLDFTIGEAFQDLPQFVDKIRGEGMRYIIIL DPAISGNETKTYPAFERGQQNDVFVKWPNTNDICWAKVWPDLPNITIDKTLTEDEAVNASRAHV AFPDFFRTSTAEWWAREIVDFYNEKMKFDGLWIDMNEPSSFVNGTTTNQCRNDELNYPPYFPEL TKRTDGLHFRTICMEAEQILSDGTSVLHYDVHNLYGWSQMKPTHDALQKTTGKRGIVISRSTYP TSGRWGGHWLGDNYARWDNMDKSIIGMMEFSLFGMSYTGADICGFFNNSEYHLCTRWMQLG AFYPYSRNHNIANTRRQDPASWNETFAEMSRNILNIRYTLLPYFYTQMHEIHANGGTVIRPLLHE FFDEKPTWDIFKQFLWGPAFMVTPVLEPYVQTVNAYVPNARWFDYHTGKDIGVRGQFQTFNAS YDTINLHVRGGHILPCQEPAQNTFYSRQKHMKLIVAADDNQMAQGSLFWDDGESIDTYERDLYL SVQFNLNQTTLTSTILKRGYINKSETRLGSLHVWGKGTTPVNAVTLTYNGNKNSLPFNEDTTNMI LRIDLTTHNVTLEEPIEINWS

    TABLE-US-00003 TABLE3 Linkers Sequence SEQID Info NO: AminoAcidsequence N-terminal SEQID EAEA addition NO:19 EAEA GGGS SEQID GGGGS linker NO:20 GSSlinker GSS Arigid SEQID EAAAREAAAREAAAREAAAR linkerthat NO:22 forms4 turnsofan alphahelix Fulllinker SEQID GSSGSSGSSGSSGSSGSSGSSGSSEAAAREA NO:23 AAREAAAREAAARGGGGSGGGGSGGGGS Aflexible SEQID GSSGSSGSSGSSGSSGSSGSSGSS GSlinker NO:24 withhigher Scontent Aflexible SEQID GGGGSGGGGSGGGGS GSlinker NO:25 withmuch higherG content (flex SEQID Nucleotidesequence linkers) NO:339 GGTTCATCAGGGTCCTCAGGATCATCCGGTA GTAGTGGTTCATCCGGTTCATCCGGATCAAG TGGCTCCTCTGAAGCTGCAGCAAGGGAGGCT GCAGCCCGTGAGGCAGCCGCTAGAGAAGCCG CCGCTAGGGGTGGTGGCGGCTCTGGCGGAGG CGGTTCCGGTGGCGGAGGCTCT

    TABLE-US-00004 TABLE4 Promoters Sequence SEQID Info NO: AminoAcidsequence AOX1 SEQIDNO:26 GATCTAACATCCAAAGACGAAAGGTTGAATGAAACCTTTTTGCCATCCGACATCCACA promoter GGTCCATTCTCACACATAAGTGCCAAACGCAACAGGAGGGGATACACTAGCAGCAGA CCGTTGCAAACGCAGGACCTCCACTCCTCTTCTCCTCAACACCCACTTTTGCCATCGAA AAACCAGCCCAGTTATTGGGCTTGATTGGAGCTCGCTCATTCCAATTCCTTCTATTAGG CTACTAACACCATGACTTTATTAGCCTGTCTATCCTGGCCCCCCTGGCGAGGTTCATGT TTGTTTATTTCCGAATGCAACAAGCTCCGCATTACACCCGAACATCACTCCAGATGAG GGCTTTCTGAGTGTGGGGTCAAATAGTTTCATGTTCCCCAAATGGCCCAAAACTGACA GTTTAAACGCTGTCTTGGAACCTAATATGACAAAAGCGTGATCTCATCCAAGATGAAC TAAGTTTGGTTCGTTGAAATGCTAACGGCCAGTTGGTCAAAAAGAAACTTCCAAAAGT CGGCATACCGTTTGTCTTGTTTGGTATTGATTGACGAATGCTCAAAAATAATCTCATTA ATGCTTAGCGCAGTCTCTCTATCGCTTCTGAACCCCGGTGCACCTGTGCCGAAACGCA AATGGGGAAACACCCGCTTTTTGGATGATTATGCATTGTCTCCACATTGTATGCTTCCA AGATTCTGGTGGGAATACTGCTGATAGCCTAACGTTCATGATCAAAATTTAACTGTTC TAACCCCTACTTGACAGCAATATATAAACAGAAGGAAGCTGCCCTGTCTTAAACCTTT TTTTTTATCATCATTATTAGCTTACTTTCATAATTGCGACTGGTTCCAATTGACAAGCT TTTGATTTTAACGACTTTTAACGACAACTTGAGAAGATCAAAAAACAACTAATTATTG GATCCCGA DAK2 SEQIDNO:27 AAATAAGCATGTTTGTTTCAGATCAAAGATTAGCGTTTCAAAGTTGTGGAAAAGTGAC promoter CATGCAACAATATGCAACACATTCGGATTATCTGATAAGTTTCAAAGCTACTAAGTAA GCCCGTTTCAAGTCTCCAGACCGACATCTGCCATCCAGTGATTTTCTTAGTCCTGAAA AATACGATGTGTAAACATAAACCACAAAGATCGGCCTCCGAGGTTGAACCCTTACGA AAGAGACATCTGGTAGCGCCAATGCCAAAAAAAAATCACACCAGAAGGACAATTCCC TTCCCCCCCAGCCCATTAAAGCTTACCATTTCCTATTCCAATACGTTCCATAGAGGGCA TCGCTCGGCTCATTTTCGCGTGGGTCATACTAGAGCGGCTAGCTAGTCGGCTGTTTGA GCTCTCTAATCGAGGGGTAAGGATGTCTAATATGTCATAATGGCTCACTATATAAAGA ACCCGCTTGCTCAACCTTCGACTCCTTTCCCGATCCTTTGCTTGTTGCTTCTTCTTTTAT AACAGGAAACAAAGGAATTTATACACTTTAAGAATT PEX11 SEQIDNO:28 CTTCCCCATTTCACTGACAGTTTGTAGAAATAGGGCAACAATTGATGCAAATCGATTT promoter TCAACGCATTGGTTTTGATAGCATTGATGATCTTGGAGCTGTAAAAGTCCGGCTGGAT AAGCTCAATGAAATAGGTTGGTTGATCTGGATCTTCTTTTGGGTCATTTTGTTCGCTCT GTATTTCACAAATTGCCAGAATCTCTGCCAACCACAGTGGTAGGTCCAACTTGGTGTT CTGAATCACAGGCTTCCCCGGGTTGTTCTCTAAATAACCGAGGCCCGGCACAGAAATC GTAAACCGACACGGTATCTTTTGTCCGTCCGCCAGTATCTCATCAAGGTCGTAGTAGC CCATGATGAGTATCAAAGGGGATTTGGTTATGCGATGCAACGAGAGATTGTTTATCCC AGATGCTGATGTAAAAACCTTAACCAGCGTGACAGTAGAAATAAGACACGTTAAAAT TACCCGCGCTTCCCTAACAATTGGCTCTGCCTTTCGGCAAGTTTCTAACTGCCCTCCCC TCTCACATGCACCACGAACTTACCGTTCGCTCCTAGCAGAACCACCCCAAAGTTTAAT CAGGACCGCATTTTAGCCTATTGCTGTAGAACCCCACAACATAACCTGGTCCAGAGCC AGCCCTTTATATATGGTAAATCCCGTTTGAACTTCGAAGTGGAATCGGAATTTTTACA TCAAAGAAACTGATACTGAAACTTTTGGCTTCGACTTGGACTTTCTCTTAATC FLD1 SEQIDNO:29 AAATCAGCCATTAATCTCACCTCAGTTTTTGAATCAGTAGAATTTTCAATGAAACAAA promoter CGGTTGGTATATTATTTGATAGGGTAGCCAAATTTCCAAAAATGAACTTTTCATCAGG TAATATCTTGAATACCGTAATGTAGTGACTATTGGAAGAAACTGCTATCAAATTATAT TTCGGATAGAAATCCAAACCCCAGACTGATCTCTTGAGTCTCAACTCTAAGTCAGCCG CGACTCTAATTATCTGTGGATTAGGAGTTAGTGTGGACAAAGCATCAGTATAGTATAA CTTTACGGTTCCATTATCAGACGCTATTGCAAGAACTTCCTTTCCATTGATCTCTCCAA TTCGACAGTAATTGATATCATAAGGTAGGTCTGGAAACACACTGGCGCTTGTATCCCA TTCTGCAGGAATTTCTGGAACGGTGGTAATGGTAGTTATCCAACGGAGTTGGGGTAGT TGGTATATCTGGATATGCCGCCTATAGGATAAAAACAGGAGAGAGTGAACCTTGCTT ACGGCTACTAGATTGTTCTTGTACTCGGAATTGTCGTTATCGGAAACTAGACTAATCT CATCTGTGTGTTGCAGTACTATTGAGTCGTTGTAGTATCTACCAGGAGGGCATTCCAT GAACTAGTGAGACAAATGAGTTGGATTTTCTCAATAGACATATGCAAGAATGCTACA CAACGGATGTCGCACTCTTTTTCTTAGTTGATAATATCATCCAATCAGAAGACACGGG CTAGAAGGACTTGCTCCCGAAGGATAATCCACTGCTACTATCTCCCTTCCTCACATAT AGTCTTGCAGGGCTCATGCCCCTTTCTCCTTCGAACTGCCCGATGAGGAAGTCTTTAG CCTATCAAGGAATTCGGGACCATCATCAATTTTTAGAGCCTTACCTGATCGCAATCAG GATTTCACTACTCATATAAATACATCACTCAAACTCCAACTTTGCTTGTTCATACAATT CTTGATATTCACAGGATC FGH1 SEQIDNO:30 GTGAATTTGTCACGGAATTGACCAAGAGGTCAGACGATCCTGTATCCCATTGAGCCGT promoter TATGCTTTGTGGGGGAAACCCTATTTCTATCGTACTAAGAAAACCAATGGTGAACTCA TATTCGGTATCAATGGCGACGATTCCAGCATAGCCTGTAGACAGTAACAACACTAGG GCAACAGCAACTAACATATCTTCATTGATGAAACGTTGTGATCGGTGTGACTTTTATA GTAAAAGCTACAACTGTTTGAAATACCAAGATATCATTGTGAATGGCTCAAAAGGGT AATACATCTGAAAAACCTGAAGTGTGGAAAATTCCGATGGAGCCAACTCATGATAAC GCAGAAGTCCCATTTTGCCATCTTCTCTTGGTATGAAACGGTAGAAAATGATCCGAGT ATGCCAATTGATACTCTTGATTCATGCCCTATAGTTTGCGTAGGGTTTAATTGATCTCC TGGTCTATCGATCTGGGACGCAATGTAGACCCCATTAGTGGAAACACTGAAAGGGAT CCAACACTCTAGGCGGACCCGCTCACAGTCATTTCAGGACAATCACCACAGGAATCA ACTACTTCTCCCAGTCTTCCTTGCGTGAAGCTTCAAGCCTACAACATAACACTTCTTAC TTAATCTTTGATTCTCGAATTGTTTACCCAATCTTGACAACTTAGCCTAAGCAATACTC TGGGGTTATATATAGCAATTGCTCTTCCTCGCTGTAGCGTTCATTCCATCTTTCTAGAA TTCGT DAS2 SEQIDNO:31 CCTGTTGATAAGACGCATTCTAGAGTTGTTTCATGAAAGGGTTACGGGTGTTGATTGG promoter TTTGAGATATGCCAGAGGACAGATCAATCTGTGGTTTGCTAAACTGGAAGTCTGGTAA GGACTCTAGCAAGTCCGTTACTCAAAAAGTCATACCAAGTAAGATTACGTAACACCTG GGCATGACTTTCTAAGTTAGCAAGTCACCAAGAGGGTCCTATTTAACGTTTGGCGGTA TCTGAAACACAAGACTTGCCTATCCCATAGTACATCATATTACCTGTCAAGCTATGCT ACCCCACAGAAATACCCCAAAAGTTGAAGTGAAAAAATGAAAATTACTGGTAACTTC ACCCCATAACAAACTTAATAATTTCTGTAGCCAATGAAAGTAAACCCCATTCAATGTT CCGAGATTTAGTATACTTGCCCCTATAAGAAACGAAGGATTTCAGCTTCCTTACCCCA TGAACAGAAATCTTCCATTTACCCCCCACTGGAGAGATCCGCCCAAACGAACAGATA ATAGAAAAAAGAAATTCGGACAAATAGAACACTTTCTCAGCCAATTAAAGTCATTCC ATGCACTCCCTTTAGCTGCCGTTCCATCCCTTTGTTGAGCAACACCATCGTTAGCCAGT ACGAAAGAGGAAACTTAACCGATACCTTGGAGAAATCTAAGGCGCGAATGAGTTTAG CCTAGATATCCTTAGTGAAGGGTTGTTCCGATACTTCTCCACATTCAGTCATAGATGG GCAGCTTTGTTATCATGAAGAGACGGAAACGGGCATTAAGGGTTAACCGCCAAATTA TATAAAGACAACATGTCCCCAGTTTAAAGTTTTTCTTTCCTATTCTTGTATCCTGAGTG ACCGTTGTGTTTAATATAACAAGTTCGTTTTAACTTAAGACCAAAACCAGTTACAACA AATTATAACCCCTCTAAACACTAAAGTTCACTCTTATCAAACTATCAAACATCAAAAG AATTCGCG CAT1 SEQIDNO:32 TAATCGAACTCCGAATGCGGTTCTCCTGTAACCTTAATTGTAGCATAGATCACTTAAA promoter TAAACTCATGGCCTGACATCTGTACACGTTCTTATTGGTCTTTTAGCAATCTTGAAGTC TTTCTATTGTTCCGGTCGGCATTACCTAATAAATTCGAATCGAGATTGCTAGTACCTGA TATCATATGAAGTAATCATCACATGCAAGTTCCATGATACCCTCTACTAATGGAATTG AACAAAGTTTAAGCTTCTCGCACGAGACCGAATCCATACTATGCACCCCTCAAAGTTG GGATTAGTCAGGAAAGCTGAGCAATTAACTTCCCTCGATTGGCCTGGACTTTTCGCTT AGCCTGCCGCAATCGGTAAGTTTCATTATCCCAGCGGGGTGATAGCCTCTGTTGCTCA TCAGGCCAAAATCATATATAAGCTGTAGACCCAGCACTTCAATTACTTGAAATTCACC ATAACACTTGCTCTAGTCAAGACTTACAATTAAA MDH3 SEQIDNO:33 TAGCTTGGGTAGGACTTGACAAGTACGGCTTCCGTGGTCATACCAAACGCCTTTGTTA promoter CCGTTGGCTATACCTAATGACCAAGGCATTTGTGGATTATAACGGTATCGTAGTTGAA AAATATGACGTAACCACTGGTACTAGCCCCCACAAGGTTGATGCTGAATACGGGAAT CAAGGTGCCGATTTTAAAGGAGTAGCCACTGAAGGGTTTGGCTGGGTCAATGCCTCTT TTATTTTGGGATTAACCTACTTAGATGTCCAAGGCATCCGTGCGATAGGCGCCGTTAC GTCCCCTGATGTATTTTTCAGGAAGCTCAAACCTTGGGAACGCGCAAGTTATGGCCTA AGGCCATGTAACGAGATAGTCAAGTCAAACTAGAAGTATACGGTTTCCCCGCAGAAA TAGCAGAAATAGGCGACAAATACATACAACATTTTCATTGTGATAGGGGGCGGCGGT TCCTAGGAGGGACAACCCCCAGAAACCTTGTAGACTACGTTTTCACGACGATGGGTTA TTACTGTAAAGGAAGAATATACTACCCACCAGTTGAATGTTTGAACGGATCAAAGGTC GAAGGGAGTACACGGCCCAACCAACGTAGCTACCGGAGAAAGCAAGACTTTCCCAAA CCAAATAGCTCCGGGTTTCTTCTCCGGCAACCCGTCAGTTTTTGTGTGGCCGGACAAA AATTCGCACCCTCAGTCTAATTGAAAGGTCGGGCTCCGAGCTCTAGGCGTTTGCGCAT GTAATATTGCATCCCCTCCCATAGATAATACTGCGCGAACACAGGGTGCAAATTATGA TGACCACACATGCCAGTGACCAAAACAGTTTTTTAGTCTTTAAAAACCCTCGGAACTT CTGAGTATATAAAGGCTTCTCATTTCCTACAAGCAAACAAAGAAGAAACTTCCACTTT CTAACTTTTTATCTATAGACTTTAGAGTTACAACCAACGAACAATAACAAA HAC1 SEQIDNO:34 TGAAGCTTATCTGCTGAGCAAGTTGTTTGACCAAACTTGAGTCAACAGTGGTTAACTA promoter TATCCTCTATTATTTTAGATGGGAGCACATCAAGTGTACGGGAACAATGCAATCGACA ACCTGTAGCCTGACATACATAGCCATCTTGAATTGACAAAACTTAGAATGTCTTGAAT GTGATAGATATGAGTTCCCAAAAATCTCTTTTACGATTTCCCAGTTGCGGTGTACTATT ACACAGAGGATATCATAGCAGACTTACAATCCTCAGGCATAAAACGAGCTTTCTTATC AAAGTGTATTCAAATGGACCATTTGATTGCACCAAGGCATTAGCCCCAAACCATACCA CACAGTAACTTGATATTCTCAGCATGCATGGAAATTCCACTCATAACGCGCTATTCAC CGCGAATACTTATCTATGAAACTGGGTTCTTTAGTATTCTTTGCCAAATTTCACCGATT AGAAATTATTAGGTAATATAATTTCTTTGGGGAACCCCTTCCCGTTACGCCCGCTGCG GCTTTGTGGTTCTTTTCCAGTCTTGAGCAAATTACATCTGGTCTAGACAGTTCTTCCGT GCCCCAGTATGCGAGCGCAAACTTTCAATCAAACCTCGTAGCAAATTGGTACTTGAAC TTCGTATTTAACCGCTATTAAATGTACTGACTCTTACATTATGAAAAATTTTGATAAAG ATTTTATATTTCATCTCAGTTAATCTCCTAATAATAATAGTCTGCATAACTCAAACGGT ACTTCCTTTTCGGAACGCGAAGAGTAGTCTCTATGTCATTCTCACACTATCCGCAGCG CAATAGAGAACGAGCATGTTACCCGACTCATCCCTTGTCGATTCGGAAACGATTTATA AATACAATTAGATCGCCACCGATCTTCTTTTGTCAATATTATAAAAATAGTACAGATT TTCCTTAGTCGAATCAGATCGCAGAAA BiP SEQIDNO:35 AGATCTGAGGGTGTATACGATGTATCGTGCCGAACACATGCACTTGACGGCACAGCA promoter AATGGTATTCAAGAAGACCACTTTAGAATGGGAGTTAATAGGGATGGTTTCATGGAG GTTAAAACACTTCAAGGAGGCATCTGAAGCATTCAAGTATGCACTAGGTCTGAGGTTT TCGGTCAAGGCATGCAAGAAATTAATTGTATTCTATCTGAACGAACGCTCCAGAATGA ACCAGCCAGAAACCTCAATTGCCCTCAACAACTTAAATCAATCCACATTATCCATCCA AGAGATTCTCAAGTATCGTTCGTTCCTCGATATCAACCTAATTTCAAACTTGGTCAAA CTAGGAGTTTGGAATCACCGCTGGTATGCTGAGTTTTCTCCAAAACTCATAGAAAGCC TTGCGGTTGTTGTGGAGAACGGAGGGCTTATCAAGGTAGAAAACGAGGTTAAGGCTA CCTATTTCGATTCACAAGATGGAGTTTACGACTTGATGAACGAGGTATTCAAGTTCAT GAAGCATTACGATTATCCTGGGACTGACAACTAAGAGCTCCTAGTGAAGACTTGAGA TGGACATGATAAACAATTATAGTGAAAATAGAAACCATAATACAATATTCTAATAGA GGAACCGTTTACCTGTGGTTCCTATTGTGGCCTACTGTTACTAGCTAGTGTAATACACC CTTGCCTCAGCTTTGCAAGTTGACAACTCAGCCAAATGATCTTTGAATGCGCGAAACC TCAAGGTCCATCGAATTTTCTCGAATTTTCAGTGTTTTCATACAGCGTGTCATCTTCTT TCGCGTACTTATTAAAATCGTACCCAGATCCCTTCTTCTTCCTTAATTTCAATTCCAAC ACTCAAGA RAD30 SEQIDNO:36 AGATCTTGCAAAATACCTTTCCAGCTTTCCAGCTTCCTAGCACTCATCTTGAAGATATC promoter AAATATTCTCCATTCAAACCAACATCAAAAAATAGAATAATTATAATCAGTTTGAAGA GCAAGAGTAATTTTAAAGGAAACACATTCATGGTCAGCTAGAAGGTTGACTGAAGAG TCGCAAGATATCTGAGAATAAAAAAGAGCATAGCTAACAAGATGAGTAAACACGGCA AACAGATTTAGGAACAGGTGAAGGGTTTCTGGCTCTTCAATGTATATCCTGCTAGCCA CCCATTCAGAAATAACACAAAGTAGGACCCTACTGAAAAATAAATTTAATACATCTTC ATCCTCTCATTAAACCACCGACCACTCAAACCATACCAGCCTTGTCCAATTCCATGCA TCGTGCTATCCGTCAGAATTTTCAGTGTTAATCGAATCGGTCATTATAGCTCCGTCTGG GGCGACAACTTGTCATCACAGAATAGCACAATTATGCGTTGGAATCGTCAAAAAATC ACCTCCAGGTCTGTATACATACAGAACTGGTTGTAACGACAACCTTGTTTGATTGAGG TGACTGGAAGGTGGAAAGAAAGGGAGGAAATAAATATTGCAAGGAAAGAAAAAAAA ATTGTTCACAGTCACCTCTTCACCTTCGCGATTTCATGTTTCTTTCATGTGCTAACTGAT CCCAGGGCTTCTCCAGCGCCCTTATCTGTTAG RVS161-2 SEQIDNO:37 CTGCCCATCTATGACTGAATGTGGAGAAGTATCGGAACAACCCTTCACTAAGGATATC promoter TAGGCTAAACTCATTCGCGCCTTAGATTTCTCCAAGGTATCGGTTAAGTTTCCTCTTTC GTACTGGCTAACGATGGTGTTGCTCAACAAAGGGATGGAACGGCAGCTAAAGGGAGT GCATGGAATGACTTTAATTGGCTGAGAAAGTGTTCTATTTGTCCGAATTTCTTTTTTCT ATTATCTGTTCGTTTGGGCGGATCTCTCCAGTGGGGGGTAAATGGAAGATTTCTGTTC ATGGGGTAAGGAAGCTGAAATCCTTCGTTTCTTATAGGGGCAAGTATACTAAATCTCG GAACATTGAATGGGGTTTACTTTCATTGGCTACAGAAATTATTAAGTTTGTTATGGGG TGAAGTTACCAGTAATTTTCATTTTTTCACTTCAACTTTTGGGGTATTTCTGTGGGGTA GCATAGCTTGACAGGTAATATGATGTACTATGGGATAGGCAAGTCTTGTGTTTCAGAT ACCGCCAAACGTTAAATAGGACCCTCTTGGTGACTTGCTAACTTAGAAAGTCATGCCC AGGTGTTACGTAATCTTACTTGGTATGACTTTTTGAGTAACGGACTTGCTAGAGTCCTT ACCAGACTTCCAGTTTAGCAAACCACAGATTGATCTGTCCTCTGGCATATCTCAAACC AATCAACACCCGTAACCCTTTCATGAAACAACTCTAGAATGCGTCTTATCAACAGGAT TGCCCAAAACAGTAATTGGGGCGGTGGAATCTACATGGGAGTTCCATCGTTGTCTCGG TTTTTCTCCCTATAAGCTACTCTGGAGACGAAGTAACTAACACCCTCAAATATCATT MPP10 SEQIDNO:38 TCTGAATCCGACCTCCTCTAATCTACCACTGAAGAGAAGCAGTGTATTGTTCGTCTAC promoter GTAAATTTGAATGTGTAAATGGCAAACATGGCTTCGGGGATGATTTGGCATATATATT ATTGTAGCATCGTCTGTGGCTCTATGAGTTGTGTGGCGGATGATGAAAAGTTTCGTGC TGATCCCACAATGCGGCATTTACCAAATGGGGAAAGACCAGATTTCTTCGCTGCGCCA GCTAGGGACAGCATAATGTTCCAAGAAGAAGCGATTACAGGTGGATTACAAAGCGTT CGTCTGCAGTTGATGTTCTACGTGATGGGTATGAGTTGTAGTGCTACGCTCCATGAAT ACTTCTAATTTGTCGTTGACAATCCATGAATAATTTAAGTTTGCTTCCCAAGAGTCTAT TGCGAAGGGTGAGCCGAATCTCTTGGCGTATGCACCCGACTCGTCGGCTTTTGTGCGT TCCTTGCAAAGCTCGGTAGCAATCCGTTGGTGGGAGAAATTTGTCTCACGAATTTCAG TTGGGAGTAGCTGTTCCTGGTAGCAAGTTCGAGGGGATCTGTGCTCATAAAACGTGCT CACGCCAAAAATATTCTTACAAAATCTTCGCGGGGTGTTTGTCTTACATAATCGATTG GATATTTTCTTCAAATTTTTTTTTCTTACTGAAGTCCCCTATAGAG THP3 SEQIDNO:39 TCTTGCCAGTTGTCTCCTAAGATGTCATCGGAGTAGGCTCGGCTAAAGAGTAGTAATG promoter CATCAAGACCAACCAAAACACCTTCCACGAGTTCAGATGAACCTTTTAATAACTTCAG GTCACTTTGATGCCGGCACAACTGGGCGAGTTTCGTATAGTTAACTCTGATCTTGCAC TCCAGAACGGGAATAGGATTGACTTTTTGCTTCCGAGAAACGATTTGCTCTCTCTTCGT CTGGCTTTTCACTTTATATCGCACGGAATCAATGGATGGAACTCCTAAAGCTCCTAAC TTCGATGATTTGCTAGCCATGACTCTGTGGGACATTTTCTTGCATCTCGTTTGTAACCT GTCTGTTCCTACACTAAGTTTATGAGAGGCTACTTTGGATTCTAGCCTCGGTGGTAAA GTGGGAGATAACAACGGCATAAGGCAAGAACCAGAAGTACCATAACGGTCTGGTAA AGTTGGTGATAACTTAATTGGAAGAGTGTAAGTAAGACGTGGCTTGTAATAAGGCTTT CCATCAAAAAGGTTCTCCGGGTTGGAGTTTGTGAGGCTCACATCTTTGATCAGTCTTTC AATATAAATTGGTAACGTTGATGACAATGCCGGAGGTAATTTCTGTAGTTGTTGATAT ACGCAGATAACAGATTCAAATCTCCATTGGTTTTCATCATTGTGGCTTAAATTAGATC AGAACATGGTAGTATTTAAAAATGGATCTCTTTGCAGATTTACTCAATATAGCGAAAA AAGGAGACATTCGTTACAAAATATGAAGATAATTCGCCTCATAACTCGATTAATCAAA ACAGACGGTCCAGTTCTTCTTTTGGTAGT GBP2 SEQIDNO:40 ATCTGTACTGGTACTGACAAAGGTTATCCAGAATCCGAGACATTTCAACAACAGAGAT promoter TCCAGGCTTCAAAACATCCATTTTATCACCAATATCTAGTAATGCTTGCAACAATTCTG GATACTTCTTCTGTGTAACCAAATCTCTTATAAACTGAACAGCTTTCTGTACGTTGTCG TCAGTAGTTGGATCAACCTCAGTGGTGACCTGGCCTATCGGTTTTCCAAAAGACTTGT TTATCACGTCCGAAAGCTCCCATTTTTGCAGATGCGCAACTTTAAAAGGCCTGGCTTG AACATTTGCATCTCTTGTTGTGTGTTCTTTGAGAAAATATTCATCGATCTGGGTGCTTC CAACGACAGAAGATACTCTTCTGAGACCAGAAAGTCCCCAGCCATGCTTCCTAATTAC AAAATATTTGTAGGAAGATCCCTGATTAGGACAAAGTTGTCTTCTCATGAGTTCAACT GAAACTGGGGCTCAAACGGATTATGAAAGGGGTGATTAAAGGTTTTCCTAGCCTTACT TTCCAAATGTCGACCGAGACGAACATTTAAAATCCTAACATCAGAAATTTCTATCCTT AATCTCATTGATGGTTAGTACACTTCGCAGAGTCTCCACATTTGCAGACCCTCCTGGA TAACCAAAGCTTATCTAACAGCGGCATTGGACCTTTGAAAAGACCCTC DAS1 SEQIDNO:41 AAATCTGAACACGATGAAACCTCCCCGTAGATTCCACCGCCCCGTTACTTTTTTGGGC promoter AATCCCGTTGATAAGATCCATTTTAGAGTTGTTTCTGAAAGGATTACAGGCGTTGAAG GGTCAGAGAGATGCCAGAGAACAGACCAATTGGTAGTTTGCTAAAGTGGACGTCTGG CAGGTGCTCTATCGTGTTCTTTATTTAGGGCGTTACACTTAGTAGGATTACGTAACAAT TTGGCTTAACCTTCTAAGTTAGAAAGAAACCAAGAGGGGTCCTCTTTAACGTTCAGCA GTATCTAAAACACAAAACCTGCCCTCATAATACATCATTCTATCTGTCAAGCTGTGCT ACCCCACAGAAATACCCCCAAGAGTTAAAGTGAAAAGAAAAGCTAAATCTGTTAGAC TTCACCCCATAACAAACTTGATAGTTCCTGTAGCCAATGAAAGTTAACCCCATTCAAT GTTCCGAGATCTAGTATGCTTGCTCCTATAAGGAACGAAGGGTTCCAGCTTCCTTACC CCATCAATGGAAATCTCCTATTTACCCCCCACTGGAAAGATCCGTCCGAACGAACGGA TAATAGAAAAAAGAAATTCGGACAAAATAGAACACTTATTTAGCCAATGAAATCCAT TTCCAGCATCTCCTTCAACTGCCGTTCCATCCCCTTTGTTGAGCTACACCATCGTCAGC CAGTACCGAATAGGAAACTTAACCGATATCTTGGAGAATTCTAATGCGCGAATGAGTT TAGCCTAGATATCCTTAGTGAAGGGTTGTTCCGATACTTCTCCACATTCAGTCATTTCA GATGGGCAGCATTGTTATCATGAAGAAACGGAAACGGGCAGTAAGGGTTAACCGCCA AATTATATAAAGACAACATGTCCCCAGTTTAAAGTTTTTCTTTCCTATTCTTGTATCCT GAGTGACCGTTGTGTTTAAAATAACAAGTTCGTTTTAACTTAAGACCAAAACCAGTTA CAACAAATTATTCCCCAACTAAACACTAAAGTTCACTCTTATCAAACTATCAAACATC AAAG Methanol SEQIDNO:42 CTTCCCCATTTCACTGACAGTTTGTAGAAATAGGGCAACAATTGATGCAAATCGATTT inducible TCAACGCATTGGTTTTGATAGCATTGATGATCTTGGAGCTGTAAAAGTCCGGCTGGAT promoter AAGCTCAATGAAATAGGTTGGTTGATCTGGATCTTCTTTTGGGTCATTTTGTTCGCTCT GTATTTCACAAATTGCCAGAATCTCTGCCAACCACAGTGGTAGGTCCAACTTGGTGTT CTGAATCACAGGCTTCCCCGGGTTGTTCTCTAAATAACCGAGGCCCGGCACAGAAATC GTAAACCGACACGGTATCTTTTGTCCGTCCGCCAGTATCTCATCAAGGTCGTAGTAGC CCATGATGAGTATCAAAGGGGATTTGGTTATGCGATGCAACGAGAGATTGTTTATCCC AGATGCTGATGTAAAAACCTTAACCAGCGTGACAGTAGAAATAAGACACGTTAAAAT TACCCGCGCTTCCCTAACAATTGGCTCTGCCTTTCGGCAAGTTTCTAACTGCCCTCCCC TCTCACATGCACCACGAACTTACCGTTCGCTCCTAGCAGAACCACCCCAAAGTTTAAT CAGGACCGCATTTTAGCCTATTGCTGTAGAACCCCACAACATAACCTGGTCCAGAGCC AGCCCTTTATATATGGTAAATCCCGTTTGAACTTCGAAGTGGAATCGGAATTTTTACA TCAAAGAAACTGATACTGAAACTTTTGGCTTCGACTTGGACTTTCTCTTAATCGAATTC GT GCW14 SEQIDNO:43 CAGGTGAACCCACCTAACTATTTTTAACTGGCATCCAGTGAGCTCGCTGGGTGAAAGC promoter CAACCATCTTTTGTTTCGGGGAACCGTGCTCGCCCCGTAAAGTTAATTTTTTTTTCCCG CGCAGCTTTAATCTTTCGGCAGAGAAGGCGTTTTCATCGTAGCGTGGGAACAGAATAA TCAGTTCATGTGCTATACAGGCACATGGCAGCAGTCACTATTTTGCTTTTTAACCTTAA AGTCGTTCATCAATCATTAACTGACCAATCAGATTTTTTGCATTTGCCACTTATCTAAA AATACTTTTGTATCTCGCAGATACGTTCAGTGGTTTCCAGGACAACACCCAAAAAAAG GTATCAATGCCACTAGGCAGTCGGTTTTATTTTTGGTCACCCACGCAAAGAAGCACCC ACCTCTTTTAGGTTTTAAGTTGTGGGAACAGTAACACCGCCTAGAGCTTCAGGAAAAA CCAGTACCTGTGACCGCAATTCACCATGATGCAGAATGTTAATTTAAACGAGTGCCAA ATCAAGATTTCAACAGACAAATCAATCGATCCATAGTTACCCATTCCAGCCTTTTCGT CGTCGAGCCTGCTTCATTCCTGCCTCAGGTGCATAACTTTGCATGAAAAGTCCAGATT AGGGCAGATTTTGAGTTTAAAATAGGAAATATAAACAAATATACCGCGAAAAAGGTT TGTTTATAGCTTTTCGCCTGGTGCCGTACGGTATAAATACATACTCTCCTCCCCCCCCT GGTTCTCTTTTTCTTTTGTTACTTACATTTTACCGTTCCGT FDH1 SEQIDNO:44 AAATAAATGGCAGAAGGATCAGCCTGGACGAAGCAACCAGTTCCAACTGCTAAGTAA promoter AGAAGATGCTAGACGAAGGAGACTTCAGAGGTGAAAAGTTTGCAAGAAGAGAGCTG CGGGAAATAAATTTTCAATTTAAGGACTTGAGTGCGTCCATATTCGTGTACGTGTCCA ACTGTTTTCCATTACCTAAGAAAAACATAAAGATTAAAAAGATAAACCCAATCGGGA AACTTTAGCGTGCCGTTTCGGATTCCGAAAAACTTTTGGAGCGCCAGATGACTATGGA AAGAGGAGTGTACCAAAATGGCAAGTCGGGGGCTACTCACCGGATAGCCAATACATT CTCTAGGAACCAGGGATGAATCCAGGTTTTTGTTGTCACGGTAGGTCAAGCATTCACT TCTTAGGAATATCTCGTTGAAAGCTACTTGAAATCCCATTGGGTGCGGAACCAGCTTC TAATTAAATAGTTCGATGATGTTCTCTAAGTGGGACTCTACGGCTCAAACTTCTACAC AGCATCATCTTAGTAGTCCCTTCCCAAAACACCATTCTAGGTTTCGGAACGTAACGAA ACAATGTTCCTCTCTTCACATTGGGCCGTTACTCTAGCCTTCCGAAGAACCAATAAAA GGGACCGGCTGAAACGGGTGTGGAAACTCCTGTCCAGTTTATGGCAAAGGCTACAGA AATCCCAATCTTGTCGGGATGTTGCTCCTCCCAAACGCCATATTGTACTGCAGTTGGT GCGCATTTTAGGGAAAATTTACCCCAGATGTCCTGATTTTCGAGGGCTACCCCCAACT CCCTGTGCTTATACTTAGTCTAATTCTATTCAGTGTGCTGACCTACACGTAATGATGTC GTAACCCAGTTAAATGGCCGAAAAACTATTTAAGTAAGTTTATTTCTCCTCCAGATGA GACTCTCCTTCTTTTCTCCGCTAGTTATCAAACTATAAACCTATTTTACCTCAAATACC TCCAACATCACCCACTTAAACAGAATT FBA1 SEQIDNO:45 TGCTTAAGTAATTGAAAACAGTGTTGTGATTATATAAGCATGGTATTTGAATAGAACT promoter ACTGGGGTTAACTTATCTAGTAGGATGGAAGTTGAGGGAGATCAAGATGCTTAAAGA AAAGGATTGGCCAATATGAAAGCCATAATTAGCAATACTTATTTAATCAGATAATTGT GGGGCATTGTGACTTGACTTTTACCAGGACTTCAAACCTCAACCATTTAAACAGTTAT AGAAGACGTACCGTCACTTTTGCTTTTAATGTGATCTAAATGTGATCACATGAACTCA AACTAAAATGATATCTTTTACTGGACAAAAATGTTATCCTGCAAACAGAAAGCTTTCT TCTATTCTAAGAAGAACATTTACATTGGTGGGAAACCTGAAAACAGAAAATAAATAC TCCCCAGTGACCCTATGAGCAGGATTTTTGCATCCCTATTGTAGGCCTTTCAAACTCAC ACCTAATATTTCCCGCCACTCACACTATCAATGATCACTTCCCAGTTCTCTTCTTCCCC TATTCGTACCATGCAACCCTTACACGCCTTTTCCATTTCGGTTCGGATGCGACTTCCAG TCTGTGGGGTACGTAGCCTATTCTCTTAGCCGGTATTTAAACATACAAATTCACCCAA ATTCTACCTTGATAAGGTAATTGATTAATTTCATAAATGAATTCGCG GAP SEQIDNO:46 TTTTTGTAGAAATGTCTTGGTGTCCTCGTCCAATCAGGTAGCCATCTCTGAAATATCTG promoter GCTCCGTTGCAACTCCGAACGACCTGCTGGCAACGTAAAATTCTCCGGGGTAAAACTT AAATGTGGAGTAATGGAACCAGAAACGTCTCTTCCCTTCTCTCTCCTTCCACCGCCCG TTACCGTCCCTAGGAAATTTTACTCTGCTGGAGAGCTTCTTCTACGGCCCCCTTGCAGC AATGCTCTTCCCAGCATTACGTTGCGGGTAAAACGGAGGTCGTGTACCCGACCTAGCA GCCCAGGGATGGAAAAGTCCCGGCCGTCGCTGGCAATAATAGCGGGCGGACGCATGT CATGAGATTATTGGAAACCACCAGAATCGAATATAAAAGGCGAACACCTTTCCCAAT TTTGGTTTCTCCTGACCCAAAGACTTTAAATTTAATTTATTTGTCCCTATTTCAATCAAT TGAACAACTAT PGK SEQIDNO:47 AAATAGCAGTTTGCGGTTTCTTGATTTCATGGGGGGAACAAACAATAGTGTTGCCTTA promoter ATTCTAATTGGCATTGTTGCTTGGAATCGAAATTGGGGGATAACGTCATATCTGAAAA GTAAACAACTTCGGGAAATCAGGCTGTTTGAATGGCTTGGAAGCGAGATAGAAAGGG GATAGCGAGATAGAGGGGGCGGAGTAGACGAAGGGTGTTAAACTGCTGAAATCTCTC AATCTGGAAGAAACGGAATAAATTAACTCCTTGCGATAATAAAATCCGAGTCCGTTAT GACCCCACACCGTGTTGACCACGGCATACCCCATGGAATCTGGTACAAAGCGTCAGTC TTGAAGACACCATCACGTGTAGGAGACTGATTGTCTGACCGTCCAGCAAAAAGGGCA TTATAAATCTTGCTGTTAAAGGGGTGAGGGGAGATGCAGGTTGTTCTTTTATTCGCCTT GAACTTTTTAATTTTCCCGGGGTTGCGGAGCGTGAACAGTTAGCCCGATCTGATAGCT TGCAAGATTCAACAGTTTATCCACTACAGGTCAGAGAGATCGCCGCAGAAGAAATGC TCGTCTCGTGTTCCAGCACACATACTGGTGAAGTCGTTATTTTGCCGAAGGGGGGGTA ATAAGGTTATGCACCCCCTCTCCACACCCCAGAATCATTTTTTAGCTGGGTTCAAGGC ATTAGACTTTGCACATTTTTCCCTTAAACACCCTTGAAACGCGGATAAACAGTTGCAT GTGCATCCTAAAACTAGGTGAGATGCGTACTCCGTGCTCCGATAATAACAGTGGTGTT GGGGTTGCTGCTAGCTCACGCACTCCGTTCTTTTTTTTCAACCAGCAAAATTCGATGGG GAGAAACTTGGGGTACTTTGCCGACTCCTCCACCATGCTGGTATATAAATAATACTCG CCCACTTTTCGTTTGCTGCTTTTATATTTCATAGACTGAAAAAGACTCTTCTTCTACTTT TTCATAATATATCTCAGATATCACTACTATAG TEFg_ SEQIDNO:48 GCGATTTAAATTCGCGAAAGAACAGCCTAATAAACTCCGAAGCATGATGGCCTCTATC promoter CGGAAAACGTTAAGAGATGTGGCAACAGGAGGGCACATAGAATTTTTAAAGACGCTG AAGAATGCTATCATAGTCCGTAAAAATGTGATAGTACTTTGTTTAGTGCGTACGCCAC TTATTCGGGGCCAATAGCTAAACCCAGGTTTGCTGGCAGCAAATTCAACTGTAGATTG AATCTCTCTAACAATAATGGTGTTCAATCCCCTGGCTGGTCACGGGGAGGACTATCTT GCGTGATCCGCTTGGAAAATGTTGTGTATCCCTTTCTCAATTGCGGAAAGCATCTGCT ACTTCCCATAGGCACCAGTTACCCAATTGATATTTCCAAAAAAGATTACCATATGTTC ATCTAGAAGTATAAATACAAGTGGACATTCAATGAATATTTCATTCAATTAGTCATTG ACACTTTCATCAACTTACTACGTCTTATTCAACAATGAATTCGCG PMP20 SEQIDNO:49 ACACAGTTATTATTCATTTAAATGTCAAAACAGTAGTGATAAAAGGCTATGAAGGAG promoter GTTGTCTAGGGGCTCGCGGAGGAAAGTGATTCAAACAGACCTGCCAAAAAGAGAAAA AAGAGGGAATCCCTGTTCTTTCCAATGGAAATGACGTAACTTTAACTTGAAAAATACC CCAACCAGAAGGGTTCAAACTCAACAAGGATTGCGTAATTCCTACAAGTAGCTTAGA GCTGGGGGAGAGACAACTGAAGGCAGCTTAACGATAACGCGGGGGGATTGGTGCAC GACTCGAAAGGAGGTATCTTAGTCTTGTAACCTCTTTTTTCCAGAGGCTATTCAAGATT CATAGGCGATATCGATGTGGAGAAGGGTGAACAATATAAAAGGCTGGAGAGATGTCA ATGAAGCAGCTGGATAGATTTCAAATTTTCTAGATTTCAGAGTAATCGCACAAAACGA AGGAATCCCACCAAGACAAAAAAAAAAATTCTAAGGAATTCCGAAACG SHB17 SEQIDNO:50 AAATTCTTTTTACGTGGTGCGCATACTGGACAGAGGCAGAGTCTCAATTTCTTCTTTTG promoter AGACAGGCTACTACAGCCTGTGATTCCTCTTGGTACTTGGATTTGCTTTTATCTGGCTC CGTTGGGAACTGTGCCTGGGTTTTGAAGTATCTTGTGGATGTGTTTCTAACACTTTTTC AATCTTCTTGGAGTGAGAATGCAGGACTTTGAACATCGTCTAGCTCGTTGGTAGGTGA ACCGTTTTACCTTGCATGTGGTTAGGAGTTTTCTGGAGTAACCAAGACCGTCTTATCAT CGCCGTAAAATCGCTCTTACTGTCGCTAATAATCCCGCTGGAAGAGAAGTTCGAACAG AAGTAGCACGCAAAGCTCTTGTCAAATGAGAATTGTTAATCGTTTGACAGGTCACACT CGTGGGCTATGTACGATCAACTTGCCGGCTGTTGCTGGAGAGATGACACCAGTTGTGG CATGGCCAATTGGTATTCAGCCGTACCACTGTATGGAAAATGAGATTATCTTGTTCTT GATCTAGTTTCTTGCCATTTTAGAGTTGCCACATTCGTAGGTTTCAGTACCAATAATGG TAACTTCCAAACTTCCAACGCAGATACCAGAGATCTGCCGATCCTTCCCCAACAATAG GAGCTTACTACGCCATACATATAGCCTATCTATTTTCACTTTCGCGTGGGTGCTTCTAT ATAAACGGTTCCCCATCTTCCGTTTCATACTACTTGAATTTTAAGCACTAAAGAATT PEX8 SEQIDNO:51 AAATTAACCAGTGTTTTCTTATCTATTTGTCTTTTTACACTAAAGTGAAGTACGAATCC promoter ATGCGATTGATTCCTCCTCAGATATCAGCTGAATTCTTGCTTATGTAATACTTGCGCGA ACTACATGTGAACTTAGGATTCGATAAGGCTGGGGGGTCAACCAACCCCACTTCAAA GAGCCGACCCGTATAAATAGCCTCTGCGTCCTCAGATCAACAAGACGAAGCAATTTTT TTTTACCTATCTTCAGGTGCCTGTTAG PEX4 SEQIDNO:52 AGGGAGGCAATTAGTTGTCCTTGTGGAATCAAAAGAGCACAAGAAACCTGTGATTGA promoter AAGTCTGGGCTGTCTGGGGTTGGCAAGAAAATCATAAAGTTTATATAGTACATTTGTT AGTTGCTTCTTTGAATGACACCTTGATCTACATGTTGTTCTTCCCAGTTCCCACCGCGA AGTTTCTCTAACTCTCAATCTCTCTTTCCCCACTTGATAATCCAAAGAA TKL3 SEQIDNO:53 gtcgaggaaagggtcgtttcggggagttaaatatttttggctatgtagcagacatgtttcgacgctggcgtcgc Promoter gtcgatcggaaaatattaccccaggaacaagcacttgcttgggttagccaccaccctgcgcaagcctttttgcc ggctctacacagggccaatgaaatctgggcggaatctgaaaccgatgaaacggacgacactggcaacaagctca ctgcactattttttttttctagtgaaatagcctatcctcgtctcgctcccctcatacctgtaaaggggtgcaat ttagcctcgttccagccattcacgggccactcaacaacacgtcggctaccatggggtgcttgggcaccaaaagg cctataaataggcccccatccgtctgctacacagtcatctctgtcttttcttccc

    TABLE-US-00005 TABLE5 SignalPeptides Sequence SEQID Info NO: AminoAcidsequence SignalPeptide SEQIDNO:56 MFTPVRRRVRTAALALSAAAALVLGSTAASGASATPSPAPAP SignalPeptide SEQIDNO:57 MKLSTVLLSAGLASTTLA SignalPeptide SEQIDNO:58 MRFPSIFTAVLFAASSALA SignalPeptide SEQIDNO:59 MVSLRSIFTSSILAAGLTRAHG SignalPeptide SEQIDNO:60 MKFPVPLLFLLQLFFIIATQG SignalPeptide SEQIDNO:61 MQVKSIVNLLLACSLAVA SignalPeptide SEQIDNO:62 MQFNWNIKTVASILSALTLAQA SignalPeptide SEQIDNO:63 MYRNLIIATALTCGAYSAYVPSEPWSTLTPDASLESALKDYSQTFGIAIKSLDADKIKR SignalPeptide SEQIDNO:64 MNLYLITLLFASLCSAITLPKR SignalPeptide SEQIDNO:65 MFEKSKFVVSFLLLLQLFCVLGVHG SignalPeptide SEQIDNO:66 MQFNSVVISQLLLTLASVSMG SignalPeptide SEQIDNO:67 MKSQLIFMALASLVASAPLEHQQQHHKHEKR SignalPeptide SEQIDNO:68 MKFAISTLLIILQAAAVFA SignalPeptide SEQIDNO:69 MKLLNFLLSFVTLFGLLSGSVFA SignalPeptide SEQIDNO:70 MIFNLKTLAAVAISISQVSA SignalPeptide SEQIDNO:71 MKISALTACAVTLAGLAIAAPAPKPEDCTTTVQKRHQHKR SignalPeptide SEQIDNO:72 MSYLKISALLSVLSVALA SignalPeptide SEQIDNO:73 MLSTILNIFILLLFIQASLQ SignalPeptide SEQIDNO:74 MKLSTNLILAIAAASAVVSAAPVAPAEEAANHLHKR SignalPeptide SEQIDNO:75 MFKSLCMLIGSCLLSSVLA SignalPeptide SEQIDNO:76 MKLAALSTIALTILPVALA SignalPeptide SEQIDNO:77 MSFSSNVPQLFLLLVLLTNIVSG SignalPeptide SEQIDNO:78 MQLQYLAVLCALLLNVQSKNVVDFSRFGDAKISPDDTDLESRERKR SignalPeptide SEQIDNO:79 MKIHSLLLWNLFFIPSILG SignalPeptide SEQIDNO:80 MSTLTLLAVLLSLQNSALA SignalPeptide SEQIDNO:81 MINLNSFLILTVTLLSPALALPKNVLEEQQAKDDLAKR SignalPeptide SEQIDNO:82 MFSLAVGALLLTQAFG SignalPeptide SEQIDNO:83 MKILSALLLLFTLAFA SignalPeptide SEQIDNO:84 MKVSTTKFLAVFLLVRLVCA SignalPeptide SEQIDNO:85 MQFGKVLFAISALAVTALG SignalPeptide SEQIDNO:86 MWSLFISGLLIFYPLVLG SignalPeptide SEQIDNO:87 MRNHLNDLVVLFLLLTVAAQA SignalPeptide SEQIDNO:88 MFLKSLLSFASILTLCKA SignalPeptide SEQIDNO:89 MFVFEPVLLAVLVASTCVTA SignalPeptide SEQIDNO:90 MFSPILSLEIILALATLQSVFA SignalPeptide SEQIDNO:91 MIINHLVLTALSIALA SignalPeptide SEQIDNO:92 MLALVRISTLLLLALTASA SignalPeptide SEQIDNO:93 MRPVLSLLLLLASSVLA SignalPeptide SEQIDNO:94 MVLIQNFLPLFAYTLFFNQRAALA SignalPeptide SEQIDNO:95 MVSLTRLLITGIATALQVNA SignalPeptide SEQIDNO:96 MIFDGTTMSIAIGLLSTLGIGAEA SignalPeptide SEQIDNO:97 MVLVGLLTRLVPLVLLAGTVLLLVFVVLSGG SignalPeptide SEQIDNO:98 MLSILSALTLLGLSCA SignalPeptide SEQIDNO:99 MRLLHISLLSIISVLTKANA SignalPeptide SEQIDNO:100 MRFPSIFTAVLFAASSALAAPVNTTTEDETAQIPAEAVIGYLDLEGDFDVAVLPFSNSTNN GLLFINTTIASIAAKEEGVSLDKREAEA SignalPeptide SEDIDNO:344 MRFPSIFTAVLFAASSALAAPVQTTTEDELEGDFDVAVLPFSASIAAKEEGVSLEKREAEA SignalPeptide SEQIDNO:101 MFKSVVYSILAASLANA SignalPeptide SEQIDNO:102 MLLQAFLFLLAGFAAKISA SignalPeptide SEQIDNO:103 MASSNLLSLALFLVLLTHANS SignalPeptide SEQIDNO:104 MNIFYIFLFLLSFVQGLEHTHRRGSLVKR SignalPeptide SEQIDNO:105 MLIIVLLFLATLANSLDCSGDVFFGYTRGDKTDVHKSQALTAVKNIKR SignalPeptide SEQIDNO:106 MESVSSLFNIFSTIMVNYKSLVLALLSVSNLKYARGMPTSERQQGLEER SignalPeptide SEQIDNO:107 MFAFYFLTACISLKGVFG SignalPeptide SEQIDNO:108 MRFSTTLATAATALFFTASQVSA SignalPeptide SEQIDNO:109 MKFAYSLLLPLAGVSASVINYKR SignalPeptide SEQIDNO:110 MKFFAIAALFAAAAVAQPLEDR SignalPeptide SEQIDNO:111 MQFFAVALFATSALA SignalPeptide SEQIDNO:112 MKWVTFISLLFLFSSAYSRGVFRR SignalPeptide SEQIDNO:113 MRSLLILVLCFLPLAALG SignalPeptide SEQIDNO:114 MKVLILACLVALALA SignalPeptide SEQIDNO:115 MFNLKTILISTLASIAVA SignalPeptide SEQIDNO:116 MYRKLAVISAFLATARAQSA WT SEQIDNO:117 MRFPSIFTAVLFAASSALAAPVNTTTEDETAQIPAEAVIGYLDLEGDFDVAVLPFSNSTNN GLLFINTTIASIAAKEEGVQLDKR App3 SEQIDNO:118 MRFPPIFTAALFAASSALAAPANTTTEDETAQIPAEAVIGYLDSEGDSDVAVLPFSNSTNN GLSFINTTIASIAAKEEGVQLDKR App8 SEQIDNO:119 MRFPSIFTAVLFAASSALAAPANTTTEDETAQIPAEAVISYSDLEGDFDAAALPLSNSTNN GLSSTNTTIASIAAKEEGVQLDKR App9 SEQIDNO:120 MRPPSIFTAVLFAASSALAAPANTTTEDETTQIPAEAVATYLDLEGDVDVAVLPFSSSTN NGLSFINTTIASIAAKEEGVQLDKR App10 SEQIDNO:121 MRFPSIFTAALFAASSALAAPANTTTEGETAQTPAEAVIGYRDLEGDFDVAVLPFPNSTN NGLLFTNTTTASIAAKEEGVQLDKR appS1 SEQIDNO:122 MRFPSIFTAVLLAAPSALAAPANATTEDEAAQIPAEAVIGYLDLEGDFDAAVLPFSNSTN NGLLSINTTIASIAAKEEGVQLDKR appS4 SEQIDNO:123 MRFPSIFTAVVFAASSALAAPANTTAEDETAQIPAEAVIGYLGLEGDSDVAALPLSDSTN NGSLSTNTTIASIAAKEEGVQLDKR appS6 SEQIDNO:124 MRLPSIFTAAVFAASSALAAPANTTTEDETAQIPAEAAIGYLDLEGDSDVAVLPLSNSTN NGLLFINTTIASIAAKEEGVQLDKR appS8 SEQIDNO:125 MRFPSIFTAVLFAASSALAAPANTTTEDETAQIPAEAVIGYLDLEGDFDVAVLPFSNSTND GLSFINTTTASIAAKEEGVQLDKR a-Factor SEQIDNO:126 MRFPSIFTAVLFAASSALAAPVNTTTEDETAQIPA PpScw11p SEQIDNO:127 MLSTILNIFILLLFIQASLQAPIPVVTKYVTEGIAVV PpDse4p SEQIDNO:128 MSFSSNVPQLFLLLVLLTNIVSGAVISVWSTSKVTK PpExglp SEQIDNO:129 MNLYLITLLFASLCSAITLPKRDIIWDYSSEKIMG a-EGFP SEQIDNO:130 MRFPSIFTAVLFAASSALAAPVNTTTEDETAQIPA S-EGFP SEQIDNO:131 MLSTILNIFILLLFIQASLQEFDYKDDDDKMVSKG D-EGFP SEQIDNO:132 MSFSSNVPQLFLLLVLLTNIVSGEFDYKDDDDKMV E-EGFP SEQIDNO:133 MNLYLITLLFASLCSAEFDYKDDDDKMVSKGEELF a-CALB SEQIDNO:134 MRFPSIFTAVLFAASSALAAPVNTTTEDETAQIPA S-CALB SEQIDNO:135 MLSTILNIFILLLFIQASLQEFLPSGSDPAFSQPK D-CALB SEQIDNO:136 MSFSSNVPQLFLLLVLLTNIVSGEFLPSGSDPAFS E-CALB SEQIDNO:137 MNLYLITLLFASLCSAEFLPSGSDPAFSQPKSVLD Amylase(AA) SEQIDNO:138 MVAWWSLFLYGLQVAAPALAAEVDCSRFPNATDKEGKDVLVCNKDLRPICGTDGVTY TNDCLLCAYSIEFGTNISKEHDGECKETVPMNCSSYANTTSEDGKVMVLCNRAFNPVCG TDGVTYDNECLLCAHKVEQGASVDKRHDGGCRKELAAVSVDCSEYPKPDCTAEDRPLC GSDNKTYGNKCNFCNAVVESNGTLTLSHFGKC AlphaK(AK) SEQIDNO:139 MRFPSIFTAVLFAASSALAAPVNTTTEDELEGDFDVAVLPFSASIAAKEEGVSLEKRAEV DCSRFPNATDKEGKDVLVCNKDLRPICGTDGVTYTNDCLLCAYSIEFGTNISKEHDGECK ETVPMNCSSYANTTSEDGKVMVLCNRAFNPVCGTDGVTYDNECLLCAHKVEQGASVD KRHDGGCRKELAAVSVDCSEYPKPDCTAEDRPLCGSDNKTYGNKCNFCNAVVESNGTL TLSHFGKC AlphaT(AT) SEQIDNO:140 MRFPSIFTAVLFAASSALAAEVDCSRFPNATDKEGKDVLVCNKDLRPICGTDGVTYTND CLLCAYSIEFGTNISKEHDGECKETVPMNCSSYANTTSEDGKVMVLCNRAFNPVCGTDG VTYDNECLLCAHKVEQGASVDKRHDGGCRKELAAVSVDCSEYPKPDCTAEDRPLCGSD NKTYGNKCNFCNAVVESNGTLTLSHFGKC Glucoamyl SEQIDNO:141 MSFRSLLALSGLVCSGLAAEVDCSRFPNATDKEGKDVLVCNKDLRPICGTDGVTYTNDC (GA) LLCAYSIEFGTNISKEHDGECKETVPMNCSSYANTTSEDGKVMVLCNRAFNPVCGTDGV TYDNECLLCAHKVEQGASVDKRHDGGCRKELAAVSVDCSEYPKPDCTAEDRPLCGSDN KTYGNKCNFCNAVVESNGTLTLSHFGKC Ovomucoid SEQIDNO:144 MAMAGVFVLFSFVLCGFLPDAAFG signalpeptide Lysozyme SEQIDNO:145 MRSLLILVLCFLPLAALG signalpeptide Ovalbumin SEQIDNO:146 MRFPSIFTAVLFAASSALAAPVNTTTEDETAQIPAEAVIGYSDLEGDFDVAVLPFSNSTNN SignalPeptide GLLFINTTIASIAAKEEGVSLDKREAEA Ovotransferrin SEQIDNO:147 MKLILCTVLSLGIAAVCFA SignalPeptide Bovine SEQIDNO:148 MKLFVPALLSLGALGLCLA Lactoferrin SignalPeptide Porcine SEQIDNO:149 MKLFIPALLFLGTLGLCLA Lactoferrin SignalPeptide KidLipase SEQIDNO:150 MESKALLLLALSVWLQSLTVSHG SignalPeptide Porcine SEQIDNO:151 MLLIWTLSLLLGAVLG LipaseSignal Peptide

    TABLE-US-00006 TABLE6 ProteinsofInterest SEQID Sequence NO. Info Sequence Ovomucoid SEQIDNO: AEVDCSRFPNATDKEGKDVLVCNKDLRPICGTDGVTYTNDCLLCAYSIEFGTNISKEHDGECK (canonical) 152 ETVPMNCSSYANTTSEDGKVMVLCNRAFNPVCGTDGVTYDNECLLCAHKVEQGASVDKRH DGGCRKELAAVSVDCSEYPKPDCTAEDRPLCGSDNKTYGNKCNFCNAVVESNGTLTLSHFGK C* Ovomucoid SEQIDNO: AEVDCSRFPNATDMEGKDVLVCNKDLRPICGTDGVTYTNDCLLCAYSVEFGTNISKEHDGEC 153 KETVPMNCSSYANTTSEDGKVMVLCNRAFNPVCGTDGVTYDNECLLCAHKVEQGASVDKR HDGGCRKELAAVSVDCSEYPKPDCTAEDRPLCGSDNKTYGNKCNFCNAVVESNGTLTLSHFG KC* Ovomucoid SEQIDNO: AEVDCSRFPNATDMEGKDVLVCNKDLRPICGTDGVTYTNDCLLCAYSVEFGTNISKEHDGEC G162M 154 KETVPMNCSSYANTTSEDGKVMVLCNRAFNPVCGTDGVTYDNECLLCAHKVEQGASVDKR F167A HDGGCRKELAAVSVDCSEYPKPDCTAEDRPLCGSDNKTYMNKCNACNAVVESNGTLTLSHF GKC* Ovomucoid SEQIDNO: MAMAGVFVLFSFVLCGFLPDAAFGAEVDCSRFPNATDKEGKDVLVCNKDLRPICGTDGVTYT isoform1 155 NDCLLCAYSIEFGTNISKEHDGECKETVPMNCSSYANTTSEDGKVMVLCNRAFNPVCGTDGV precursorfull TYDNECLLCAHKVEQGASVDKRHDGGCRKELAAVSVDCSEYPKPDCTAEDRPLCGSDNKTY length GNKCNFCNAVVESNGTLTLSHFGKC Ovomucoid SEQIDNO: MAMAGVFVLFSFVLCGFLPDAVFGAEVDCSRFPNATDMEGKDVLVCNKDLRPICGTDGVTY [Gallusgallus] 156 TNDCLLCAYSVEFGTNISKEHDGECKETVPMNCSSYANTTSEDGKVMVLCNRAFNPVCGTDG VTYDNECLLCAHKVEQGASVDKRHDGGCRKELAAVSVDCSEYPKPDCTAEDRPLCGSDNKT YGNKCNFCNAVVESNGTLTLSHFGKC Ovomucoid SEQIDNO: MAMAGVFVLFSFVLCGFLPDAAFGAEVDCSRFPNATDKEGKDVLVCNKDLRPICGTDGVTYT isoform2 157 NDCLLCAYSIEFGTNISKEHDGECKETVPMNCSSYANTTSEDGKVMVLCNRAFNPVCGTDGV precursor TYDNECLLCAHKVEQGASVDKRHDGGCRKELAAVDCSEYPKPDCTAEDRPLCGSDNKTYGN [Gallusgallus] KCNFCNAVVESNGTLTLSHFGKC Ovomucoid SEQIDNO: AEVDCSRFPNATDKEGKDVLVCNKDLRPICGTDGVTYNNECLLCAYSIEFGTNISKEHDGECK [Gallusgallus] 158 ETVPMNCSSYANTTSEDGKVMVLCNRAFNPVCGTDGVTYDNECLLCAHKVEQGASVDKRH DGECRKELAAVSVDCSEYPKPDCTAEDRPLCGSDNKTYGNKCNFCNAVVESNGTLTLSHFGK C Ovomucoid SEQIDNO: MAMAGVFVLFSFALCGFLPDAAFGVEVDCSRFPNATNEEGKDVLVCTEDLRPICGTDGVTYS [Numida 159 NDCLLCAYNIEYGTNISKEHDGECREAVPVDCSRYPNMTSEEGKVLILCNKAFNPVCGTDGVT meleagris] YDNECLLCAHNVEQGTSVGKKHDGECRKELAAVDCSEYPKPACTMEYRPLCGSDNKTYDNK CNFCNAVVESNGTLTLSHFGKC PREDICTED: SEQIDNO: MQTITWRQPQGDHLRSRAPAATCRAGQYLTMAMAGIFVLFSFALCGFLPDAAFGVEVDCSRF Ovomucoid 160 PNTTNEEGKDVLVCTEDLRPICGTDGVTHSECLLCAYNIEYGTNISKEHDGECREAVPMDCSR isoformX1 YPNTTNEEGKVMILCNKALNPVCGTDGVTYDNECVLCAHNLEQGTSVGKKHDGGCRKELAA [Meleagris VSVDCSEYPKPACTLEYRPLCGSDNKTYGNKCNFCNAVVESNGTLTLSHFGKC gallopavo] Ovomucoid SEQIDNO: VEVDCSRFPNTTNEEGKDVLVCTEDLRPICGTDGVTHSECLLCAYNIEYGTNISKEHDGECRE [Meleagris 161 AVPMDCSRYPNTTSEEGKVMILCNKALNPVCGTDGVTYDNECVLCAHNLEQGTSVGKKHDG gallopavo] ECRKELAAVSVDCSEYPKPACTLEYRPLCGSDNKTYGNKCNFCNAVVESNGTLTLSHFGKC PREDICTED: SEQIDNO: MQTITWRQPQGDHLRSRAPAATCRAGQYLTMAMAGIFVLFSFALCGFLPDAAFGVEVDCSRF Ovomucoid 162 PNTTNEEGKDVLVCTEDLRPICGTDGVTHSECLLCAYNIEYGTNISKEHDGECREAVPMDCSR isoformX2 YPNTTNEEGKVMILCNKALNPVCGTDGVTYDNECVLCAHNLEQGTSVGKKHDGGCRKELAA [Meleagris VDCSEYPKPACTLEYRPLCGSDNKTYGNKCNFCNAVVESNGTLTLSHFGKC gallopavo] Ovomucoid SEQIDNO: EYGTNISIKHNGECKETVPMDCSRYANMTNEEGKVMMPCDRTYNPVCGTDGVTYDNECQLC [Bambusicola 163 AHNVEQGTSVDKKHDGVCGKELAAVSVDCSEYPKPECTAEERPICGSDNKTYGNKCNFCNA thoracicus] VVYVQP Ovomucoid SEQIDNO: VDCSRFPNTTNEEGKDVLACTKELHPICGTDGVTYSNECLLCYYNIEYGTNISKEHDGECTEA [Callipepla 164 VPVDCSRYPNTTSEEGKVLIPCNRDFNPVCGSDGVTYENECLLCAHNVEQGTSVGKKHDGGC squamata] RKEFAAVSVDCSEYPKPDCTLEYRPLCGSDNKTYASKCNFCNAVVIWEQEKNTRHHASHSVF FISARLVC Ovomucoid SEQIDNO: MLPLGLREYGTNTSKEHDGECTEAVPVDCSRYPNTTSEEGKVRILCKKDINPVCGTDGVTYD [Colinus 165 NECLLCSHSVGQGASIDKKHDGGCRKEFAAVSVDCSEYPKPACMSEYRPLCGSDNKTYVNKC virginianus] NFCNAVVYVQPWLHSRCRLPPTGTSFLGSEGRETSLLTSRATDLQVAGCTAISAMEATRAAAL LGLVLLSSFCELSHLCFSQASCDVYRLSGSRNLACPRIFQPVCGTDNVTYPNECSLCRQMLRSR AVYKKHDGRCVKVDCTGYMRATGGLGTACSQQYSPLYATNGVIYSNKCTFCSAVANGEDID LLAVKYPEEESWISVSPTPWRMLSAGA Ovomucoid- SEQIDNO: MSWWGIKPALERPSQEQSTSGQPVDSGSTSTTTMAGIFVLLSLVLCCFPDAAFGVEVDCSRFP likeisoform 166 NTTNEEGKEVLLCTKDLSPICGTDGVTYSNECLLCAYNIEYGTNISKDHDGECKEAVPVDCST X2 YPNMTNEEGKVMLVCNKMFSPVCGTDGVTYDNECMLCAHNVEQGTSVGKKYDGKCKKEV [Ansercygnoides ATVDCSDYPKPACTVEYMPLCGSDNKTYDNKCNFCNAVVDSNGTLTLSHFGKC domesticus] Ovomucoid- SEQIDNO: MSSQNQLHRRRRPLPGGQDLNKYYWPHCTSDRFSWLLHVTAEQFRHCVCIYLQPALERPSQE likeisoform 167 QSTSGQPVDSGSTSTTTMAGIFVLLSLVLCCFPDAAFGVEVDCSRFPNTTNEEGKEVLLCTKDL X1 SPICGTDGVTYSNECLLCAYNIEYGTNISKDHDGECKEAVPVDCSTYPNMTNEEGKVMLVCN [Ansercygnoides KMFSPVCGTDGVTYDNECMLCAHNVEQGTSVGKKYDGKCKKEVATVDCSDYPKPACTVEY domesticus] MPLCGSDNKTYDNKCNFCNAVVDSNGTLTLSHFGKC Ovomucoid SEQIDNO: VEVDCSRFPNTTNEEGKDEVVCPDELRLICGTDGVTYNHECMLCFYNKEYGTNISKEQDGEC [Coturnix 168 GETVPMDCSRYPNTTSEDGKVTILCTKDFSFVCGTDGVTYDNECMLCAHNVVQGTSVGKKH japonica] DGECRKELAAVSVDCSEYPKPACPKDYRPVCGSDNKTYSNKCNFCNAVVESNGTLTLNHFGK C Ovomucoid SEQIDNO: MAMAGVFLLFSFALCGFLPDAAFGVEVDCSRFPNTTNEEGKDEVVCPDELRLICGTDGVTYN [Coturnix 169 HECMLCFYNKEYGTNISKEQDGECGETVPMDCSRYPNTTSEDGKVTILCTKDFSFVCGTDGVT japonica] YDNECMLCAHNIVQGTSVGKKHDGECRKELAAVSVDCSEYPKPACPKDYRPVCGSDNKTYS NKCNFCNAVVESNGTLTLNHFGKC Ovomucoid SEQIDNO: MAGVFVLLSLVLCCFPDAAFGVEVDCSRFPNTTNEEGKDVLLCTKELSPVCGTDGVTYSNEC [Anas 170 LLCAYNIEYGTNISKDHDGECKEAVPADCSMYPNMTNEEGKMTLLCNKMFSPVCGTDGVTY platyrhynchos] DNECMLCAHNVEQGTSVGKKYDGKCKKEVATVDCSGYPKPACTMEYMPLCGSDNKTYGNK CNFCNAVVDSNGTLTLSHFGEC Ovomucoid, SEQIDNO: QVDCSRFPNTTNEEGKEVLLCTKELSPVCGTDGVTYSNECLLCAYNIEYGTNISKDHDGECKE partial[Anas 171 AVPADCSMYPNMTNEEGKMTLLCNKMFSPVCGTDGVTYDNECMLCAHNVEQGTSVGKKYD platyrhynchos] GKCKKEVATVSVDCSGYPKPACTMEYMPLCGSDNKTYGNKCNFCNAVV Ovomucoid- SEQIDNO: MTMPGAFVVLSFVLCCFPDATFGVEVDCSTYPNTTNEEGKEVLVCSKILSPICGTDGVTYSNE like[Tyto 172 CLLCANNIEYGTNISKYHDGECKEFVPVNCSRYPNTTNEEGKVMLICNKDLSPVCGTDGVTYD alba] NECLLCAHNLEPGTSVGKKYDGECKKEIATVDCSDYPKPVCSLESMPLCGSDNKTYSNKCNF CNAVVDSNETLTLSHFGKC Ovomucoid SEQIDNO: MTMAGVFVLLSFALCCFPDAAFGVEVDCSTYPNTTNEEGKEVLVCTKILSPICGTDGVTYSNE [Balearica 173 CLLCAYNIEYGTNVSKDHDGECKEVVPVDCSRYPNSTNEEGKVVMLCSKDLNPVCGTDGVT regulorum YDNECVLCAHNVESGTSVGKKYDGECKKETATVDCSDYPKPACTLEYMPFCGSDSKTYSNK gibbericeps] CNFCNAVVDSNGTLTLSHFGKC Turkey SEQIDNO: MTTAGVFVLLSFALCSFPDAAFGVEVDCSTYPNTTNEEGKEVLVCTKILSPICGTDGVTYSNEC vulture 174 LLCAYNIEYGTNVSKDHDGECKEFVPVDCSRYPNTTNEDGKVVLLCNKDLSPICGTDGVTYD [Cathartes NECLLCARNLEPGTSVGKKYDGECKKEIATVDCSDYPKPVCSLEYMPLCGSDSKTYSNKCNF aura]OVD CNAVVDSNGTLTLSHFGKC (native sequence) Ovomucoid- SEQIDNO: MTTAGVFVLLSFTLCSFPDAAFGVEVDCSPYPNTTNEEGKEVLVCNKILSPICGTDGVTYSNEC like[Cuculus 175 LLCAYNLEYGTNISKDYDGECKEVAPVDCSRHPNTTNEEGKVELLCNKDLNPICGTNGVTYD canorus] NECLLCARNLESGTSIGKKYDGECKKEIATVDCSDYPKPVCTLEEMPLCGSDNKTYGNKCNFC NAVVDSNGTLTLSHFGKC Ovomucoid SEQIDNO: MTTAVVFVLLSFALCCFPDAAFGVEVDCSTYPNSTNEEGKDVLVCPKILGPICGTDGVTYSNE [Antrostomus 176 CLLCAYNIQYGTNVSKDHDGECKEIVPVDCSRYPNTTNEEGKVVFLCNKNFDPVCGTDGDTY carolinensis] DNECMLCARSLEPGTTVGKKHDGECKREIATVDCSDYPKPTCSAEDMPLCGSDSKTYSNKCN FCNAVVDSNGTLTLSRFGKC Ovomucoid SEQIDNO: MTMTGVFVLLSFAICCFPDAAFGVEVDCSTYPNTTNEEGKEVLVCTKILSPICGTDGVTYSNEC [Cariama 177 LLCAYNIEYGTNVSKDHDGECKEVVPVDCSKYPNTTNEEGKVVLLCSKDLSPVCGTDGVTYD cristata] NECLLCARNLEPGSSVGKKYDGECKKEIATIDCSDYPKPVCSLEYMPLCGSDSKTYDNKCNFC NAVVDSNGTLTLSHFGKC Ovomucoid- SEQIDNO: MTTAGVFVLLSFVLCCFPDAVFGVEVDCSTYPNTTNEEGKEVLVCTKILSPICGTDGVTYSNE likeisoform 178 CLLCAYNIEYGTNVSKDHDGECKEVVPVNCSRYPNTTNEEGKVVLRCSKDLSPVCGTDGVTY X2 DNECLMCARNLEPGAVVGKNYDGECKKEIATVDCSDYPKPVCSLEYMPLCGSDSKTYSNKC [Pygoscelis NFCNAVVDSNGTLTLSHFGKC adeliae] Ovomucoid- SEQIDNO: MTTAGVFVLLSIALCCFPDAAFGVEVDCSAYSNTTSEEGKEVLSCTKILSPICGTDGVTYSNEC like[Nipponia 179 LLCAYNIEYGTNISKDHDGECKEVVSVDCSRYPNTTNEEGKAVLLCNKDLSPVCGTDGVTYD nippon] NECLLCAHNLEPGTSVGKKYDGACKKEIATVDCSDYPKPVCTLEYLPLCGSDSKTYSNKCDF CNAVVDSNGTLTLSHFGKC Ovomucoid- SEQIDNO: MTTAGVFVLLSFALCCFPDAAFGVEVDCSTYPNTTNEEGKEVLVCTKILSPICGTDGTTYSNEC like[Phaethon 180 LLCAYNIEYGTNVSKDHDGECKVVPVDCSKYPNTTNEDGKVVLLCNKALSPICGTDRVTYDN lepturus] ECLMCAHNLEPGTSVGKKHDGECQKEVATVDCSDYPKPVCSLEYMPLCGSDGKTYSNKCNF CNAVVNSNGTLTLSHFEKC Ovomucoid- SEQIDNO: MTTAGVFVLLSFVLCCFFPDAAFGVEVDCSTYPNTTNEEGKEVLVCAKILSPVCGTDGVTYSN likeisoform 181 ECLLCAHNIENGTNVGKDHDGKCKEAVPVDCSRYPNTTDEEGKVVLLCNKDVSPVCGTDGV X1 TYDNECLLCAHNLEAGTSVDKKNDSECKTEDTTLAAVSVDCSDYPKPVCTLEYLPLCGSDNK [Melopsittacus TYSNKCRFCNAVVDSNGTLTLSRFGKC undulatus] Ovomucoid SEQIDNO: MTTAGVFVLLSFALCCSPDAAFGVEVDCSTYPNTTNEEGKEVLACTKILSPICGTDGVTYSNE [Podiceps 182 CLLCAYNMEYGTNVSKDHDGKCKEVVPVDCSRYPNTTNEEGKVVLLCNKDLSPVCGTDGVT cristatus] YDNECLLCARNLEPGASVGKKYDGECKKEIATVDCSDYPKPVCSLEHMPLCGSDSKTYSNKC TFCNAVVDSNGTLTLSHFGKC Ovomucoid- SEQIDNO: MTTAGVFVLLSFALCCFPDAAFGVEVDCSTYPNTTNEEGREVLVCTKILSPICGTDGVTYSNEC like[Fulmarus 183 LLCAYNIEYGTNVSKDHDGECKEVAPVGCSRYPNTTNEEGKVVLLCNKDLSPVCGTDGVTYD glacialis] NECLLCARHLEPGTSVGKKYDGECKKEIATVDCSDYPKPVCSLEYMPLCGSDSKTYSNKCNF CNAVLDSNGTLTLSHFGKC Ovomucoid SEQIDNO: MTTAGVFVLLSFALCCFPDAVFGVEVDCSTYPNTTNEEGKEVLVCTKILSPICGTDGVTYSNE [Aptenodytes 184 CLLCAYNIEYGTNVSKDHDGECKEVVPVDCSRYPNTTNEEGKVVLRCNKDLSPVCGTDGVTY forsteri] DNECLMCARNLEPGAIVGKKYDGECKKEIATVDCSDYPKPVCSLEYMPLCGSDSKTYSNKCN FCNAVVDSNGTLILSHFGKC Ovomucoid- SEQIDNO: MTTAGVFVLLSFVLCCFPDAVFGVEVDCSTYPNTTNEEGKEVLVCTKILSPICGTDGVTYSNE likeisoform 185 CLLCAYNIEYGTNVSKDHDGECKEVVPVDCSRYPNTTNEEGKVVLRCSKDLSPVCGTDGVTY X1 DNECLMCARNLEPGAVVGKNYDGECKKEIATVDCSDYPKPVCSLEYMPLCGSDSKTYSNKC [Pygoscelis NFCNAVVDSNGTLTLSHFGKC adeliae] Ovomucoid SEQIDNO: MSSQNQLPSRCRPLPGSQDLNKYYQPHCTGDRFCWLFYVTVEQFRHCICIYLQLALERPSHEQ isoformX1 186 SGQPADSRNTSTMTTAGVFVLLSFALCCFPDAVFGVEVDCSTYPNTTNEEGKEVLVCTKILSPI [Aptenodytes CGTDGVTYSNECLLCAYNIEYGTNVSKDHDGECKEVVPVDCSRYPNTTNEEGKVVLRCNKD forsteri] LSPVCGTDGVTYDNECLMCARNLEPGAIVGKKYDGECKKEIATVDCSDYPKPVCSLEYMPLC GSDSKTYSNKCNFCNAVVDSNGTLILSHFGKC Ovomucoid, SEQIDNO: MTTAVVFVLLSFALCCFPDAAFGVEVDCSTYPNSTNEEGKDVLVCPKILGPICGTDGVTYSNE partial 187 CLLCAYNIQYGTNVSKDHDGECKEIVPVDCSRYPNTTNEEGKVVFLCNKNFDPVCGTDGDTY [Antrostomus DNECMLCARSLEPGTTVGKKHDGECKREIATVDCSDYPKPTCSAEDMPLCGSDSKTYSNKCN carolinensis] FCNAVV rOVDas SEQIDNO: EAEAAEVDCSRFPNATDKEGKDVLVCNKDLRPICGTDGVTYTNDCLLCAYSIEFGTNISKEHD expressedin 188 GECKETVPMNCSSYANTTSEDGKVMVLCNRAFNPVCGTDGVTYDNECLLCAHKVEQGASVD pichia KRHDGGCRKELAAVSVDCSEYPKPDCTAEDRPLCGSDNKTYGNKCNFCNAVVESNGTLTLS secretedform HFGKC 1 rOVDas SEQIDNO: EEGVSLEKREAEAAEVDCSRFPNATDKEGKDVLVCNKDLRPICGTDGVTYTNDCLLCAYSIEF expressedin 189 GTNISKEHDGECKETVPMNCSSYANTTSEDGKVMVLCNRAFNPVCGTDGVTYDNECLLCAH pichia KVEQGASVDKRHDGGCRKELAAVSVDCSEYPKPDCTAEDRPLCGSDNKTYGNKCNFCNAVV secretedform ESNGTLTLSHFGKC 2 rOVD[gallus] SEQIDNO: MRFPSIFTAVLFAASSALAAPVNTTTEDETAQIPAEAVIGYSDLEGDFDVAVLPFSNSTNNGLL coding 190 FINTTIASIAAKEEGVSLEKREAEAAEVDCSRFPNATDKEGKDVLVCNKDLRPICGTDGVTYTN sequence DCLLCAYSIEFGTNISKEHDGECKETVPMNCSSYANTTSEDGKVMVLCNRAFNPVCGTDGVT containingan YDNECLLCAHKVEQGASVDKRHDGGCRKELAAVSVDCSEYPKPDCTAEDRPLCGSDNKTYG alphamating NKCNFCNAVVESNGTLTLSHFGKC factorsignal sequence (bolded)as expressedin pichia Turkey SEQIDNO: MRFPSIFTAVLFAASSALAAPVNTTTEDETAQIPAEAVIGYSDLEGDFDVAVLPFSNSTNNGLL vultureOVD 191 FINTTIASIAAKEEGVSLEKREAEAVEVDCSTYPNTTNEEGKEVLVCTKILSPICGTDGVTYSNE coding CLLCAYNIEYGTNVSKDHDGECKEFVPVDCSRYPNTTNEDGKVVLLCNKDLSPICGTDGVTY sequence DNECLLCARNLEPGTSVGKKYDGECKKEIATVDCSDYPKPVCSLEYMPLCGSDSKTYSNKCN containing FCNAVVDSNGTLTLSHFGKC secretion signalsas expressedin pichiabolded isanalpha matingfactor signal sequence Turkey SEQIDNO: EAEAVEVDCSTYPNTTNEEGKEVLVCTKILSPICGTDGVTYSNECLLCAYNIEYGTNVSKDHD vultureOVD 192 GECKEFVPVDCSRYPNTTNEDGKVVLLCNKDLSPICGTDGVTYDNECLLCARNLEPGTSVGK insecreted KYDGECKKEIATVDCSDYPKPVCSLEYMPLCGSDSKTYSNKCNFCNAVVDSNGTLTLSHFGK form C expressedin Pichia Hummingbird SEQIDNO: MTMAGVFVLLSFILCCFPDTAFGVEVDCSIYPNTTSEEGKEVLVCTETLSPICGSDGVTYNNEC OVD(native 193 QLCAYNVEYGTNVSKDHDGECKEIVPVDCSRYPNTTEEGRVVMLCNKALSPVCGTDGVTYD sequence) NECLLCARNLESGTSVGKKFDGECKKEIATVDCTDYPKPVCSLDYMPLCGSDSKTYSNKCNF CNAVMDSNGTLTLNHFGKC Hummingbird SEQIDNO: MRFPSIFTAVLFAASSALAAPVNTTTEDETAQIPAEAVIGYSDLEGDFDVAVLPFSNSTNNGLL OVDcoding 194 FINTTIASIAAKEEGVSLDKREAEAVEVDCSIYPNTTSEEGKEVLVCTETLSPICGSDGVTYNNE sequenceas CQLCAYNVEYGTNVSKDHDGECKEIVPVDCSRYPNTTEEGRVVMLCNKALSPVCGTDGVTY expressedin DNECLLCARNLESGTSVGKKFDGECKKEIATVDCTDYPKPVCSLDYMPLCGSDSKTYSNKCN Pichia FCNAVMDSNGTLTLNHFGKC Hummingbird SEQIDNO: EAEAVEVDCSIYPNTTSEEGKEVLVCTETLSPICGSDGVTYNNECQLCAYNVEYGTNVSKDHD OVDin 195 GECKEIVPVDCSRYPNTTEEGRVVMLCNKALSPVCGTDGVTYDNECLLCARNLESGTSVGKK secretedform FDGECKKEIATVDCTDYPKPVCSLDYMPLCGSDSKTYSNKCNFCNAVMDSNGTLTLNHFGKC fromPichia Ovalbumin SEQIDNO: MFFYNTDFRMGSISAANAEFCFDVFNELKVQHTNENILYSPLSIIVALAMVYMGARGNTEYQ relatedprotein 196 MEKALHFDSIAGLGGSTQTKVQKPKCGKSVNIHLLFKELLSDITASKANYSLRIANRLYAEKSR X PILPIYLKCVKKLYRAGLETVNFKTASDQARQLINSWVEKQTEGQIKDLLVSSSTDLDTTLVLV NAIYFKGMWKTAFNAEDTREMPFHVTKEESKPVQMMCMNNSFNVATLPAEKMKILELPFAS GDLSMLVLLPDEVSGLERIEKTINFEKLTEWTNPNTMEKRRVKVYLPQMKIEEKYNLTSVLM ALGMTDLFIPSANLTGISSAESLKISQAVHGAFMELSEDGIEMAGSTGVIEDIKHSPELEQFRAD HPFLFLIKHNPTNTIVYFGRYWSP* Ovalbumin SEQIDNO: MDSISVTNAKFCFDVFNEMKVHHVNENILYCPLSILTALAMVYLGARGNTESQMKKVLHFDS relatedprotein 197 ITGAGSTTDSQCGSSEYVHNLFKELLSEITRPNATYSLEIADKLYVDKTFSVLPEYLSCARKFYT Y GGVEEVNFKTAAEEARQLINSWVEKETNGQIKDLLVSSSIDFGTTMVFINTIYFKGIWKIAFNT EDTREMPFSMTKEESKPVQMMCMNNSFNVATLPAEKMKILELPYASGDLSMLVLLPDEVSGL ERIEKTINFDKLREWTSTNAMAKKSMKVYLPRMKIEEKYNLTSILMALGMTDLFSRSANLTGI SSVDNLMISDAVHGVFMEVNEEGTEATGSTGAIGNIKHSLELEEFRADHPFLFFIRYNPTNAILF FGRYWSP* Ovalbumin SEQIDNO: MGSIGAASMEFCFDVFKELKVHHANENIFYCPIAIMSALAMVYLGAKDSTRTQINKVVRFDKL 198 PGFGDSIEAQCGTSVNVHSSLRDILNQITKPNDVYSFSLASRLYAEERYPILPEYLQCVKELYRG GLEPINFQTAADQARELINSWVESQINGIIRNVLQPSSVDSQTAMVLVNAIVFKGLWEKAFKD EDTQAMPFRVTEQESKPVQMMYQIGLFRVASMASEKMKILELPFASGTMSMLVLLPDEVSGL EQLESIINFEKLTEWTSSNVMEERKIKVYLPRMKMEEKYNLTSVLMAMGITDVFSSSANLSGIS SAESLKISQAVHAAHAEINEAGREVVGSAEAGVDAASVSEEFRADHPFLFCIKHIATNAVLFFG RCVSP* Chicken SEQIDNO: MRFPSIFTAVLFAASSALAAPVNTTTEDETAQIPAEAVIGYSDLEGDFDVAVLPFSNSTNNGLL Ovalbumin 199 FINTTIASIAAKEEGVSLDKREAEAGSIGAASMEFCFDVFKELKVHHANENIFYCPIAIMSALAM withbolded VYLGAKDSTRTQINKVVRFDKLPGFGDSIEAQCGTSVNVHSSLRDILNQITKPNDVYSFSLASR signal LYAEERYPILPEYLQCVKELYRGGLEPINFQTAADQARELINSWVESQINGIIRNVLQPSSVDS sequence QTAMVLVNAIVFKGLWEKAFKDEDTQAMPFRVTEQESKPVQMMYQIGLFRVASMASEKMKI LELPFASGTMSMLVLLPDEVSGLEQLESIINFEKLTEWTSSNVMEERKIKVYLPRMKMEEKYN LTSVLMAMGITDVFSSSANLSGISSAESLKISQAVHAAHAEINEAGREVVGSAEAGVDAASVS EEFRADHPFLFCIKHIATNAVLFFGRCVSP ChickenOVA SEQIDNO: EAEAGSIGAASMEFCFDVFKELKVHHANENIFYCPIAIMSALAMVYLGAKDSTRTQINKVVRF sequenceas 200 DKLPGFGDSIEAQCGTSVNVHSSLRDILNQITKPNDVYSFSLASRLYAEERYPILPEYLQCVKEL secretedfrom YRGGLEPINFQTAADQARELINSWVESQTNGIIRNVLQPSSVDSQTAMVLVNAIVFKGLWEKA pichia FKDEDTQAMPFRVTEQESKPVQMMYQIGLFRVASMASEKMKILELPFASGTMSMLVLLPDEV SGLEQLESIINFEKLTEWTSSNVMEERKIKVYLPRMKMEEKYNLTSVLMAMGITDVFSSSANL SGISSAESLKISQAVHAAHAEINEAGREVVGSAEAGVDAASVSEEFRADHPFLFCIKHIATNAV LFFGRCVSP Predicted SEQIDNO: MRVPAQLLGLLLLWLPGARCGSIGAASMEFCFDVFKELKVHHANENIFYCPIAIMSALAMVY Ovalbumin 201 LGAKDSTRTQINKVVRFDKLPGFGDSIEAQCGTSVNVHSSLRDILNQITKPNDVYSFSLASRLY [Achromobacter AEERYPILPEYLQCVKELYRGGLEPINFQTAADQARELINSWVESQTNGIIRNVLQPSSVDSQT denitrificans] AMVLVNAIVFKGLWEKAFKDEDTQAMPFRVTEQESKPVQMMYQIGLFRVASMASEKMKILE LPFASGTMSMLVLLPDEVSGLEQLESIINFEKLTEWTSSNVMEERKIKVYLPRMKMEEKYNLT SVLMAMGITDVFSSSANLSGISSAESLKISQAVHAAHAEINEAGREVVGSAEAGVDAASVSEEF RADHPFLFCIKHIATNAVLFFGRCVSPLEIKRAAAHHHHHH OLLAS SEQIDNO: MTSGFANELGPRLMGKLTMGSIGAASMEFCFDVFKELKVHHANENIFYCPIAIMSALAMVYL epitope- 202 GAKDSTRTQINKVVRFDKLPGFGDSIEAQCGTSVNVHSSLRDILNQITKPNDVYSFSLASRLYA tagged EERYPILPEYLQCVKELYRGGLEPINFQTAADQARELINSWVESQTNGIIRNVLQPSSVDSQTA ovalbumin MVLVNAIVFKGLWEKTFKDEDTQAMPFRVTEQESKPVQMMYQIGLFRVASMASEKMKILEL PFASGTMSMLVLLPDEVSGLEQLESIINFEKLTEWTSSNVMEERKIKVYLPRMKMEEKYNLTS VLMAMGITDVFSSSANLSGISSAESLKISQAVHAAHAEINEAGREVVGSAEAGVDAASVSEEF RADHPFLFCIKHIATNAVLFFGRCVSPSR Serpinfamily SEQIDNO: MGGRRVRWEVYISRAGYVNRQIAWRRHHRSLTMRVPAQLLGLLLLWLPGARCGSIGAASME protein 203 FCFDVFKELKVHHANENIFYCPIAIMSALAMVYLGAKDSTRTQINKVVRFDKLPGFGDSIEAQ [Achromobacter CGTSVNVHSSLRDILNQITKPNDVYSFSLASRLYAEERYPILPEYLQCVKELYRGGLEPINFQTA denitrificans] ADQARELINSWVESQTNGIIRNVLQPSSVDSQTAMVLVNAIVFKGLWEKAFKDEDTQAMPFR VTEQESKPVQMMYQIGLFRVASMASEKMKILELPFASGTMSMLVLLPDEVSGLEQLESIINFE KLTEWTSSNVMEERKIKVYLPRMKMEEKYNLTSVLMAMGITDVFSSSANLSGISSAESLKISQ AVHAAHAEINEAGREVVGSAEAGVDAASVSEEFRADHPFLFCIKHIATNAVLFFGRCVSPLEIK RAAAHHHHHH PREDICTED: SEQIDNO: MGSIGAVSMEFCFDVFKELKVHHANENIFYSPFTIISALAMVYLGAKDSTRTQINKVVRFDKLP ovalbumin 204 GFGDSVEAQCGTSVNVHSSLRDILNQITKPNDVYSFSLASRLYAEETYPILPEYLQCVKELYRG isoformX1 GLESINFQTAADQARGLINSWVESQTNGMIKNVLQPSSVDSQTAMVLVNAIVFKGLWEKAFK [Meleagris DEDTQAIPFRVTEQESKPVQMMYQIGLFKVASMASEKMKILELPFASGTMSMWVLLPDEVSG gallopavo] LEQLETTISFEKMTEWISSNIMEERRIKVYLPRMKMEEKYNLTSVLMAMGITDLFSSSANLSGI SSAGSLKISQAVHAAYAEIYEAGREVIGSAEAGADATSVSEEFRVDHPFLYCIKHNLTNSILFFG RCISP Ovalbumin SEQIDNO: MGSIGAVSMEFCFDVFKELKVHHANENIFYSPFTIISALAMVYLGAKDSTRTQINKVVRFDKLP precursor 205 GFGDSVEAQCGTSVNVHSSLRDILNQITKPNDVYSFSLASRLYAEETYPILPEYLQCVKELYRG [Meleagris GLESINFQTAADQARGLINSWVESQTNGMIKNVLQPSSVDSQTAMVLVNAIVFKGLWEKAFK gallopavo] DEDTQAIPFRVTEQESKPVQMMYQIGLFKVASMASEKMKILELPFASGTMSMWVLLPDEVSG LEQLETTISFEKMTEWISSNIMEERRIKVYLPRMKMEEKYNLTSVLMAMGITDLFSSSANLSGI SSAGSLKISQAAHAAYAEIYEAGREVIGSAEAGADATSVSEEFRVDHPFLYCIKHNLTNSILFFG RCISP Hypothetical SEQIDNO: YYRVPCMVLCTAFHPYIFIVLLFALDNSEFTMGSIGAVSMEFCFDVFKELRVHHPNENIFFCPF protein 206 AIMSAMAMVYLGAKDSTRTQINKVIRFDKLPGFGDSTEAQCGKSANVHSSLKDILNQITKPND [Bambusicola VYSFSLASRLYADETYSIQSEYLQCVNELYRGGLESINFQTAADQARELINSWVESQINGIIRN thoracicus] VLQPSSVDSQTAMVLVNAIVFRGLWEKAFKDEDTQTMPFRVTEQESKPVQMMYQIGSFKVAS MASEKMKILELPLASGTMSMLVLLPDEVSGLEQLETTISFEKLTEWTSSNVMEERKIKVYLPR MKMEEKYNLTSVLMAMGITDLFRSSANLSGISLAGNLKISQAVHAAHAEINEAGRKAVSSAE AGVDATSVSEEFRADRPFLFCIKHIATKVVFFFGRYTSP Eggalbumin SEQIDNO: MGSIGAASMEFCFDVFKELKVHHANDNMLYSPFAILSTLAMVFLGAKDSTRTQINKVVHFDK 207 LPGFGDSIEAQCGTSVNVHSSLRDILNQITKQNDAYSFSLASRLYAQETYTVVPEYLQCVKELY RGGLESVNFQTAADQARGLINAWVESQINGIIRNILQPSSVDSQTAMVLVNAIAFKGLWEKAF KAEDTQTIPFRVTEQESKPVQMMYQIGSFKVASMASEKMKILELPFASGTMSMLVLLPDDVSG LEQLESIISFEKLTEWTSSSIMEERKVKVYLPRMKMEEKYNLTSLLMAMGITDLFSSSANLSGIS SVGSLKISQAVHAAHAEINEAGRDVVGSAEAGVDATEEFRADHPFLFCVKHIETNAILLFGRC VSP Ovalbumin SEQIDNO: MASIGAVSTEFCVDVYKELRVHHANENIFYSPFTIISTLAMVYLGAKDSTRTQINKVVRFDKLP isoformX2 208 GFGDSIEAQCGTSVNVHSSLRDILNQITKPNDVYSFSLASRLYAEETYPILPEYLQCVKELYRG [Numida GLESINFQTAADQARELINSWVESQTSGIIKNVLQPSSVNSQTAMVLVNAIYFKGLWERAFKD meleagris] EDTQAIPFRVTEQESKPVQMMSQIGSFKVASVASEKVKILELPFVSGTMSMLVLLPDEVSGLEQ LESTISTEKLTEWTSSSIMEERKIKVFLPRMRMEEKYNLTSVLMAMGMTDLFSSSANLSGISSA ESLKISQAVHAAYAEIYEAGREVVSSAEAGVDATSVSEEFRVDHPFLLCIKHNPTNSILFFGRCI SP Ovalbumin SEQIDNO: MALCKAFHPYIFIVLLFDVDNSAFTMASIGAVSTEFCVDVYKELRVHHANENIFYSPFTIISTLA isoformX1 209 MVYLGAKDSTRTQINKVVRFDKLPGFGDSIEAQCGTSVNVHSSLRDILNQITKPNDVYSFSLAS [Numida RLYAEETYPILPEYLQCVKELYRGGLESINFQTAADQARELINSWVESQTSGIIKNVLQPSSVNS meleagris] QTAMVLVNAIYFKGLWERAFKDEDTQAIPFRVTEQESKPVQMMSQIGSFKVASVASEKVKILE LPFVSGTMSMLVLLPDEVSGLEQLESTISTEKLTEWTSSSIMEERKIKVFLPRMRMEEKYNLTS VLMAMGMTDLFSSSANLSGISSAESLKISQAVHAAYAEIYEAGREVVSSAEAGVDATSVSEEF RVDHPFLLCIKHNPTNSILFFGRCISP PREDICTED: SEQIDNO: MGSIGAASMEFCFDVFKELKVHHANDNMLYSPFAILSTLAMVFLGAKDSTRTQINKVVHFDK Ovalbumin 210 LPGFGDSIEAQCGTSANVHSSLRDILNQITKQNDAYSFSLASRLYAQETYTVVPEYLQCVKELY isoformX2 RGGLESVNFQTAADQARGLINAWVESQINGIIRNILQPSSVDSQTAMVLVNAIAFKGLWEKAF [Coturnix KAEDTQTIPFRVTEQESKPVQMMHQIGSFKVASMASEKMKILELPFASGTMSMLVLLPDDVSG japonica] LEQLESTISFEKLTEWTSSSIMEERKVKVYLPRMKMEEKYNLTSLLMAMGITDLFSSSANLSGI SSVGSLKISQAVHAAYAEINEAGRDVVGSAEAGVDATEEFRADHPFLFCVKHIETNAILLFGRC VSP PREDICTED: SEQIDNO: MGLCTAFHPYIFIVLLFALDNSEFTMGSIGAASMEFCFDVFKELKVHHANDNMLYSPFAILSTL ovalbumin 211 AMVFLGAKDSTRTQINKVVHFDKLPGFGDSIEAQCGTSANVHSSLRDILNQITKQNDAYSFSL isoformX1 ASRLYAQETYTVVPEYLQCVKELYRGGLESVNFQTAADQARGLINAWVESQINGIIRNILQPS [Coturnix SVDSQTAMVLVNAIAFKGLWEKAFKAEDTQTIPFRVTEQESKPVQMMHQIGSFKVASMASEK japonica] MKILELPFASGTMSMLVLLPDDVSGLEQLESTISFEKLTEWTSSSIMEERKVKVYLPRMKMEE KYNLTSLLMAMGITDLFSSSANLSGISSVGSLKISQAVHAAYAEINEAGRDVVGSAEAGVDAT EEFRADHPFLFCVKHIETNAILLFGRCVSP Eggalbumin SEQIDNO: MGSIGAASMEFCFDVFKELKVHHANDNMLYSPFAILSTLAMVFLGAKDSTRTQINKVVHFDK 212 LPGFGDSIEAQCGTSANVHSSLRDILNQITKQNDAYSFSLASRLYAQETYTVVPEYLQCVKELY RGGLESVNFQTAADQARGLINAWVESQINGIIRNILQPSSVDSQTAMVLVNAIAFKGLWEKAF KAEDTQTIPFRVTEQESKPVQMMHQIGSFKVASMASEKMKILELPFASGTMSMLVLLPDDVSG LEQLESTISFEKLTEWTSSSIMEERKVKVYLPRMKMEEKYNLTSLLMAMGITDLFSSSANLSGI SSVGSLKIPQAVHAAYAEINEAGRDVVGSAEAGVDATEEFRADHPFLFCVKHIETNAILLFGRC VSP ovalbumin SEQIDNO: MGSIGAASTEFCFDVFRELRVQHVNENIFYSPFSIISALAMVYLGARDNTRTQIDKVVHFDKLP [Anas 213 GFGESMEAQCGTSVSVHSSLRDILTQITKPSDNFSLSFASRLYAEETYAILPEYLQCVKELYKG platyrhynchos] GLESISFQTAADQARELINSWVESQINGIIKNILQPSSVDSQTTMVLVNAIYFKGMWEKAFKDE DTQAMPFRMTEQESKPVQMMYQVGSFKVAMVTSEKMKILELPFASGMMSMFVLLPDEVSGL EQLESTISFEKLTEWTSSTMMEERRMKVYLPRMKMEEKYNLTSVFMALGMTDLFSSSANMSG ISSTVSLKMSEAVHAACVEIFEAGRDVVGSAEAGMDVTSVSEEFRADHPFLFFIKHNPTNSILFF GRWMSP PREDICTED: SEQIDNO: MGSIGAASTEFCFDVFRELKVQHVNENIFYSPLSIISALAMVYLGARDNTRTQIDQVVHFDKIP ovalbumin- 214 GFGESMEAQCGTSVSVHSSLRDILTEITKPSDNFSLSFASRLYAEETYTILPEYLQCVKELYKGG like LESISFQTAADQARELINSWVESQINGIIKNILQPSSVDSQTTMVLVNAIYFKGMWEKAFKDED [Ansercygnoides TQTMPFRMTEQESKPVQMMYQVGSFKLATVTSEKVKILELPFASGMMSMCVLLPDEVSGLEQ domesticus] LETTISFEKLTEWTSSTMMEERRMKVYLPRMKMEEKYNLTSVFMALGMTDLFSSSANMSGIS STVSLKMSEAVHAACVEIFEAGRDVVGSAEAGMDVTSVSEEFRADHPFLFFIKHNPSNSILFFG RWISP PREDICTED: SEQIDNO: MGSIGAASTEFCFDVFKELKVQHVNENIFYSPLTIISALSMVYLGARENTRAQIDKVLHFDKMP Ovalbumin- 215 GFGDTIESQCGTSVSIHTSLKDMFTQITKPSDNYSLSFASRLYAEETYPILPEYLQCVKELYKGG like[Aquila LETISFQTAAEQARELINSWVESQTNGMIKNILQPSSVDPQTKMVLVNAIYFKGVWEKAFKDE chrysaetos DTQEVPFRVTEQESKPVQMMYQIGSFKVAVMASEKMKILELPYASGQLSMLVLLPDDVSGLE canadensis] QLESAITFEKLMAWTSSTTMEERKMKVYLPRMKIEEKYNLTSVLMALGVTDLFSSSANLSGIS SAESLKISKAVHEAFVEIYEAGSEVVGSTEAGMEVTSVSEEFRADHPFLFLIKHNPTNSILFFGR CFSP PREDICTED: SEQIDNO: MGSIGAASTEFCFDVFKELKVQHVNENIFYSPLTIISALSMVYLGARENTRTQIDKVLHFDKMT Ovalbumin- 216 GFGDTVESQCGTSVSIHTSLKDIFTQITKPSDNYSLSLASRLYAEETYPILPEYLQCVKELYKGG like LETVSFQTAAEQARELINSWVESQTNGMIKNILQPSSVDPQTKMVLVNAIYFKGVWEKAFKD [Haliaeetus EDTQEVPFRVTEQESKPVQMMYQIGSFKVAVMASEKMKILELPYASGQLSMLVLLPDDVSGL albicilla] EQLESAITSEKLMEWTSSTTMEERKMKVYLPRMKIEEKYNLTSVLMALGVTDLFSSSADLSGI SSAESLKISKAVHEAFVEIYEAGSEVVGSTEGGMEVTSVSEEFRADHPFLFLIKHKPTNSILFFG RCFSP PREDICTED: SEQIDNO: MGSIGAASTEFCFDVFKELKVQHVNENIFYSPLTIISALSMVYLGARENTRTQIDKVLHFDKMT Ovalbumin- 217 GFGDTVESQCGTSVSIHTSLKDIFTQITKPSDNYSLSLASRLYAEETYPILPEYLQCVKELYKGG like LETVSFQTAAEQARELINSWVESQTNGMIKNILQPSSVDPQTKMVLVNAIYFKGVWEKAFKD [Haliaeetus EDTQEVPFRVTEQESKPVQMMYQIGSFKVAVMASEKMKILELPYASGQLSMLVLLPDDVSGL leucocephalus] EQLESAITSEKLMEWTSSTTMEERKMKVYLPRMKIEEKYNLTSVLMALGVTDLFSSSADLSGI SSAESLKISKAVHEAFVEIYEAGSEVVGSTEGGMEVTSFSEEFRADHPFLFLIKHKPTNSILFFGR CFSP PREDICTED: SEQIDNO: MGSIGAASTEFCFDVFKELKVQHVNENIFYSPLSIISALSMVYLGARENTRAQIDKVVHFDKIT Ovalbumin 218 GFGETIESQCGTSVSVHTSLKDMFTQITKPSDNYSLSFASRLYAEETYPILPEYLQCVKELYKG [Fulmarus GLETTSFQTAADQARELINSWVESQTNGMIKNILQPGSVDPQTEMVLVNAIYFKGMWEKAFK glacialis] DEDTQAVPFRMTEQESKTVQMMYQIGSFKVAVMASEKMKILELPYASGELSMLVMLPDDVS GLEQLETAITFEKLMEWTSSNMMEERKMKVYLPRMKMEEKYNLTSVLMALGVTDLFSSSAN LSGISSAESLKMSEAVHEAFVEIYEAGSEVVGSTGAGMEVTSVSEEFRADHPFLFLIKHNPTNSI LFFGRCFSP PREDICTED: SEQIDNO: MGSIGAASTEFCFDVFKELRVQHVNENVCYSPLIIISALSLVYLGARENTRAQIDKVVHFDKIT Ovalbumin- 219 GFGESIESQCGTSVSVHTSLKDMFNQITKPSDNYSLSVASRLYAEERYPILPEYLQCVKELYKG like GLESISFQTAADQAREAINSWVESQTNGMIKNILQPSSVDPQTEMVLVNAIYFKGMWQKAFK [Chlamydotis DEDTQAVPFRISEQESKPVQMMYQIGSFKVAVMAAEKMKILELPYASGELSMLVLLPDEVSG macqueenii] LEQLENAITVEKLMEWTSSSPMEERIMKVYLPRMKIEEKYNLTSVLMALGITDLFSSSANLSGI SAEESLKMSEAVHQAFAEISEAGSEVVGSSEAGIDATSVSEEFRADHPFLFLIKHNATNSILFFG RCFSP PREDICTED: SEQIDNO: MGSISAASTEFCFDVFKELKVQHVNENIFYSPLSIISALSMVYLGARENTRAQIEKVVHFDKITG Ovalbumin 220 FGESIESQCSTSVSVHTSLKDMFTQITKPSDNYSLSFASRFYAEETYPILPEYLQCVKELYKGGL like[Nipponia ETINFRTAADQARELINSWVESQTNGMIKNILQPGSVDPQTDMVLVNAIYFKGMWEKAFKDE nippon] DTQALPFRVTEQESKPVQMMYQIGSFKVAVLASEKVKILELPYASGQLSMLVLLPDDVSGLEQ LETAITVEKLMEWTSSNNMEERKIKVYLPRIKIEEKYNLTSVLMALGITDLFSSSANLSGISSAE SLKVSEAIHEAFVEIYEAGSEVAGSTEAGIEVTSVSEEFRADHPFLFLIKHNATNSILFFGRCFSP PREDICTED: SEQIDNO: MVSIGAASTEFCFDVFKELKVQHVNENIFYSPLSIISALSMVYLGARENTRAQIDKVVHFDKIT Ovalbumin- 221 GFEETIESQCSTSVSVHTSLKDMFTQITKPSDNYSLSFASRLYAEETYPILPEYLQCVKELYKGG likeisoform LETISFQTAADQARELINSWVESQTDGMIKNILQPGSVDPQTEMVLVNAIYFKGMWEKAFKDE X2[Gavia DTQAVPFRMTEQESKPVQMMYQIGSFKVAVMASEKMKILELPYASGGMSMLVMLPDDVSGL stellata] EQLETAITFEKLMEWTSSNMMEERKMKVYLPRMKMEEKYNLTSVLMALGMTDLFSSSANLS GISSAESLKMSEAVHEAFVEIYEAGSEAVGSTGAGMEVTSVSEEFRADHPFLFLIKHNPTNSILF FGRCFSP PREDICTED: SEQIDNO: MGSIGAASTEFCFDVFKELKVQHVNENIFYSPLSIISALSMVYLGARENTRAQIDKVVHFDKIT Ovalbumin 222 GFGEPIESQCGISVSVHTSLKDMITQITKPSDNYSLSFASRLYAEETYPILPEYLQCVKELYKGG [Pelecanus LETISFQTAADQARELINSWVENQTNGMIKNILQPGSVDPQTEMVLVNAVYFKGMWEKAFKD crispus] EDTQAVPFRMTEQESKPVQMMYQIGSFKVAVMASEKIKILELPYASGELSMLVLLPDDVSGLE QLETAITLDKLTEWTSSNAMEERKMKVYLPRMKIEKKYNLTSVLIALGMTDLFSSSANLSGISS AESLKMSEAIHEAFLEIYEAGSEVVGSTEAGMEVTSVSEEFRADHPFLFLIKHNPTNSILFFGRC LSP PREDICTED: SEQIDNO: MGSIGAASTEFCFDVFKELKVQHVNENIFYSPLTIISALSMVYLGARENTRAQIDKVVHFDKIP Ovalbumin- 223 GFGDTTESQCGTSVSVHTSLKDMFTQITKPSDNYSVSFASRLYAEETYPILPEFLECVKELYKG like GLESISFQTAADQARELINSWVESQTNGMIKNILQPGSVDSQTEMVLVNAIYFKGMWEKAFK [Charadrius DEDTQTVPFRMTEQETKPVQMMYQIGTFKVAVMPSEKMKILELPYASGELCMLVMLPDDVS vociferus] GLEELESSITVEKLMEWTSSNMMEERKMKVFLPRMKIEEKYNLTSVLMALGMTDLFSSSANL SGISSAEPLKMSEAVHEAFIEIYEAGSEVVGSTGAGMEITSVSEEFRADHPFLFLIKHNPTNSILF FGRCVSP PREDICTED: SEQIDNO: MGSIGAVSTEFCFDVFKELKVQHVNENIFYSPLSIISALSMVYLGARENTRAQIDKVVHFDKIT Ovalbumin- 224 GSGETIEAQCGTSVSVHTSLKDMFTQITKPSENYSVGFASRLYADETYPIIPEYLQCVKELYKG like GLEMISFQTAADQARELINSWVESQTNGMIKNILQPGSVDPQTEMILVNAIYFKGVWEKAFKD [Eurypyga EDTQAVPFRMTEQESKPVQMMYQFGSFKVAAMAAEKMKILELPYASGALSMLVLLPDDVSG helias] LEQLESAITFEKLMEWTSSNMMEEKKIKVYLPRMKMEEKYNFTSVLMALGMTDLFSSSANLS GISSADSLKMSEVVHEAFVEIYEAGSEVVGSTGSGMEAASVSEEFRADHPFLFLIKHNPTNSILF FGRCFSP PREDICTED: SEQIDNO: MVSIGAASTEFCFDVFKELKVQHVNENIFYSPLSIISALSMVYLGARENTRAQIDKVVHFDKIT Ovalbumin- 225 GFEETIESQVQKKQCSTSVSVHTSLKDMFTQITKPSDNYSLSFASRLYAEETYPILPEYLQCVKE likeisoform LYKGGLETISFQTAADQARELINSWVESQTDGMIKNILQPGSVDPQTEMVLVNAIYFKGMWE X1[Gavia KAFKDEDTQAVPFRMTEQESKPVQMMYQIGSFKVAVMASEKMKILELPYASGGMSMLVMLP stellata] DDVSGLEQLETAITFEKLMEWTSSNMMEERKMKVYLPRMKMEEKYNLTSVLMALGMTDLF SSSANLSGISSAESLKMSEAVHEAFVEIYEAGSEAVGSTGAGMEVTSVSEEFRADHPFLFLIKH NPTNSILFFGRCFSP PREDICTED: SEQIDNO: MGSIGAASGEFCFDVFKELKVQHVNENIFYSPLSIISALSMVYLGARENTRAQIDKVVHFDKIIG Ovalbumin- 226 FGESIESQCGTSVSVHTSLKDMFAQITKPSDNYSLSFASRLYAEETFPILPEYLQCVKELYKGGL like[Egretta ETLSFQTAADQARELINSWVESQTNGMIKDILQPGSVDPQTEMVLVNAIYFKGVWEKAFKDE garzetta] DTQTVPFRMTEQESKPVQMMYQIGSFKVAVVAAEKIKILELPYASGALSMLVLLPDDVSSLEQ LETAITFEKLTEWTSSNIMEERKIKVYLPRMKIEEKYNLTSVLMDLGITDLFSSSANLSGISSAES LKVSEAIHEAIVDIYEAGSEVVGSSGAGLEGTSVSEEFRADHPFLFLIKHNPTSSILFFGRCFSP PREDICTED: SEQIDNO: MGSIGAASTEFCFDVFKELKVQHVNENIFYSPLSIISALSMVYLGARENTRAQIDKVVHFDKIT Ovalbumin- 227 GSGEAIESQCGTSVSVHISLKDMFTQITKPSDNYSLSFASRLYAEETYPILPEYLQCVKELYKEG like[Balearica LATISFQTAADQAREFINSWVESQTNGMIKNILQPGSVDPQTQMVLVNAIYFKGVWEKAFKDE regulorum DTQAVPFRMTKQESKPVQMMYQIGSFKVAVMASEKMKILELPYASGQLSMLVMLPDDVSGL gibbericeps] EQIENAITFEKLMEWTNPNMMEERKMKVYLPRMKMEEKYNLTSVLMALGMTDLFSSSANLS GISSAESLKMSEAVHEAFVEIYEAGSEVVGSTGAGIEVTSVSEEFRADHPFLFLIKHNPTNSILFF GRCFSP PREDICTED: SEQIDNO: MGSIGEASTEFCIDVFRELKVQHVNENIFYSPLSIISALSMVYLGARENTRAQIDQVVHFDKITG Ovalbumin- 228 FGDTVESQCGSSLSVHSSLKDIFAQITQPKDNYSLNFASRLYAEETYPILPEYLQCVKELYKGG like[Nestor LETISFQTAADQARELINSWVESQTNGMIKNILQPSSVDPQTEMVLVNAIYFKGVWEKAFKDE notabilis] ETQAVPFRITEQENRPVQIMYQFGSFKVAVVASEKIKILELPYASGQLSMLVLLPDEVSGLEQL ENAITFEKLTEWTSSDIMEEKKIKVFLPRMKIEEKYNLTSVLVALGIADLFSSSANLSGISSAESL KMSEAVHEAFVEIYEAGSEVVGSSGAGIEAASDSEEFRADHPFLFLIKHKPTNSILFFGRCFSP PREDICTED: SEQIDNO: MGSIGAASTEFCFDIFNELKVQHVNENIFYSPLSIISALSMVYLGARENTKAQIDKVVHFDKITG Ovalbumin- 229 FGESIESQCSTSASVHTSFKDMFTQITKPSDNYSLSFASRLYAEETYPILPEYSQCVKELYKGGL like ESISFQTAADQARELINSWVESQTNGMIKNILQPGSVDPQTELVLVNAIYFKGTWEKAFKDKD [Pygoscelis TQAVPFRVTEQESKPVQMMYQIGSYKVAVIASEKMKILELPYASGELSMLVLLPDDVSGLEQL adeliae] ETAITFEKLMEWTSSNMMEERKVKVYLPRMKIEEKYNLTSVLMALGMTDLFSPSANLSGISSA ESLKMSEAIHEAFVEIYEAGSEVVGSTEAGMEVTSVSEEFRADHPFLFLIKCNLTNSILFFGRCF SP Ovalbumin- SEQIDNO: MGSISTASTEFCFDVFKELKVQHVNENIFYSPLSIISALSMVYLGARENTRAQIEKVVHFDKITG like[Athene 230 FGESIESQCGTSVSVHTSLKDMLIQISKPSDNYSLSFASKLYAEETYPILPEYLQCVKELYKGGL cunicularia] ESINFQTAADQARQLINSWVESQTNGMIKDILQPSSVDPQTEMVLVNAIYFKGIWEKAFKDED TQEVPFRITEQESKPVQMMYQIGSFKVAVIASEKIKILELPYASGELSMLIVLPDDVSGLEQLET AITFEKLIEWTSPSIMEERKTKVYLPRMKIEEKYNLTSVLMALGMTDLFSPSANLSGISSAESLK MSEAIHEAFVEIYEAGSEVVGSAEAGMEATSVSEFRVDHPFLFLIKHNPANIILFFGRCVSP PREDICTED: SEQIDNO: MGSIGAASTEFCFDVFKELKVQHVNENIFYSPLTIISALSLVYLGARENTRAQIDKVFHFDKISG Ovalbumin- 231 FGETTESQCGTSVSVHTSLKEMFTQITKPSDNYSVSFASRLYAEDTYPILPEYLQCVKELYKGG like[Calidris LETISFQTAADQAREVINSWVESQTNGMIKNILQPGSVDSQTEMVLVNAIYFKGMWEKAFKD pugnax] EDTQTMPFRITEQERKPVQMMYQAGSFKVAVMASEKMKILELPYASGEFCMLIMLPDDVSGL EQLENSFSFEKLMEWTTSNMMEERKMKVYIPRMKMEEKYNLTSVLMALGMTDLFSSSANLS GISSAETLKMSEAVHEAFMEIYEAGSEVVGSTGSGAEVTGVYEEFRADHPFLFLVKHKPTNSIL FFGRCVSP PREDICTED: SEQIDNO: MGSIGAASTEFCFDIFNELKVQHVNENIFYSPLSIISALSMVYLGARENTKAQIDKVVHFDKITG Ovalbumin 232 FGETIESQCSTSVSVHTSLKDTFTQITKPSDNYSLSFASRLYAEETYPILPEYSQCVKELYKGGL [Aptenodytes ETISFQTAADQARELINSWVESQTNGMIKNILQPGSVDPQTELVLVNAIYFKGTWEKAFKDKD forsteri] TQAVPFRVTEQESKPVQMMYQIGSYKVAVIASEKMKILELPYASRELSMLVLLPDDVSGLEQL ETAITFEKLMEWTSSNMMEERKVKVYLPRMKIEEKYNLTSVLMALGMTDLFSPSANLSGISSA ESLKMSEAVHEAFVEIYEAGSEVVGSTGAGMEVTSVSEEFRADHPFLFLIKCNPTNSILFFGRC FSP PREDICTED: SEQIDNO: MGSISAASAEFCLDVFKELKVQHVNENIFYSPLSIISALSMVYLGARENTRAQIDKVVHFDKIT Ovalbumin- 233 GSGETIEFQCGTSANIHPSLKDMFTQITRLSDNYSLSFASRLYAEERYPILPEYLQCVKELYKGG like[Pterocles LETISFQTAADQARELINSWVESQTNGMIKNILQPGSVNPQTEMVLVNAIYFKGLWEKAFKDE gutturalis] DTQTVPFRMTEQESKPVQMMYQVGSFKVAVMASDKIKILELPYASGELSMLVLLPDDVTGLE QLETSITFEKLMEWTSSNVMEERTMKVYLPHMRMEEKYNLTSVLMALGVTDLFSSSANLSGI SSAESLKMSEAVHEAFVEIYESGSQVVGSTGAGTEVTSVSEEFRVDHPFLFLIKHNPTNSILFFG RCFSP Ovalbumin- SEQIDNO: MGSIGAASVEFCFDVFKELKVQHVNENIFYSPLSIISALSMVYLGARENTKAQIDKVVHFDKIA like[Falco 234 GFGEAIESQCVTSASIHSLKDMFTQITKPSDNYSLSFASRLYAEEAYSILPEYLQCVKELYKGGL peregrinus] ETISFQTAADQARDLINSWVESQTNGMIKNILQPGAVDLETEMVLVNAIYFKGMWEKAFKDE DTQTVPFRMTEQESKPVQMMYQVGSFKVAVMASDKIKILELPYASGQLSMVVVLPDDVSGL EQLEASITSEKLMEWTSSSIMEEKKIKVYFPHMKIEEKYNLTSVLMALGMTDLFSSSANLSGIS SAEKLKVSEAVHEAFVEISEAGSEVVGSTEAGTEVTSVSEEFKADHPFLFLIKHNPTNSILFFGR CFSP PREDICTED: SEQIDNO: MGSIGAASSEFCFDIFKELKVQHVNENIFYSPLSIISALSMVYLGARENTRAQIDKVVPFDKITA Ovalbumin- 235 SGESIESQCSTSVSVHTSLKDIFTQITKSSDNHSLSFASRLYAEETYPILPEYLQCVKELYEGGLE likeisoform TISFQTAADQARELINSWIESQTNGRIKNILQPGSVDPQTEMVLVNAIYFKGMWEKAFKDEDT X2 QAVPFRMTEQESKPVQVMHQIGSFKVAVLASEKIKILELPYASGELSMLVLLPDDVSGLEQLE [Phalacrocorax TAITFEKLMEWTSPNIMEERKIKVFLPRMKIEEKYNLTSVLMALGITDLFSPLANLSGISSAESL carbo] KMSEAIHEAFVEISEAGSEVIGSTEAEVEVINDPEEFRADHPFLFLIKHNPTNSILFFGRCFSP PREDICTED: SEQIDNO: MGSIGAASTEFCFDVFKELKAQYVNENIFYSPMTIITALSMVYLGSKENTRAQIAKVAHFDKIT Ovalbumin- 236 GFGESIESQCGASASIQFSLKDLFTQITKPSGNHSLSVASRIYAEETYPILPEYLECMKELYKGGL like[Merops ETINFQTAANQARELINSWVERQTSGMIKNILQPSSVDSQTEMVLVNAIYFRGLWEKAFKVED nubicus] TQATPFRITEQESKPVQMMHQIGSFKVAVVASEKIKILELPYASGRLTMLVVLPDDVSGLKQL ETTITFEKLMEWTTSNIMEERKIKVYLPRMKIEEKYNLTSVLMALGLTDLFSSSANLSGISSAES LKMSEAVHEAFVEIYEAGSEVVASAEAGMDATSVSEEFRADHPFLFLIKDNTSNSILFFGRCFS P PREDICTED: SEQIDNO: MGSIGAASTEFCFDVFKELKGQHVNENIFFCPLSIVSALSMVYLGARENTRAQIVKVAHFDKIA Ovalbumin- 237 GFAESIESQCGTSVSIHTSLKDMFTQITKPSDNYSLNFASRLYAEETYPIIPEYLQCVKELYKGG like[Tauraco LETISFQTAADQAREIINSWVESQTNGMIKNILRPSSVHPQTELVLVNAVYFKGTWEKAFKDE erythrolophus] DTQAVPFRITEQESKPVQMMYQIGSFKVAAVTSEKMKILEVPYASGELSMLVLLPDDVSGLEQ LETAITAEKLIEWTSSTVMEERKLKVYLPRMKIEEKYNLTTVLTALGVTDLFSSSANLSGISSA QGLKMSNAVHEAFVEIYEAGSEVVGSKGEGTEVSSVSDEFKADHPFLFLIKHNPTNSIVFFGRC FSP PREDICTED: SEQIDNO: MGSIGAASTEFCFDVFKELKVHHVNENILYSPLAIISALSMVYLGAKENTRDQIDKVVHFDKIT Ovalbumin- 238 GIGESIESQCSTAVSVHTSLKDVFDQITRPSDNYSLAFASRLYAEKTYPILPEYLQCVKELYKGG like[Cuculus LETIDFQTAADQARQLINSWVEDETNGMIKNILRPSSVNPQTKIILVNAIYFKGMWEKAFKDED canorus] TQEVPFRITEQETKSVQMMYQIGSFKVAEVVSDKMKILELPYASGKLSMLVLLPDDVYGLEQL ETVITVEKLKEWTSSIVMEERITKVYLPRMKIMEKYNLTSVLTAFGITDLFSPSANLSGISSTESL KVSEAVHEAFVEIHEAGSEVVGSAGAGIEATSVSEEFKADHPFLFLIKHNPTNSILFFGRCFSP Ovalbumin SEQIDNO: MGSIGAASTEFCLDVFKELKVQHVNENIFYSPLSIISALSMVYLGARENTRAQIDKVVHFDKIT [Antrostomus 239 GFEDSIESQCGTSVSVHTSLKDMFTQITKPSDNYSVGFASRLYAAETYQILPEYSQCVKELYKG carolinensis] GLETINFQKAADQATELINSWVESQTNGMIKNILQPSSVDPQTQIFLVNAIYFKGMWQRAFKE EDTQAVPFRISEKESKPVQMMYQIGSFKVAVIPSEKIKILELPYASGLLSMLVILPDDVSGLEQL ENAITLEKLMQWTSSNMMEERKIKVYLPRMRMEEKYNLTSVFMALGITDLFSSSANLSGISSA ESLKMSDAVHEASVEIHEAGSEVVGSTGSGTEASSVSEEFRADHPYLFLIKHNPTDSIVFFGRCF SP PREDICTED: SEQIDNO: MGSIGAASTEFCFDVFKELKFQHVDENIFYSPLTIISALSMVYLGARENTRAQIDKVVHFDKIA Ovalbumin- 240 GFEETVESQCGTSVSVHTSLKDMFAQITKPSDNYSLSFASRLYAEETYPILPEYLQCVKELYKG like GLETISFQTAADQARDLINSWVESQTNGMIKNILQPSSVGPQTELILVNAIYFKGMWQKAFKD [Opisthocomus EDTQEVPFRMTEQQSKPVQMMYQTGSFKVAVVASEKMKILALPYASGQLSLLVMLPDDVSG hoazin] LKQLESAITSEKLIEWTSPSMMEERKIKVYLPRMKIEEKYNLTSVLMALGITDLFSPSANLSGIS SAESLKMSQAVHEAFVEIYEAGSEVVGSTGAGMEDSSDSEEFRVDHPFLFFIKHNPTNSILFFG RCFSP PREDICTED: SEQIDNO: MGSIGPLSVEFCCDVFKELRIQHPRENIFYSPVTIISALSMVYLGARDNTKAQIEKAVHFDKIPG Ovalbumin- 241 FGESIESQCGTSLSIHTSLKDIFTQITKPSDNYTVGIASRLYAEEKYPILPEYLQCIKELYKGGLEP like INFQTAAEQARELINSWVESQTNGMIKNILQPSSVNPETDMVLVNAIYFKGLWEKAFKDEDIQ [Lepidothrix TVPFRITEQESKPVQMMFQIGSFRVAEITSEKIRILELPYASGQLSLWVLLPDDISGLEQLETAIT coronata] FENLKEWTSSTKMEERKIKVYLPRMKIEEKYNLTSVLTSLGITDLFSSSANLSGISSAESLKVSS AFHEASVEIYEAGSKVVGSTGAEVEDTSVSEEFRADHPFLFLIKHNPSNSIFFFGRCFSP PREDICTED: SEQIDNO: MGSIGTASAEFCFDVFKELKVHHVNENIFYSPLSIISALSMVYLGARENTKTQMEKVIHFDKIT Ovalbumin 242 GLGESMESQCGTGVSIHTALKDMLSEITKPSDNYSLSLASRLYAEQTYAILPEYLQCIKELYKE [Struthio SLETVSFQTAADQARELINSWIESQTNGVIKNFLQPGSVDSQTELVLVNAIYFKGMWEKAFKD camelus EDTQEVPFRITEQESRPVQMMYQAGSFKVATVAAEKIKILELPYASGELSMLVLLPDDISGLEQ australis] LETTISFEKLTEWTSSNMMEDRNMKVYLPRMKIEEKYNLTSVLIALGMTDLFSPAANLSGISA AESLKMSEAIHAAYVEIYEADSEIVSSAGVQVEVTSDSEEFRVDHPFLFLIKHNPTNSVLFFGRC ISP PREDICTED: SEQIDNO: MGSIGAVSTEFSCDVFKELRIHHVQENIFYSPVTIISALSMIYLGARDSTKAQIEKAVHFDKIPGF Ovalbumin- 243 GESIESQCGTSLSIHTSIKDMFTKITKASDNYSIGIASRLYAEEKYPILPEYLQCVKELYKGGLESI like SFQTAAEQAREIINSWVESQTNGMIKNILQPSSVDPQTDIVLVNAIYFKGLWEKAFRDEDTQTV [Acanthisitta PFKITEQESKPVQMMYQIGSFKVAEITSEKIKILEVPYASGQLSLWVLLPDDISGLEKLETAITFE chloris] NLKEWTSSTKMEERKIKVYLPRMKIEEKYNLTSVLTALGITDLFSSSANLSGISSAESLKVSEAF HEAIVEISEAGSKVVGSVGAGVDDTSVSEEFRADHPFLFLIKHNPTSSIFFFGRCFSP PREDICTED: SEQIDNO: MGSIGAASTEFCFDVFKELKVQHVNENIFYSPLSIISALSMVYLGARENTRAQIDKVVHFDKIA Ovalbumin- 244 GFGESTESQCGTSVSAHTSLKDMSNQITKLSDNYSLSFASRLYAEETYPILPEYSQCVKELYKG like[Tyto GLESISFQTAAYQARELINAWVESQTNGMIKDILQPGSVDSQTKMVLVNAIYFKGIWEKAFKD alba] EDTQEVPFRMTEQETKPVQMMYQIGSFKVAVIAAEKIKILELPYASGQLSMLVILPDDVSGLE QLETAITFEKLTEWTSASVMEERKIKVYLPRMSIEEKYNLTSVLIALGVTDLFSSSANLSGISSA ESLRMSEAIHEAFVETYEAGSTESGTEVTSASEEFRVDHPFLFLIKHKPTNSILFFGRCFSP PREDICTED: SEQIDNO: MGSIGAASSEFCFDIFKELKVQHVNENIFYSPLSIISALSMVYLGARENTRAQIDKVVPFDKITA Ovalbumin- 245 SGESIESQVQKIQCSTSVSVHTSLKDIFTQITKSSDNHSLSFASRLYAEETYPILPEYLQCVKELY likeisoform EGGLETISFQTAADQARELINSWIESQTNGRIKNILQPGSVDPQTEMVLVNAIYFKGMWEKAF X1 KDEDTQAVPFRMTEQESKPVQVMHQIGSFKVAVLASEKIKILELPYASGELSMLVLLPDDVSG [Phalacrocorax LEQLETAITFEKLMEWTSPNIMEERKIKVFLPRMKIEEKYNLTSVLMALGITDLFSPLANLSGIS carbo] SAESLKMSEAIHEAFVEISEAGSEVIGSTEAEVEVTNDPEEFRADHPFLFLIKHNPTNSILFFGRC FSP Ovalbumin- SEQIDNO: MGSIGPLSVEFCCDVFKELRIQHARENIFYSPVTIISALSMVYLGARDNTKAQIEKAVHFDKIPG like[Pipra 246 FGESIESQCGTSLSIHTSLKDIFTQITKPSDNYTVGIASRLYAEEKYPILPEYLQCIKELYKGGLEP filicauda] ISFQTAAEQARELINSWVESQINGIIKNILQPSSVNPETDMVLVNAIYFKGLWEKAFKDEGTQT VPFRITEQESKPVQMMFQIGSFRVAEIASEKIRILELPYASGQLSLWVLLPDDISGLEQLETAITF ENLKEWTSSTKMEERKIKVYLPRMKIEEKYNLTSVLTSLGITDLFSSSANLSGISSAERLKVSSA FHEASMEINEAGSKVVGAGVDDTSVSEEFRVDRPFLFLIKHNPSNSIFFFGRCFSP Ovalbumin SEQIDNO: MGSIGAASTEFCFDMFKELKVHHVNENIIYSPLSIISILSMVFLGARENTKTQMEKVIHFDKITG [Dromaius 247 FGESLESQCGTSVSVHASLKDILSEITKPSDNYSLSLASKLYAEETYPVLPEYLQCIKELYKGSL novaehollandiae] ETVSFQTAADQARELINSWVETQTNGVIKNFLQPGSVDPQTEMVLVDAIYFKGTWEKAFKDE DTQEVPFRITEQESKPVQMMYQAGSFKVATVAAEKMKILELPYASGELSMFVLLPDDISGLEQ LETTISIEKLSEWTSSNMMEDRKMKVYLPHMKIEEKYNLTSVLVALGMTDLFSPSANLSGIST AQTLKMSEAIHGAYVEIYEAGSEMATSTGVLVEAASVSEEFRVDHPFLFLIKHNPSNSILFFGR CIFP ChainA, SEQIDNO: MGSIGAASTEFCFDMFKELKVHHVNENIIYSPLSIISILSMVFLGARENTKTQMEKVIHFDKITG Ovalbumin 248 FGESLESQCGTSVSVHASLKDILSEITKPSDNYSLSLASKLYAEETYPVLPEYLQCIKELYKGSL ETVSFQTAADQARELINSWVETQTNGVIKNFLQPGSVDPQTEMVLVDAIYFKGTWEKAFKDE DTQEVPFRITEQESKPVQMMYQAGSFKVATVAAEKMKILELPYASGELSMFVLLPDDISGLEQ LETTISIEKLSEWTSSNMMEDRKMKVYLPHMKIEEKYNLTSVLVALGMTDLFSPSANLSGIST AQTLKMSEAIHGAYVEIYEAGSEMATSTGVLVEAASVSEEFRVDHPFLFLIKHNPSNSILFFGR CIFPHHHHHH Ovalbumin- SEQIDNO: MGSIGPLSVEFCCDVFKELRIQHARENIFYSPVTIISALSMVYLGARDNTKAQIEKAVHFDKIPG like[Corapipo 249 FGESIESQCGTSLSIHTSLKDIFTQITKPSDNYTVGIASRLYAEEKYPILPEYLQCIKELYKGGLEP altera] ISFQTAAEQARELINSWVESQTNGMIKNILQPSAVNPETDMVLVNAIYFKGLWEKAFKDEGTQ TVPFRITEQESKPVQMMFQIGSFRVAEITSEKIRILELPYASGQLSLWVLLPDDISGLEQLETAIT FENLKEWTSSTKMEERKIKVYLPRMKIEEKYNLTSVLTSLGITDLFSSSANLSGISSAERLKVSS AFHEASMEIYEAGSKVVGSTGAGVDDTSVSEEFRVDRPFLFLIKHNPSNSIFFFGRCFSP Ovalbumin- SEQIDNO: MEDQRGNTGFTMGSIGAASTEFCIDVFRELRVQHVNENIFYSPLTIISALSMVYLGARENTRAQ likeprotein 250 IDQVVHFDKIAGFGDTVESQCGSSPSVHNSLKTVXAQITQPRDNYSLNLASRLYAEESYPILPE [Amazona YLQCVKELYNGGLETVSFQTAADQARELINSWVESQINGIIKNILQPSSVDPQTEMVLVNAIYF aestiva] KGLWEKAFKDEETQAVPFRITEQENRPVQMMYQFGSFKVAXVASEKIKILELPYASGQLSML VLLPDEVSGLEQNAITFEKLTEWTSSDLMEERKIKVFFPRVKIEEKYNLTAVLVSLGITDLFSSS ANLSGISSAENLKMSEAVHEAXVEIYEAGSEVAGSSGAGIEVASDSEEFRVDHPFLFLIXHNPT NSILFFGRCFSP PREDICTED: SEQIDNO: MGSIGAASTEFCIDVFRELRVQHVNENIFYSPLSIISALSMVYLGARENTRAQIDEVFHFDKIAG Ovalbumin- 251 FGDTVDPQCGASLSVHKSLQNVFAQITQPKDNYSLNLASRLYAEESYPILPEYLQCVKELYNE like GLETVSFQTGADQARELINSWVENQTNGVIKNILQPSSVDPQTEMVLVNAIYFKGLWQKAFK [Melopsittacus DEETQAVPFRITEQENRPVQMMYQFGSFKVAVVASEKVKILELPYASGQLSMWVLLPDEVSG undulatus] LEQLENAITFEKLTEWTSSDLTEERKIKVFLPRVKIEEKYNLTAVLMALGVTDLFSSSANFSGIS AAENLKMSEAVHEAFVEIYEAGSEVVGSSGAGIEAPSDSEEFRADHPFLFLIKHNPTNSILFFGR CFSP Ovalbumin- SEQIDNO: MGSIGPLSVEFCCDVFKELRIQHARDNIFYSPVTIISALSMVYLGARDNTKAQIEKAVHFDKIPG like 252 FGESIESQCGTSLSVHTSLKDIFTQITKPRENYTVGIASRLYAEEKYPILPEYLQCIKELYKGGLE [Neopelma PISFQTAAEQARELINSWVESQTNGMIKNILQPSSVNPETDMVLVNAIYFKGLWKKAFKDEGT chrysocephalum] QTVPFRITEQESKPVQMMFQIGSFRVAEITSEKIRILELPYASGQLSLWVLLPDDISGLEQLESAI TFENLKEWTSSTKMEERKIKVYLPRMKIEEKYNLTSVLTSLGITDLFSSSANLSGISSAEKLKVS SAFHEASMEIYEAGNKVVGSTGAGVDDTSVSEEFRVDRPFLFLIKHNPSNSIFFFGRCFSP PREDICTED: SEQIDNO: MGSIGAASAEFCVDVFKELKDQHVNNIVFSPLMIISALSMVNIGAREDTRAQIDKVVHFDKITG Ovalbumin- 253 YGESIESQCGTSIGIYFSLKDAFTQITKPSDNYSLSFASKLYAEETYPILPEYLKCVKELYKGGLE like[Buceros TISFQTAADQARELINSWVESQTNGMIKNILQPSSVDPQTEMVLVNAIYFKGLWEKAFKDEDT rhinoceros QAVPFRITEQESKPVQMMYQIGSFKVAVIASEKIKILELPYASGQLSLLVLLPDDVSGLEQLESA silvestris] ITSEKLLEWTNPNIMEERKTKVYLPRMKIEEKYNLTSVLVALGITDLFSSSANLSGISSAEGLKL SDAVHEAFVEIYEAGREVVGSSEAGVEDSSVSEEFKADRPFIFLIKHNPTNGILYFGRYISP PREDICTED: SEQIDNO: MGSIGAANTDFCFDVFKELKVHHANENIFYSPLSIVSALAMVYLGARENTRAQIDKALHFDKI Ovalbumin- 254 LGFGETVESQCDTSVSVHTSLKDMLIQITKPSDNYSFSFASKIYTEETYPILPEYLQCVKELYKG like[Cariama GVETISFQTAADQAREVINSWVESHTNGMIKNILQPGSVDPQTKMVLVNAVYFKGIWEKAFK cristata] EEDTQEMPFRINEQESKPVQMMYQIGSFKLTVAASENLKILEFPYASGQLSMMVILPDEVSGL KQLETSITSEKLIKWTSSNTMEERKIRVYLPRMKIEEKYNLKSVLMALGITDLFSSSANLSGISS AESLKMSEAVHEAFVEIYEAGSEVTSSTGTEMEAENVSEEFKADHPFLFLIKHNPTDSIVFFGR CMSP Ovalbumin SEQIDNO: MGSIGPLSVEFCCDVFKELRIQHARENIFYSPVTIISALSMVYLGARDNTKAQIEKAVHFDKIPG [Manacus 255 FGESIESQCGTSLSIHTSLKDIFTQITKPSDNYTVGIASRLYAEEKYPILPEYLQCIKELYKGGLEP vitellinus] ISFQTAAEQARELINSWVESQTNGMIKNILQPSSVNPETDMVLVNAIYFKGLWEKAFKDESTQ TVPFRITEQESKPVQMMFQIGSFRVAEIASEKIRILELPYASGQLSLWVLLPDDISGLEQLETAIT FENLKEWTSSTKMEERKIKVYLPRMKIEEKYNLTSVLTSLGITDLFSSSANLSGISSAERLKVSS AFHEASMEIYEAGSRVVEAGVDDTSVSEEFRVDRPFLFLIKHNPSNSIFFFGRCFSP Ovalbumin- SEQIDNO: MGSIGPVSTEFCCDIFKELRIQHARENIIYSPVTIISALSMVYLGARDNTKAQIEKAVHFDKIPGF like 256 GESIESQCGTSLSIHTSLKDILTQITKPSDNYTVGIASRLYAEEKYPILSEYLQCIKELYKGGLEPI [Empidonax SFQTAAEQARELINSWVESQTNGMIKNILQPSSVNPETDMVLVNAIYFKGLWEKAFKDEGTQT traillii] VPFRITEQESKPVQMMFQIGSFKVAEITSEKIRILELPYASGKLSLWVLLPDDISGLEQLETAITF ENLKEWTSSTRMEERKIKVYLPRMKIEEKYNLTSVLTSLGITDLFSSSANLSGISSAERLKVSSA FHEVFVEIYEAGSKVEGSTGAGVDDTSVSEEFRADHPFLFLVKHNPSNSIIFFGRCYLP PREDICTED: SEQIDNO: MGSTGAASMEFCFALFRELKVQHVNENIFFSPVTIISALSMVYLGARENTRAQLDKVAPFDKIT Ovalbumin- 257 GFGETIGSQCSTSASSHTSLKDVFTQITKASDNYSLSFASRLYAEETYPILPEYLQCVKELYKGG like LESISFQTAADQARELINSWVESQTNGMIKDILRPSSVDPQTKIILITAIYFKGMWEKAFKEEDT [Leptosomus QAVPFRMTEQESKPVQMMYQIGSFKVAVIPSEKLKILELPYASGQLSMLVILPDDVSGLEQLET discolor] AITTEKLKEWTSPSMMKERKMKVYFPRMRIEEKYNLTSVLMALGITDLFSPSANLSGISSAESL KVSEAVHEASVDIDEAGSEVIGSTGVGTEVTSVSEEIRADHPFLFLIKHKPTNSILFFGRCFSP Hypothetical SEQIDNO: MEHAQLTQLVNSNMTSNTCHEADEFENIDFRMDSISVTNTKFCFDVFNEMKVHHVNENILYS protein 258 PLSILTALAMVYLGARGNTESQMKKALHFDSITGAGSTTDSQCGSSEYIHNLFKEFLTEITRTN H355_008077 ATYSLEIADKLYVDKTFTVLPEYINCARKFYTGGVEEVNFKTAAEEARQLINSWVEKETNGQI [Colinus KDLLVPSSVDFGTMMVFINTIYFKGIWKTAFNTEDTREMPFSMTKQESKPVQMMCLNDTFNM virginianus] ATLPAEKMRILELPYASGELSMLVLLPDEVSGLEQIEKAINFEKLREWTSTNAMEKKSMKVYL PRMKIEEKYNLTSTLMALGMTDLFSRSANLTGISSVENLMISDAVHGAFMEVNEEGTEAAGST GAIGNIKHSVEFEEFRADHPFLFLIRYNPTNVILFFDNSEFTMGSIGAVSTEFCFDVFKELRVHH ANENIFYSPFTVISALAMVYLGAKDSTRTQINKVVRFDKLPGFGDSIEAQCGTSANVHSSLRDI LNQITKPNDIYSFSLASRLYADETYTILPEYLQCVKELYRGGLESINFQTAADQARELINSWVES QTSGIIRNVLQPSSVDSQTAMVLVNAIYFKGLWEKGFKDEDTQAMPFRVTEQENKSVQMMY QIGTFKVASVASEKMKILELPFASGTMSMWVLLPDEVSGLEQLETTISIEKLTEWTSSSVMEER KIKVFLPRMKMEEKYNLTSVLMAMGMTDLFSSSANLSGISSTLQKKGFRSQELGDKYAKPML ESPALTPQVTAWDNSWIVAHPAAIEPDLCYQIMEQKWKPFDWPDFRLPMRVSCRFRTMEALN KANTSFALDFFKHECQEDDDENILFSPFSISSALATVYLGAKGNTADQMAKTEIGKSGNIHAGF KALDLEINQPTKNYLLNSVNQLYGEKSLPFSKEYLQLAKKYYSAEPQSVDFLGKANEIRREINS RVEHQTEGKIKNLLPPGSIDSLTRLVLVNALYFKGNWATKFEAEDTRHRPFRINMHTTKQVPM MYLRDKFNWTYVESVQTDVLELPYVNNDLSMFILLPRDITGLQKLINELTFEKLSAWTSPELM EKMKMEVYLPRFTVEKKYDMKSTLSKMGIEDAFTKVDSCGVTNVDEITTHIVSSKCLELKHIQ INKKLKCNKAVAMEQVSASIGNFTIDLFNKLNETSRDKNIFFSPWSVSSALALTSLAAKGNTAR EMAEDPENEQAENIHSGFKELMTALNKPRNTYSLKSANRIYVEKNYPLLPTYIQLSKKYYKAE PYKVNFKTAPEQSRKEINNWVEKQTERKIKNFLSSDDVKNSTKSILVNAIYFKAEWEEKFQAG NTDMQPFRMSKNKSKLVKMMYMRHTFPVLIMEKLNFKMIELPYVKRELSMFILLPDDIKDST TGLEQLERELTYEKLSEWADSKKMSVTLVDLHLPKFSMEDRYDLKDALKSMGMASAFNSNA DFSGMTGFQAVPMESLSASTNSFTLDLYKKLDETSKGQNIFFASWSIATALAMVHLGAKGDT ATQVAKGPEYEETENIHSGFKELLSAINKPRNTYLMKSANRLFGDKTYPLLPKFLELVARYYQ AKPQAVNFKTDAEQARAQINSWVENETESKIQNLLPAGSIDSHTVLVLVNAIYFKGNWEKRFL EKDTSKMPFRLSKTETKPVQMMFLKDTFLIHHERTMKFKIIELPYVGNELSAFVLLPDDISDNT TGLELVERELTYEKLAEWSNSASMMKAKVELYLPKLKMEENYDLKSVLSDMGIRSAFDPAQ ADFTRMSEKKDLFISKVIHKAFVEVNEEDRIVQLASGRLTGRCRTLANKELSEKNRTKNLFFSP FSISSALSMILLGSKGNTEAQIAKVLSLSKAEDAHNGYQSLLSEINNPDTKYILRTANRLYGEKT FEFLSSFIDSSQKFYHAGLEQTDFKNASEDSRKQINGWVEEKTEGKIQKLLSEGIINSMTKLVLV NAIYFKGNWQEKFDKETTKEMPFKINKNETKPVQMMFRKGKYNMTYIGDLETTVLEIPYVDN ELSMIILLPDSIQDESTGLEKLERELTYEKLMDWINPNMMDSTEVRVSLPRFKLEENYELKPTL STMGMPDAFDLRTADFSGISSGNELVLSEVVHKSFVEVNEEGTEAAAATAGIMLLRCAMIVA NFTADHPFLFFIRHNKTNSILFCGRFCSP PREDICTED: SEQIDNO: MGSIGTASTEFCFDMFKEMKVQHANQNIIFSPLTIISALSMVYLGARDNTKAQMEKVIHFDKIT Ovalbumin 259 GFGESVESQCGTSVSIHTSLKDMLSEITKPSDNYSLSLASRLYAEETYPILPEYLQCMKELYKG isoformX2 GLETVSFQTAADQARELINSWVESQTNGVIKNFLQPGSVDPQTEMVLVNAIYFKGMWEKAFK [Apteryx DEDTQEVPFRITEQESKPVQMMYQVGSFKVATVAAEKMKILEIPYTHRELSMFVLLPDDISGL australis EQLETTISFEKLTEWTSSNMMEERKVKVYLPHMKIEEKYNLTSVLMALGMTDLFSPSANLSGI mantelli] STAQTLMMSEAIHGAYVEIYEAGREMASSTGVQVEVTSVLEEVRADKPFLFFIRHNPTNSMVV FGRYMSP Hypothetical SEQIDNO: MTSNTCHEADEFENIDFRMDSISVTNTKFCFDVFNEMKVHHVNENILYSPLSILTALAMVYLG protein 260 ARGNTESQMKKALHFDSITGGGSTTDSQCGSSEYIHNLFKEFLTEITRTNATYSLEIADKLYVD ASZ78_006007 KTFTVLPEYINCARKFYTGGVEEVNFKTAAEEARQLMNSWVEKETNGQIKDLLVPSSVDFGT [Callipepla MMVFINTIYFKGIWKTAFNTEDTREMPFSMTKQESKPVQMMCLNDTFNMVTLPAEKMRILEL squamata] PYASGELSMLVLLPDEVSGLERIEKAINFEKLREWTSTNAMEKKSMKVYLPRMKIEEKYNLTS TLMALGMTDLFSRSANLTGISSVDNLMISDAVHGAFMEVNEEGTEAAGSTGAIGNIKHSVEFE EFRADHPFLFLIRYNPTNVILFFDNSEFTMGSIGAVSTEFCFDVFKELRVHHANENIFYSPFTIISA LAMVYLGAKDSTRTQINKVVRFDKLPGFGDSIEAQCGTSANVHSSLRDILNQITKPNDIYSFSL ASRLYADETYTILPEYLQCVKELYRGGLESINFQTAADQARELINSWVESQTSGIIRNVLQPSSV DSQTAMVLVNAIYFKGLWEKGFKDEDTQAIPFRVTEQENKSVQMMYQIGTFKVASVASEKM KILELPFASGTMSMWVLLPDEVSGLEQLETTISIEKLTEWTSSSVMEERKIKVFLPRMKMEEKY NLTSVLMAMGMTDLFSSSANLSGISSTLQKKGFRSQELGDKYAKPMLESPALTPQATAWDNS WIVAHPPAIEPDLYYQIMEQKWKPFDWPDFRLPMRVSCRFRTMEALNKANTSFALDFFKHEC QEDDSENILFSPFSISSALATVYLGAKGNTADQMAKVLHFNEAEGARNVTTTIRMQVYSRTDQ QRLNRRACFQKTEIGKSGNIHAGFKGLNLEINQPTKNYLLNSVNQLYGEKSLPFSKEYLQLAK KYYSAEPQSVDFVGTANEIRREINSRVEHQTEGKIKNLLPPGSIDSLTRLVLVNALYFKGNWAT KFEAEDTRHRPFRINTHTTKQVPMMYLSDKFNWTYVESVQTDVLELPYVNNDLSMFILLPRDI TGLQKLINELTFEKLSAWTSPELMEKMKMEVYLPRFTVEKKYDMKSTLSKMGIEDAFTKVDN CGVTNVDEITIHVVPSKCLELKHIQINKELKCNKAVAMEQVSASIGNFTIDLFNKLNETSRDKN IFFSPWSVSSALALTSLAAKGNTAREMAEDPENEQAENIHSGFNELLTALNKPRNTYSLKSAN RIYVEKNYPLLPTYIQLSKKYYKAEPHKVNFKTAPEQSRKEINNWVEKQTERKIKNFLSSDDV KNSTKLILVNAIYFKAEWEEKFQAGNTDMQPFRMSKNKSKLVKMMYMRHTFPVLIMEKLNF KMIELPYVKRELSMFILLPDDIKDSTTGLEQLERELTYEKLSEWADSKKMSVTLVDLHLPKFS MEDRYDLKDALRSMGMASAFNSNADFSGMTGERDLVISKVCHQSFVAVDEKGTEAAAATA VIAEAVPMESLSASTNSFTLDLYKKLDETSKGQNIFFASWSIATALTMVHLGAKGDTATQVAK GPEYEETENIHSGFKELLSALNKPRNTYSMKSANRLFGDKTYPLLPTKTKPVQMMFLKDTFLI HHERTMKFKIIELPYMGNELSAFVLLPDDISDNTTGLELVERELTYEKLAEWSNSASMMKVKV ELYLPKLKMEENYDLKSALSDMGIRSAFDPAQADFTRMSEKKDLFISKVIHKAFVEVNEEDRI VQLASGRLTGNTEAQIAKVLSLSKAEDAHNGYQSLLSEINNPDTKYILRTANRLYGEKTFEFLS SFIDSSQKFYHAGLEQTDFKNASEDSRKQINGWVEEKTEGKIQKLLSEGIINSMTKLVLVNAIY FKGNWQEKFDKETTKEMPFKINKNETKPVQMMFRKGKYNMTYIGDLETTVLEIPYVDNELS MIILLPDSIQDESTGLEKLERELTYEKLMDWINPNMMDSTEVRVSLPRFKLEENYELKPTLSTM GMPDAFDLRTADFSGISSGNELVLSEVVHKSFVEVNEEGTEAAAATAGIMLLRCAMIVANFTA DHPFLFFIRHNKTNSILFCGRFCSP PREDICTED: SEQIDNO: MASIGAASTEFCFDVFKELKTQHVKENIFYSPMAIISALSMVYIGARENTRAEIDKVVHFDKIT Ovalbumin- 261 GFGNAVESQCGPSVSVHSSLKDLITQISKRSDNYSLSYASRIYAEETYPILPEYLQCVKEVYKG like GLESISFQTAADQARENINAWVESQTNGMIKNILQPSSVNPQTEMVLVNAIYLKGMWEKAFK [Mesitornis DEDTQTMPFRVTQQESKPVQMMYQIGSFKVAVIASEKMKILELPYTSGQLSMLVLLPDDVSG unicolor] LEQVESAITAEKLMEWTSPSIMEERTMKVYLPRMKMVEKYNLTSVLMALGMTDLFTSVANL SGISSAQGLKMSQAIHEAFVEIYEAGSEAVGSTGVGMEITSVSEEFKADLSFLFLIRHNPTNSIIF FGRCISP Ovalbumin, SEQIDNO: MGSIGAASTEFCFDVFRELRVQHVNENIFYSPFSIISALAMVYLGARDNTRTQIDKISQFQALSD partial[Anas 262 EHLVLCIQQLGEFFVCTNRERREVTRYSEQTEDKTQDQNTGQIHKIVDTCMLRQDILTQITKPS platyrhynchos] DNFSLSFASRLYAEETYAILPEYLQCVKELYKGGLESISFQTAADQARELINSWVESQINGIIKN ILQPSSVDSQTTMVLVNAIYFKGMWEKAFKDEDTQAMPFRMTEQESKPVQMMYQVGSFKVA MVTSEKMKILELPFASGMMSMFVLLPDEVSGLEQLESTISFEKLTEWTSSTMMEERRMKVYLP RMKMEEKYNLTSVFMALGMTDLFSSSANMSGISSTVSLKMSEAVHAACVEIFEAGRDVVGSA EAGMDVTSVSEEFRADHPFLFFIKHNPTNSILFFGRWMSP PREDICTED: SEQIDNO: MGSIGAASAEFCLDIFKELKVQHVNENIIFSPMTIISALSLVYLGAKEDTRAQIEKVVPFDKIPGF Ovalbumin- 263 GEIVESQCPKSASVHSSIQDIFNQIIKRSDNYSLSLASRLYAEESYPIRPEYLQCVKELDKEGLETI like[Chaetura SFQTAADQARQLINSWVESQTNGMIKNILQPSSVNSQTEMVLVNAIYFRGLWQKAFKDEDTQ pelagica] AVPFRITEQESKPVQMMQQIGSFKVAEIASEKMKILELPYASGQLSMLVLLPDDVSGLEKLESS ITVEKLIEWTSSNLTEERNVKVYLPRLKIEEKYNLTSVLAALGITDLFSSSANLSGISTAESLKLS RAVHESFVEIQEAGHEVEGPKEAGIEVTSALDEFRVDRPFLFVTKHNPTNSILFLGRCLSP PREDICTED: SEQIDNO: MGSISAASGEFCLDIFKELKVQHVNENIFYSPMVIVSALSLVYLGARENTRAQIDKVIPFDKITG Ovalbumin- 264 SSEAVESQCGTPVGAHISLKDVFAQIAKRSDNYSLSFVNRLYAEETYPILPEYLQCVKELYKGG like LETISFQTAADQAREIINSWVESQTDGKIKNILQPSSVDPQTKMVLVSAIYFKGLWEKSFKDED [Apaloderma TQAVPFRVTEQESKPVQMMYQIGSFKVAALAAEKIKILELPYASEQLSMLVLLPDDVSGLEQLE vittatum] KKISYEKLTEWTSSSVMEEKKIKVYLPRMKIEEKYNLTSILMSLGITDLFSSSANLSGISSTKSLK MSEAVHEASVEIYEAGSEASGITGDGMEATSVFGEFKVDHPFLFMIKHKPTNSILFFGRCISP Ovalbumin- SEQIDNO: MGSIGPVSTEVCCDIFRELRSQSVQENVCYSPLLIISTLSMVYIGAKDNTKAQIEKAIHFDKIPGF like[Corvus 265 GESTESQCGTSVSIHTSLKDIFTQITKPSDNYSISIARRLYAEEKYPILPEYIQCVKELYKGGLESI cornixcornix] SFQTAAEKSRELINSWVESQTNGTIKNILQPSSVSSQTDMVLVSAIYFKGLWEKAFKEEDTQTI PFRITEQESKPVQMMSQIGTFKVAEIPSEKCRILELPYASGRLSLWVLLPDDISGLEQLETAITFE NLKEWTSSSKMEERKIRVYLPRMKIEEKYNLTSVLKSLGITDLFSSSANLSGISSAESLKVSAAF HEASVEIYEAGSKGVGSSEAGVDGTSVSEEIRADHPFLFLIKHNPSDSILFFGRCFSP PREDICTED: SEQIDNO: MGSIGAASTEFCFDVFKELKVQHVNENIIISPLSIISALSMVYLGAREDTRAQIDKVVHFDKITG Ovalbumin- 266 FGEAIESQCPTSESVHASLKETFSQLTKPSDNYSLAFASRLYAEETYPILPEYLQCVKELYKGGL like[Calypte ETINFQTAAEQARQVINSWVESQTDGMIKSLLQPSSVDPQTEMILVNAIYFRGLWERAFKDED anna] TQELPFRITEQESKPVQMMSQIGSFKVAVVASEKVKILELPYASGQLSMLVLLPDDVSGLEQLE SSITVEKLIEWISSNTKEERNIKVYLPRMKIEEKYNLTSVLVALGITDLFSSSANLSGISSAESLKI SEAVHEAFVEIQEAGSEVVGSPGPEVEVTSVSEEWKADRPFLFLIKHNPTNSILFFGRYISP PREDICTED: SEQIDNO: MGSIGPVSTEVCCDIFRELRSQSVQENVCYSPLLIISTLSMVYIGAKDNTKAQIEKAIHFDKIPGF Ovalbumin 267 GESTESQCGTSVSIHTSLKDIFTQITKPSDNYSISIARRLYAEEKYPILQEYIQCVKELYKGGLESI [Corvus SFQTAAEKSRELINSWVESQTNGTIKNILQPSSVSSQTDMVLVSAIYFKGLWEKAFKEEDTQTI brachyrhynchos] PFRITEQESKPVQMMSQIGTFKVAEIPSEKCRILELPYASGRLSLWVLLPDDISGLEQLETSITFE NLKEWTSSSKMEERKIRVYLPRMKIEEKYNLTSVLKSLGITDLFSSSANLSGISSAESLKVSAVF HEASVEIYEAGSKGVGSSEAGVDGTSVSEEIRADHPFLFLIKHNPSDSILFFGRCFSP Hypothetical SEQIDNO: MLNLMHPKQFCCTMGSIGPVSTEVCCDIFRELRSQSVQENVCYSPLLIISTLSMVYIGAKDNTK protein 268 AQIEKAIHFDKIPGFGESTESQCGTSVSIHTSLKDIFTQITKPSDNYSISIASRLYAEEKYPILPEYI DUI87_08270 QCVKELYKGGLESISFQTAAEKSRELINSWVESQTNGTIKNILQPSSVSSQTDMVLVSAIYFKG [Hirundo LWEKAFKEEDTQTVPFRITEQESKPVQMMSQIGTFKVAEIPSEKCRILELPYASGRLSLWVLLP rusticarustica] DDISGLEQLETAITSENLKEWTSSSKMEERKIKVYLPRMKIEEKYNLTSVLKSLGITDLFSSSAN LSGISSAESLKVSGAFHEAFVEIYEAGSKAVGSSGAGVEDTSVSEEIRADHPFLFFIKHNPSDSIL FFGRCFSP OstrichOVA SEQIDNO: EAEAGSIGTASAEFCFDVFKELKVHHVNENIFYSPLSIISALSMVYLGARENTKTQMEKVIHFD sequenceas 269 KITGLGESMESQCGTGVSIHTALKDMLSEITKPSDNYSLSLASRLYAEQTYAILPEYLQCIKELY secretedfrom KESLETVSFQTAADQARELINSWIESQTNGVIKNFLQPGSVDSQTELVLVNAIYFKGMWEKAF pichia KDEDTQEVPFRITEQESRPVQMMYQAGSFKVATVAAEKIKILELPYASGELSMLVLLPDDISGL EQLETTISFEKLTEWTSSNMMEDRNMKVYLPRMKIEEKYNLTSVLIALGMTDLFSPAANLSGI SAAESLKMSEAIHAAYVEIYEADSEIVSSAGVQVEVTSDSEEFRVDHPFLFLIKHNPTNSVLFFG RCISP Ostrich SEQIDNO: MRFPSIFTAVLFAASSALAAPVNTTTEDETAQIPAEAVIGYSDLEGDFDVAVLPFSNSTNNGLL construct 270 FINTTIASIAAKEEGVSLEKREAEAGSIGTASAEFCFDVFKELKVHHVNENIFYSPLSIISALSMV (secretion YLGARENTKTQMEKVIHFDKITGLGESMESQCGTGVSIHTALKDMLSEITKPSDNYSLSLASRL signal+ YAEQTYAILPEYLQCIKELYKESLETVSFQTAADQARELINSWIESQTNGVIKNFLQPGSVDSQ mature TELVLVNAIYFKGMWEKAFKDEDTQEVPFRITEQESRPVQMMYQAGSFKVATVAAEKIKILE protein) LPYASGELSMLVLLPDDISGLEQLETTISFEKLTEWTSSNMMEDRNMKVYLPRMKIEEKYNLT SVLIALGMTDLFSPAANLSGISAAESLKMSEAIHAAYVEIYEADSEIVSSAGVQVEVTSDSEEFR VDHPFLFLIKHNPTNSVLFFGRCISP DuckOVA SEQIDNO: EAEAGSIGAASTEFCFDVFRELRVQHVNENIFYSPFSIISALAMVYLGARDNTRTQIDKVVHFD sequenceas 271 KLPGFGESMEAQCGTSVSVHSSLRDILTQITKPSDNFSLSFASRLYAEETYAILPEYLQCVKELY secretedfrom KGGLESISFQTAADQARELINSWVESQINGIIKNILQPSSVDSQTTMVLVNAIYFKGMWEKAF pichia KDEDTQAMPFRMTEQESKPVQMMYQVGSFKVAMVTSEKMKILELPFASGMMSMFVLLPDE VSGLEQLESTISFEKLTEWTSSTMMEERRMKVYLPRMKMEEKYNLTSVFMALGMTDLFSSSA NMSGISSTVSLKMSEAVHAACVEIFEAGRDVVGSAEAGMDVTSVSEEFRADHPFLFFIKHNPT NSILFFGRWMSP Duck SEQIDNO: MRFPSIFTAVLFAASSALAAPVNTTTEDETAQIPAEAVIGYSDLEGDFDVAVLPFSNSTNNGLL construct 272 FINTTIASIAAKEEGVSLEKREAEAGSIGAASTEFCFDVFRELRVQHVNENIFYSPFSIISALAMV (secretion YLGARDNTRTQIDKVVHFDKLPGFGESMEAQCGTSVSVHSSLRDILTQITKPSDNFSLSFASRL signal+ YAEETYAILPEYLQCVKELYKGGLESISFQTAADQARELINSWVESQINGIIKNILQPSSVDSQT mature TMVLVNAIYFKGMWEKAFKDEDTQAMPFRMTEQESKPVQMMYQVGSFKVAMVTSEKMKIL protein) ELPFASGMMSMFVLLPDEVSGLEQLESTISFEKLTEWTSSTMMEERRMKVYLPRMKMEEKYN LTSVFMALGMTDLFSSSANMSGISSTVSLKMSEAVHAACVEIFEAGRDVVGSAEAGMDVTSV SEEFRADHPFLFFIKHNPTNSILFFGRWMSP Ovoglobulin SEQIDNO: TRAPDCGGILTPLGLSYLAEVSKPHAEVVLRQDLMAQRASDLFLGSMEPSRNRITSVKVADL G2 273 WLSVIPEAGLRLGIEVELRIAPLHAVPMPVRISIRADLHVDMGPDGNLQLLTSACRPTVQAQST REAESKSSRSILDKVVDVDKLCLDVSKLLLFPNEQLMSLTALFPVTPNCQLQYLPLAAPVFSKQ GIALSLQTTFQVAGAVVPVPVSPVPFSMPELASTSTSHLILALSEHFYTSLYFTLERAGAFNMTI PSMLTTATLAQKITQVGSLYHEDLPITLSAALRSSPRVVLEEGRAALKLFLTVHIGAGSPDFQSF LSVSADVTAGLQLSVSDTRMMISTAVIEDAELSLAASNVGLVRAALLEELFLAPVCQQVPAW MDDVLREGVHLPHLSHFTYTDVNVVVHKDYVLVPCKLKLRSTMA* Ovoglobulin SEQIDNO: MDSISVTNAKFCFDVFNEMKVHHVNENILYCPLSILTALAMVYLGARGNTESQMKKVLHFDS G3 274 ITGAGSTTDSQCGSSEYVHNLFKELLSEITRPNATYSLEIADKLYVDKTFSVLPEYLSCARKFYT GGVEEVNFKTAAEEARQLINSWVEKETNGQIKDLLVSSSIDFGTTMVFINTIYFKGIWKIAFNT EDTREMPFSMTKEESKPVQMMCMNNSFNVATLPAEKMKILELPYASGDLSMLVLLPDEVSGL ERIEKTINFDKLREWTSTNAMAKKSMKVYLPRMKIEEKYNLTSILMALGMTDLFSRSANLTGI SSVDNLMISDAVHGVFMEVNEEGTEATGSTGAIGNIKHSLELEEFRADHPFLFFIRYNPTNAILF FGRYWSP* -ovomucin SEQIDNO: CSTWGGGHFSTFDKYQYDFTGTCNYIFATVCDESSPDFNIQFRRGLDKKIARIIIELGPSVIIVEK 275 DSISVRSVGVIKLPYASNGIQIAPYGRSVRLVAKLMEMELVVMWNNEDYLMVLTEKKYMGK TCGMCGNYDGYELNDFVSEGKLLDTYKFAALQKMDDPSEICLSEEISIPAIPHKKYAVICSQLL NLVSPTCSVPKDGFVTRCQLDMQDCSEPGQKNCTCSTLSEYSRQCAMSHQVVFNWRTENFCS VGKCSANQIYEECGSPCIKTCSNPEYSCSSHCTYGCFCPEGTVLDDISKNRTCVHLEQCPCTLN GETYAPGDTMKAACRTCKCTMGQWNCKELPCPGRCSLEGGSFVTTFDSRSYRFHGVCTYILM KSSSLPHNGTLMAIYEKSGYSHSETSLSAIIYLSTKDKIVISQNELLTDDDELKRLPYKSGDITIF KQSSMFIQMHTEFGLELVVQTSPVFQAYVKVSAQFQGRTLGLCGNYNGDTTDDFMTSMDITE GTASLFVDSWRAGNCLPAMERETDPCALSQLNKISAETHCSILTKKGTVFETCHAVVNPTPFY KRCVYQACNYEETFPYICSALGSYARTCSSMGLILENWRNSMDNCTITCTGNQTFSYNTQACE RTCLSLSNPTLECHPTDIPIEGCNCPKGMYLNHKNECVRKSHCPCYLEDRKYILPDQSTMTGGI TCYCVNGRLSCTGKLQNPAESCKAPKKYISCSDSLENKYGATCAPTCQMLATGIECIPTKCES GCVCADGLYENLDGRCVPPEECPCEYGGLSYGKGEQIQTECEICTCRKGKWKCVQKSRCSST CNLYGEGHITTFDGQRFVFDGNCEYILAMDGCNVNRPLSSFKIVTENVICGKSGVTCSRSISIYL GNLTIILRDETYSISGKNLQVKYNVKKNALHLMFDIIIPGKYNMTLIWNKHMNFFIKISRETQET ICGLCGNYNGNMKDDFETRSKYVASNELEFVNSWKENPLCGDVYFVVDPCSKNPYRKAWAE KTCSIINSQVFSACHNKVNRMPYYEACVRDSCGCDIGGDCECMCDAIAVYAMACLDKGICID WRTPEFCPVYCEYYNSHRKTGSGGAYSYGSSVNCTWHYRPCNCPNQYYKYVNIEGCYNCSH DEYFDYEKEKCMPCAMQPTSVTLPTATQPTSPSTSSASTVLTETTNPPV* Lysozyme SEQIDNO: KVFGRCELAAAMKRHGLDNYRGYSLGNWVCAAKFESNFNTQATNRNTDGSTDYGILQINSR 276 WWCNDGRTPGSRNLCNIPCSALLSSDITASVNCAKKIVSDGNGMNAWVAWRNRCKGTDVQ AWIRGCRL* Lysozyme SEQIDNO: KVFGRCELAAAMKRHGLDNYRGYSLGNWVCVAKFESNFNTQATNRNTDGSTDYGILQINSR 277 WWCNDGRTPGSRNLCNIPCSALLSSDITASVNCAKKIVSDGNGMSAWVAWRNRCKGTDVQA WIRGCRL* LysozymeC SEQIDNO: KVFERCELARTLKRLGMDGYRGISLANWMCLAKWESGYNTRATNYNAGDRSTDYGIFQINS (Human) 278 RYWCNDGKTPGAVNACHLSCSALLQDNIADAVACAKRVVRDPQGIRAWVAWRNRCQNRD VRQYVQGCGV* LysozymeC SEQIDNO: KVFERCELARTLKKLGLDGYKGVSLANWLCLTKWESSYNTKATNYNPSSESTDYGIFQINSK (Bostaurus) 279 WWCNDGKTPNAVDGCHVSCRELMENDIAKAVACAKHIVSEQGITAWVAWKSHCRDHDVSS YVEGCTL* Ovoinhibitor SEQIDNO: IEVNCSLYASGIGKDGTSWVACPRNLKPVCGTDGSTYSNECGICLYNREHGANVEKEYDGEC 280 RPKHVMIDCSPYLQVVRDGNTMVACPRILKPVCGSDSFTYDNECGICAYNAEHHTNISKLHD GECKLEIGSVDCSKYPSTVSKDGRTLVACPRILSPVCGTDGFTYDNECGICAHNAEQRTHVSK KHDGKCRQEIPEIDCDQYPTRKTTGGKLLVRCPRILLPVCGTDGFTYDNECGICAHNAQHGTE VKKSHDGRCKERSTPLDCTQYLSNTQNGEAITACPFILQEVCGTDGVTYSNDCSLCAHNIELG TSVAKKHDGRCREEVPELDCSKYKTSTLKDGRQVVACTMIYDPVCATNGVTYASECTLCAH NLEQRTNLGKRKNGRCEEDITKEHCREFQKVSPICTMEYVPHCGSDGVTYSNRCFFCNAYVQ SNRTLNLVSMAAC* Cystatin SEQIDNO: MAGARGCVVLLAAALMLVGAVLGSEDRSRLLGAPVPVDENDEGLQRALQFAMAEYNRASN 281 DKYSSRVVRVISAKRQLVSGIKYILQVEIGRTTCPKSSGDLQSCEFHDEPEMAKYTTCTFVVYS IPWLNQIKLLESKCQ* Porcine SEQIDNO: SEVCFPRLGCFSDDAPWAGIVQRPLKILPWSPKDVDTRFLLYTNQNQNNYQELVADPSTITNS Lipase 282 NFRMDRKTRFIIHGFIDKGEEDWLSNICKNLFKVESVNCICVDWKGGSRTGYTQASQNIRIVG AEVAYFVEVLKSSLGYSPSNVHVIGHSLGSHAAGEAGRRTNGTIERITGLDPAEPCFQGTPELV RLDPSDAKFVDVIHTDAAPIIPNLGFGMSQTVGHLDFFPNGGKQMPGCQKNILSQIVDIDGIWE GTRDFVACNHLRSYKYYADSILNPDGFAGFPCDSYNVFTANKCFPCPSEGCPQMGHYADRFP GKTNGVSQVFYLNTGDASNFARWRYKVSVTLSGKKVTGHILVSLFGNEGNSRQYEIYKGTLQ PDNTHSDEFDSDVEVGDLQKVKFIWYNNNVINPTLPRVGASKITVERNDGKVYDFCSQETVR EEVLLTLNPC* KidLipase SEQIDNO: GLVAADRITGGKDFRDIESKFALRTPEDTAEDTCHLIPGVTESVANCHFNHSSKTFVVIHGWTV 283 TGMYESWVPKLVAALYKREPDSNVIVVDWLSRAQQHYPVSAGYTKLVGQDVAKFMNWMA DEFNYPLGNVHLLGYSLGAHAAGIAGSLTSKKVNRITGLDPAGPNFEYAEAPSRLSPDDADFV DVLHTFTRGSPGRSIGIQKPVGHVDIYPNGGTFQPGCNIGEALRVIAERGLGDVDQLVKCSHER SVHLFIDSLLNEENPSKAYRCNSKEAFEKGLCLSCRKNRCNNMGYEINKVRAKRSSKMYLKT RSQMPYKVFHYQVKIHFSGTESNTYTNQAFEISLYGTVAESENIPFTLPEVSTNKTYSFLLYTEV DIGELLMLKLKWISDSYFSWSNWWSSPGFDIGKIRVKAGETQKKVIFCSREKMSYLQKGKSPV IFVKCHDKSLNRKSG* Porcine SEQIDNO: APKKGVRWCVISTAEYSKCRQWQSKIRRTNPMFCIRRASPTDCIRAIAAKRADAVTLDGGLVF Lactoferrin 284 EADQYKLRPVAAEIYGTEENPQTYYYAVAVVKKGFNFQLNQLQGRKSCHTGLGRSAGWNIPI GLLRRFLDWAGPPEPLQKAVAKFFSQSCVPCADGNAYPNLCQLCIGKGKDKCACSSQEPYFG YSGAFNCLHKGIGDVAFVKESTVFENLPQKADRDKYELLCPDNTRKPVEAFRECHLARVPSH AVVARSVNGKENSIWELLYQSQKKFGKSNPQEFQLFGSPGQQKDLLFRDATIGFLKIPSKIDSK LYLGLPYLTAIQGLRETAAEVEARQAKVVWCAVGPEELRKCRQWSSQSSQNLNCSLASTTED CIVQVLKGEADAMSLDGGFIYTAGKCGLVPVLAENQKSRQSSSSDCVHRPTQGYFAVAVVRK ANGGITWNSVRGTKSCHTAVDRTAGWNIPMGLLVNQTGSCKFDEFFSQSCAPGSQPGSNLCA LCVGNDQGVDKCVPNSNERYYGYTGAFRCLAENAGDVAFVKDVTVLDNINGQNTEEWARE LRSDDFELLCLDGTRKPVTEAQNCHLAVAPSHAVVSRKEKAAQVEQVLLTEQAQFGRYGKD CPDKFCLFRSETKNLLFNDNTEVLAQLQGKTTYEKYLGSEYVTAIANLKQCSVSPLLEACAFM MR* Bovine SEQIDNO: APRKNVRWCTISQPEWFKCRRWQWRMKKLGAPSITCVRRAFALECIRAIAEKKADAVTLDG Lactoferrin 285 GMVFEAGRDPYKLRPVAAEIYGTKESPQTHYYAVAVVKKGSNFQLDQLQGRKSCHTGLGRS AGWIIPMGILRPYLSWTESLEPLQGAVAKFFSASCVPCIDRQAYPNLCQLCKGEGENQCACSSR EPYFGYSGAFKCLQDGAGDVAFVKETTVFENLPEKADRDQYELLCLNNSRAPVDAFKECHLA QVPSHAVVARSVDGKEDLIWKLLSKAQEKFGKNKSRSFQLFGSPPGQRDLLFKDSALGFLRIP SKVDSALYLGSRYLTTLKNLRETAEEVKARYTRVVWCAVGPEEQKKCQQWSQQSGQNVTC ATASTTDDCIVLVLKGEADALNLDGGYIYTAGKCGLVPVLAENRKSSKHSSLDCVLRPTEGYL AVAVVKKANEGLTWNSLKDKKSCHTAVDRTAGWNIPMGLIVNQTGSCAFDEFFSQSCAPGA DPKSRLCALCAGDDQGLDKCVPNSKEKYYGYTGAFRCLAEDVGDVAFVKNDTVWENTNGE STADWAKNLNREDFRLLCLDGTRKPVTEAQSCHLAVAPNHAVVSRSDRAAHVKQVLLHQQA LFGKNGKNCPDKFCLFKSETKNLLFNDNTECLAKLGGRPTYEEYLGTEYVTALANLKKCSTSP LLEACAFLTR* Lysozyme SEQIDNO: KVFGRCELAAAMKRHGLDNYRGYSLGNWVCAAKFESNFNTQATNRNTDGSTDYGILQINSR 276 WWCNDGRTPGSRNLCNIPCSALLSSDITASVNCAKKIVSDGNGMNAWVAWRNRCKGTDVQ AWIRGCRL* Lysozyme SEQIDNO: KVFGRCELAAAMKRHGLDNYRGYSLGNWVCVAKFESNFNTQATNRNTDGSTDYGILQINSR 277 WWCNDGRTPGSRNLCNIPCSALLSSDITASVNCAKKIVSDGNGMSAWVAWRNRCKGTDVQA WIRGCRL* LysozymeC SEQIDNO: KVFERCELARTLKRLGMDGYRGISLANWMCLAKWESGYNTRATNYNAGDRSTDYGIFQINS (Human) 278 RYWCNDGKTPGAVNACHLSCSALLQDNIADAVACAKRVVRDPQGIRAWVAWRNRCQNRD VRQYVQGCGV* LysozymeC SEQIDNO: KVFERCELARTLKKLGLDGYKGVSLANWLCLTKWESSYNTKATNYNPSSESTDYGIFQINSK (Bostaurus) 279 WWCNDGKTPNAVDGCHVSCRELMENDIAKAVACAKHIVSEQGITAWVAWKSHCRDHDVSS YVEGCTL* Ovoinhibitor SEQIDNO: IEVNCSLYASGIGKDGTSWVACPRNLKPVCGTDGSTYSNECGICLYNREHGANVEKEYDGEC 280 RPKHVMIDCSPYLQVVRDGNTMVACPRILKPVCGSDSFTYDNECGICAYNAEHHTNISKLHD GECKLEIGSVDCSKYPSTVSKDGRTLVACPRILSPVCGTDGFTYDNECGICAHNAEQRTHVSK KHDGKCRQEIPEIDCDQYPTRKTTGGKLLVRCPRILLPVCGTDGFTYDNECGICAHNAQHGTE VKKSHDGRCKERSTPLDCTQYLSNTQNGEAITACPFILQEVCGTDGVTYSNDCSLCAHNIELG TSVAKKHDGRCREEVPELDCSKYKTSTLKDGRQVVACTMIYDPVCATNGVTYASECTLCAH NLEQRTNLGKRKNGRCEEDITKEHCREFQKVSPICTMEYVPHCGSDGVTYSNRCFFCNAYVQ SNRTLNLVSMAAC* Cystatin SEQIDNO: MAGARGCVVLLAAALMLVGAVLGSEDRSRLLGAPVPVDENDEGLQRALQFAMAEYNRASN 281 DKYSSRVVRVISAKRQLVSGIKYILQVEIGRTTCPKSSGDLQSCEFHDEPEMAKYTTCTFVVYS IPWLNQIKLLESKCQ* Porcine SEQIDNO: SEVCFPRLGCFSDDAPWAGIVQRPLKILPWSPKDVDTRFLLYTNQNQNNYQELVADPSTITNS Lipase 282 NFRMDRKTRFIIHGFIDKGEEDWLSNICKNLFKVESVNCICVDWKGGSRTGYTQASQNIRIVG AEVAYFVEVLKSSLGYSPSNVHVIGHSLGSHAAGEAGRRTNGTIERITGLDPAEPCFQGTPELV RLDPSDAKFVDVIHTDAAPIIPNLGFGMSQTVGHLDFFPNGGKQMPGCQKNILSQIVDIDGIWE GTRDFVACNHLRSYKYYADSILNPDGFAGFPCDSYNVFTANKCFPCPSEGCPQMGHYADRFP GKTNGVSQVFYLNTGDASNFARWRYKVSVTLSGKKVTGHILVSLFGNEGNSRQYEIYKGTLQ PDNTHSDEFDSDVEVGDLQKVKFIWYNNNVINPTLPRVGASKITVERNDGKVYDFCSQETVR EEVLLTLNPC* KidLipase SEQIDNO: GLVAADRITGGKDFRDIESKFALRTPEDTAEDTCHLIPGVTESVANCHFNHSSKTFVVIHGWTV 283 TGMYESWVPKLVAALYKREPDSNVIVVDWLSRAQQHYPVSAGYTKLVGQDVAKFMNWMA DEFNYPLGNVHLLGYSLGAHAAGIAGSLTSKKVNRITGLDPAGPNFEYAEAPSRLSPDDADFV DVLHTFTRGSPGRSIGIQKPVGHVDIYPNGGTFQPGCNIGEALRVIAERGLGDVDQLVKCSHER SVHLFIDSLLNEENPSKAYRCNSKEAFEKGLCLSCRKNRCNNMGYEINKVRAKRSSKMYLKT RSQMPYKVFHYQVKIHFSGTESNTYTNQAFEISLYGTVAESENIPFTLPEVSTNKTYSFLLYTEV DIGELLMLKLKWISDSYFSWSNWWSSPGFDIGKIRVKAGETQKKVIFCSREKMSYLQKGKSPV IFVKCHDKSLNRKSG* Porcine SEQIDNO: APKKGVRWCVISTAEYSKCRQWQSKIRRTNPMFCIRRASPTDCIRAIAAKRADAVTLDGGLVF Lactoferrin 284 EADQYKLRPVAAEIYGTEENPQTYYYAVAVVKKGFNFQLNQLQGRKSCHTGLGRSAGWNIPI GLLRRFLDWAGPPEPLQKAVAKFFSQSCVPCADGNAYPNLCQLCIGKGKDKCACSSQEPYFG YSGAFNCLHKGIGDVAFVKESTVFENLPQKADRDKYELLCPDNTRKPVEAFRECHLARVPSH AVVARSVNGKENSIWELLYQSQKKFGKSNPQEFQLFGSPGQQKDLLFRDATIGFLKIPSKIDSK LYLGLPYLTAIQGLRETAAEVEARQAKVVWCAVGPEELRKCRQWSSQSSQNLNCSLASTTED CIVQVLKGEADAMSLDGGFIYTAGKCGLVPVLAENQKSRQSSSSDCVHRPTQGYFAVAVVRK ANGGITWNSVRGTKSCHTAVDRTAGWNIPMGLLVNQTGSCKFDEFFSQSCAPGSQPGSNLCA LCVGNDQGVDKCVPNSNERYYGYTGAFRCLAENAGDVAFVKDVTVLDNINGQNTEEWARE LRSDDFELLCLDGTRKPVTEAQNCHLAVAPSHAVVSRKEKAAQVEQVLLTEQAQFGRYGKD CPDKFCLFRSETKNLLFNDNTEVLAQLQGKTTYEKYLGSEYVTAIANLKQCSVSPLLEACAFM MR* Bovine SEQIDNO: APRKNVRWCTISQPEWFKCRRWQWRMKKLGAPSITCVRRAFALECIRAIAEKKADAVTLDG Lactoferrin 285 GMVFEAGRDPYKLRPVAAEIYGTKESPQTHYYAVAVVKKGSNFQLDQLQGRKSCHTGLGRS AGWIIPMGILRPYLSWTESLEPLQGAVAKFFSASCVPCIDRQAYPNLCQLCKGEGENQCACSSR EPYFGYSGAFKCLQDGAGDVAFVKETTVFENLPEKADRDQYELLCLNNSRAPVDAFKECHLA QVPSHAVVARSVDGKEDLIWKLLSKAQEKFGKNKSRSFQLFGSPPGQRDLLFKDSALGFLRIP SKVDSALYLGSRYLTTLKNLRETAEEVKARYTRVVWCAVGPEEQKKCQQWSQQSGQNVTC ATASTTDDCIVLVLKGEADALNLDGGYIYTAGKCGLVPVLAENRKSSKHSSLDCVLRPTEGYL AVAVVKKANEGLTWNSLKDKKSCHTAVDRTAGWNIPMGLIVNQTGSCAFDEFFSQSCAPGA DPKSRLCALCAGDDQGLDKCVPNSKEKYYGYTGAFRCLAEDVGDVAFVKNDTVWENTNGE STADWAKNLNREDFRLLCLDGTRKPVTEAQSCHLAVAPNHAVVSRSDRAAHVKQVLLHQQA LFGKNGKNCPDKFCLFKSETKNLLFNDNTECLAKLGGRPTYEEYLGTEYVTALANLKKCSTSP LLEACAFLTR*

    TABLE-US-00007 TABLE7 Miscellaneous SEQID SequenceInfo NO: Aminoacidsequence CCW12homolog SEQIDNO: MFEKSKFVVSFLLLLQLFCVLGVHGQESGNGTTSDTAYACDIGATPFDGFNATIYQYQAS GQ68_01574 286 DDNSIQDPVFMSTGYLQRNQLHSTTGVTNPGFNIFTAGVATTTLYGIPNVNYQNMLLELK (chr1) GYFRADASGNYGLSLRNIDDSAILFFGRETAFECCNENLIPLDEAPTDYSLFTIKEGEASTN PDSYTYTQYLEAGRYYPVRTFFANIRTRAVFNFTMTLPDGSELTDFQNYIFQFGALNQQQ CQAEIVTRENYTTTTEPWTGTFEATTTVIPSGTEPGTVIVQTPYSTIDSTSTWTGTFTTFTTD ADGSTIAVVPSSTIDDHFASTETVLTDTAISTTVITVTSCGTSKCTKTTALTGVTQRTLTIDD RTTVVTTYCPLPTDVATIKTASVSGSEVVQTIYTAKHSQAVSYVHPSTVTITREVCDAQTC TQATIVTGEILQTTVVDSGSTTVVPKYVPVETHEPTFELSTL CCW14homolog SEQIDNO: MQFTFASTSVVVSLIAALAKPAVATPPACLLACAAEVVKESSDCDALNNIQCICENEGSAI GQ68_01658 287 HACLESTCPDGLSSTALQSFEDVCESVGTEANLDESSSSQSSSSSSSSESSSSSVSSSSSSASS (PAS_chr1-4_ SSETSSSVTSSSVTSSSTAVSSSTESSSSVEPSTSHSSSHSSSEVSSTVAPTTSVAPTTSSITT 0510) SSTSLTSATTSSVTISIEPTSDAADKVIIPGLAGLVGALAVGLI CCW22homologs SEQIDNO: MQYRSLFLGSALLAAANAAVYNTTVTDVVSELETTVLTITSCAEDKCITSKSTGLITTSTL GQ68_02511 288 TKHGVVTVVTTVCDLPSTTKSYVPPAKTTTIPPPEKTTTTVPPPAKTTTTVPPPAKTTSTVP (chr1) PPAKTSSHHESTITVTVPSSTSTKKIETESTTYHFVTQTTTARNITPPAITTQSHGAAGMNA ANFVGLGAAAVAAAALVL CCW22homolog SEQIDNO: MSLLLFLVLGAFLLSSVKAADIGAFRLRVYTPGRFTNGALNFNNWGYQYLDASSSNGQL GQ68_03003 289 FAGYATVTSVTTFLAPDDEGFVWGSSLGGYPGFLGIGAGATAFHLTGIPGDALSWYIEDN (chr3) ILKTSSPTYVCSRNDGDVVVGIEANTRWLAMHDTSQLPPNYYCFQADYEIVALWYIPDTT STWTGTETSTTTDDDGSVIELVPTPLPDTTSTWTGTFTTFTTDDDGSVIELVPTPLPDSTST WTGTYTTFTTDEDGSTIAVVPSSTIDSTSTWTGTYTTFTTDEDGSTIAVVPSSTIDSTSTWT GTYTTFTTDEDGSTIAVYHHLLSTPHPPGLVLTPRSLPMRMEVLLLWYHHLLSTLHPPGL VLTPRSLPMRMEVLLLWYHRLLSTPHPGLVLTPRSLPMRMEVLLLYHHLLSTPHPPGLVL TPRSLPMRMEVLLLWY FLO5homolog SEQIDNO: MKLQLQSFVFFLLSAVNVLADDSYGCSIATSPRSTGFVANLYEFPNMAISNAELKTYVRY GQ68_04296 290 RYKEGRLYDTISNIISPYFYYQGQGANSAYGTLYGRPNVYLYNFSMELKGYFRPPITGQY (chr4) TIDFNGANVDDAAMVFFGKAGAFDCCNSDYILPEQSAEYSLYSVYPHTATDQILSATIYL EAGKYYPLRVTYTNIGNIGSLDLRVVLPSGASITSLGAFVYQFPNNLSPGTCTPDVEYFTT TTQAWTGTYETTYTVPPSGTQPGTVIIETPESYVTTTQPWTGTYETTYTVPPTGTEPGTVII ETPESYVTTTQPWTGTYETTYTVPPSGTEPGTVIIETPESYVTTTQPWTGTYETTYTVPPSG TEPGTVIIETPESYVTTTQPWTGTYETTYTVPPSGTEPGIVIIETPESYVTTTQPWTGTYETT YTVPPSGTEPGTVVIETPEITDCEAVCCGAVPTSDPLRRRDVCDCETFCCPGDTNCETYVT TTQPWTGTYETTYTVPPSGTEPGTVIIETPESYVTTTQPWTGTYETTYTVPPSGTEPGIVIIE TPESYVTTTQPWTGTYETTYTVPPTGTEPGTVIIETPESYVTTTQPWTGTYETTYTVPPSGT EPGIVIIETPESYVTTTQPWTGTYETTYTVPPSGTEPGTVIIETPESYVTTTQPWTGTYETTY TVPPTGTEPGTVIIETPESYVTTTQPWTGTYETTYTVPPSGTEPGIVIIETPESYVTTTQPWT GTYETTYTVPPTGTEPGTVIIETPESYVTTTQPWTGTYETTYTVPPTGTEPGTVIIETPESYV TTTQPWTGTYETTYTVPPSGTEPGTVIIETPESYVTTTQPWTGTYETTYTVPPSGTEPGTV VIETPEITDCEAVCCGAVPTSDPLRRRDVCDCETFCCPGDTNCETYVTTTQPWTGTYETT YTVPPSGTEPGTVIIETPESYVTTTQPWTGTYETTYTVPPTGTEPGTVIIETPESYVTTTQPW TGTYETTYTVPPSGTQPGTVIIETPESYVTTTQPWTGTYETTYTVPPTGTEPGTVIIETPESY VTTTQPWTGTYETTYTVPPSGTEPGTVIIETPESYVTTTQPWTGTYETTYTVPPSGTQPGT VIIETPESYVTTTQPWTGTYETTYTVPPTGTEPGTVIIETPESYVTTTQPWTGTYETTYTVP PSGTEPGIVIIETPESYVTTTQPWTGTYETTYTVPPTGTEPGTVIIETPESYVTTTQPWTGTY ETTYTVPPTGTEPGTVIIETPESYVTTTQPWTGTYETTYTVPPSGTEPGTVIIETPESYVTTT QPWTGTYETTYTVPPSGTEPGTVVIETPEITDCEAVCCGAVPTSDPLRRRDVCDCETFCCP GDTNCETYVTTTQPWTGTYETTYTVPPSGTEPGTVIIETPESYVTTTQPWTGTYETTYTVP PTGTEPGTVIIETPESYVTTTQPWTGTYETTYTVPPSGTQPGTVIIETPESYVTTTQPWTGT YETTYTVPPTGTEPGTVIIETPESYVTTTQPWTGTYETTYTVPPSGTEPGTVIIETPESYVTT TQPWTGTYETTYTVPPSGTQPGTVIIETPESYVTTTQPWTGTYETTYTVPPTGTEPGTVIIE TPESYVTTTQPWTGTYETTYTVPPSGTEPGIVIIETPESYVTTTQPWTGTYETTYTVPPTGT EPGTVIIETPESYVTTTQPWTGTYETTYTVPPSGTEPGTVIIETPESYVTTTQPWTGTYETT YTVPPSGTQPGTVIIETPESYVTTTQPWTGTYETTYTVPPSGTEPGTVIVETPDVPGSYVTT TQPWTGTYETTHTVPPTGTEPGTVVVETPDVPGSYVTTTQPWTGTYETTHTVPPTGTEPG TVVVETPDVPGSYVTTTQPWTGTYETTYTVPPSGTEPGTVIVETPDVPGSYVTTTQPWTG TYETTHTVPPTGTEPGTVVVETPDVPGSYVTTTQPWTGVYKTTYTVPPSGTIPGTVIIETPF GYFNTSSISTKTDKRTITSVVPCSQCSESKTQYITPTGPGDVTVIISQPPSKITLSSPEDKTKT DFITSTGSIGGGSPPSHPNDKPGIITTPTQPIGGGNPSDIPSAISSVSSGGNSRASVPSFSTSS AISVQVSSLYDENSGSTFEVSLLFSVVSGFFLTLMV FLO5homolog SEQIDNO: MKFPVPLLFLLQLFFIIATQGDESGNGDESDTAYGCDITSNAFDGFDATIYEYNANDLKLI GQ68_03011 291 RDPVFMSTGYLGRNVLNKISGVTVPGFNIWNPRSRTATVYGVQNVNYYNMVLELKGYF (PAS_chr3_1145) KAAVSGDYKLTLSNIDDSSMLFFGKNTAFQCCDTGSIPVDQAPTDYSLFTIKPSNQVNSEV ISSTQYLEAGKYYPVRIVFVNALERALFNFKLTIPSGTVLDDFQDYIYQFGALDENSCYET TVSKITEWTTYTTPWTGTFETTRTITPTGTEGTVVIETPESYVTTTQPWTGTYETTYTVPPT GTEPGTVIIETPEIIDCEAVCCGPFLTAFSFRKREECQCENICCPGDTNCETYVTTTQPWTG TYETTYTVPPTGTEPGTVIIETPESYVTTTQPWTGTYETTYTVPPTGTEPGTVIIETPESYVT TTQPWTGTYETTYTVPPSGTEPGTVVIETPEIVDCEAYCCASVAIKKRELCQCENFCCSW DQSCQTYVTTTQPWTGTYETTYTVPPTGTEPGTVIIETPESYVTTTQPWTGTYETTYTVPP TGTEPGTVIIETPESYVTTTQPWTGTYETTYTVPPTGTEPGTVIIETPEIIDCEAVCCGPFLT AFSFRKREECQCENICCPGDTNCETYVTTTQPWTGTYETTYTVPPTGTEPGTVIIETPESYV TTTQPWTGTYETTYTVPPTGTEPGTVIIETPESYVTTTQPWTGTYETTYTVPPTGTEPGTVI IETPEIINCEAVCCGPFLTAFSFRKREECQCENICCPGDTNCETYVTTTQPWTGTYETTYTV PPTGTEPGTVIIETPESYVTTTQPWTGTYETTYTVPSTGTEPGTVIIETPESYVTTTQPWTGT YETTFTVPPTGTEPGTVVIETPESYVTTTQPWTGTYETTYSVPPSGTEPGTVVIETPEASTA RTKFTTVTSSWTGVFTTTKTLPASGTEPATIVIQTPTGYFNTSSLVSTRTKTNVDTVTRVIP CPICTAPKTITVVPEEPNESVSVIISQPQSSSTDTTLSKPDSVRVISQPETASQMDTSLSKTDS AVISTETAGNNIIPLAGSHSYNTIVTTVTDSPQVAQSTTATSSSNVHLTISTQTTTPSLVYSS SLSTVHQVSPSNGGFRSSITVHPLLSVIGAIFGALFM FLO5homolog SEQIDNO: MTKFTILLLVLLKFYSILAIEVDGSANGQPLAHPIVVEVHEATKWITHTSPWTGTPEAIRT GQ68_03079 292 VTGETPYEQKIARYDEFNPRLANREIIDCVAFCCGDATSSPSITEPESTATELPESYVTINRP (chr3) WSLSWIPDVPPGSPYWSTSTIPPSGTEPGTVIIYFYLYDDARKRREINFGSTQPYHGRPKLL GSIEKRELCQCDAVCCLGDLSCEVYVTTTQPWTGTYETTYTITPTGSEPGTVIIETPELYVT TTQPWTGTYETTYTITPTGSEPGTVIIETPESYVTTTQPWTGTYETTYTITPTGSEPGTVIIET PESYVTTTQPWTGTYETTYTITPTGSEPGTVIIETPESYVTTTQPWTGTYETTYTVPPSGTE PGAVIIETPELYVTTTQPWTGTYETTYTITPTGSEPGTVIIETPESYVTTTQPWTGTYETTYT VPPSGTEPGTVIIETPELYVTTTQPWTGTYETTYTITPTGSEPGTVIVEIPVSYVNSTQISTST YDTTDTVLSSGVEPGTIAIETPIVYLNTSVSAFSRPWTKIDTVTQFSSCAVCSKPETITVTPE NPIDTVTIIISQPQSTSQSNTPTSFKANSTSAFSRFDEDSIPVFGSYSYEITVNIDVNTEDDTT TNLNADTTIIIGSLSAIRTVAGSSSNYHASNISPTINSQKTASSVVVHSDSSATVYQFSPSNG APWLSVQISTLLSVVGTLLAAVLL FLO5homolog SEQIDNO: MNFRYLLILPIYASIVLGQVGDFQLLLNAKEPIRNSPSLLSSNYGNLTLPAMANGALESHF GQ68_04277 293 DYGNAYVGDDQITVVYHLPDEHGQINAYRQDTDEYIGYLGLVTDDYGEYTYLSVIMPG (chr4) VQYDQTTSVNWYIENEELKSTSINVQPLLGCYYKNPPQYSWYWASIDEPGNIASSNFVCE PCKVYVDFVPTPSADITSMWTGSETSWTTDADGTVIELVPTPSADATSVWTGDHTTWTT DDDGNVIEQIPTPSADITSMWTGSETSWTTDADGTVIELVPTPSADITSMWTGSETSWTTD ADGTVIELVPTPSADATSVWTGDHTTWTTDDDGNVIEQIPTPSADITSMWTGSETSWTTD ADGTVIELVPTPSADATSVWTGDHTTWTTDDDGNVIEQIPTPSADITSMWTGSETSWTTD ADGTVIELVPTPSADITSMWTGSETSWTTDADGTVIELVPTPSADATSVWTGDHTTWTTD DDGNVIEQIPTPSADITSMWTGSETSWTTDADGTVIELVPTPSADITSMWTGSETSWTTDA DGTVIELVPTPSADITSMWTGSETSWTTDADGTVIELVPTPSADATSVWTGDHTTWTTDD DGNVIEQIPTPSADITSMWTGSETSWTTDADGTVIELVPTPSADITSMWTGSETSWTTDAD GTVIELVPTPSADITSMWTGSETSWTTDADGTVIELVPTPSADITSMWTGSETSWTTDAD GTVIELVPTPSADITSMWTGSETSWTTDADGTVIELVPTPSADITSMWTGSETSWTTDAD GTVIELVPTPSADATSVWTGDHTTWTTDDDGNVIEQIPTPSADITSMWTGSETSWTTDAD GTVIELVPTPSADITSMWTGSETSWTTDADGTVIELVPTPSADATSVWTGDHTTWTTDDD GNVIEQIPTPSADITSMWTGSETSWTTDADGTVIELVPTPSADATSVWTGDHTTWTTDDD GNVIEQIPTPSADITSMWTGSETSWTTDADGTVIELVPTPSADITSMWTGSETSWTTDADG TVIELVPTPSADITSMWTGSETSWTTDADGTVIELVPTPSADATSVWTGDHTTWTTDDDG NVIEQIPTPSADITSMWTGSETSWTTDADGTVIELVPTPSADITSMWTGSETSWTTDADGT VIELVPTPSADATSVWTGDHTTWTTDDDGNVIEQIPTPSADITSMWTGSETSWTTDADGT VIELVPTPSADITSMWTGSETSWTTDADGTVIELVPTPSADITSMWTGSETSWTTDADGT VIELVPTPSADATSVWTGDHTTWTTDDDGNVIEQIPTPSADITSMWTGSETSWTTDADGT VIELVPTPSADITSMWTGSETSWTTDADGTVIELVPTPSADITSMWTGSETSWTTDADGT VIELVPTPSADATSVWTGDHTTWTTDDDGNVIEQIPTPSADITSMWTGSETSWTTDADGT VIELVPTPSADITSMWTGSETSWTTDADGTVIELVPTPSADATSVWTGDHTTWTTDDDG NVIEQIPTPSADITSMWTGSETSWTTDADGTVIELVPTPSADITSMWTGSETSWTTDADGT VIELVPTPSADITSMWTGSETSWTTDADGTVIELVPTPSADATSVWTGDHTTWTTDDDG NVIEQIPTPSADITSMWTGSETSWTTDADGTVIELVPTPSADITSMWTGSETSWTTDADGT VIELVPTPSADITSMWTGSETSWTTDADGTVIELVPTPSADATSVWTGDHTTWTTDDDG NVIEQIPTPSADITSMWTGSETSWTTDADGTVIELVPTPSADITSMWTGSETSWTTDADGT VIELVPTPSADITSMWTGSETSWTTDADGTVIELVPTPSADTTSVWTGSYTTWTTDEDGT VIEQVPTPSADTPSADTTSVWTGSYTTWTTDEDGTVIEQVPTPSADTTSVWTGSYTTWTT DEDGTVIEQVPTPSADTPSADTTSVWTGSYTTWTTEVGDGGSSTVVELVPTESSTSTNVM QTPVPSSGVSDGVSVFNGFNVEVFHYPADNYELANEISFLSYGYENLGLVTTVTGVSDIN FDTDSNWPYYIDRDALGNTGSYVNATIEYEGFFRAPVDGEYVFSFSSTDYNSILFVGSPAA ADQALQKREVQFLKPETSPDYVLLFNNTRDLGKTVSTTQYLLADQYYPLRVVIAAISQH ALLDFQIKLPNGASLTQYQGYVYNFALEGSESTTVIGDKTSTWTGSYTTWTTDSDGSTIV VVPPATITADKTSTWTGSYTTWTTDSDGSTVVICPSITSDHNDKPSESTLTDSSISTTVVTV TSCDIEKCTKTTALTGVRETTLTTGGTTTVVTTYCPLPTDIVTVKTTSIDGSEVLQTIYTAK PNHVVPDVQTSTVTITREVCDAFTCTHATIVTGEILKTTTLADTHYTTVVPVYVPLETYQP AVELSTLETVLKSSDLASGPVVTAGSVQPSYQSGGVAESSLTVSEFEAHSTSDTVSQPSTIS LQTGEANALKWSSFFGAALVPLVNVFFV FLO5homolog SEQIDNO: MQNTNDKLIIRTFYSISTIHGLLSINIFSDTRVYKFAIYSTDAVSLEPRTKNNMSLVTVLACF GQ68_01371 294 IIFAAHAFGQDTFYMLKVRTLTPNGYPLADSLSNPMQYWDLYYVPGGPRRLESSFVNWQ (chr1) PTTAAPINQFYCRLGTDGHMTGYNRVTGSVIGKLSFGTNAATALAFGSYDGDPSYPPQAF SISSSVSGTMTYLNVHYVNARSITWYSTTTATGETNVYINVASTGYTGDRTTYQAELWV EPFVPNIPVDTTTSIWTGSQTSYTTEVGENGGSTVIELIPTPPADATSTWTGTYTTRTTDAD GSVIEQIPTPSADTTSVWTGTYTTWTTDADGSVIEQIPTPSADTTSVWTGTYTTWTTDAD GSVIEQIPTPSADTTATWTGTETSYTTDVGEDGSSTVIELVPTPSADTTATWTGTETSYTT DVGEDGSSTVVELVPTPSADTTATWTGTETSYTTDVGEDGSSTVIELVPTPSADTTATWT GTETSYTTDVGEDGSSTVIELVPTPSADTTATWTGTETSYTTDVGEDGSSTVVELVPTPSA DTTATWTGTETSYTTDVGEDGSSTVIELVPTPSADTTATWTGTETSYTTDVGEDGSSTVIE LVPTPSADTTATWTGTETSYTTDVGEDGSSTVIELVPTPSADTTATWTGTETSYTTDVGE DGSSTVIELVPTPSADTTATWTGTETSYTTDVGEDGSSTVIELVPTPSADTTATWTGTETS YTTDVGEDGSSTVIELVPTPSADTTATWTGTETSYTTDVGEDGSSTVIELVPTPSADTTAT WTGTETSYTTDVGEDGSSTVIELVPTPSADTTATWTGTETSYTTDVGEDGSSTVIELVPTP SADTTATWTGTETSYTTDVGEDGSSTVIELVPTPSADTTATWTGTETSYTTDVGEDGSST VIELVPTPSADTTATWTGTETSYTTDVGEDGSSTVIELVPTPTPSADTTATWTGTETSYTT DVGEDGSSTVIELVPTPSADTTATWTGTETSYTTDVGEDGSSTVIELVPTPSADTTATWTG TETSYTTDVGEDGSSTVIELVPTPSADTTATWTGTETSYTTDVGEDGSSTVIELVPTPTPSA DTTATWTGTETSYTTDVGEDGSSTVIELVPTPSADTTATWTGTETSYTTDVGEDGSSTVIE LVPTPSADTTATWTGTETSYTTDVGEDGSSTVVELVPTPTPSADTTATWTGTETSYTTDV GEDGSSTVVELVPTPSADTTATWTGTETSYTTDVGEDGSSTVIELVPTPSADTTATWTGT ETSYTTDVGEDGSSTVVELVPTPSADTTATWTGTETSYTTDVGEDGSSTVVELVPTPTAD TTATWTGTETSYTTDVGEDGSSTVIELVPTPSADTTATWTGTETSYTTDVGEDGSSTVVE LVPTPSADTTATWTGTETSYTTDVGEDGSSTVVELVPTPSADTTATWTGTETSYTTDVGE DGSSTVIELVPTPSADTTATWTGTETSYTTDVGEDGSSTVVELVPTPSADTTATWTGTETS YTTDVGEDGSSTVVELVPTPTADTTATWTGTETSYTTDVGEDGSSTVIELVPTPSADTTA TWTGTETSYTTDVGEDGSSTVVELVPTPSADTTATWTGTETSYTTDVGEDGSSTVIELVP TPSADTTATWTGTETSYTTDVGEDGSSTVIELVPTPSADTTATWTGTETSYTTDVGEDGS STVIELVPTPTPSADTTATWTGTETSYTTDVGEDGSSTVIELVPTPSADTTATWTGTETSY TTDVGEDGSSTVIELVPTPTPSADTTATWTGTETSYTTDVGEDGSSTVVELVPTPSADTTA TWTGTETSYTTDVGEDGSSTVIELVPTPSADTTATWTGTETSYTTDVGEDGSSTVVELVP TPSADTTATWTGTETSYTTDVGEDGSSTVIELVPTPSADTTATWTGTETSYTTDVGEDGS STVIELVPTPSADTTATWTGTETSYTTDVGEDGSSTVIELVPTPSADTTATWTGTETSYTT DVGEDGSSTVIELVPTPSADTTATWTGTETSYTTDVGEDGSSTVIELVPTPSADTTATWTG TETSYTTDVGEDGSSTVIELVPTPSADTTATWTGTETSYTTDVGEDGSSTVIELVPTPSAD TTATWTGTETSYTTDVGEDGSSTVIELVPTPSADTTATWTGTETSYTTDVGEDGSSTVIEL VPTPSADTTATWTGTETSYTTDVGEDGSSTVIELVPTPSADTTATWTGTETSYTTDVGED GSSTVVELVPTPTPSADTTATWTGTETSYTTDVGEDGSSTVIELVPSDTETATNIVETPVPS SGVSDGVSVFDGFNVEVFHYPADNYELANEIGFLSYGYENLGLVTNATGVSDINFDTDSN WPYYIDRDALGNTGSYVNATIEYEGFFRAPVDGEYVFSFSNTDYNSILFVGSPAAAGQAL QKRRVQFLKPETSPDHVLLFNNTRDLGQTISTTQYLLADQYYPLRVVIAAISQHALLDFQI KLPNGALLTQYQGYVYNFALEGSESTTVIGDKTSTWTGSYTTWTTDSDGSTVVVVPSATI TADKTSTWTGSYTTWTTDSDGSTIVICPSITSDHNDKPSESTLTDGSISTTVVTVTSCDIEK CTKTTALTGVTETTLTTGGTTTVVTTYCPLPTDIVTVKTTSISGSEVLQTIYTAKPSHVVPN VHTLTVTITREVCDAFTCTQATIVTGEILKTTTLADTHSTTVVPVYVPLESYQSAVELSTL ETVLKSSDFASGSAVTAGSAQPSYQSGGVAESSLTGSELEAHSTSDTVSQPSTISPQTGEA NALRWSSFFGAALVPLVNVFFV FLO5homolog SEQIDNO: MTKLTILLSVLLQLFSVLAEVPKKTEWSSHTTYWTSTLEALRTVTPTGTERAVIGEAPYE GQ68_04678 295 YKLIGNDQFDPGLNAKREIIDCEAVCCGAVPTSDPLKRRDVCECENVCCPGDDCETYVTT (PAS_chr4_0363) TQPWTGTYETTYTVPPSGTEPGTVVIETPEITDCEAVCCGAVPTSDPLRRRDVCECENVCC PGDDCETYVTTTQPWTGTYETTYTVPPSGTEPGTVVIETPEITDCEAVCCGAVPTSDPLRR RDVCECENVCCPGDDCETYVTTTQPWTGTYETTYTVPPTGTEPGTVVIETPVTYVTTTQP WTGTYETTYTVPPTGTEPGTVVIETPEITDCEAVCCGAVPTSDPLRRRDVCECENVCCPG DDCETYVTTTQPWTGTYETTYTVPPTGTEPGTVVIETPVTYVTTTQPWTGTYETTYTVPP TGTEPGTVVIETPVTYVTTTQPWTGTYETTYTVPPTGTEPGTVVIETPVTYVTTTQPWTGT YETTYTIPPTGTEPGTVVIETPEITDCEAVCCGAVPTSDPLRRRDVCECENVCCPGDDCET YVTTTQPWTGTYETTYTVPPTGTEPGTVVIETPVTYVTTTQPWTGTYETTYTVPPTGTEP GTVVIETPVTYVTTTQPWTGTYETTYTVPPTGTEPGTVVIETPVTYVTTTKPWTGTYETT HTVPASGTEPGTVIIETPIKYLNTSISASTSTWTKINTVTQFISCPVCTIPKTITVTPKISNE TVTIIISQPHGTSSRTTTVVKTDGASVSSHSYKTALTTDVKPEEKTSTKLGTVTTVSGSHSAID TVTGSLSDYHASSIPHTVKSEEKASSTVTHTISSSTVYQVSPSNGASWLSVRLNTALSIIGT LFAAVFI FLO5homolog SEQIDNO: MSKTKNGGSEFVHIAYVFHIEASTPSDYINMIQIVLFPHQAQITKRMNLVTLLVCNLLCVS GQ68_04282 296 LTLGQGVYRLKFPALVVTGRESVGTTVVNYDFLVGNTGQYGDLGEFFYDGEPYYCWNS (chr4) TDSQPLSCSSSSSLLISTQNVTISHPDEDGTVYAYAERDGGLLGRFTVGSVSADWPQWAVI VYSTSSSAHPSSWYVDDNKLKLTSGLGPNNSTTLQACYFTQSSGRDRYAISLEGSPAYTG QVSCQATEFDLEFIPPSADTTSIWDGSYTTWTTDSNGIVVEQIPTPSADTTSIWTGSETSWT TDSDGTVIELVPTPSADATSIWTGDHTTWTTDSEGNVIEQIPTPSADTTSIWTGSETSWTT DSDGTVIELVPTPSADATSIWTGDHTTWTTDSEGNVIEQIPTPSADATSIWTGDHTTWTTD REGNVIEQIPTPSADTTSIWTGSETSWTTDSDGTVIELVPTPSADATSIWTGDHTTWTTDSE GNVIEQIPTPSADATSIWTGSETSWTTDSDGTVIELVPTPSADATSIWTGDHTTWTTDSEG NVIEQIPTPSADATSIWTGDHTTWTTDSEGNVIEQIPTPSADTTSIWTGSETSWTTDSDGTV IELVPTPSADATSIWTGDHTTWTTDSEGNVIEQIPTPSADATSIWTGDHTTWTTDSEGNVI EQIPTPSADTTSIWTGSETSWTTDSDGTVIELVPTPSADATSIWTGDHTTWTTDSEGNVIE QIPTPSADATSIWTGDHTTWTTDSEGNVIEQIPTPSADTTSIWTGSETSWTTDSDGTVIELV PTPSADATSIWTGDHTTWTTDSEGNVIEQIPTPSADATSIWTGSETSWTTDSDGTVIELVPT PSADATSIWTGDHTTWTTDSEGNVIEQIPTPSADATSIWTGDHTTWTTDSEGNVIEQIPTPS ADTTSIWTGDHTTWTTEVGGDGSSIVVELVPSETGTATNVVQTPVPSSGISDGVSALDGF NVEVFHYPADNYELANEISFLSYGYENLGLVTTATGVSDINFDTDSNWPSYIDRNALGNT GSYVNATIKYEGFFRAPVDGDYEFSFSNIDYNSILFVGSAAADQALRKREAQFLKPETSPN HILFFNNSRDVGQTISTTQYLSADSYYPLRVVIAAVSQHALLDFQIKLPNGVSLTQFQGYV YNFALEGAESTTVIGDKTSTWTGTYTTWTTDSEGSTIVLCPSIISDHNGKPADTTLTDGSIS TTVVTVTSCDIKKCTKTTALTGVTQKTLTVKGTTTVVTAYCPLPTDVATVKTISVGGSEV LQTVYTAKPSHIVPDVQTLTVTITREVCDALTCIPATIVTGEILKTTTLADTHSTTVIPVYV PLETHQPALDLITLETVLKSSDFANGPAITSVSVESLSHQSGVVVSEFDSDSTSGAVSQPSS AVSLQTGKASALKWSPFLGAAVISLFNVFFV FLO5homolog SEQIDNO: MNLFTILAWGFLYVPLVLGEGYYSLNFDARVPIALGILGSSYQKYTIMADRSLLGGSNIDL GQ68_03013 297 DVTFSGIIELLTNRVHIVVSLPDADGRVSVYDMYSGTSLGYLSFVCSLTTCEVHAVSSSSG (PAS_chr3_0015) ATTWTLDGNQLIPTSPSTVYACYRSLVGLLAQYTLNDRTSITAQCEQTNLYVELAIPAFPE TTAVWTGTYTTWTTDESGSVIEQMPTPSADTTTTWTGTYTTWTTDADGSVIEQIPTPPAD TTSVWTGTYTTRTTDADGSVIEQIPTPSADTTSIWTGTYTTWTTDADGSVIEQIPTPSADTT SVWTGTYTTWTTDADGSVIEQIPTPSADTTSVWTGTYTTWTTDADGSVIEQIPTPSADTTS VWTGTYTTWTTDADGSVIEQIPTPSTDTTLAPSADTTSIWTGTYTTWTTDADGSVIEQIPT PSADTTSIWTGTYTTWTTDADGSVIEQIPTPSADTTSVWTGTYTTWTTDADGSVIEQIPTP STDTTLAPSADTTSIWTGTYTTWTTDADGSVIEQIPTPSADTTSVWTGTYTTWTTDADGS VIEQIPTPSADTTSVWTGTYTTWTTDADGSVIEQIPTPSADTTSVWTGTYTTWTTDADGS VIEQIPTPSADTTSVWTGTYTTWTTDADGSVIEQIPTPSADTTSVWTGTYTTWTTDADGS VIEQIPTPSADTTSIWTGTYTTWTTDADGSVIEQIPTPSADTTSVWTGTYTTWTTDADGSV IEQIPTPSADTTLAPSADTTSIWTGTYTTWTTDADGSVIEQIPTPSADTTSIWTGTYTTWTT DADGSVIEQIPTPSADTTSVWTGTYTTWTTDADGSVIEQIPTPSADTTSVWTGTYTTWTT DADGSVIEQIPTPSADTTSVWTGTYTTWTTDADGSVIEQIPTPSADTTSVWTGTYTTWTT DADGSVIEQIPTPSADTTSVWTGTYTTWTTDADGSVIEQIPTPSADTTSVWTGTYTTWTT DADGSVIEQIPTPSTDTTLAPSADTTSIWTGTYTTWTTDADGSVIEQIPTPSADTTSVWTGT YTTWTTDADGSVIEQIPTPSADTTSVWTGTYTTWTTDADGSVIEQIPTPSADTTSVWTGT YTTWTTDADGSVIEQIPTPSTDTTLAPSADTTSIWTGTYTTWTTDADGSVIEQIPTPSADTT SIWTGTYTTWTTDADGSVIEQIPTPSADTTSVWTGTYTTWTTDADGSVIEQIPTPSADTTS VWTGTYTTWTTDADGSVIEQIPTPSADTTSVWTGTYTTWTTDADGSVIEQIPTPSADTTS VWTGTYTTWTTDADGSVIEQSPTPSAYTTSVWTGTYTTWTTDADGSVIEQIPTPSADTTS VWTGTYTTWTTDADGSVIEQIPTPSADTTSVWTGTYTTWTTDADGSVIEQIPTPSADTTS VWTGTYTTWTTDADGSVIEQIPTPSADTTSVWTGTYTTWTTDADGSVIEQIPTPSADTTS VWTGTYTTWTTDADGSVIEQIPTPSADTTSVWTGTYTTWTTDADGSVIEQSPTPSAYTTS VWTGTYTTWTTDADGSVIEQIPTPSADTTLAPSADTTSIWTGTYTTWTTDADGSVIEQIPT PSADTTSIWTGTYTTWTTDADGSVIEQIPTPSADTTSVWTGTYTTWTTDADGSVIEQIPTP SADTTSVWTGTYTTWTTDADGSVIEQIPTPSADTTSVWTGTYTTWTTDADGSVIEQIPTPS TDTTLAPSADTTSIWTGTYTTWTTDADGSVIEQIPTPSADTTSVWTGTYTTWTTDADGSV IEQIPTPSTDTTLAPSADTTSIWTGTYTTWTTDADGSVIEQIPTPSADTTSVWTGTYTTWTT DADGSVIEQIPTPSADTTSVWTGTYTTWTTDADGSVIEQIPTPSADTTSVWTGTYTTWTT DADGSVIEQIPTPSADTTLAPSADTTSIWTGTYTTWTTDADGSVIEQIPTPSADTTSVWTG TYTTWTTDAAGTVIEVIPSGTSISSDVIPTPLPTSGVDIDTIPYDAFNVAVYHYPADNYELA NNLGFLTSGYEGLGQVTTATSVGNINFDTSSGWPYYIESNALGNTGSYVNATIEYVGFFQ APANGNYELSFSNIDYNAILFLGSPATDSSLAKREVQFLKPETSSEYVLFFDHGKDAGQTV STTQYLSAGLYYPLRIVLAAVSERAQLDFQITLPDGRVLDQYQGYVYNFAHEGIESATSS AHETSWSRFTNSTIYSHSSTIGIITSSTDAPHSVINPTAIETTSTDTSISTVAVTTSICDTKD CVKTTVITPNSPLPTQTVSLTTTTIDRSEVVQTAHSAVPSQFAPDAHPSAVTITREQCDAYSCS QATIVSGKVLQTTTVSDSTTVVPLDTPQLSVEASTLETRLKSTQSSRAPTVTVQTSQSSRH SEDVTESSVHVSEFDAQSTSATSASALQAPSSISLQTGGANTLRLSAFLGTALLPMLNVLFI SED1homolog SEQIDNO: MQFSIVATLALAGSALAAYSNVTYTYETTITDVVTELTTYCPEPTTFVHKNKTITVTAPTT (GQ68_01572) 298 LTITDCPCTISKTTKITTDVPPTTHSTPHTTTTHVPSTSTPAPTHSVSTISHGGAAKAGVAGL AGVAAAAAYFL Erp1 SEQIDNO: MLLTSLLQVFACCLVLPAQVTAFYYYTSGAERKCFHKELSKGTLFQATYKAQIYDDQLQ 299 NYRDAGAQDFGVLIDIEETFDDNHLVVHQKGSASGDLTFLASDSGEHKICIQPEAGGWLI KAKTKIDVEFQVGSDEKLDSKGKATIDILHAKVNVLNSKIGEIRREQKLMRDREATFRDA SEAVNSRAMWWIVIQLIVLAVTCGWQMKHLGKFFVKQKIL Erp2 SEQIDNO: MIKSTIALPSFFIVLILALVNSVAASSSYAPVAISLPAFSKECLYYDMVTEDDSLAVGYQVL 300 TGGNFEIDFDITAPDGSVITSEKQKKYSDFLLKSFGVGKYTFCFSNNYGTALKKVEITLEK EKTLTDEHEADVNNDDIIANNAVEEIDRNLNKITKTLNYLRAREWRNMSTVNSTESRLT WLSILIIIIIAVISIAQVLLIQFLFTGRQKNYV Emp24 SEQIDNO: MASFATKFVIACFLFFSASAHNVLLPAYGRRCFFEDLSKGDELSISFQFGDRNPQSSSQLT 301 GDFIIYGPERHEVLKTVRDTSHGEITLSAPYKGHFQYCFLNENTGIETKDVTFNIHGVVYV DLDDPNTNTLDSAVRKLSKLTREVKDEQSYIVIRERTHRNTAESTNDRVKWWSIFQLGV VIANSLFQIYYLRRFFEVTSLV Erv25 SEQIDNO: MQVLQLWLTTLISLVVAVQGLHFDIAASTDPEQVCIRDFVTEGQLVVADIHSDGSVGDG 302 QKLNLFVRDSVGNEYRRKRDFAGDVRVAFTAPSSTAFDVCFENQAQYRGRSLSRAIELDI ESGAEARDWNKISANEKLKPIEVELRRVEEITDEIVDELTYLKNREERLRDTNESTNRRVR NFSILVIIVLSSLGVWQVNYLKNYFKTKHII Erp3 SEQIDNO: MSNLCVLFFQFFFLAQFFAEASPLTFELNKGRKECLYTLTPEIDCTISYYFAVQQGESNDF 303 DVNYEIFAPDDKNKPIIERSGERQGEWSFIGQHKGEYAICFYGGKAHDKIVDLDFKYNCE RQDDIRNERRKARKAQRNLRDSKTDPLQDSVENSIDTIERQLHVLERNIQYYKSRNTRNH HTVCSTEHRIVMFSIYGILLIIGMSCAQIAILEFIFRESRKHNV* Erp5 SEQIDNO: MKYNIVHGICLLFAITQAVGAVHFYAKSGETKCFYEHLSRGNLLIGDLDLYVEKDGLFEE 304 DPESSLTITVDETFDNDHRVLNQKNSHTGDVTFTALDTGEHRFCFTPFYSKKSATLRVFIE LEIGNVEALDSKKKEDMNSLKGRVGQLTQRLSSIRKEQDAIREKEAEFRNQSESANSKIM TWSVFQLLILLGTCAFQLRYLKNFFVKQKVV Flo5-2from SEQIDNO: DESGNGDESDTAYGCDITSNAFDGFDATIYEYNANDLKLIRDPVFMSTGYLGRNVLNKIS Komagataella 305 GVTVPGFNIWNPRSRTATVYGVQNVNYYNMVLELKGYFKAAVSGDYKLTLSNIDDSSM phaffii LFFGKNTAFQCCDTGSIPVDQAPTDYSLFTIKPSNQVNSEVISSTQYLEAGKYYPVRIVFV NALERALFNFKLTIPSGTVLDDFQDYIYQFGALDENSCYETTVSKITEWTTYTTPWTGTFE TTRTITPTGTEGTVVIETPESYVTTTQPWTGTYETTYTVPPTGTEPGTVIIETPEIIDCEAVC CGPFLTAFSFRKREECQCENICCPGDTNCETYVTTTQPWTGTYETTYTVPPTGTEPGTVIIE TPESYVTTTQPWTGTYETTYTVPPTGTEPGTVIIETPESYVTTTQPWTGTYETTYTVPPSGT EPGTVVIETPEIVDCEAYCCASVAIKKRELCQCENFCCSWDQSCQTYVTTTQPWTGTYET TYTVPPTGTEPGTVIIETPESYVTTTQPWTGTYETTYTVPPTGTEPGTVIIETPESYVTTTQP WTGTYETTYTVPPTGTEPGTVIIETPEIIDCEAVCCGPFLTAFSFRKREECQCENICCPGDT NCETYVTTTQPWTGTYETTYTVPPTGTEPGTVIIETPESYVTTTQPWTGTYETTYTVPPTG TEPGTVIIETPESYVTTTQPWTGTYETTYTVPPTGTEPGTVIIETPEIINCEAVCCGPFLTAFS FRKREECQCENICCPGDTNCETYVTTTQPWTGTYETTYTVPPTGTEPGTVIIETPESYVTTT QPWTGTYETTYTVPSTGTEPGTVIIETPESYVTTTQPWTGTYETTFTVPPTGTEPGTVVIET PESYVTTTQPWTGTYETTYSVPPSGTEPGTVVIETPESYVTTTQPWTGTYETTYSVPPSGT EPGTVVIETPEASTARTKFTTVTSSWTGVFTTTKTLPASGTEPATIVIQTPTGYFNTSSLVST RTKTNVDTVTRVIPCPICTAPKTITVVPEEPNESVSVIISQPQSSSTDTTLSKPDSVRVISQPE TASQMDTSLSKTDSAVISTETAGNNIIPLAGSHSYNTIVTTVTDSPQVAQSTTATSSSNVHL TISTQTTTPSLVYSSSLSTVHQVSPSNGGFRSSITVHPLLSVIGAIFGALFM Flo5-2from SEQIDNO: MKFPVPLLFLLQLFFIIATQGDESGNGDESDTAYGCDITSNAFDGFDATIYEYNANDLKLI Komagataella 306 RDPVFMSTGYLGRNVLNKISGVTVPGFNIWNPRSRTATVYGVQNVNYYNMVLELKGYF phaffii KAAVSGDYKLTLSNIDDSSMLFFGKNTAFQCCDTGSIPVDQAPTDYSLFTIKPSNQVNSEV (underlinedis ISSTQYLEAGKYYPVRIVFVNALERALFNFKLTIPSGTVLDDFQDYIYQFGALDENSCYET signalpeptide, TVSKITEWTTYTTPWTGTFETTRTITPTGTEGTVVIETPESYVTTTQPWTGTYETTYTVPPT usedinsome GTEPGTVIIETPEIIDCEAVCCGPFLTAFSFRKREECQCENICCPGDTNCETYVTTTQPWTG versionsandnot TYETTYTVPPTGTEPGTVIIETPESYVTTTQPWTGTYETTYTVPPTGTEPGTVIIETPESYVT others) TTQPWTGTYETTYTVPPSGTEPGTVVIETPEIVDCEAYCCASVAIKKRELCQCENFCCSW DQSCQTYVTTTQPWTGTYETTYTVPPTGTEPGTVIIETPESYVTTTQPWTGTYETTYTVPP TGTEPGTVIIETPESYVTTTQPWTGTYETTYTVPPTGTEPGTVIIETPEIIDCEAVCCGPFLT AFSFRKREECQCENICCPGDTNCETYVTTTQPWTGTYETTYTVPPTGTEPGTVIIETPESYV TTTQPWTGTYETTYTVPPTGTEPGTVIIETPESYVTTTQPWTGTYETTYTVPPTGTEPGTVI IETPEIINCEAVCCGPFLTAFSFRKREECQCENICCPGDTNCETYVTTTQPWTGTYETTYTV PPTGTEPGTVIIETPESYVTTTQPWTGTYETTYTVPSTGTEPGTVIIETPESYVTTTQPWTGT YETTFTVPPTGTEPGTVVIETPESYVTTTQPWTGTYETTYSVPPSGTEPGTVVIETPESYVT TTQPWTGTYETTYSVPPSGTEPGTVVIETPEASTARTKFTTVTSSWTGVFTTTKTLPASGT EPATIVIQTPTGYFNTSSLVSTRTKTNVDTVTRVIPCPICTAPKTITVVPEEPNESVSVIISQP QSSSTDTTLSKPDSVRVISQPETASQMDTSLSKTDSAVISTETAGNNIIPLAGSHSYNTIVTT VTDSPQVAQSTTATSSSNVHLTISTQTTTPSLVYSSSLSTVHQVSPSNGGFRSSITVHPLLS VIGAIFGALFM Flo11from SEQIDNO: SSGKTCPTSEVSPACYANQWETTFPPSDIKITGATWVQDNIYDVTLSYEAESLELENLTEL Komagataella 307 KIIGLNSPTGGTKLVWSLNSKVYDIDNPAKWTTTLRVYTKSSADDCYVEMYPFQIQVDW phaffii CEAGASTDGCSAWKWPKSYDYDIGCDNMQDGVSRKHHPVYKWPKKCSSNCGVEPTTS DEPEEPTTSEEPEEPTTSEEPEEPTSSDEEPTTSEEPEEPTTSDEPEEPTTSEEPEEPTTSEEP EEPTTSEEPTTSEEPEEPTSSDEEPTTSDEPEEPTTSDEPEEPTTSEEPTTSEEPEEPTTSSEE PTPSEEPEGPTCPTSEVSPACYADQWETTFPPSDIKITGATWVEDNIYDVTLSYEAESLELENL TELKIIGLNSPTGGTKVVWSLNSGIYDIDNPAKWTTTLRVYTKSSADDCYVEMYPFQIQVD WCEAGASTDGCSAWKWPKSYDYDIGCDNMQDGVSRKHHPVYKWPKKCSSDCGVEPTT SDEPEEPTTSEEPVEPTSSDEEPTTSEEPTTSEEPEEPTTSDEPEEPTTSEEPEEPTTSEEPEE PTTSEEPTTSEEPEEPTSSDEEPTTSDEPEEPTTSEEPEEPTTSEEPEEPTTSEEPEEPTTSDE PEEPTTSEEPEEPTTSEEPEEPTSSDEEPTTSEEPEEPTTSEEPEEPTTSEEPEEPTTSEEPEE PTSSDEEPTTSEEPEEPTTSDEPEEPTTSEEPEEPTTSEEPEEPTSSDEEPTTSEEPEEPTTSD EPEEPTTSEEPEEPTTSEEPEEPTTSEEPEEPTTSEEPEEPTSSDEEPTTSEEPEEPTTSDEPE EPTTSEEPEEPTTSEEPEEPTTSEEPEEPTTSDEEPGTTEEPLVPTTKTETDVSTTLLTVTDCG TKTCTKSLVITGVTKETVTTHGKTTVITTYCPLPTETVTPTPVTVTSTIYADESVTKTTVYTTG AVEKTVTVGGSSTVVVVHTPLTTAVVQSQSTDEIKTVVTARPSTTTIVRDVCYNSVCSVATIV TGVTEKTITFSTGSITVVPTYVPLVESEEHQRTASTSETRATSVVVPTVVGQSSSASATSSIF PSVTIHEGVANTVKNSMISGAVALLFNALFL Flo11from SEQIDNO: MVSLRSIFTSSILAAGLTRAHGSSGKTCPTSEVSPACYANQWETTFPPSDIKITGATWVQD Komagataella 308 NIYDVTLSYEAESLELENLTELKIIGLNSPTGGTKLVWSLNSKVYDIDNPAKWTTTLRVYT phaffii KSSADDCYVEMYPFQIQVDWCEAGASTDGCSAWKWPKSYDYDIGCDNMQDGVSRKHH PVYKWPKKCSSNCGVEPTTSDEPEEPTTSEEPEEPTTSEEPEEPTSSDEEPTTSEEPEEPTTS DEPEEPTTSEEPEEPTTSEEPEEPTTSEEPTTSEEPEEPTSSDEEPTTSDEPEEPTTSDEPEEP TTSEEPTTSEEPEEPTTSSEEPTPSEEPEGPTCPTSEVSPACYADQWETTFPPSDIKITGATWV EDNIYDVTLSYEAESLELENLTELKIIGLNSPTGGTKVVWSLNSGIYDIDNPAKWTTTLRV YTKSSADDCYVEMYPFQIQVDWCEAGASTDGCSAWKWPKSYDYDIGCDNMQDGVSRK HHPVYKWPKKCSSDCGVEPTTSDEPEEPTTSEEPVEPTSSDEEPTTSEEPTTSEEPEEPTTS DEPEEPTTSEEPEEPTTSEEPEEPTTSEEPTTSEEPEEPTSSDEEPTTSDEPEEPTTSEEPEEP TTSEEPEEPTTSEEPEEPTTSDEPEEPTTSEEPEEPTTSEEPEEPTSSDEEPTTSEEPEEPTTS EEPEEPTTSEEPEEPTTSEEPEEPTSSDEEPTTSEEPEEPTTSDEPEEPTTSEEPEEPTTSEEP EEPTSSDEEPTTSEEPEEPTTSDEPEEPTTSEEPEEPTTSEEPEEPTTSEEPEEPTTSEEPEEP TSSDEEPTTSEEPEEPTTSDEPEEPTTSEEPEEPTTSEEPEEPTTSEEPEEPTTSDEEPGTTEE PLVPTTKTETDVSTTLLTVTDCGTKTCTKSLVITGVTKETVTTHGKTTVITTYCPLPTETVTPT PVTVTSTIYADESVTKTTVYTTGAVEKTVTVGGSSTVVVVHTPLTTAVVQSQSTDEIKTVVTAR PSTTTIVRDVCYNSVCSVATIVTGVTEKTITFSTGSITVVPTYVPLVESEEHQRTASTSETR ATSVVVPTVVGQSSSASATSSIFPSVTIHEGVANTVKNSMISGAVALLFNALFL Adhesindomain SEQIDNO: DESGNGDESDTAYGCDITSNAFDGFDATIYEYNANDLKLIRDPVFMSTGYLGRNVLNKIS onlyofFlo5-2 309 GVTVPGFNIWNPRSRTATVYGVQNVNYYNMVLELKGYFKAAVSGDYKLTLSNIDDSSM fromKomagataella LFFGKNTAFQCCDTGSIPVDQAPTDYSLFTIKPSNQVNSEVISSTQYLEAGKYYPVRIVFV phaffii(without NALERALFNFKLTIPSGTVLDDFQDYIYQFGALDENSC signalpeptideor extension+anchor domains) (secretionsignal SEQIDNO: Nucleotidesequence (mini-alpha,alpha 337 ATGAGATTCCCATCTATTTTCACCGCTGTCTTGTTCGCTGCCTCCTCTGCATTGGCTGC factorsecretion CCCTGTTCAGACTACCACTGAAGACGAGCTTGAGGGTGATTTCGACGTCGCTGTTTTG signalwith CCTTTCTCTGCTTCCATTGCTGCTAAGGAAGAGGGTGTCTCTCTCGAGAAAAGAGAG deletionand GCCGAAGCT mutations toeliminate EPSproduction)) (SUC2sequence) SEQIDNO: Nucleotidesequence 338 TCTATGACGAATGAGACGTCAGACAGGCCACTGGTACATTTCACACCAAATAAAGGA TGGATGAATGATCCCAACGGTCTGTGGTACGACGAGAAAGACGCTAAGTGGCATCTG TACTTTCAATATAACCCCAATGACACTGTTTGGGGTACGCCCCTTTTCTGGGGCCATG CAACCTCAGATGATCTGACTAACTGGGAAGACCAACCTATTGCCATCGCCCCCAAGC GTAATGATTCAGGCGCCTTCTCTGGAAGTATGGTAGTCGACTACAACAACACGAGTG GTTTTTTCAACGACACAATTGACCCAAGACAGAGATGTGTTGCCATTTGGACATATA ATACACCTGAGTCAGAAGAACAATATATATCCTACTCTCTGGATGGAGGTTATACTTT TACGGAGTATCAGAAAAACCCTGTTCTGGCAGCTAATTCCACCCAATTTCGTGACCC AAAGGTGTTTTGGTATGAGCCATCACAGAAATGGATCATGACCGCCGCAAAGTCACA GGACTATAAAATTGAGATATATTCATCAGACGACTTGAAATCCTGGAAGCTGGAGAG TGCTTTCGCAAATGAAGGATTTTTGGGATACCAATACGAATGCCCTGGCCTGATCGA AGTGCCCACTGAACAAGACCCATCAAAGTCTTACTGGGTGATGTTTATCTCTATAAAC CCCGGCGCTCCCGCTGGAGGCTCCTTCAACCAATATTTCGTAGGTTCTTTTAACGGAA CCCACTTCGAGGCATTCGATAACCAATCTAGGGTTGTCGATTTCGGAAAAGATTATTA TGCACTACAAACCTTTTTTAATACGGACCCTACTTATGGATCAGCTTTAGGCATAGCC TGGGCTTCTAACTGGGAGTACAGTGCCTTTGTTCCTACAAACCCATGGCGTTCCTCAA TGAGTCTTGTCAGAAAATTCTCTCTGAATACGGAATATCAGGCTAACCCCGAGACAG AACTAATAAACTTGAAAGCAGAGCCTATCTTGAATATAAGTAACGCTGGACCTTGGA GTCGTTTTGCCACCAACACCACATTAACAAAAGCCAATTCCTATAACGTGGACCTTTC CAACTCTACGGGAACACTGGAATTTGAACTGGTGTACGCCGTAAATACTACGCAAAC AATTTCAAAGTCAGTCTTTGCTGACCTTAGTCTATGGTTCAAGGGTTTAGAGGACCCC GAGGAGTACTTACGTATGGGTTTTGAGGTATCAGCATCTTCCTTTTTCCTGGATCGTG GAAACTCCAAGGTGAAGTTTGTCAAGGAAAATCCTTATTTCACTAACAGGATGTCCG TGAACAACCAGCCTTTCAAATCTGAGAACGATCTTTCCTATTACAAGGTCTACGGACT TCTAGACCAAAACATATTGGAACTATACTTCAACGATGGAGATGTAGTTTCCACCAA CACCTATTTTATGACCACAGGCAACGCCCTTGGATCAGTAAACATGACAACAGGTGT TGATAACCTTTTTTACATAGACAAATTCCAAGTTAGAGAGGTAAAG (flexlinkers) SEQIDNO: Nucleotidesequence 339 GGTTCATCAGGGTCCTCAGGATCATCCGGTAGTAGTGGTTCATCCGGTTCATCCGGAT CAAGTGGCTCCTCTGAAGCTGCAGCAAGGGAGGCTGCAGCCCGTGAGGCAGCCGCTA GAGAAGCCGCCGCTAGGGGTGGTGGCGGCTCTGGCGGAGGCGGTTCCGGTGGCGGA GGCTCT (Tir4anchors) SEQIDNO: Nucleotidesequence 340 CAAATCAACGAATTGAACGTTGTTTTAGATGATGTTAAGACCAACATTGCCGACTAC ATCACCCTATCCTACACTCCAAATTCAGGTTTTTCCTTGGACCAAATGCCAGCTGGTA TTATGGATATTGCTGCGCAATTGGTTGCAAATCCAAGTGATGACTCCTACACCACTTT GTACTCTGAAGTGGACTTTTCTGCTGTTGAGCATATGTTGACTATGGTCCCATGGTAC TCTTCTAGACTGCTTCCAGAATTAGAAGCAATGGATGCTTCTCTAACTACCTCAAGTT CTGCTGCCACATCTTCAAGTGAAGTTGCTAGCTCTTCTATTGCTTCATCCACTAGCTCT TCTGTTGCACCATCCTCAAGTGAAGTTGTCAGCTCTTCCGTTGCTTCATCCTCAAGTG AAGTTGCCAGCTCCTCTGTTGCGTCTACAAGCGAAGCTACTAGTTCTTCTGCTGTCAC ATCTTCCTCCGCTGTTTCCTCTTCGACCGAGTCTGTTAGCTCTTCCTCTGTCAGTTCTT CCTCAGCCGTTTCCTCTTCTGAAGCTGTCAGTTCCTCTCCAGTTTCCTCAGTTGTTTCA TCTTCGGCCGGACCTGCTAGCTCAAGCGTTGCTCCTTACAACTCAACCATTGCTAGCT CTTCTTCCACTGCCCAGACTTCTATCTCGACCATTGCTCCTTACAACTCCACAACCAC CACCACCCCAGCTAGTTCTGCTTCCAGCGTTATTATCTCAACCAGAAACGGTACCACT GTTACTGAAACTGACAACACTCTTGTCACCAAAGAAACCACTGTCTGTGACTACTCTT CAACATCTGCCGTTCCAGCTTCCACCACCGGTTACAACAATTCTACTAAGGTTTCAAC CGCTACTATCTGCAGTACATGCAAAGAAGGTACCTCTACTGCAACTGACTTCTCTACA CTAAAGACTACAGTTACCGTATGTGACTCCGCCTGTCAAGCTAAGAAGTCTGCTACC GTTGTTAGCGTTCAATCTAAAACTACCGGTATCGTTGAACAAACCGAAAACGGTGCT GCCAAGGCTGTTATCGGTATGGGTGCCGGTGCTTTAGCTGCTGTTGCCGCCATGCTAC TATGA

    TABLE-US-00008 TABLE8 TerminatorSequences Sequence Sequence Info Info SequenceInfo AOX1 SEQID TCAAGAGGATGTCAGAATGCCATTTGCCTGAGAGATGCAGGCTTCATTTTTGATACTTTTTTATT terminator NO:310 TGTAACCTATATAGTATAGGATTTTTTTTGTCATTTTGTTTCTTCTCGTACGAGCTTGCTCCTGAT CAGCCTATCTCGCAGCAGATGAATATCTTGTGGTAGGGGTTTGGGAAAATCATTCGAGTTTGAT GTTTTTCTTGGTATTTCCCACTCCTCTTCAGAGTACAGAAGATTAAGTGAAACCTTCGTTTGTGC G TDH3 SEQID TCGATTTGTATGTGAAATAGCTGAAATTCGAAAATTTCATTATGGCTGTATCTACTTTAGCGTAT terminator NO:311 TAGGCATTTGAGCATTGGCTTGAACAATGCGGGCTGTAGTGTGTCACCAAAGAAACCATTCGGG TTCGGATCTGGAAGTCCTCATCACGTGATGCCGATCTCGTGTATTTTATTTTCAGATAACACCTG AAGACTTT RPS25A SEQID ATTAGTGTACATCTGATAATATAGTACTACCACGTATGATAATGTAGAGAATAGTCTTCCTTGTC terminator NO:312 GAGTGTGTTTGCAGTTTTCTTGAGTTTCAAGGTTTAAATGCTGGTATATTAGTTCATCGAAGGTT TCAGCCAATAGCACCTTAAATCAATCAAACTAATTCGACTCTTACGAAAGAGCCTACTGTGTTTA GTATCGAAGTCGTTTACCTTTCATGTTGAATAGCTTCCTCTCTGACCCTAACATTTCAAGATCCTC CTAAAGTTACCCGGATTGTGAAATTCTAATGATCCACCTGCCCAATGCATTTTTTCTTTATTCAGT TTACCTTTTTTACCTAATATACGAGCTTGTTAAAGTAAGTGGCACTGCAATACTAGGCTTATTGT TGATATTATGATGAATCGTTTTCACAAACTTGATTTCCTGTGAACTCACCATGTACTAAGGAAAA AAACATGCATCACCATCTGAATATTTGAC RPL2A SEQID ACTATGTAACTAACGAAACAGCATGTACTAATAGAACCGTATCGAGAATATTTATTTAGGTGAG terminator NO:313 TAGTAGGAGTGAACCAGACAGTCAATTTAGTGAGCTGTCCCAGCTTTTGTGCATTCCAGAATTG CCGGTCAAATTGGTTATGGGTTATGGGGCTTTTCCGATTGAGGTTCAGTTTCTGCGGTTATCTCTT TCTTGACCTGGTCTTTTACAGGCTGTTCTTTCTCCCCATGATTATTCTTTAGCTGAAGATACCGCT TAGCCTGATAATGTCGTCGTTTTGTAATCAAAATCTTTAGTTGGGCATCGTCTGAGGTTTCCTTTG GCTTCTGGGGTTGTTAGTAGGAACGTAGGAACCATAGTAACTTTTACACATACATTCTTATGATT GCGAAGTAAGCTGAGTCTGCTGCTTGGCTCCCGAAGTACTTTCTCTTTCTCTACCGGTTGATTCT CCTTCTGGTGCTCCTAAACGATTGTGTTAGAAGGGATTGAC (TDH3 SEQID GCGGCCGCTCGATTTGTATGTGAAATAGCTGAAATTCGAAAATTTCATTATGGCTGTATCTACTT trans- NO:341 TAGCGTATTAGGCATTTGAGCATTGGCTTGAACAATGCGGGCTGTAGTGTGTCACCAAAGAAAC criptional CATTCGGGTTCGGATCTGGAAGTCCTCATCACGTGATGCCGATCTCGTGTATTTTATTTTCAGAT terminator) AACACCTGAAGACTTT

    TABLE-US-00009 TABLE9 ExemplarySUCsurfacedisplaymolecules(surfacedisplayproteins) Sequence Sequence Info Info SequenceInfo Exemplary SEQID CAGGTGAACCCACCTAACTATTTTTAACTGGCATCCAGTGAGCTCGCTGGGTGAAAGCCAACCA SUC2 NO:314 TCTTTTGTTTCGGGGAACCGTGCTCGCCCCGTAAAGTTAATTTTTTTTTCCCGCGCAGCTTTAATC surface TTTCGGCAGAGAAGGCGTTTTCATCGTAGCGTGGGAACAGAATAATCAGTTCATGTGCTATACA display GGCACATGGCAGCAGTCACTATTTTGCTTTTTAACCTTAAAGTCGTTCATCAATCATTAACTGAC nucleotide CAATCAGATTTTTTGCATTTGCCACTTATCTAAAAATACTTTTGTATCTCGCAGATACGTTCAGTG sequence GTTTCCAGGACAACACCCAAAAAAAGGTATCAATGCCACTAGGCAGTCGGTTTTATTTTTGGTC ACCCACGCAAAGAAGCACCCACCTCTTTTAGGTTTTAAGTTGTGGGAACAGTAACACCGCCTAG AGCTTCAGGAAAAACCAGTACCTGTGACCGCAATTCACCATGATGCAGAATGTTAATTTAAACG AGTGCCAAATCAAGATTTCAACAGACAAATCAATCGATCCATAGTTACCCATTCCAGCCTTTTCG TCGTCGAGCCTGCTTCATTCCTGCCTCAGGTGCATAACTTTGCATGAAAAGTCCAGATTAGGGCA GATTTTGAGTTTAAAATAGGAAATATAAACAAATATACCGCGAAAAAGGTTTGTTTATAGCTTT TCGCCTGGTGCCGTACGGTATAAATACATACTCTCCTCCCCCCCCTGGTTCTCTTTTTCTTTTGTT ACTTACATTTTACCGTTCCGTCACTCGCTTCACTCAACAACAAAAGAATTCCGAAACG(GCW14 promoter) ATGAGATTCCCATCTATTTTCACCGCTGTCTTGTTCGCTGCCTCCTCTGCATTGGCTGCCCCTGTT CAGACTACCACTGAAGACGAGCTTGAGGGTGATTTCGACGTCGCTGTTTTGCCTTTCTCTGCTTC CATTGCTGCTAAGGAAGAGGGTGTCTCTCTCGAGAAAAGAGAGGCCGAAGCT(secretionsignal (mini-alpha,alphafactorsecretionsignalwithdeletionandmutationsto eliminateEPSproduction)) TCTATGACGAATGAGACGTCAGACAGGCCACTGGTACATTTCACACCAAATAAAGGATGGATGA ATGATCCCAACGGTCTGTGGTACGACGAGAAAGACGCTAAGTGGCATCTGTACTTTCAATATAA CCCCAATGACACTGTTTGGGGTACGCCCCTTTTCTGGGGCCATGCAACCTCAGATGATCTGACTA ACTGGGAAGACCAACCTATTGCCATCGCCCCCAAGCGTAATGATTCAGGCGCCTTCTCTGGAAG TATGGTAGTCGACTACAACAACACGAGTGGTTTTTTCAACGACACAATTGACCCAAGACAGAGA TGTGTTGCCATTTGGACATATAATACACCTGAGTCAGAAGAACAATATATATCCTACTCTCTGGA TGGAGGTTATACTTTTACGGAGTATCAGAAAAACCCTGTTCTGGCAGCTAATTCCACCCAATTTC GTGACCCAAAGGTGTTTTGGTATGAGCCATCACAGAAATGGATCATGACCGCCGCAAAGTCACA GGACTATAAAATTGAGATATATTCATCAGACGACTTGAAATCCTGGAAGCTGGAGAGTGCTTTC GCAAATGAAGGATTTTTGGGATACCAATACGAATGCCCTGGCCTGATCGAAGTGCCCACTGAAC AAGACCCATCAAAGTCTTACTGGGTGATGTTTATCTCTATAAACCCCGGCGCTCCCGCTGGAGG CTCCTTCAACCAATATTTCGTAGGTTCTTTTAACGGAACCCACTTCGAGGCATTCGATAACCAAT CTAGGGTTGTCGATTTCGGAAAAGATTATTATGCACTACAAACCTTTTTTAATACGGACCCTACT TATGGATCAGCTTTAGGCATAGCCTGGGCTTCTAACTGGGAGTACAGTGCCTTTGTTCCTACAAA CCCATGGCGTTCCTCAATGAGTCTTGTCAGAAAATTCTCTCTGAATACGGAATATCAGGCTAACC CCGAGACAGAACTAATAAACTTGAAAGCAGAGCCTATCTTGAATATAAGTAACGCTGGACCTTG GAGTCGTTTTGCCACCAACACCACATTAACAAAAGCCAATTCCTATAACGTGGACCTTTCCAACT CTACGGGAACACTGGAATTTGAACTGGTGTACGCCGTAAATACTACGCAAACAATTTCAAAGTC AGTCTTTGCTGACCTTAGTCTATGGTTCAAGGGTTTAGAGGACCCCGAGGAGTACTTACGTATGG GTTTTGAGGTATCAGCATCTTCCTTTTTCCTGGATCGTGGAAACTCCAAGGTGAAGTTTGTCAAG GAAAATCCTTATTTCACTAACAGGATGTCCGTGAACAACCAGCCTTTCAAATCTGAGAACGATC TTTCCTATTACAAGGTCTACGGACTTCTAGACCAAAACATATTGGAACTATACTTCAACGATGGA GATGTAGTTTCCACCAACACCTATTTTATGACCACAGGCAACGCCCTTGGATCAGTAAACATGA CAACAGGTGTTGATAACCTTTTTTACATAGACAAATTCCAAGTTAGAGAGGTAAAG(SUC2 sequence) GGTTCATCAGGGTCCTCAGGATCATCCGGTAGTAGTGGTTCATCCGGTTCATCCGGATCAAGTG GCTCCTCTGAAGCTGCAGCAAGGGAGGCTGCAGCCCGTGAGGCAGCCGCTAGAGAAGCCGCCG CTAGGGGTGGTGGCGGCTCTGGCGGAGGCGGTTCCGGTGGCGGAGGCTCT(flexlinkers) CAAATCAACGAATTGAACGTTGTTTTAGATGATGTTAAGACCAACATTGCCGACTACATCACCC TATCCTACACTCCAAATTCAGGTTTTTCCTTGGACCAAATGCCAGCTGGTATTATGGATATTGCT GCGCAATTGGTTGCAAATCCAAGTGATGACTCCTACACCACTTTGTACTCTGAAGTGGACTTTTC TGCTGTTGAGCATATGTTGACTATGGTCCCATGGTACTCTTCTAGACTGCTTCCAGAATTAGAAG CAATGGATGCTTCTCTAACTACCTCAAGTTCTGCTGCCACATCTTCAAGTGAAGTTGCTAGCTCT TCTATTGCTTCATCCACTAGCTCTTCTGTTGCACCATCCTCAAGTGAAGTTGTCAGCTCTTCCGTT GCTTCATCCTCAAGTGAAGTTGCCAGCTCCTCTGTTGCGTCTACAAGCGAAGCTACTAGTTCTTC TGCTGTCACATCTTCCTCCGCTGTTTCCTCTTCGACCGAGTCTGTTAGCTCTTCCTCTGTCAGTTC TTCCTCAGCCGTTTCCTCTTCTGAAGCTGTCAGTTCCTCTCCAGTTTCCTCAGTTGTTTCATCTTCG GCCGGACCTGCTAGCTCAAGCGTTGCTCCTTACAACTCAACCATTGCTAGCTCTTCTTCCACTGC CCAGACTTCTATCTCGACCATTGCTCCTTACAACTCCACAACCACCACCACCCCAGCTAGTTCTG CTTCCAGCGTTATTATCTCAACCAGAAACGGTACCACTGTTACTGAAACTGACAACACTCTTGTC ACCAAAGAAACCACTGTCTGTGACTACTCTTCAACATCTGCCGTTCCAGCTTCCACCACCGGTTA CAACAATTCTACTAAGGTTTCAACCGCTACTATCTGCAGTACATGCAAAGAAGGTACCTCTACT GCAACTGACTTCTCTACACTAAAGACTACAGTTACCGTATGTGACTCCGCCTGTCAAGCTAAGA AGTCTGCTACCGTTGTTAGCGTTCAATCTAAAACTACCGGTATCGTTGAACAAACCGAAAACGG TGCTGCCAAGGCTGTTATCGGTATGGGTGCCGGTGCTTTAGCTGCTGTTGCCGCCATGCTACTAT GA(Tir4anchors) GCGGCCGCTCGATTTGTATGTGAAATAGCTGAAATTCGAAAATTTCATTATGGCTGTATCTACTT TAGCGTATTAGGCATTTGAGCATTGGCTTGAACAATGCGGGCTGTAGTGTGTCACCAAAGAAAC CATTCGGGTTCGGATCTGGAAGTCCTCATCACGTGATGCCGATCTCGTGTATTTTATTTTCAGAT AACACCTGAAGACTTT(TDH3transcriptionalterminator) Exemplary SEQID SMTNETSDRPLVHFTPNKGWMNDPNGLWYDEKDAKWHLYFQYNPNDTVWGTPLFWGHATSDDL SUC2 NO:315 TNWEDQPIAIAPKRNDSGAFSGSMVVDYNNTSGFFNDTIDPRQRCVAIWTYNTPESEEQYISYSLDG surface GYTFTEYQKNPVLAANSTQFRDPKVFWYEPSQKWIMTAAKSQDYKIEIYSSDDLKSWKLESAFANE display GFLGYQYECPGLIEVPTEQDPSKSYWVMFISINPGAPAGGSFNQYFVGSFNGTHFEAFDNQSRVVDF protein GKDYYALQTFFNTDPTYGSALGIAWASNWEYSAFVPTNPWRSSMSLVRKFSLNTEYQANPETELINL sequence KAEPILNISNAGPWSRFATNTTLTKANSYNVDLSNSTGTLEFELVYAVNTTQTISKSVFADLSLWFKG LEDPEEYLRMGFEVSASSFFLDRGNSKVKFVKENPYFTNRMSVNNQPFKSENDLSYYKVYGLLDQNI LELYFNDGDVVSTNTYFMTTGNALGSVNMTTGVDNLFYIDKFQVREVK(SUC2sequence) GSSGSSGSSGSSGSSGSSGSSGSSEAAAREAAAREAAAREAAARGGGGSGGGGSGGGGS(linker sequence) QINELNVVLDDVKTNIADYITLSYTPNSGFSLDQMPAGIMDIAAQLVANPSDDSYTTLYSEVDFSAVE HMLTMVPWYSSRLLPELEAMDASLTTSSSAATSSSEVASSSIASSTSSSVAPSSSEVVSSSVASSSSEV ASSSVASTSEATSSSAVTSSSAVSSSTESVSSSSVSSSSAVSSSEAVSSSPVSSVVSSSAGPASSSVAPYN STIASSSSTAQTSISTIAPYNSTTTTTPASSASSVIISTRNGTTVTETDNTLVTKETTVCDYSSTSAVPAST TGYNNSTKVSTATICSTCKEGTSTATDFSTLKTTVTVCDSACQAKKSATVVSVQSKTTGIVEQTENG AAKAVIGMGAGALAAVAAMLL(Tir4anchor) Exemplary SEQID SMTNETSDRPLVHFTPNKGWMNDPNGLWYDEKDAKWHLYFQYNPNDTVWGTPLFWGHATSDDL SUC2 NO:332 TNWEDQPIAIAPKRNDSGAFSGSMVVDYNNTSGFFNDTIDPRQRCVAIWTYNTPESEEQYISYSLDG surface GYTFTEYQKNPVLAANSTQFRDPKVFWYEPSQKWIMTAAKSQDYKIEIYSSDDLKSWKLESAFANE display GFLGYQYECPGLIEVPTEQDPSKSYWVMFISINPGAPAGGSFNQYFVGSFNGTHFEAFDNQSRVVDF protein GKDYYALQTFFNTDPTYGSALGIAWASNWEYSAFVPTNPWRSSMSLVRKFSLNTEYQANPETELINL sequence KAEPILNISNAGPWSRFATNTTLTKANSYNVDLSNSTGTLEFELVYAVNTTQTISKSVFADLSLWFKG (without LEDPEEYLRMGFEVSASSFFLDRGNSKVKFVKENPYFTNRMSVNNQPFKSENDLSYYKVYGLLDQNI C- LELYFNDGDVVSTNTYFMTTGNALGSVNMTTGVDNLFYIDKFQVREVK(SUC2sequence) terminus GSSGSSGSSGSSGSSGSSGSSGSSEAAAREAAAREAAAREAAARGGGGSGGGGSGGGGS(linker ofTir4 sequence) GPI QINELNVVLDDVKTNIADYITLSYTPNSGFSLDQMPAGIMDIAAQLVANPSDDSYTTLYSEVDFSAVE anchoror HMLTMVPWYSSRLLPELEAMDASLTTSSSAATSSSEVASSSIASSTSSSVAPSSSEVVSSSVASSSSEV signal ASSSVASTSEATSSSAVTSSSAVSSSTESVSSSSVSSSSAVSSSEAVSSSPVSSVVSSSAGPASSSVAPYN peptide) STIASSSSTAQTSISTIAPYNSTTTTTPASSASSVIISTRNGTTVTETDNTLVTKETTVCDYSSTSAVPAST TGYNNSTKVSTATICSTCKEGTSTATDFSTLKTTVTVCDSACQAKKSATVVSVQSKTTGIVEQTEN Exemplary SEQID MRFPSIFTAVLFAASSALAAPVQTTTEDELEGDFDVAVLPFSASIAAKEEGVSLEKREAEASMTNETS SUC2 NO:333 DRPLVHFTPNKGWMNDPNGLWYDEKDAKWHLYFQYNPNDTVWGTPLFWGHATSDDLTNWEDQP surface LAIAPKRNDSGAFSGSMVVDYNNTSGFFNDTIDPRQRCVAIWTYNTPESEEQYISYSLDGGYTFTEYQ display KNPVLAANSTQFRDPKVFWYEPSQKWIMTAAKSQDYKIEIYSSDDLKSWKLESAFANEGFLGYQYE protein CPGLIEVPTEQDPSKSYWVMFISINPGAPAGGSFNQYFVGSFNGTHFEAFDNQSRVVDFGKDYYALQ sequence TFFNTDPTYGSALGIAWASNWEYSAFVPTNPWRSSMSLVRKFSLNTEYQANPETELINLKAEPILNIS NAGPWSRFATNTTLTKANSYNVDLSNSTGTLEFELVYAVNTTQTISKSVFADLSLWFKGLEDPEEYL RMGFEVSASSFFLDRGNSKVKFVKENPYFTNRMSVNNQPFKSENDLSYYKVYGLLDQNILELYFND GDVVSTNTYFMTTGNALGSVNMTTGVDNLFYIDKFQVREVKGSSGSSGSSGSSGSSGSSGSSGSSEA AAREAAAREAAAREAAARGGGGSGGGGSGGGGSQINELNVVLDDVKTNIADYITLSYTPNSGFSLD QMPAGIMDIAAQLVANPSDDSYTTLYSEVDFSAVEHMLTMVPWYSSRLLPELEAMDASLTTSSSAA TSSSEVASSSIASSTSSSVAPSSSEVVSSSVASSSSEVASSSVASTSEATSSSAVTSSSAVSSSTESVSSSS VSSSSAVSSSEAVSSSPVSSVVSSSAGPASSSVAPYNSTIASSSSTAQTSISTIAPYNSTTTTTPASSASSV IISTRNGTTVTETDNTLVTKETTVCDYSSTSAVPASTTGYNNSTKVSTATICSTCKEGTSTATDFSTLK TTVTVCDSACQAKKSATVVSVQSKTTGIVEQTENGAAKAVIGMGAGALAAVAAMLL* Exemplary SEQID SMTNETSDRPLVHFTPNKGWMNDPNGLWYDEKDAKWHLYFQYNPNDTVWGTPLFWGHATSDDL SUC2 NO:334 TNWEDQPIAIAPKRNDSGAFSGSMVVDYNNTSGFFNDTIDPRQRCVAIWTYNTPESEEQYISYSLDG surface GYTFTEYQKNPVLAANSTQFRDPKVFWYEPSQKWIMTAAKSQDYKIEIYSSDDLKSWKLESAFANE display GFLGYQYECPGLIEVPTEQDPSKSYWVMFISINPGAPAGGSFNQYFVGSFNGTHFEAFDNQSRVVDF protein GKDYYALQTFFNTDPTYGSALGIAWASNWEYSAFVPTNPWRSSMSLVRKFSLNTEYQANPETELINL sequence KAEPILNISNAGPWSRFATNTTLTKANSYNVDLSNSTGTLEFELVYAVNTTQTISKSVFADLSLWFKG (without LEDPEEYLRMGFEVSASSFFLDRGNSKVKFVKENPYFTNRMSVNNQPFKSENDLSYYKVYGLLDQNI extreme LELYFNDGDVVSTNTYFMTTGNALGSVNMTTGVDNLFYIDKFQVREVKGSSGSSGSSGSSGSSGSSG C- SSGSSEAAAREAAAREAAAREAAARGGGGSGGGGSGGGGSQINELNVVLDDVKTNIADYITLSYTP terminus NSGFSLDQMPAGIMDIAAQLVANPSDDSYTTLYSEVDFSAVEHMLTMVPWYSSRLLPELEAMDASL oftheTir4 TTSSSAATSSSEVASSSIASSTSSSVAPSSSEVVSSSVASSSSEVASSSVASTSEATSSSAVTSSSAVSSST GPI ESVSSSSVSSSSAVSSSEAVSSSPVSSVVSSSAGPASSSVAPYNSTIASSSSTAQTSISTIAPYNSTTTTTP anchoror ASSASSVIISTRNGTTVTETDNTLVTKETTVCDYSSTSAVPASTTGYNNSTKVSTATICSTCKEGTSTA signal TDFSTLKTTVTVCDSACQAKKSATVVSVQSKTTGIVEQTEN peptide) Exemplary SEQID MRFPSIFTAVLFAASSALAAPVQTTTEDELEGDFDVAVLPFSASIAAKEEGVSLEKREAEASMTNETS SUC2 NO:335 DRPLVHFTPNKGWMNDPNGLWYDEKDAKWHLYFQYNPNDTVWGTPLFWGHATSDDLTNWEDQP surface LAIAPKRNDSGAFSGSMVVDYNNTSGFFNDTIDPRQRCVAIWTYNTPESEEQYISYSLDGGYTFTEYQ display KNPVLAANSTQFRDPKVFWYEPSQKWIMTAAKSQDYKIEIYSSDDLKSWKLESAFANEGFLGYQYE protein CPGLIEVPTEQDPSKSYWVMFISINPGAPAGGSFNQYFVGSFNGTHFEAFDNQSRVVDFGKDYYALQ sequence TFFNTDPTYGSALGIAWASNWEYSAFVPTNPWRSSMSLVRKFSLNTEYQANPETELINLKAEPILNIS (without NAGPWSRFATNTTLTKANSYNVDLSNSTGTLEFELVYAVNTTQTISKSVFADLSLWFKGLEDPEEYL extreme RMGFEVSASSFFLDRGNSKVKFVKENPYFTNRMSVNNQPFKSENDLSYYKVYGLLDQNILELYFND C- GDVVSTNTYFMTTGNALGSVNMTTGVDNLFYIDKFQVREVKGSSGSSGSSGSSGSSGSSGSSGSSEA terminus AAREAAAREAAAREAAARGGGGSGGGGSGGGGSQINELNVVLDDVKTNIADYITLSYTPNSGFSLD oftheTir4 QMPAGIMDIAAQLVANPSDDSYTTLYSEVDFSAVEHMLTMVPWYSSRLLPELEAMDASLTTSSSAA GPI TSSSEVASSSIASSTSSSVAPSSSEVVSSSVASSSSEVASSSVASTSEATSSSAVTSSSAVSSSTESVSSSS anchor) VSSSSAVSSSEAVSSSPVSSVVSSSAGPASSSVAPYNSTIASSSSTAQTSISTIAPYNSTTTTTPASSASSV IISTRNGTTVTETDNTLVTKETTVCDYSSTSAVPASTTGYNNSTKVSTATICSTCKEGTSTATDFSTLK TTVTVCDSACQAKKSATVVSVQSKTTGIVEQTEN Exemplary SEQID EAEASMTNETSDRPLVHFTPNKGWMNDPNGLWYDEKDAKWHLYFQYNPNDTVWGTPLFWGHAT SUC2 NO:342 SDDLTNWEDQPIAIAPKRNDSGAFSGSMVVDYNNTSGFFNDTIDPRQRCVAIWTYNTPESEEQYISYS surface LDGGYTFTEYQKNPVLAANSTQFRDPKVFWYEPSQKWIMTAAKSQDYKIEIYSSDDLKSWKLESAF display ANEGFLGYQYECPGLIEVPTEQDPSKSYWVMFISINPGAPAGGSFNQYFVGSFNGTHFEAFDNQSRV protein VDFGKDYYALQTFFNTDPTYGSALGIAWASNWEYSAFVPTNPWRSSMSLVRKFSLNTEYQANPETE sequence LINLKAEPILNISNAGPWSRFATNTTLTKANSYNVDLSNSTGTLEFELVYAVNTTQTISKSVFADLSL (Post- WFKGLEDPEEYLRMGFEVSASSFFLDRGNSKVKFVKENPYFTNRMSVNNQPFKSENDLSYYKVYGL processing LDQNILELYFNDGDVVSTNTYFMTTGNALGSVNMTTGVDNLFYIDKFQVREVKGSSGSSGSSGSSGS mature SGSSGSSGSSEAAAREAAAREAAAREAAARGGGGSGGGGSGGGGSQINELNVVLDDVKTNIADYIT sequence LSYTPNSGFSLDQMPAGIMDIAAQLVANPSDDSYTTLYSEVDFSAVEHMLTMVPWYSSRLLPELEA (with MDASLTTSSSAATSSSEVASSSIASSTSSSVAPSSSEVVSSSVASSSSEVASSSVASTSEATSSSAVTSSS secretion AVSSSTESVSSSSVSSSSAVSSSEAVSSSPVSSVVSSSAGPASSSVAPYNSTIASSSSTAQTSISTIAPYNS signal TTTTTPASSASSVIISTRNGTTVTETDNTLVTKETTVCDYSSTSAVPASTTGYNNSTKVSTATICSTCKE cleaved GTSTATDFSTLKTTVTVCDSACQAKKSATVVSVQSKTTGIVEQTEN offtheN- termand propeptide ofTir4 cleaved offtheC- term))

    TABLE-US-00010 TABLE10 Exemplarytransporterproteins Sequence Sequence Info Info Sequence Saccharomyces SEQID MKNIISLVSKKKAASKNEDKNISESSRDIVNQQEVENTEDFEEGKKDSAFELDHLEFTTNSAQLGDS cerevisiae NO:316 DEDNENVINEMNATDDANEANSEEKSMTLKQALLKYPKAALWSILVSTTLVMEGYDTALLSALY MAL11or ALPVFQRKFGTLNGEGSYEITSQWQIGLNMCVLCGEMIGLQITTYMVEFMGNRYTMITALGLLTAY AGT1- IFILYYCKSLAMIAVGQILSAIPWGCFQSLAVTYASEVCPLALRYYMTSYSNICWLFGQIFASGIMKN sucrose SQENLGNSDLGYKLPFALQWIWPAPLMIGIFFAPESPWWLVRKDRVAEARKSLSRILSGKGAEKDI permease QVDLTLKQIELTIEKERLLASKSGSFFNCFKGVNGRRTRLACLTWVAQNSSGAVLLGYSTYFFERA UniProtKB- GMATDKAFTFSLIQYCLGLAGTLCSWVISGRVGRWTILTYGLAFQMVCLFIIGGMGFGSGSSASNG P53048 AGGLLLALSFFYNAGIGAVVYCIVAEIPSAELRTKTIVLARICYNLMAVINAILTPYMLNVSDWNWG (MAL11_YEAST) AKTGLYWGGFTAVTLAWVIIDLPETTGRTFSEINELFNQGVPARKFASTVVDPFGKGKTQHDSLAD ESISQSSSIKQRELNAADKC Pichiaangusta SEQID MPEFVENIEKPEEAEVIPDITKKINTLSDSDDGSGAFNDYIARFVEISTNAQNNEHQEKHMSLKEGLK MAL2- NO:317 TFPKAACWSIVLSTAIIMEGYDTTLLNSLYSMQSFAKKYGKYYPEIDQYQVPAKWQTSLSMSTYVG maltose EIVGLYIAGLVAEKWGYRRTLISFMAAVVGLIFILFFAVDVQMLLAGELLCGIVWGAFQTLTVSYA transporter SEVCPVVLRIYLTTYVNACWVIGQLIAACLLRGTMTLTSEWSYKIPFAVQWIWPVPIMIGIYLAPESP UniProtKB- WWLVKKNRDAEAKKSITRLLSPNTEVPDVAPLAEAMLNKMQLTIKEESARTSNVSYFDCFKHGNF Q32SL4 RRTRIAAMIWLIQNITGSVLMGYSTYFYIQAGLDSSMSFTFSIIQYALGLLGTLASWLLSQKLGRFDI (Q32SL4_PICAN) YFLGLSINTCILIIVGGLGFSSSTSASWAIGSLLLVFTFVYDSSIGPITYCTVAEIPSSTVRAKTVALAR NWYNLSQIPLSIVTPYMLNPTAWNWKAKAALLWAGLSICSLIYIWFEFPETKGRTYAELDILFKNGT SARKFRSTQVETFNPQEMLKKMNNEDIIQVVDGDLDAGAATAKV
    Expression or Secretion of a Protein of Interest in Host Cells with an Alternative Carbon Source

    [0166] In some embodiments, the engineered host cell which expresses a surface-displayed enzyme that hydrolyses a disaccharide produces about the same amount of a protein of interest compared to a similar cell that does not express a surface-displayed enzyme that hydrolyses a disaccharide. In these embodiments, about the same amount includes from about 1% to about 10%more or lessprotein of interest production.

    [0167] In some embodiments, the engineered host cell which expresses a surface-displayed enzyme that hydrolyses a disaccharide produces more of the protein of interest compared to a similar cell that does not express a surface-displayed enzyme that hydrolyses a disaccharide. In some embodiments, the engineered host cell which expresses a surface-displayed enzyme that hydrolyses a disaccharide produces about 1% to about 200% more of the protein of interest compared to a similar cell that does not express a surface-displayed enzyme that hydrolyses a disaccharide. In some embodiments, the engineered host cell which expresses a surface-displayed enzyme that hydrolyses a disaccharide produces about 1% to about 2%, about 1% to about 5%, about 1% to about 10%, about 1% to about 15%, about 1% to about 20%, about 1% to about 30%, about 1% to about 40%, about 1% to about 50%, about 1% to about 75%, about 1% to about 100%, about 1% to about 150%, about 1% to about 200%, about 2% to about 5%, about 2% to about 10%, about 2% to about 15%, about 2% to about 20%, about 2% to about 30%, about 2% to about 40%, about 2% to about 50%, about 2% to about 75%, about 2% to about 100%, about 2% to about 150%, about 2% to about 200%, about 5% to about 10%, about 5% to about 15%, about 5% to about 20%, about 5% to about 30%, about 5% to about 40%, about 5% to about 50%, about 5% to about 75%, about 5% to about 100%, about 5% to about 150%, about 5% to about 200%, about 10% to about 15%, about 10% to about 20%, about 10% to about 30%, about 10% to about 40%, about 10% to about 50%, about 10% to about 75%, about 10% to about 100%, about 10% to about 200%, about 15% to about 20%, about 15% to about 30%, about 15% to about 40%, about 15% to about 50%, about 15% to about 75%, about 15% to about 100%, about 15% to about 150%, about 15% to about 200%, about 20% to about 30%, about 20% to about 40%, about 20% to about 50%, about 20% to about 75%, about 20% to about 100%, about 20% to about 200%, about 30% to about 40%, about 30% to about 50%, about 30% to about 75%, about 30% to about 100%, about 30% to about 150%, about 30% to about 200%, about 40% to about 50%, about 40% to about 75%, about 40% to about 100%, about 40% to about 150%, about 40% to about 200%, about 50% to about 75%, about 50% to about 100%, about 50% to about 200%, about 75% to about 100%, about 75% to about 200%, or about 100% to about 200% more of the protein of interest compared to a similar cell that does not express a surface-displayed enzyme that hydrolyses a disaccharide. In some embodiments, the engineered host cell which expresses a surface-displayed enzyme that hydrolyses a disaccharide produces about 1%, about 2%, about 5%, about 10%, about 15%, about 20%, about 30%, about 40%, about 50%, about 75%, about 100%, about 150%, or about 200% more of the protein of interest compared to a similar cell that does not express a surface-displayed enzyme that hydrolyses a disaccharide. In some embodiments, the engineered host cell which expresses a surface-displayed enzyme that hydrolyses a disaccharide produces at least about 1%, about 2%, about 5%, about 10%, about 15%, about 20%, about 30%, about 40%, about 50%, about 75%, about 150%, or about 100% more of the protein of interest compared to a similar cell that does not express a surface-displayed enzyme that hydrolyses a disaccharide. In some embodiments, the engineered host cell which expresses a surface-displayed enzyme that hydrolyses a disaccharide produces at most about 2%, about 5%, about 10%, about 15%, about 20%, about 30%, about 40%, about 50%, about 75%, about 100%, about 150% or about 200% more of the protein of interest compared to a similar cell that does not express a surface-displayed enzyme that hydrolyses a disaccharide.

    [0168] In some embodiments, the engineered host cell which expresses a surface-displayed enzyme that hydrolyses a disaccharide secretes the same amount of the protein of interest compared to a similar cell that does not express a surface-displayed enzyme that hydrolyses a disaccharide. In these embodiments, about the same amount includes from about 1% to about 10% more or lessprotein of interest secretion.

    [0169] In some embodiments, the engineered host cell which expresses a surface-displayed enzyme that hydrolyses a disaccharide secretes more of the protein of interest compared to a similar cell that does not express a surface-displayed enzyme that hydrolyses a disaccharide. In some embodiments, the engineered host cell which expresses a surface-displayed enzyme that hydrolyses a disaccharide secretes about 1% to about 200% more of the protein of interest compared to a similar cell that does not express a surface-displayed enzyme that hydrolyses a disaccharide. In some embodiments, the engineered host cell which expresses a surface-displayed enzyme that hydrolyses a disaccharide secretes about 1% to about 2%, about 1% to about 5%, about 1% to about 10%, about 1% to about 15%, about 1% to about 20%, about 1% to about 30%, about 1% to about 40%, about 1% to about 50%, about 1% to about 75%, about 1% to about 100%, about 1% to about 150%, about 1% to about 200%, about 2% to about 5%, about 2% to about 10%, about 2% to about 15%, about 2% to about 20%, about 2% to about 30%, about 2% to about 40%, about 2% to about 50%, about 2% to about 75%, about 2% to about 100%, about 2% to about 150%, about 2% to about 200%, about 5% to about 10%, about 5% to about 15%, about 5% to about 20%, about 5% to about 30%, about 5% to about 40%, about 5% to about 50%, about 5% to about 75%, about 5% to about 100%, about 5% to about 150%, about 5% to about 200%, about 10% to about 15%, about 10% to about 20%, about 10% to about 30%, about 10% to about 40%, about 10% to about 50%, about 10% to about 75%, about 10% to about 100%, about 10% to about 200%, about 15% to about 20%, about 15% to about 30%, about 15% to about 40%, about 15% to about 50%, about 15% to about 75%, about 15% to about 100%, about 15% to about 150%, about 15% to about 200%, about 20% to about 30%, about 20% to about 40%, about 20% to about 50%, about 20% to about 75%, about 20% to about 100%, about 20% to about 200%, about 30% to about 40%, about 30% to about 50%, about 30% to about 75%, about 30% to about 100%, about 30% to about 150%, about 30% to about 200%, about 40% to about 50%, about 40% to about 75%, about 40% to about 100%, about 40% to about 150%, about 40% to about 200%, about 50% to about 75%, about 50% to about 100%, about 50% to about 200%, about 75% to about 100%, about 75% to about 200%, or about 100% to about 200% more of the protein of interest compared to a similar cell that does not express a surface-displayed enzyme that hydrolyses a disaccharide. In some embodiments, the engineered host cell which expresses a surface-displayed enzyme that hydrolyses a disaccharide secretes about 1%, about 2%, about 5%, about 10%, about 15%, about 20%, about 30%, about 40%, about 50%, about 75%, about 100%, about 150%, or about 200% more of the protein of interest compared to a similar cell that does not express a surface-displayed enzyme that hydrolyses a disaccharide. In some embodiments, the engineered host cell which expresses a surface-displayed enzyme that hydrolyses a disaccharide secretes at least about 1%, about 2%, about 5%, about 10%, about 15%, about 20%, about 30%, about 40%, about 50%, about 75%, about 150%, or about 100% more of the protein of interest compared to a similar cell that does not express a surface-displayed enzyme that hydrolyses a disaccharide. In some embodiments, the engineered host cell which expresses a surface-displayed enzyme that hydrolyses a disaccharide secretes at most about 2%, about 5%, about 10%, about 15%, about 20%, about 30%, about 40%, about 50%, about 75%, about 100%, about 150% or about 200% more of the protein of interest compared to a similar cell that does not express a surface-displayed enzyme that hydrolyses a disaccharide.

    [0170] In some embodiments, the engineered host cell which expresses a surface-displayed enzyme that hydrolyses a disaccharide produces about the same amount of a protein of interest compared to a similar cell that does not express a surface-displayed enzyme that hydrolyses a disaccharide when each are fed a growth medium comprising glucose as its carbon source. In these embodiments, about the same amount includes from about 1% to about 10% more or lessprotein of interest production.

    [0171] In some embodiments, the engineered host cell which expresses a surface-displayed enzyme that hydrolyses a disaccharide produces more of the protein of interest compared to a similar cell that does not express a surface-displayed enzyme that hydrolyses a disaccharide when each are fed a growth medium comprising glucose as its carbon source. In some embodiments, the engineered host cell which expresses a surface-displayed enzyme that hydrolyses a disaccharide produces about 1% to about 200% more of the protein of interest compared to a similar cell that does not express a surface-displayed enzyme that hydrolyses a disaccharide when each are fed a growth medium comprising glucose as its carbon source. In some embodiments, the engineered host cell which expresses a surface-displayed enzyme that hydrolyses a disaccharide produces about 1% to about 2%, about 1% to about 5%, about 1% to about 10%, about 1% to about 15%, about 1% to about 20%, about 1% to about 30%, about 1% to about 40%, about 1% to about 50%, about 1% to about 75%, about 1% to about 100%, about 1% to about 150%, about 1% to about 200%, about 2% to about 5%, about 2% to about 10%, about 2% to about 15%, about 2% to about 20%, about 2% to about 30%, about 2% to about 40%, about 2% to about 50%, about 2% to about 75%, about 2% to about 100%, about 2% to about 150%, about 2% to about 200%, about 5% to about 10%, about 5% to about 15%, about 5% to about 20%, about 5% to about 30%, about 5% to about 40%, about 5% to about 50%, about 5% to about 75%, about 5% to about 100%, about 5% to about 150%, about 5% to about 200%, about 10% to about 15%, about 10% to about 20%, about 10% to about 30%, about 10% to about 40%, about 10% to about 50%, about 10% to about 75%, about 10% to about 100%, about 10% to about 200%, about 15% to about 20%, about 15% to about 30%, about 15% to about 40%, about 15% to about 50%, about 15% to about 75%, about 15% to about 100%, about 15% to about 150%, about 15% to about 200%, about 20% to about 30%, about 20% to about 40%, about 20% to about 50%, about 20% to about 75%, about 20% to about 100%, about 20% to about 200%, about 30% to about 40%, about 30% to about 50%, about 30% to about 75%, about 30% to about 100%, about 30% to about 150%, about 30% to about 200%, about 40% to about 50%, about 40% to about 75%, about 40% to about 100%, about 40% to about 150%, about 40% to about 200%, about 50% to about 75%, about 50% to about 100%, about 50% to about 200%, about 75% to about 100%, about 75% to about 200%, or about 100% to about 200% more of the protein of interest compared to a similar cell that does not express a surface-displayed enzyme that hydrolyses a disaccharide when each are fed a growth medium comprising glucose as its carbon source. In some embodiments, the engineered host cell which expresses a surface-displayed enzyme that hydrolyses a disaccharide produces about 1%, about 2%, about 5%, about 10%, about 15%, about 20%, about 30%, about 40%, about 50%, about 75%, about 100%, about 150%, or about 200% more of the protein of interest compared to a similar cell that does not express a surface-displayed enzyme that hydrolyses a disaccharide when each are fed a growth medium comprising glucose as its carbon source. In some embodiments, the engineered host cell which expresses a surface-displayed enzyme that hydrolyses a disaccharide produces at least about 1%, about 2%, about 5%, about 10%, about 15%, about 20%, about 30%, about 40%, about 50%, about 75%, about 150%, or about 100% more of the protein of interest compared to a similar cell that does not express a surface-displayed enzyme that hydrolyses a disaccharide when each are fed a growth medium comprising glucose as its carbon source. In some embodiments, the engineered host cell which expresses a surface-displayed enzyme that hydrolyses a disaccharide produces at most about 2%, about 5%, about 10%, about 15%, about 20%, about 30%, about 40%, about 50%, about 75%, about 100%, about 150% or about 200% more of the protein of interest compared to a similar cell that does not express a surface-displayed enzyme that hydrolyses a disaccharide when each are fed a growth medium comprising glucose as its carbon source.

    [0172] In some embodiments, the engineered host cell which expresses a surface-displayed enzyme that hydrolyses a disaccharide secretes the same amount of the protein of interest compared to a similar cell that does not express a surface-displayed enzyme that hydrolyses a disaccharide when each are fed a growth medium comprising glucose as its carbon source. In these embodiments, about the same amount includes from about 1% to about 10% more or lessprotein of interest secretion.

    [0173] In some embodiments, the engineered host cell which expresses a surface-displayed enzyme that hydrolyses a disaccharide secretes more of the protein of interest compared to a similar cell that does not express a surface-displayed enzyme that hydrolyses a disaccharide when each are fed a growth medium comprising glucose as its carbon source. In some embodiments, the engineered host cell which expresses a surface-displayed enzyme that hydrolyses a disaccharide secretes about 1% to about 200% more of the protein of interest compared to a similar cell that does not express a surface-displayed enzyme that hydrolyses a disaccharide when each are fed a growth medium comprising glucose as its carbon source. In some embodiments, the engineered host cell which expresses a surface-displayed enzyme that hydrolyses a disaccharide secretes about 1% to about 2%, about 1% to about 5%, about 1% to about 10%, about 1% to about 15%, about 1% to about 20%, about 1% to about 30%, about 1% to about 40%, about 1% to about 50%, about 1% to about 75%, about 1% to about 100%, about 1% to about 150%, about 1% to about 200%, about 2% to about 5%, about 2% to about 10%, about 2% to about 15%, about 2% to about 20%, about 2% to about 30%, about 2% to about 40%, about 2% to about 50%, about 2% to about 75%, about 2% to about 100%, about 2% to about 150%, about 2% to about 200%, about 5% to about 10%, about 5% to about 15%, about 5% to about 20%, about 5% to about 30%, about 5% to about 40%, about 5% to about 50%, about 5% to about 75%, about 5% to about 100%, about 5% to about 150%, about 5% to about 200%, about 10% to about 15%, about 10% to about 20%, about 10% to about 30%, about 10% to about 40%, about 10% to about 50%, about 10% to about 75%, about 10% to about 100%, about 10% to about 200%, about 15% to about 20%, about 15% to about 30%, about 15% to about 40%, about 15% to about 50%, about 15% to about 75%, about 15% to about 100%, about 15% to about 150%, about 15% to about 200%, about 20% to about 30%, about 20% to about 40%, about 20% to about 50%, about 20% to about 75%, about 20% to about 100%, about 20% to about 200%, about 30% to about 40%, about 30% to about 50%, about 30% to about 75%, about 30% to about 100%, about 30% to about 150%, about 30% to about 200%, about 40% to about 50%, about 40% to about 75%, about 40% to about 100%, about 40% to about 150%, about 40% to about 200%, about 50% to about 75%, about 50% to about 100%, about 50% to about 200%, about 75% to about 100%, about 75% to about 200%, or about 100% to about 200% more of the protein of interest compared to a similar cell that does not express a surface-displayed enzyme that hydrolyses a disaccharide when each are fed a growth medium comprising glucose as its carbon source. In some embodiments, the engineered host cell which expresses a surface-displayed enzyme that hydrolyses a disaccharide secretes about 1%, about 2%, about 5%, about 10%, about 15%, about 20%, about 30%, about 40%, about 50%, about 75%, about 100%, about 150%, or about 200% more of the protein of interest compared to a similar cell that does not express a surface-displayed enzyme that hydrolyses a disaccharide when each are fed a growth medium comprising glucose as its carbon source. In some embodiments, the engineered host cell which expresses a surface-displayed enzyme that hydrolyses a disaccharide secretes at least about 1%, about 2%, about 5%, about 10%, about 15%, about 20%, about 30%, about 40%, about 50%, about 75%, about 150%, or about 100% more of the protein of interest compared to a similar cell that does not express a surface-displayed enzyme that hydrolyses a disaccharide when each are fed a growth medium comprising glucose as its carbon source. In some embodiments, the engineered host cell which expresses a surface-displayed enzyme that hydrolyses a disaccharide secretes at most about 2%, about 5%, about 10%, about 15%, about 20%, about 30%, about 40%, about 50%, about 75%, about 100%, about 150% or about 200% more of the protein of interest compared to a similar cell that does not express a surface-displayed enzyme that hydrolyses a disaccharide when each are fed a growth medium comprising glucose as its carbon source.

    [0174] In some embodiments, the engineered host cell which expresses a surface-displayed enzyme that hydrolyses a disaccharide produces more of the protein of interest compared to a similar cell that does not express a surface-displayed enzyme that hydrolyses a disaccharide when each are fed a growth medium comprising a disaccharide, e.g., sucrose, as its carbon source. In some embodiments, the engineered host cell which expresses a surface-displayed enzyme that hydrolyses a disaccharide produces about 10% to about 2000% more of the protein of interest compared to a similar cell that does not express a surface-displayed enzyme that hydrolyses a disaccharide when each are fed a growth medium comprising a disaccharide, e.g., sucrose, as its carbon source. In some embodiments, the engineered host cell which expresses a surface-displayed enzyme that hydrolyses a disaccharide produces about 10% to about 20%, about 10% to about 50%, about 10% to about 100%, about 10% to about 150%, about 10% to about 200%, about 10% to about 300%, about 10% to about 400%, about 10% to about 500%, about 10% to about 750%, about 10% to about 1000%, about 10% to about 1500%, about 10% to about 2000%, about 20% to about 50%, about 20% to about 100%, about 20% to about 150%, about 20% to about 200%, about 20% to about 300%, about 20% to about 400%, about 20% to about 500%, about 20% to about 750%, about 20% to about 1000%, about 20% to about 1500%, about 20% to about 2000%, about 50% to about 100%, about 50% to about 150%, about 50% to about 200%, about 50% to about 300%, about 50% to about 400%, about 50% to about 500%, about 50% to about 750%, about 50% to about 1000%, about 50% to about 1500%, about 50% to about 2000%, about 100% to about 150%, about 100% to about 200%, about 100% to about 300%, about 100% to about 400%, about 100% to about 500%, about 100% to about 750%, about 100% to about 1000%, about 100% to about 2000%, about 150% to about 200%, about 150% to about 300%, about 150% to about 400%, about 150% to about 500%, about 150% to about 750%, about 150% to about 1000%, about 150% to about 1500%, about 150% to about 2000%, about 200% to about 300%, about 200% to about 400%, about 200% to about 500%, about 200% to about 750%, about 200% to about 1000%, about 200% to about 2000%, about 300% to about 400%, about 300% to about 500%, about 300% to about 750%, about 300% to about 1000%, about 300% to about 1500%, about 300% to about 2000%, about 400% to about 500%, about 400% to about 750%, about 400% to about 1000%, about 400% to about 1500%, about 400% to about 2000%, about 500% to about 750%, about 500% to about 1000%, about 500% to about 2000%, about 750% to about 1000%, about 750% to about 2000%, or about 1000% to about 2000% more of the protein of interest compared to a similar cell that does not express a surface-displayed enzyme that hydrolyses a disaccharide when each are fed a growth medium comprising a disaccharide, e.g., sucrose, as its carbon source. In some embodiments, the engineered host cell which expresses a surface-displayed enzyme that hydrolyses a disaccharide produces at least about 10%, at least about 20%, at least about 50%, at least about 100%, at least about 150%, at least about 200%, at least about 300%, at least about 400%, at least about 500%, at least about 750%, at least about 1000%, at least about 1500%, or at least about 2000%, at least about 3000%, at least about 4000%, at least about 5000%, at least about 7500%, at least about 10000% more of the protein of interest compared to a similar cell that does not express a surface-displayed enzyme that hydrolyses a disaccharide when each are fed a growth medium comprising a disaccharide, e.g., sucrose, as its carbon source. In some embodiments, the engineered host cell which expresses a surface-displayed enzyme that hydrolyses a disaccharide produces about 10%, about 20%, about 50%, about 100%, about 150%, about 200%, about 300%, about 400%, about 500%, about 750%, about 1000%, about 1500%, or about 2000%, about 3000%, about 4000%, about 5000%, about 7500%, about 10000% more of the protein of interest compared to a similar cell that does not express a surface-displayed enzyme that hydrolyses a disaccharide when each are fed a growth medium comprising a disaccharide, e.g., sucrose, as its carbon source. In some embodiments, the engineered host cell which expresses a surface-displayed enzyme that hydrolyses a disaccharide produces at most about 20%, about 50%, about 100%, about 150%, about 200%, about 300%, about 400%, about 500%, about 750%, about 1000%, about 1500%, or about 2000%, about 3000%, about 4000%, about 5000%, about 7500%, about 10000% more of the protein of interest compared to a similar cell that does not express a surface-displayed enzyme that hydrolyses a disaccharide when each are fed a growth medium comprising a disaccharide, e.g., sucrose, as its carbon source.

    [0175] In some embodiments, the engineered host cell which expresses a surface-displayed enzyme that hydrolyses a disaccharide secretes more of the protein of interest compared to a similar cell that does not express a surface-displayed enzyme that hydrolyses a disaccharide when each are fed a growth medium comprising a disaccharide, e.g., sucrose, as its carbon source. In some embodiments, the engineered host cell which expresses a surface-displayed enzyme that hydrolyses a disaccharide secretes about 10% to about 2000% more of the protein of interest compared to a similar cell that does not express a surface-displayed enzyme that hydrolyses a disaccharide when each are fed a growth medium comprising a disaccharide, e.g., sucrose, as its carbon source. In some embodiments, the engineered host cell which expresses a surface-displayed enzyme that hydrolyses a disaccharide secretes about 10% to about 20%, about 10% to about 50%, about 10% to about 100%, about 10% to about 150%, about 10% to about 200%, about 10% to about 300%, about 10% to about 400%, about 10% to about 500%, about 10% to about 750%, about 10% to about 1000%, about 10% to about 1500%, about 10% to about 2000%, about 20% to about 50%, about 20% to about 100%, about 20% to about 150%, about 20% to about 200%, about 20% to about 300%, about 20% to about 400%, about 20% to about 500%, about 20% to about 750%, about 20% to about 1000%, about 20% to about 1500%, about 20% to about 2000%, about 50% to about 100%, about 50% to about 150%, about 50% to about 200%, about 50% to about 300%, about 50% to about 400%, about 50% to about 500%, about 50% to about 750%, about 50% to about 1000%, about 50% to about 1500%, about 50% to about 2000%, about 100% to about 150%, about 100% to about 200%, about 100% to about 300%, about 100% to about 400%, about 100% to about 500%, about 100% to about 750%, about 100% to about 1000%, about 100% to about 2000%, about 150% to about 200%, about 150% to about 300%, about 150% to about 400%, about 150% to about 500%, about 150% to about 750%, about 150% to about 1000%, about 150% to about 1500%, about 150% to about 2000%, about 200% to about 300%, about 200% to about 400%, about 200% to about 500%, about 200% to about 750%, about 200% to about 1000%, about 200% to about 2000%, about 300% to about 400%, about 300% to about 500%, about 300% to about 750%, about 300% to about 1000%, about 300% to about 1500%, about 300% to about 2000%, about 400% to about 500%, about 400% to about 750%, about 400% to about 1000%, about 400% to about 1500%, about 400% to about 2000%, about 500% to about 750%, about 500% to about 1000%, about 500% to about 2000%, about 750% to about 1000%, about 750% to about 2000%, or about 1000% to about 2000% more of the protein of interest compared to a similar cell that does not express a surface-displayed enzyme that hydrolyses a disaccharide when each are fed a growth medium comprising a disaccharide, e.g., sucrose, as its carbon source. In some embodiments, the engineered host cell which expresses a surface-displayed enzyme that hydrolyses a disaccharide secretes about 10%, about 20%, about 50%, about 100%, about 150%, about 200%, about 300%, about 400%, about 500%, about 750%, about 1000%, about 1500%, or about 2000%, about 3000%, about 4000%, about 5000%, about 7500%, about 10000% more of the protein of interest compared to a similar cell that does not express a surface-displayed enzyme that hydrolyses a disaccharide when each are fed a growth medium comprising a disaccharide, e.g., sucrose, as its carbon source. In some embodiments, the engineered host cell which expresses a surface-displayed enzyme that hydrolyses a disaccharide secretes at least about 10%, at least about 20%, at least about 50%, at least about 100%, at least about 150%, at least about 200%, at least about 300%, at least about 400%, at least about 500%, at least about 750%, at least about 1000%, at least about 1500%, or at least about 2000%, at least about 3000%, at least about 4000%, at least about 5000%, at least about 7500%, at least about 10000% more of the protein of interest compared to a similar cell that does not express a surface-displayed enzyme that hydrolyses a disaccharide when each are fed a growth medium comprising a disaccharide, e.g., sucrose, as its carbon source. In some embodiments, the engineered host cell which expresses a surface-displayed enzyme that hydrolyses a disaccharide secretes at most about 20%, about 50%, about 100%, about 150%, about 200%, about 300%, about 400%, about 500%, about 750%, about 1000%, about 1500%, or about 2000%, about 3000%, about 4000%, about 5000%, about 7500%, about 10000% more of the protein of interest compared to a similar cell that does not express a surface-displayed enzyme that hydrolyses a disaccharide when each are fed a growth medium comprising a disaccharide, e.g., sucrose, as its carbon source.

    [0176] In some embodiments, the engineered host cell which expresses a surface-displayed enzyme that hydrolyses a disaccharide produces more of the protein of interest compared to a similar cell that does not express a surface-displayed enzyme that hydrolyses a disaccharide when the engineered host cell is fed a growth medium comprising a disaccharide, e.g., sucrose, as its carbon source and the similar cell is fed a growth medium comprising glucose. In some embodiments, the engineered host cell which expresses a surface-displayed enzyme that hydrolyses a disaccharide produces about 10% to about 2000% more of the protein of interest compared to a similar cell that does not express a surface-displayed enzyme that hydrolyses a disaccharide when the engineered host cell is fed a growth medium comprising a disaccharide, e.g., sucrose, as its carbon source and the similar cell is fed a growth medium comprising glucose. In some embodiments, the engineered host cell which expresses a surface-displayed enzyme that hydrolyses a disaccharide produces about 10% to about 20%, about 10% to about 50%, about 10% to about 100%, about 10% to about 150%, about 10% to about 200%, about 10% to about 300%, about 10% to about 400%, about 10% to about 500%, about 10% to about 750%, about 10% to about 1000%, about 10% to about 1500%, about 10% to about 2000%, about 20% to about 50%, about 20% to about 100%, about 20% to about 150%, about 20% to about 200%, about 20% to about 300%, about 20% to about 400%, about 20% to about 500%, about 20% to about 750%, about 20% to about 1000%, about 20% to about 1500%, about 20% to about 2000%, about 50% to about 100%, about 50% to about 150%, about 50% to about 200%, about 50% to about 300%, about 50% to about 400%, about 50% to about 500%, about 50% to about 750%, about 50% to about 1000%, about 50% to about 1500%, about 50% to about 2000%, about 100% to about 150%, about 100% to about 200%, about 100% to about 300%, about 100% to about 400%, about 100% to about 500%, about 100% to about 750%, about 100% to about 1000%, about 100% to about 2000%, about 150% to about 200%, about 150% to about 300%, about 150% to about 400%, about 150% to about 500%, about 150% to about 750%, about 150% to about 1000%, about 150% to about 1500%, about 150% to about 2000%, about 200% to about 300%, about 200% to about 400%, about 200% to about 500%, about 200% to about 750%, about 200% to about 1000%, about 200% to about 2000%, about 300% to about 400%, about 300% to about 500%, about 300% to about 750%, about 300% to about 1000%, about 300% to about 1500%, about 300% to about 2000%, about 400% to about 500%, about 400% to about 750%, about 400% to about 1000%, about 400% to about 1500%, about 400% to about 2000%, about 500% to about 750%, about 500% to about 1000%, about 500% to about 2000%, about 750% to about 1000%, about 750% to about 2000%, or about 1000% to about 2000% more of the protein of interest compared to a similar cell that does not express a surface-displayed enzyme that hydrolyses a disaccharide when the engineered host cell is fed a growth medium comprising a disaccharide, e.g., sucrose, as its carbon source and the similar cell is fed a growth medium comprising glucose. In some embodiments, the engineered host cell which expresses a surface-displayed enzyme that hydrolyses a disaccharide produces at least about 10%, at least about 20%, at least about 50%, at least about 100%, at least about 150%, at least about 200%, at least about 300%, at least about 400%, at least about 500%, at least about 750%, at least about 1000%, at least about 1500%, or at least about 2000%, at least about 3000%, at least about 4000%, at least about 5000%, at least about 7500%, at least about 10000% more of the protein of interest compared to a similar cell that does not express a surface-displayed enzyme that hydrolyses a disaccharide when the engineered host cell is fed a growth medium comprising a disaccharide, e.g., sucrose, as its carbon source and the similar cell is fed a growth medium comprising glucose. In some embodiments, the engineered host cell which expresses a surface-displayed enzyme that hydrolyses a disaccharide produces about 10%, about 20%, about 50%, about 100%, about 150%, about 200%, about 300%, about 400%, about 500%, about 750%, about 1000%, about 1500%, or about 2000%, about 3000%, about 4000%, about 5000%, about 7500%, about 10000% more of the protein of interest compared to a similar cell that does not express a surface-displayed enzyme that hydrolyses a disaccharide when the engineered host cell is fed a growth medium comprising a disaccharide, e.g., sucrose, as its carbon source and the similar cell is fed a growth medium comprising glucose. In some embodiments, the engineered host cell which expresses a surface-displayed enzyme that hydrolyses a disaccharide produces at most about 20%, about 50%, about 100%, about 150%, about 200%, about 300%, about 400%, about 500%, about 750%, about 1000%, about 1500%, or about 2000%, about 3000%, about 4000%, about 5000%, about 7500%, about 10000% more of the protein of interest compared to a similar cell that does not express a surface-displayed enzyme that hydrolyses a disaccharide when the engineered host cell is fed a growth medium comprising a disaccharide, e.g., sucrose, as its carbon source and the similar cell is fed a growth medium comprising glucose.

    [0177] In some embodiments, the engineered host cell which expresses a surface-displayed enzyme that hydrolyses a disaccharide secretes more of the protein of interest compared to a similar cell that does not express a surface-displayed enzyme that hydrolyses a disaccharide when the engineered host cell is fed a growth medium comprising a disaccharide, e.g., sucrose, as its carbon source and the similar cell is fed a growth medium comprising glucose. In some embodiments, the engineered host cell which expresses a surface-displayed enzyme that hydrolyses a disaccharide secretes about 10% to about 2000% more of the protein of interest compared to a similar cell that does not express a surface-displayed enzyme that hydrolyses a disaccharide when the engineered host cell is fed a growth medium comprising a disaccharide, e.g., sucrose, as its carbon source and the similar cell is fed a growth medium comprising glucose. In some embodiments, the engineered host cell which expresses a surface-displayed enzyme that hydrolyses a disaccharide secretes about 10% to about 20%, about 10% to about 50%, about 10% to about 100%, about 10% to about 150%, about 10% to about 200%, about 10% to about 300%, about 10% to about 400%, about 10% to about 500%, about 10% to about 750%, about 10% to about 1000%, about 10% to about 1500%, about 10% to about 2000%, about 20% to about 50%, about 20% to about 100%, about 20% to about 150%, about 20% to about 200%, about 20% to about 300%, about 20% to about 400%, about 20% to about 500%, about 20% to about 750%, about 20% to about 1000%, about 20% to about 1500%, about 20% to about 2000%, about 50% to about 100%, about 50% to about 150%, about 50% to about 200%, about 50% to about 300%, about 50% to about 400%, about 50% to about 500%, about 50% to about 750%, about 50% to about 1000%, about 50% to about 1500%, about 50% to about 2000%, about 100% to about 150%, about 100% to about 200%, about 100% to about 300%, about 100% to about 400%, about 100% to about 500%, about 100% to about 750%, about 100% to about 1000%, about 100% to about 2000%, about 150% to about 200%, about 150% to about 300%, about 150% to about 400%, about 150% to about 500%, about 150% to about 750%, about 150% to about 1000%, about 150% to about 1500%, about 150% to about 2000%, about 200% to about 300%, about 200% to about 400%, about 200% to about 500%, about 200% to about 750%, about 200% to about 1000%, about 200% to about 2000%, about 300% to about 400%, about 300% to about 500%, about 300% to about 750%, about 300% to about 1000%, about 300% to about 1500%, about 300% to about 2000%, about 400% to about 500%, about 400% to about 750%, about 400% to about 1000%, about 400% to about 1500%, about 400% to about 2000%, about 500% to about 750%, about 500% to about 1000%, about 500% to about 2000%, about 750% to about 1000%, about 750% to about 2000%, or about 1000% to about 2000% more of the protein of interest compared to a similar cell that does not express a surface-displayed enzyme that hydrolyses a disaccharide when the engineered host cell is fed a growth medium comprising a disaccharide, e.g., sucrose, as its carbon source and the similar cell is fed a growth medium comprising glucose. In some embodiments, the engineered host cell which expresses a surface-displayed enzyme that hydrolyses a disaccharide secretes about 10%, about 20%, about 50%, about 100%, about 150%, about 200%, about 300%, about 400%, about 500%, about 750%, about 1000%, about 1500%, or about 2000%, about 3000%, about 4000%, about 5000%, about 7500%, about 10000% more of the protein of interest compared to a similar cell that does not express a surface-displayed enzyme that hydrolyses a disaccharide when the engineered host cell is fed a growth medium comprising a disaccharide, e.g., sucrose, as its carbon source and the similar cell is fed a growth medium comprising glucose. In some embodiments, the engineered host cell which expresses a surface-displayed enzyme that hydrolyses a disaccharide secretes at least about 10%, at least about 20%, at least about 50%, at least about 100%, at least about 150%, at least about 200%, at least about 300%, at least about 400%, at least about 500%, at least about 750%, at least about 1000%, at least about 1500%, or at least about 2000%, at least about 3000%, at least about 4000%, at least about 5000%, at least about 7500%, at least about 10000% more of the protein of interest compared to a similar cell that does not express a surface-displayed enzyme that hydrolyses a disaccharide when the engineered host cell is fed a growth medium comprising a disaccharide, e.g., sucrose, as its carbon source and the similar cell is fed a growth medium comprising glucose. In some embodiments, the engineered host cell which expresses a surface-displayed enzyme that hydrolyses a disaccharide secretes at most about 20%, about 50%, about 100%, about 150%, about 200%, about 300%, about 400%, about 500%, about 750%, about 1000%, about 1500%, or about 2000%, about 3000%, about 4000%, about 5000%, about 7500%, about 10000% more of the protein of interest compared to a similar cell that does not express a surface-displayed enzyme that hydrolyses a disaccharide when the engineered host cell is fed a growth medium comprising a disaccharide, e.g., sucrose, as its carbon source and the similar cell is fed a growth medium comprising glucose.

    Cell Growth in Host Cells with an Alternative Carbon Source

    [0178] In some embodiments, the engineered host cell which expresses a surface-displayed enzyme that hydrolyses a disaccharide provides about the same amount cellular proliferation and/or cellular growth compared to a similar cell that does not express a surface-displayed enzyme that hydrolyses a disaccharide. In these embodiments, about the same amount includes from about 1% to about 10%more or lesscellular proliferation and/or cellular growth.

    [0179] In some embodiments, the engineered host cell which expresses a surface-displayed enzyme that hydrolyses a disaccharide provides more cellular proliferation and/or cellular growth compared to a similar cell that does not express a surface-displayed enzyme that hydrolyses a disaccharide. In some embodiments, the engineered host cell which expresses a surface-displayed enzyme that hydrolyses a disaccharide provides about 1% to about 200% more cellular proliferation and/or cellular growth compared to a similar cell that does not express a surface-displayed enzyme that hydrolyses a disaccharide. In some embodiments, the engineered host cell which expresses a surface-displayed enzyme that hydrolyses a disaccharide provides about 1% to about 2%, about 1% to about 5%, about 1% to about 10%, about 1% to about 15%, about 1% to about 20%, about 1% to about 30%, about 1% to about 40%, about 1% to about 50%, about 1% to about 75%, about 1% to about 100%, about 1% to about 150%, about 1% to about 200%, about 2% to about 5%, about 2% to about 10%, about 2% to about 15%, about 2% to about 20%, about 2% to about 30%, about 2% to about 40%, about 2% to about 50%, about 2% to about 75%, about 2% to about 100%, about 2% to about 150%, about 2% to about 200%, about 5% to about 10%, about 5% to about 15%, about 5% to about 20%, about 5% to about 30%, about 5% to about 40%, about 5% to about 50%, about 5% to about 75%, about 5% to about 100%, about 5% to about 150%, about 5% to about 200%, about 10% to about 15%, about 10% to about 20%, about 10% to about 30%, about 10% to about 40%, about 10% to about 50%, about 10% to about 75%, about 10% to about 100%, about 10% to about 200%, about 15% to about 20%, about 15% to about 30%, about 15% to about 40%, about 15% to about 50%, about 15% to about 75%, about 15% to about 100%, about 15% to about 150%, about 15% to about 200%, about 20% to about 30%, about 20% to about 40%, about 20% to about 50%, about 20% to about 75%, about 20% to about 100%, about 20% to about 200%, about 30% to about 40%, about 30% to about 50%, about 30% to about 75%, about 30% to about 100%, about 30% to about 150%, about 30% to about 200%, about 40% to about 50%, about 40% to about 75%, about 40% to about 100%, about 40% to about 150%, about 40% to about 200%, about 50% to about 75%, about 50% to about 100%, about 50% to about 200%, about 75% to about 100%, about 75% to about 200%, or about 100% to about 200% more cellular proliferation and/or cellular growth compared to a similar cell that does not express a surface-displayed enzyme that hydrolyses a disaccharide. In some embodiments, the engineered host cell which expresses a surface-displayed enzyme that hydrolyses a disaccharide provides about 1%, about 2%, about 5%, about 10%, about 15%, about 20%, about 30%, about 40%, about 50%, about 75%, about 100%, about 150%, or about 200% more cellular proliferation and/or cellular growth compared to a similar cell that does not express a surface-displayed enzyme that hydrolyses a disaccharide. In some embodiments, the engineered host cell which expresses a surface-displayed enzyme that hydrolyses a disaccharide provides at least about 1%, about 2%, about 5%, about 10%, about 15%, about 20%, about 30%, about 40%, about 50%, about 75%, about 150%, or about 100% more cellular proliferation and/or cellular growth compared to a similar cell that does not express a surface-displayed enzyme that hydrolyses a disaccharide. In some embodiments, the engineered host cell which expresses a surface-displayed enzyme that hydrolyses a disaccharide provides at most about 20%, about 50%, about 100%, about 150%, about 200%, about 300%, about 400%, about 500%, about 750%, about 1000%, about 1500%, or about 2000%, about 3000%, about 4000%, about 5000%, about 7500%, about 10000% more cellular proliferation and/or cellular growth compared to a similar cell that does not express a surface-displayed enzyme that hydrolyses a disaccharide.

    [0180] In some embodiments, the engineered host cell which expresses a surface-displayed enzyme that hydrolyses a disaccharide provides about the same amount cellular proliferation and/or cellular growth compared to a similar cell that does not express a surface-displayed enzyme that hydrolyses a disaccharide when each are fed a growth medium comprising glucose as its carbon source. In these embodiments, about the same amount includes from about 1% to about 10%more or lesscellular proliferation and/or cellular growth.

    [0181] In some embodiments, the engineered host cell which expresses a surface-displayed enzyme that hydrolyses a disaccharide provides more cellular proliferation and/or cellular growth compared to a similar cell that does not express a surface-displayed enzyme that hydrolyses a disaccharide when each are fed a growth medium comprising glucose as its carbon source. In some embodiments, the engineered host cell which expresses a surface-displayed enzyme that hydrolyses a disaccharide provides about 1% to about 200% more cellular proliferation and/or cellular growth compared to a similar cell that does not express a surface-displayed enzyme that hydrolyses a disaccharide when each are fed a growth medium comprising glucose as its carbon source. In some embodiments, the engineered host cell which expresses a surface-displayed enzyme that hydrolyses a disaccharide provides about 1% to about 2%, about 1% to about 5%, about 1% to about 10%, about 1% to about 15%, about 1% to about 20%, about 1% to about 30%, about 1% to about 40%, about 1% to about 50%, about 1% to about 75%, about 1% to about 100%, about 1% to about 150%, about 1% to about 200%, about 2% to about 5%, about 2% to about 10%, about 2% to about 15%, about 2% to about 20%, about 2% to about 30%, about 2% to about 40%, about 2% to about 50%, about 2% to about 75%, about 2% to about 100%, about 2% to about 150%, about 2% to about 200%, about 5% to about 10%, about 5% to about 15%, about 5% to about 20%, about 5% to about 30%, about 5% to about 40%, about 5% to about 50%, about 5% to about 75%, about 5% to about 100%, about 5% to about 150%, about 5% to about 200%, about 10% to about 15%, about 10% to about 20%, about 10% to about 30%, about 10% to about 40%, about 10% to about 50%, about 10% to about 75%, about 10% to about 100%, about 10% to about 200%, about 15% to about 20%, about 15% to about 30%, about 15% to about 40%, about 15% to about 50%, about 15% to about 75%, about 15% to about 100%, about 15% to about 150%, about 15% to about 200%, about 20% to about 30%, about 20% to about 40%, about 20% to about 50%, about 20% to about 75%, about 20% to about 100%, about 20% to about 200%, about 30% to about 40%, about 30% to about 50%, about 30% to about 75%, about 30% to about 100%, about 30% to about 150%, about 30% to about 200%, about 40% to about 50%, about 40% to about 75%, about 40% to about 100%, about 40% to about 150%, about 40% to about 200%, about 50% to about 75%, about 50% to about 100%, about 50% to about 200%, about 75% to about 100%, about 75% to about 200%, or about 100% to about 200% more cellular proliferation and/or cellular growth compared to a similar cell that does not express a surface-displayed enzyme that hydrolyses a disaccharide when each are fed a growth medium comprising glucose as its carbon source. In some embodiments, the engineered host cell which expresses a surface-displayed enzyme that hydrolyses a disaccharide provides about 1%, about 2%, about 5%, about 10%, about 15%, about 20%, about 30%, about 40%, about 50%, about 75%, about 100%, about 150%, or about 200% more cellular proliferation and/or cellular growth compared to a similar cell that does not express a surface-displayed enzyme that hydrolyses a disaccharide when each are fed a growth medium comprising glucose as its carbon source. In some embodiments, the engineered host cell which expresses a surface-displayed enzyme that hydrolyses a disaccharide provides at least about 1%, about 2%, about 5%, about 10%, about 15%, about 20%, about 30%, about 40%, about 50%, about 75%, about 150%, or about 100% more cellular proliferation and/or cellular growth compared to a similar cell that does not express a surface-displayed enzyme that hydrolyses a disaccharide when each are fed a growth medium comprising glucose as its carbon source. In some embodiments, the engineered host cell which expresses a surface-displayed enzyme that hydrolyses a disaccharide provides at most about 20%, about 50%, about 100%, about 150%, about 200%, about 300%, about 400%, about 500%, about 750%, about 1000%, about 1500%, or about 2000%, about 3000%, about 4000%, about 5000%, about 7500%, about 10000% more cellular proliferation and/or cellular growth compared to a similar cell that does not express a surface-displayed enzyme that hydrolyses a disaccharide when each are fed a growth medium comprising glucose as its carbon source.

    [0182] In some embodiments, the engineered host cell which expresses a surface-displayed enzyme that hydrolyses a disaccharide provides more cellular proliferation and/or cellular growth compared to a similar cell that does not express a surface-displayed enzyme that hydrolyses a disaccharide when each are fed a growth medium comprising a disaccharide, e.g., sucrose, as its carbon source. In some embodiments, the engineered host cell which expresses a surface-displayed enzyme that hydrolyses a disaccharide provides about 10% to about 2000% more cellular proliferation and/or cellular growth compared to a similar cell that does not express a surface-displayed enzyme that hydrolyses a disaccharide when each are fed a growth medium comprising a disaccharide, e.g., sucrose, as its carbon source. In some embodiments, the engineered host cell which expresses a surface-displayed enzyme that hydrolyses a disaccharide provides about 10% to about 20%, about 10% to about 50%, about 10% to about 100%, about 10% to about 150%, about 10% to about 200%, about 10% to about 300%, about 10% to about 400%, about 10% to about 500%, about 10% to about 750%, about 10% to about 1000%, about 10% to about 1500%, about 10% to about 2000%, about 20% to about 50%, about 20% to about 100%, about 20% to about 150%, about 20% to about 200%, about 20% to about 300%, about 20% to about 400%, about 20% to about 500%, about 20% to about 750%, about 20% to about 1000%, about 20% to about 1500%, about 20% to about 2000%, about 50% to about 100%, about 50% to about 150%, about 50% to about 200%, about 50% to about 300%, about 50% to about 400%, about 50% to about 500%, about 50% to about 750%, about 50% to about 1000%, about 50% to about 1500%, about 50% to about 2000%, about 100% to about 150%, about 100% to about 200%, about 100% to about 300%, about 100% to about 400%, about 100% to about 500%, about 100% to about 750%, about 100% to about 1000%, about 100% to about 2000%, about 150% to about 200%, about 150% to about 300%, about 150% to about 400%, about 150% to about 500%, about 150% to about 750%, about 150% to about 1000%, about 150% to about 1500%, about 150% to about 2000%, about 200% to about 300%, about 200% to about 400%, about 200% to about 500%, about 200% to about 750%, about 200% to about 1000%, about 200% to about 2000%, about 300% to about 400%, about 300% to about 500%, about 300% to about 750%, about 300% to about 1000%, about 300% to about 1500%, about 300% to about 2000%, about 400% to about 500%, about 400% to about 750%, about 400% to about 1000%, about 400% to about 1500%, about 400% to about 2000%, about 500% to about 750%, about 500% to about 1000%, about 500% to about 2000%, about 750% to about 1000%, about 750% to about 2000%, or about 1000% to about 2000% more cellular proliferation and/or cellular growth compared to a similar cell that does not express a surface-displayed enzyme that hydrolyses a disaccharide when each are fed a growth medium comprising a disaccharide, e.g., sucrose, as its carbon source. In some embodiments, the engineered host cell which expresses a surface-displayed enzyme that hydrolyses a disaccharide provides at least about 10%, at least about 20%, at least about 50%, at least about 100%, at least about 150%, at least about 200%, at least about 300%, at least about 400%, at least about 500%, at least about 750%, at least about 1000%, at least about 1500%, or at least about 2000%, at least about 3000%, at least about 4000%, at least about 5000%, at least about 7500%, at least about 10000% more cellular proliferation and/or cellular growth compared to a similar cell that does not express a surface-displayed enzyme that hydrolyses a disaccharide when each are fed a growth medium comprising a disaccharide, e.g., sucrose, as its carbon source. In some embodiments, the engineered host cell which expresses a surface-displayed enzyme that hydrolyses a disaccharide provides about 10%, about 20%, about 50%, about 100%, about 150%, about 200%, about 300%, about 400%, about 500%, about 750%, about 1000%, about 1500%, or about 2000%, about 3000%, about 4000%, about 5000%, about 7500%, about 10000% more cellular proliferation and/or cellular growth compared to a similar cell that does not express a surface-displayed enzyme that hydrolyses a disaccharide when each are fed a growth medium comprising a disaccharide, e.g., sucrose, as its carbon source. In some embodiments, the engineered host cell which expresses a surface-displayed enzyme that hydrolyses a disaccharide provides at most about 20%, about 50%, about 100%, about 150%, about 200%, about 300%, about 400%, about 500%, about 750%, about 1000%, about 1500%, or about 2000%, about 3000%, about 4000%, about 5000%, about 7500%, about 10000% more cellular proliferation and/or cellular growth compared to a similar cell that does not express a surface-displayed enzyme that hydrolyses a disaccharide when each are fed a growth medium comprising a disaccharide, e.g., sucrose, as its carbon source.

    [0183] In some embodiments, the engineered host cell which expresses a surface-displayed enzyme that hydrolyses a disaccharide provides more cellular proliferation and/or cellular growth compared to a similar cell that does not express a surface-displayed enzyme that hydrolyses a disaccharide when the engineered host cell is fed a growth medium comprising a disaccharide, e.g., sucrose, as its carbon source and the similar cell is fed a growth medium comprising glucose. In some embodiments, the engineered host cell which expresses a surface-displayed enzyme that hydrolyses a disaccharide provides about 10% to about 2000% more cellular proliferation and/or cellular growth compared to a similar cell that does not express a surface-displayed enzyme that hydrolyses a disaccharide when the engineered host cell is fed a growth medium comprising a disaccharide, e.g., sucrose, as its carbon source and the similar cell is fed a growth medium comprising glucose. In some embodiments, the engineered host cell which expresses a surface-displayed enzyme that hydrolyses a disaccharide provides about 10% to about 20%, about 10% to about 50%, about 10% to about 100%, about 10% to about 150%, about 10% to about 200%, about 10% to about 300%, about 10% to about 400%, about 10% to about 500%, about 10% to about 750%, about 10% to about 1000%, about 10% to about 1500%, about 10% to about 2000%, about 20% to about 50%, about 20% to about 100%, about 20% to about 150%, about 20% to about 200%, about 20% to about 300%, about 20% to about 400%, about 20% to about 500%, about 20% to about 750%, about 20% to about 1000%, about 20% to about 1500%, about 20% to about 2000%, about 50% to about 100%, about 50% to about 150%, about 50% to about 200%, about 50% to about 300%, about 50% to about 400%, about 50% to about 500%, about 50% to about 750%, about 50% to about 1000%, about 50% to about 1500%, about 50% to about 2000%, about 100% to about 150%, about 100% to about 200%, about 100% to about 300%, about 100% to about 400%, about 100% to about 500%, about 100% to about 750%, about 100% to about 1000%, about 100% to about 2000%, about 150% to about 200%, about 150% to about 300%, about 150% to about 400%, about 150% to about 500%, about 150% to about 750%, about 150% to about 1000%, about 150% to about 1500%, about 150% to about 2000%, about 200% to about 300%, about 200% to about 400%, about 200% to about 500%, about 200% to about 750%, about 200% to about 1000%, about 200% to about 2000%, about 300% to about 400%, about 300% to about 500%, about 300% to about 750%, about 300% to about 1000%, about 300% to about 1500%, about 300% to about 2000%, about 400% to about 500%, about 400% to about 750%, about 400% to about 1000%, about 400% to about 1500%, about 400% to about 2000%, about 500% to about 750%, about 500% to about 1000%, about 500% to about 2000%, about 750% to about 1000%, about 750% to about 2000%, or about 1000% to about 2000% more cellular proliferation and/or cellular growth compared to a similar cell that does not express a surface-displayed enzyme that hydrolyses a disaccharide when the engineered host cell is fed a growth medium comprising a disaccharide, e.g., sucrose, as its carbon source and the similar cell is fed a growth medium comprising glucose. In some embodiments, the engineered host cell which expresses a surface-displayed enzyme that hydrolyses a disaccharide provides at least about 10%, at least about 20%, at least about 50%, at least about 100%, at least about 150%, at least about 200%, at least about 300%, at least about 400%, at least about 500%, at least about 750%, at least about 1000%, at least about 1500%, or at least about 2000%, at least about 3000%, at least about 4000%, at least about 5000%, at least about 7500%, at least about 10000% more cellular proliferation and/or cellular growth compared to a similar cell that does not express a surface-displayed enzyme that hydrolyses a disaccharide when the engineered host cell is fed a growth medium comprising a disaccharide, e.g., sucrose, as its carbon source and the similar cell is fed a growth medium comprising glucose. In some embodiments, the engineered host cell which expresses a surface-displayed enzyme that hydrolyses a disaccharide provides about 10%, about 20%, about 50%, about 100%, about 150%, about 200%, about 300%, about 400%, about 500%, about 750%, about 1000%, about 1500%, or about 2000%, about 3000%, about 4000%, about 5000%, about 7500%, about 10000% more cellular proliferation and/or cellular growth compared to a similar cell that does not express a surface-displayed enzyme that hydrolyses a disaccharide when the engineered host cell is fed a growth medium comprising a disaccharide, e.g., sucrose, as its carbon source and the similar cell is fed a growth medium comprising glucose. In some embodiments, the engineered host cell which expresses a surface-displayed enzyme that hydrolyses a disaccharide provides at most about 20%, about 50%, about 100%, about 150%, about 200%, about 300%, about 400%, about 500%, about 750%, about 1000%, about 1500%, or about 2000%, about 3000%, about 4000%, about 5000%, about 7500%, about 10000% more cellular proliferation and/or cellular growth compared to a similar cell that does not express a surface-displayed enzyme that hydrolyses a disaccharide when the engineered host cell is fed a growth medium comprising a disaccharide, e.g., sucrose, as its carbon source and the similar cell is fed a growth medium comprising glucose.

    [0184] Any aspect or embodiment described herein can be combined with any other aspect or embodiment as disclosed herein.

    Definitions

    [0185] Unless defined otherwise, all terms of art, notations and other technical and scientific terms or terminology used herein are intended to have the same meaning as is commonly understood by one of ordinary skill in the art to which the claimed subject matter pertains. In some cases, terms with commonly understood meanings are defined herein for clarity and/or for ready reference, and the inclusion of such definitions herein should not necessarily be construed to represent a substantial difference over what is generally understood in the art.

    [0186] As used in the specification and claims, the singular forms a, an and the include plural references unless the context clearly dictates otherwise.

    [0187] As used herein, the phrases at least one, one or more, and and/or are open-ended expressions that are both conjunctive and disjunctive in operation. For example, each of the expressions at least one of A, B and C, at least one of A, B, or C, one or more of A, B, and C, one or more of A, B, or C and A, B, and/or C mean A alone, B alone, C alone, A and B together, A and C together, B and C together, or A, B and C together.

    [0188] As used herein, or may refer to and, or, or and/or and may be used both exclusively and inclusively. For example, the term A or B may refer to A or B, A but not B, B but not A, and A and B. In some cases, context may dictate a particular meaning.

    [0189] The term about or approximately means within an acceptable error range for the particular value as determined by one of ordinary skill in the art, which will depend in part on how the value is measured or determined, e.g., the limitations of the measurement system. For example, about can mean within 1 or more than 1 standard deviation, per the practice in the art. Alternatively, about can mean a range of up to 20%, up to 15%, up to 10%, up to 5%, or up to 1% of a given value. Alternatively, particularly with respect to biological systems or processes, the term can mean within an order of magnitude, preferably within 5-fold, and more preferably within 2-fold, of a value. Where particular values are described in the application and claims, unless otherwise stated the term about meaning within an acceptable error range for the particular value should be assumed.

    [0190] The term substantially is meant to be a significant extent, for the most part; or essentially. In other words, the term substantially may mean nearly exact to the desired attribute or slightly different from the exact attribute. Substantially may be indistinguishable from the desired attribute. Substantially may be distinguishable from the desired attribute but the difference is unimportant or negligible.

    [0191] The terms comprise, comprising, contain, containing, including, includes, having, has, with, or variants thereof as used in either the present disclosure and/or in the claims, are intended to be inclusive in a manner similar to the term comprising.

    [0192] Throughout this application, various embodiments may be presented in a range format. It should be understood that the description in range format is merely for convenience and brevity and should not be construed as an inflexible limitation on the scope of the disclosure. Accordingly, the description of a range should be considered to have specifically disclosed all the possible subranges as well as individual numerical values within that range. For example, description of a range such as from 1 to 6 should be considered to have specifically disclosed subranges such as from 1 to 3, from 1 to 4, from 1 to 5, from 2 to 4, from 2 to 6, from 3 to 6 etc., as well as individual numbers within that range, for example, 1, 2, 3, 4, 5, and 6. This applies regardless of the breadth of the range.

    [0193] The terms increased, increasing, or increase are used herein to generally mean an increase by a statically significant amount relative to a reference level. In some aspects, the terms increased, or increase, mean an increase of at least 10% as compared to a reference level, for example an increase of at least about 10%, at least about 20%, or at least about 30%, or at least about 40%, or at least about 50%, or at least about 60%, or at least about 70%, or at least about 80%, or at least about 90% or up to and including a 100% increase or any increase between 10-100% as compared to a reference level. Other examples of increase include an increase of at least 2-fold, at least 5-fold, at least 10-fold, at least 20-fold, at least 50-fold, at least 100-fold, at least 1000-fold or more as compared to a reference level.

    [0194] The terms decreased, decreasing, or decrease are used herein generally to mean a decrease in a value relative to a reference level. In some aspects, decreased or decrease means a reduction by at least 10% as compared to a reference level, for example a decrease by at least about 20%, or at least about 30%, or at least about 40%, or at least about 50%, or at least about 60%, or at least about 70%, or at least about 80%, or at least about 90% or up to and including a 100% decrease (e.g., absent level or non-detectable level as compared to a reference level), or any decrease between 10-100% as compared to a reference level.

    [0195] As used herein, engineered host cells are host cells which have been manipulated using genetic engineering, i.e., by human intervention. When a host cell is engineered to underexpress a given protein, the host cell is manipulated such that the host cell has no longer the capability to express the protein described or a functional homologue thereof such as a non-engineered host cell.

    [0196] Prior to engineering when used in the context of host cells of the present invention means that such host cells are not engineered such that a polynucleotide encoding a recombinant protein or functional homologue thereof is not expressed.

    [0197] A nucleic acid is operably linked when it is placed into a functional relationship with another nucleic acid sequence on the same nucleic acid molecule. For example, a promoter is operably linked with a coding sequence of a recombinant gene when it is capable of effecting the expression of that coding sequence.

    [0198] For the purpose of the present invention the term protein is also meant to encompass functional homologues of the proteins described.

    [0199] Sequence identity, such as for the purpose of assessing percent complementarity, may be measured by any suitable alignment algorithm, including but not limited to the Needleman-Wunsch algorithm (see e.g., the EMBOSS Needle aligner available at the World Wide Web at ebi.ac.uk/Tools/psa/emboss_needle/nucleotide.html, optionally with default settings), the BLAST algorithm (see e.g., the BLAST alignment tool available at blast.ncbi.nlm.nih.gov/Blast.cgi, optionally with default settings), and the Smith-Waterman algorithm (see e.g., the EMBOSS Water aligner available at the World Wide Web at ebi.ac.uk/Tools/psa/emboss_water/nucleotide.html, optionally with default settings). Optimal alignment may be assessed using any suitable parameters of a chosen algorithm, including default parameters.

    [0200] The term bird includes both domesticated birds and non-domesticated birds such as wildlife and the like. Birds include, but are not limited to, poultry, fowl, waterfowl, game bird, ratite (e.g., flightless bird), chicken (Gallus Gallus, Gallus domesticus, or Gallus Gallus domesticus), quail, turkey, duck, ostrich (Struthio camelus), Somali ostrich (Struthio molybdophanes), goose, gull, guineafowl, pheasant, emu (Dromaius novaehollandiae), American rhea (Rhea americana), Darwin's rhea (Rhea pennata), and kiwi. Tissues, cells, and their progeny of a biological entity obtained in vivo or cultured in vitro are also encompassed. A bird may lay eggs.

    ADDITIONAL EMBODIMENTS

    [0201] Embodiment 1: An engineered host cell comprising: an integrated coding sequence of a fusion protein comprising a catalytic domain of a heterologous glycosyl hydrolase; and an integrated coding sequence of a heterologous protein of interest (POI). In this embodiment, the engineered host cell does not endogenously express the glycosyl hydrolase and the POI; and the glycosyl hydrolase is anchored on the surface of the engineered host cell.

    [0202] Embodiment 2: A method of growing/culturing the engineered host cell of Embodiment 1, wherein the method comprises culturing the engineered host cell with a carbon source that is not naturally utilized by the host cell in the absence of the glycosyl hydrolase.

    [0203] Embodiment 3: A method for growing/culturing a host cell with a carbon source that is not naturally utilized by the host cell, the method comprising: (a) recombinantly producing in the host cell a fusion protein comprising a catalytic domain of a glycosyl hydrolase capable of digesting sucrose; optionally, wherein the glycosyl hydrolase capable of digesting sucrose is an invertase; and (b) recombinantly producing in the host cell a heterologous protein of interest (POI). In this embodiment, the host cell does not express the glycosyl hydrolase endogenously and the engineered host cell prior to step (a) does not utilize sucrose as a carbon source as efficiently as glucose, and wherein the glycosyl hydrolase is expressed on the surface of the engineered host cell.

    [0204] Embodiment 4: A method for manufacturing a host cell capable of utilizing a carbon source that is not naturally utilized by the host cell, the method comprising: (a) obtaining a host cell that recombinantly expresses a fusion protein comprising a catalytic domain of a glycosyl hydrolase capable of digesting sucrose; optionally, wherein the glycosyl hydrolase capable of digesting sucrose is an invertase; and (b) genetically modifying the host cell to express a heterologous protein of interest (POI). In this embodiment, the host cell does not utilize sucrose as a carbon source as efficiently as glucose in the absence of the glycosyl hydrolase.

    [0205] Embodiment 5: A method for manufacturing a host cell capable of utilizing a carbon source that is not naturally utilized by the host cell, the method comprising: (a) obtaining a host cell that recombinantly expresses a heterologous protein of interest (POI); and (b) genetically modifying the host cell to express a fusion protein comprising a catalytic domain of a glycosyl hydrolase capable of digesting sucrose, optionally, the glycosyl hydrolase capable of digesting sucrose is an invertase. In this embodiment, the host cell prior to step (b) does not utilize sucrose as a carbon source as efficiently as glucose.

    [0206] Embodiment 6: The engineered host cell of Embodiment 1 or the method of Embodiment 2, wherein the glycosyl hydrolase is an invertase from S. cerevisiae.

    [0207] Embodiment 7: The engineered host cell or the method of Embodiment 3, wherein the invertase is encoded by the SUC2 gene.

    [0208] Embodiment 8: The engineered host cell or the method of Embodiment 3, wherein the invertase is encoded by the MAL1 gene.

    [0209] Embodiment 9: The engineered host cell or the method of any one of the previous claims, wherein the fusion protein is surface-displayed on the engineered host cell; wherein the surface-displayed fusion protein comprises a catalytic domain of the glycosyl hydrolase and an anchoring domain of a glycosylphosphatidylinositol (GPI)-anchored protein, wherein the anchoring domain comprises at least about 200 amino acids and/or at least about 30% of the residues in the anchoring domain are serines or threonines.

    [0210] Embodiment 10: The engineered host cell or the method of Embodiment 9, wherein the anchoring domain comprises at least about 225 amino acids, at least about 250 amino acids, at least about 275 amino acids, at least about 300 amino acids, at least about 325 amino acids, at least about 350 amino acids, at least about 375 amino acids, or at least about 400 amino acids.

    [0211] Embodiment 11: The engineered host cell or the method of Embodiment 9 or Embodiment 10, wherein at least about 35% of the residues in the anchoring domain are serines or threonines, at least about 40% of the residues in the anchoring domain are serines or threonines, at least about 45% of the residues in the anchoring domain are serines or threonines, or at least about 50% of the residues in the anchoring domain are serines or threonines.

    [0212] Embodiment 12: The engineered host cell or the method of Embodiment 11, wherein the serines or threonines in the anchoring domain are capable of being O-mannosylated.

    [0213] Embodiment 13: The engineered host cell or the method of any one of the preceding claims, wherein a fusion protein having an anchoring domain comprising at least about 325 amino acids provides greater glycosyl hydrolase activity relative to a fusion protein having an anchoring domain comprising less than about 300 amino acids.

    [0214] Embodiment 14: The engineered host cell or the method of any one of the preceding claims, wherein a fusion protein having an anchoring domain comprising at least about 300 amino acids provides greater glycosyl hydrolase activity relative to a fusion protein having an anchoring domain comprising less than about 250 amino acids.

    [0215] Embodiment 15: The engineered host cell or the method of any one of the preceding claims, wherein the fusion protein comprises the anchoring domain of the GPI anchored protein.

    [0216] Embodiment 16: The engineered host cell or the method of any one of the preceding claims, wherein the fusion protein comprises the GPI anchored protein without its native signal peptide or native secretory signal.

    [0217] Embodiment 17: The engineered host cell or the method of any one of the preceding claims, wherein the GPI anchored protein is not native to the engineered host cell.

    [0218] Embodiment 18: The engineered host cell or the method of any one of the preceding claims, wherein the GPI anchored protein is naturally expressed by a S. cerevisiae cell and the engineered host cell is not a S. cerevisiae cell.

    [0219] Embodiment 19: The engineered host cell or the method of any one of the preceding claims, wherein the GPI anchored protein is selected from Tir4, Dan1, or Sed1.

    [0220] Embodiment 20: The engineered host cell or the method of Embodiment 19, wherein an anchoring domain of the GPI anchored protein comprises an amino acid sequence that is at least 70% identical, at least 75% identical, at least 80% identical, at least 85% identical, at least 90% identical, or at least 95% identical, to one of SEQ ID NO: 1 to SEQ ID NO: 14.

    [0221] Embodiment 21: The engineered host cell or the method of Embodiment 19 or Embodiment 20, wherein the anchoring domain of the GPI anchored protein comprises an amino acid sequence of one of SEQ ID NO: 1 to SEQ ID NO: 14.

    [0222] Embodiment 22: The engineered host cell or the method of any one of the preceding claims, wherein the engineered host cell is a yeast cell.

    [0223] Embodiment 23: The engineered host cell or the method of any one of the preceding claims, wherein the engineered host cell is a Pichia species.

    [0224] Embodiment 24: The engineered host cell or the method of Embodiment 23, wherein the Pichia species is Pichia pastoris.

    [0225] Embodiment 25: The engineered host cell or the method of any one of the preceding claims, wherein the engineered host cell comprises a genomic modification that expresses the fusion.

    [0226] Embodiment 26: The engineered host cell or the method of any one of the preceding claims, wherein the fusion protein comprises a portion of the glycosyl hydrolase in addition to its catalytic domain.

    [0227] Embodiment 27: The engineered host cell or the method of any one of the preceding claims, wherein the fusion protein comprises substantially the entire amino acid sequence of the glycosyl hydrolase.

    [0228] Embodiment 28: The engineered host cell or the method of any one of Embodiments 20-27, wherein in the fusion protein, the catalytic domain is N-terminal to the anchoring domain.

    [0229] Embodiment 29: The engineered host cell or the method of any one of Embodiments 20-27, wherein in the fusion protein, the catalytic domain is C-terminal to the anchoring domain.

    [0230] Embodiment 30: The engineered host cell or the method of any one of the preceding claims, wherein the fusion protein comprises a linker between the catalytic domain and the anchoring domain.

    [0231] Embodiment 31: The engineered host cell or the method of any one of the preceding claims, wherein, upon translation, the fusion protein comprises a signal peptide and/or a secretory signal.

    [0232] Embodiment 32: The engineered host cell or the method of any one of the preceding claims, wherein a growth rate of the engineered host cell in a media containing sucrose as a primary carbon source is higher than a growth rate of a control host cell, wherein the control host cell is identical to the engineered host cell, except the control cell does not express the glycosyl hydrolase.

    [0233] Embodiment 33: The engineered eukaryotic cell of any one of the preceding claims, wherein the engineered eukaryotic cell comprises a genomic modification that overexpresses a secreted recombinant protein and/or comprises an extrachromosomal modification that overexpresses a secreted recombinant protein.

    [0234] Embodiment 34: The engineered eukaryotic cell of Embodiment 33, wherein the secreted recombinant protein is an animal protein.

    [0235] Embodiment 35: The engineered eukaryotic cell of Embodiment 34, wherein the animal protein is an egg protein.

    [0236] Embodiment 36: The engineered eukaryotic cell of Embodiment 35, wherein the egg protein is selected from the group consisting of ovalbumin, ovomucoid, lysozyme ovoglobulin G2, ovoglobulin G3, -ovomucin, -ovomucin, ovotransferrin, ovoinhibitor, ovoglycoprotein, flavoprotein, ovomacroglobulin, ovostatin, cystatin, avidin, ovalbumin related protein X, and ovalbumin related protein Y.

    [0237] Embodiment 37: The engineered eukaryotic cell of any one of Embodiments 33 to 36, wherein the genomic modification and/or the extrachromosomal modification that overexpresses the secreted recombinant protein comprises an inducible promoter.

    [0238] Embodiment 38: The engineered eukaryotic cell of Embodiment 37, wherein the inducible promoter is an AOX1, DAK2, PEX11, FLD1, FGH1, DAS1, DAS2, CAT1, MDH3, HAC1, BiP, RAD30, RVS161-2, MPP10, THP3, TLR, GBP2, PMP20, SHB17, PEX8, PEX4, or TKL3 promoter.

    [0239] Embodiment 39: The engineered eukaryotic cell of any one of Embodiments 33 to 38, wherein the genomic modification and/or the extrachromosomal modification that overexpresses a secreted recombinant protein comprises an AOX1, TDH3, MOX, RPS25A, or RPL2A terminator.

    [0240] Embodiment 40: The engineered eukaryotic cell of any one of Embodiments 33 to 39, wherein the genomic modification and/or the extrachromosomal modification that overexpresses a secreted recombinant protein encodes a signal peptide and/or a secretory signal.

    [0241] Embodiment 41: The engineered eukaryotic cell of any one of Embodiments 33 to 40, wherein the genomic modification and/or the extrachromosomal modification that overexpresses a secreted recombinant protein comprises codons that are optimized for the species of the engineered eukaryotic cell.

    [0242] Embodiment 42: The engineered eukaryotic cell of any one of Embodiments 33 to 41, wherein the secreted recombinant protein is designed to be secreted from the cell and/or is capable of being secreted from the cell.

    INCORPORATION BY REFERENCE

    [0243] All publications, patents, and patent applications mentioned in this specification are herein incorporated by reference to the same extent as if each individual publication, patent, or patent application was specifically and individually indicated to be incorporated by reference.

    [0244] Additional aspects and advantages of the present disclosure will become readily apparent to those skilled in this art from the following detailed description, wherein only illustrative embodiments of the present disclosure are shown and described. As will be realized, the present disclosure is capable of other and different embodiments, and its several details are capable of modifications in various obvious respects, all without departing from the disclosure. Accordingly, the drawings and description are to be regarded as illustrative in nature, and not as restrictive.

    EXAMPLES

    [0245] The following examples are given for the purpose of illustrating various embodiments of the invention and are not meant to limit the present invention in any fashion. The present examples, along with the methods described herein are presently representative of preferred embodiments, are exemplary, and are not intended as limitations on the scope of the invention. Changes therein and other uses which are encompassed within the spirit of the invention as defined by the scope of the claims will occur to those skilled in the art.

    Example 1: Growth of P. Pastoris on Carbon Sources Prior to Engineering

    [0246] A background strain (strain 1) was used as a test strain. The genetic modifications present in strain 1 are deletion of AOX1 and AOX2. No target protein cassettes were present in this strain. strain 1 was plated on minimal nutrient plates containing Glucose, Fructose, or Sucrose.

    [0247] As shown in FIG. 1 the strain was able to grow on glucose and fructose at similar rates and had similar colony sizes. The strain grew to pinprick sized colonies on sucrose and stops. Without wishing to be bound by theory, it appears that sucrose source may naturally contain a small amount of hydrolyzed material, which produces separated glucose and fructose molecules.

    Example 2: Expression Constructs, Transformation, and Processing

    [0248] A surface displayed invertase (suc2) from Saccharomyces cerevisiae was transformed into a high performing strain (strain 2; parent strain) previously transformed to express recombinant ovalbumin (rOVA). Strains 3 and Strain 4 are considered a high-performing strain. The fusion protein was driven by PGCW14, a highly expressed constitutive promoter. The DNA sequence for the expression cassette and the amino acid sequence for the fusion protein are disclosed herein respectively as SEQ ID NO: 314 and SEQ ID NO: 315. The DNA sequence encoded a secretion signal between the promoter and the SUC2 sequence, thereby permitting the invertase to become displayed on the outer surface of the cell.

    [0249] In high throughput screening, those transformants which successfully expressed rOVA protein when fed sucrose, i.e., those transformants that expressed rOVA and the surface displayed invertase, were able to achieve a 50% or more increase in productivity when compared to the same strains when fed glucose alone. Candidate strains were picked into sucrose-containing media and grown for 24 hours. The starter cultures were divided equally and inoculated either sucrose-containing media or glucose-containing media for high throughput screening. Data from eight high performing candidate strains, showing growth and productivity comparisons when fed different carbon sources is shown below in Table 11. The parent strain strain 2 is unable to grow and express recombinant protein when fed sucrose, therefore all strain 2 comparisons below are made relative to its performance in glucose.

    TABLE-US-00011 TABLE 11 Supernatant protein Supernatant protein Supernatant protein concentration in concentration in Productivity in Productivity in OD* in OD in concentration in Productivity in sucrose vs strain glucose vs strain 2 sucrose vs strain glucose vs strain Strain sucrose glucose sucrose vs glucose sucrose vs glucose 2in glucose in glucose 2in glucose 2in glucose 1 16.76 14.02 0.81 0.68 1.09 1.34 0.77 1.13 2 17.16 14.2 0.92 0.76 1.04 1.13 0.71 0.93 3 15.8 13.37 0.79 0.67 0.99 1.25 0.74 1.10 4 16.41 14.29 1.15 1.00 0.98 0.85 0.71 0.70 5 19.29 17.66 1.15 1.05 0.87 0.76 0.53 0.50 6 16.66 14.59 0.76 0.66 0.87 1.14 0.61 0.92 7 17.04 13.67 0.67 0.54 0.75 1.12 0.52 0.96 8 16.14 14.45 0.61 0.55 0.68 1.11 0.49 0.90

    [0250] In Table 11, above, optical density (OD) is an indirect measure of cell density in culture, thus reflecting cell growth. For reference, strain 2 achieved OD's of 1.14 in sucrose (practically no growth) and 11.76 in glucose. The columns of Table 11 reciting vs. strain 2 show a relative comparison of protein production of a candidate strain using sucrose or glucose as a food source compared to strain 2 using glucose as a food source. Numbers shown in columns 3-8 show relative ratios of protein production. The ratios shown in Table 11 are described below:

    [0251] The column entitled: Supernatant protein concentration in sucrose vs glucose in Table 11 shows ratios of the concentration of recombinantly-expressed protein measured in the culture supernatant when comparing sucrose-fed cultures to glucose-fed cultures.

    [0252] The column entitled: Productivity in sucrose vs glucose in Table 11 shows ratios comparing sucrose-fed cultures to glucose-fed cultures. Productivity was measured by protein concentration in supernatant divided by OD; by dividing by the culture's OD, a per-cell protein productivity was determined.

    [0253] The column entitled: Supernatant protein concentration in sucrose vs strain 2 in glucose in Table 11 shows ratios of protein concentration measured in the culture supernatant when comparing sucrose-fed cultures of each candidate strain to glucose-fed cultures of the parent strain strain 2.

    [0254] The column entitled: Supernatant protein concentration in glucose vs strain 2 in glucose in Table 11 shows ratios of protein concentration measured in the culture supernatant when comparing glucose-fed cultures of each candidate strain to glucose-fed cultures of the parent strain strain 2.

    [0255] The column entitled: Productivity in sucrose vs strain 2 in glucose in Table 11 shows ratios of per cell productivity comparing sucrose-fed cultures of each candidate strain to glucose-fed cultures of the parent strain strain 2.

    [0256] The column entitled: Productivity in glucose vs strain 2 in glucose in Table 11 shows ratios of per cell productivity comparing glucose-fed cultures of each candidate strain to glucose-fed cultures of the parent strain strain 2.

    [0257] All candidate strains grew more cell mass when fed sucrose when compared to their cell mass when fed glucose. When considering protein concentration and productivity by the candidate strains when fed sucrose in comparison to the strain 2 strain when fed glucose, candidate strains 1 to 4 each performed well, with similar supernatant protein concentration to parent and from about 71% to 77% productivity. The data herein show that candidate strains that were fed sucrose were as efficient as making protein as the strain 2 parent strain fed with glucose.

    [0258] FIG. 4 illustrates the comparison of growth on glucose (G) (shown as _D in FIG. 4) vs sucrose (S) (shown as _S in FIG. 4) of various background strains and the candidate strains which were engineered to display invertase. Strain 2, strain 1, and strain 11 are background strains which express rOVA, strain 12 is a wild-type P. pastoris strain, and strain 3 and strain 4 were engineered express the Suc2 construct (strain 2+Suc2-Tir4, i.e., the surface displayed invertase fusion protein). Although each strain achieved OD600 values of 10 or higher when grown in glucose-containing media, only the strains which were engineered to express the surface displayed invertase fusion protein could achieve such levels with sucrose was the main carbon source in a media. All other media components were the same, final concentrations of sugar (either sucrose or glucose) in the media were 0.5%. OD600 measures the amount turbidity of a culture, which is related to the amount of cells present in the culture and is an indicator of cell proliferation/cell growth.

    Example 3: Growth of Engineered P. pastoris Using Sucrose as a Carbon Source

    [0259] A surface displayed invertase (suc2) from Saccharomyces cerevisiae was transformed into a P2 strain (strain 5) which was previously transformed to express recombinant ovalbumin (rOVA). Performance of the suc2-expressing strain, referred to herein is strain 6, was evaluated in a 250 mL bioreactor. The strain 6 strain produced rOVA at a similar titer and quality as the strain 5 when fed either glucose or sucrose, as measured qualitatively by SDS-PAGE (FIG. 5) and quantitatively by HPLC (Table 12). The strain 6 strain and the control strain 5 strain (which expressed rOVA but did not express suc2) were run in bioreactors in parallel to undergo similar fermentation processes. Inclusion of either glucose or sucrose as the carbon source in a culturing media was the only variable. Strain 6 was further evaluated in a 50:50 glucose:fructose feed (not shown). The strain performed similarly in the 50:50 feed compared to sucrose feed, suggesting that its metabolism when fed sucrose is not rate limited by the sucrose hydrolysis step carried out by SUC2.

    [0260] In FIG. 5 and Table 12: 194 and 195 are data for parent strain (strain 5) grown on glucose, 196 and 197 are data for a surface displayed suc2-expressing strain strain 6 grown on glucose; and 198 and 199 are data for a suc2-expressing strain 6 grown on sucrose. P2.1-P2-3 are data the standard strain 5 sample loaded as a reference. P2.1-P2.3 are a protein standard (not generated by strain 5) of known concentration loaded for reference. The standard sample was generated using an in-house strain expressing P2 and the protein was column purified to be used as an internal protein standard.

    [0261] The performance measured by HPLC (Table 12) represents the broth titer of fermentation normalized to the average of the control (strain 5 that lacks suc2, fed glucose as the carbon source, run on Bay 194 and Bay 195).

    TABLE-US-00012 TABLE 12 Carbon Performance* normalized Sample Strain source average of control Bay 194 strain 5 control Glucose 1.03 Bay 195 strain 5 control Glucose 0.97 Bay 196 strain 6 Glucose 1.02 Bay 197 strain 6 Glucose 1.01 Bay 198 strain 6 Sucrose 0.99 Bay 199 strain 6 Sucrose 0.99 *Broth titer of fermentation

    [0262] To determine if hydrolysis of sucrose into glucose and fructose by the surface displayed invertase fusion protein affects cell growth and/or recombinant protein expression amounts, the strain 6 strain was a fed a media comprising equal parts of glucose and fructose and compared to the strain 6 strain fed a medium comprising an equivalent amount of sucrose. The strain 6 strain performed similarly when the two conditions were compared as shown in Table 12; suggesting that the extra step of hydrolyzing sucrose is not rate limiting to the cell growth and protein expression processes.