Alpha (1,2) fucosyltransferase syngenes for use in the production of fucosylated oligosaccharides
11643675 · 2023-05-09
Assignee
Inventors
- John M. McCoy (Reading, MA)
- Matthew Ian Heidtman (Brighton, MA, US)
- Massimo Merighi (Somerville, MA, US)
Cpc classification
C12P19/04
CHEMISTRY; METALLURGY
C12Y204/01149
CHEMISTRY; METALLURGY
C12P19/18
CHEMISTRY; METALLURGY
C12N15/70
CHEMISTRY; METALLURGY
C07H13/04
CHEMISTRY; METALLURGY
C12P19/00
CHEMISTRY; METALLURGY
C12Y204/01069
CHEMISTRY; METALLURGY
Y02A50/30
GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
C12Y204/01086
CHEMISTRY; METALLURGY
International classification
C12P19/18
CHEMISTRY; METALLURGY
C12P19/00
CHEMISTRY; METALLURGY
C07H13/04
CHEMISTRY; METALLURGY
C12N15/70
CHEMISTRY; METALLURGY
C12P19/04
CHEMISTRY; METALLURGY
Abstract
The invention provides compositions and methods for engineering E. coli or other host production bacterial strains to produce fucosylated oligosaccharides, and the use thereof in the prevention or treatment of infection.
Claims
1. A method for producing a fucosylated oligosaccharide in a bacterium comprising providing bacterium comprising an exogenous lactose-utilizing α(1,2) fucosyltransferase enzyme, wherein said α(1,2) fucosyltransferase enzyme has at least 90% sequence identity to amino acid sequence SEQ ID NO: 17; and culturing said bacterium in the presence of lactose.
2. The method of claim 1, wherein said α(1,2) fucosyltransferase enzyme comprises Prevotella sp. FutW, or a functional variant or fragment thereof having at least 90% sequence identity to SEQ ID NO: 17.
3. The method of claim 1, further comprising retrieving the fucosylated oligosaccharide from said bacterium or from a culture supernatant of said bacterium.
4. The method of claim 1, wherein said fucosylated oligosaccharide comprises 2′-fucosyllactose (2′-FL), lactodifucotetraose (LDFT), or lacto-N-difucohexaose I (LDFH I).
5. The method of claim 1, wherein the bacterium further comprises an exogenous lactose-utilizing α(1,3) fucosyltransferase enzyme and/or an exogenous lactose-utilizing α(1,4) fucosyltransferase enzyme, or wherein said bacterium further comprises a reduced level of β-galactosidase activity, a defective colanic acid synthesis pathway, an inactivated adenosine-5′-triphosphate (ATP)-dependent intracellular protease, or an inactivated endogenous lacA gene, or any combination thereof.
6. The method of claim 5, wherein the exogenous lactose-utilizing α(1,3) fucosyltransferase enzyme comprises a Helicobacter pylori 26695 futA gene.
7. The method of claim 5, wherein the exogenous lactose-utilizing α(1,4) fucosyltransferase enzyme comprises a Helicobacter pylori UA948 FucTa gene or a Helicobacter pylori strain DMS6709 FucT III gene.
8. The method of claim 5, wherein said method further comprises culturing said bacterium in the presence of tryptophan and in the absence of thymidine.
9. The method of claim 5, wherein said reduced level of β-galactosidase activity comprises a deleted or inactivated endogenous lacZ gene and/or a deleted or inactivated endogenous lad gene of said bacterium.
10. The method of claim 9, wherein said reduced level of β-galactosidase activity further comprises an exogenous lacZ gene or variant thereof, wherein said exogenous lacZ gene or variant thereof comprises an β-galactosidase activity level less than a corresponding wild-type bacterium.
11. The method of claim 5, wherein said reduced level of β-galactosidase activity comprises an activity level less than wild-type bacterium.
12. The method of claim 11, wherein said reduced level of β-galactosidase activity comprises less than 6,000 units of β-galactosidase activity.
13. The method of claim 11, wherein said reduced level of β-galactosidase activity comprises less than 1,000 units of β-galactosidase activity.
14. The method of claim 5, wherein said bacterium comprises a lacIq gene promoter immediately upstream of a lacY gene, or wherein said bacterium further comprises a functional lactose permease gene, or wherein said bacterium comprises E. coli lacY, or wherein said bacterium further comprises an exogenous E. coli rcsA or E. coli rcsB gene, or wherein said bacterium further comprises a mutation in a thyA gene, or wherein said bacterium accumulates intracellular lactose in the presence of exogenous lactose, or wherein said bacterium accumulates intracellular GDP-fucose.
15. The method of claim 5, wherein said defective colanic acid synthesis pathway comprises an inactivation of a wcaJ gene of said bacterium.
16. The method of claim 5, wherein said inactivated ATP-dependent intracellular protease is a null mutation, inactivating mutation, or deletion of an endogenous lon gene.
17. The method of claim 16, wherein said inactivating mutation of an endogenous lon gene comprises the insertion of a functional E. coli lacZ.sup.+ gene.
18. The method of claim 1, wherein said bacterium is E. coli.
19. The method of claim 1, wherein said bacterium of claim 1 is a member of the Bacillus, Pantoea, Lactobacillus, Lactococcus, Streptococcus, Proprionibacterium, Enterococcus, Bifidobacterium, Sporolactobacillus, Micromomospora, Micrococcus, Rhodococcus, or Pseudomonas genus.
20. The method of claim 1, wherein said bacterium of claim 1 is selected from the group consisting of Bacillus licheniformis, Bacillus subtilis, Bacillus coagulans, Bacillus thermophiles, Bacillus laterosporus, Bacillus megaterium, Bacillus mycoides, Bacillus pumilus, Bacillus lentus, Bacillus cereus, and Bacillus circulans, Erwinia herbicola (Pantoea agglomerans), Citrobacter freundii, Pantoea citrea, Pectobacterium carotovorum, Xanthomonas campestris Lactobacillus acidophilus, Lactobacillus salivarius, Lactobacillus plantarum, Lactobacillus helveticus, Lactobacillus delbrueckii, Lactobacillus rhamnosus, Lactobacillus bulgaricus, Lactobacillus crispatus, Lactobacillus gasseri, Lactobacillus casei, Lactobacillus reuteri, Lactobacillus jensenii, Lactococcus lactis, Streptococcus thermophiles, Proprionibacterium freudenreichii, Enterococcus faecium, Enterococcus thermophiles), Bifidobacterium longum, Bifidobacterium infantis, Bifidobacterium bifidum, Pseudomonas fluorescens and Pseudomonas aeruginosa.
21. The method of claim 1, wherein said bacterium comprises a nucleic acid construct comprising an isolated nucleic acid encoding said α(1,2) fucosyltransferase enzyme.
22. The method of claim 21, wherein said nucleic acid is operably linked to one or more heterologous control sequences that direct the production of the enzyme in the bacterium.
23. The method of claim 22, wherein said heterologous control sequence comprises a bacterial promoter and operator, a bacterial ribosome binding site, a bacterial transcriptional terminator, or a plasmid selectable marker.
24. The method of claim 1, wherein the amino acid sequence of said enzyme comprises the amino acid sequence of SEQ ID NO:17.
Description
BRIEF DESCRIPTION OF THE DRAWINGS
(1)
(2)
(3)
(4)
(5)
(6)
(7)
(8)
(9)
(10)
(11)
(12)
(13)
(14)
DETAILED DESCRIPTION OF THE INVENTION
(15) While some studies suggest that human milk glycans could be used as antimicrobial anti-adhesion agents, the difficulty and expense of producing adequate quantities of these agents of a quality suitable for human consumption has limited their full-scale testing and perceived utility. What has been needed is a suitable method for producing the appropriate glycans in sufficient quantities at reasonable cost. Prior to the invention described herein, there were attempts to use several distinct synthetic approaches for glycan synthesis. Some chemical approaches can synthesize oligosaccharides (Flowers, H. M. Methods Enzymol 50, 93-121 (1978); Seeberger, P. H. Chem Commun (Camb) 1115-1121 (2003)), but reactants for these methods are expensive and potentially toxic (Koeller, K. M. & Wong, C. H. Chem Rev 100, 4465-4494 (2000)). Enzymes expressed from engineered organisms (Albermann, C., Piepersberg, W. & Wehmeier, U. F. Carbohydr Res 334, 97-103 (2001); Bettler, E., Samain, E., Chazalet, V., Bosso, C., et al. Glycoconj J 16, 205-212 (1999); Johnson, K. F. Glycoconj J 16, 141-146 (1999); Palcic, M. M. Curr Opin Biotechnol 10, 616-624 (1999); Wymer, N. & Toone, E. J. Curr Opin Chem Biol 4, 110-119 (2000)) provide a precise and efficient synthesis (Palcic, M. M. Curr Opin Biotechnol 10, 616-624 (1999)); Crout, D. H. & Vic, G. Curr Opin Chem Biol 2, 98-111 (1998)), but the high cost of the reactants, especially the sugar nucleotides, limits their utility for low-cost, large-scale production. Microbes have been genetically engineered to express the glycosyltransferases needed to synthesize oligosaccharides from the bacteria's innate pool of nucleotide sugars (Endo, T., Koizumi, S., Tabata, K., Kakita, S. & Ozaki, A. Carbohydr Res 330, 439-443 (2001); Endo, T., Koizumi, S., Tabata, K. & Ozaki, A. Appl Microbiol Biotechnol 53, 257-261 (2000); Endo, T. & Koizumi, S. Curr Opin Struct Biol 10, 536-541 (2000); Endo, T., Koizumi, S., Tabata, K., Kakita, S. & Ozaki, A. Carbohydr Res 316, 179-183 (1999); Koizumi, S., Endo, T., Tabata, K. & Ozaki, A. Nat Biotechnol 16, 847-850 (1998)). However, prior to the invention described herein, there was a growing need to identify and characterize additional glycosyltransferases that are useful for the synthesis of HMOS in metabolically engineered bacterial hosts.
(16) Human Milk Glycans
(17) Human milk contains a diverse and abundant set of neutral and acidic oligosaccharides (Kunz, C., Rudloff, S., Baier, W., Klein, N., and Strobel, S. (2000). Annu Rev Nutr 20, 699-722; Bode, L. (2006). J Nutr 136, 2127-130). More than 130 different complex oligosaccharides have been identified in human milk, and their structural diversity and abundance is unique to humans. Although these molecules may not be utilized directly by infants for nutrition, they nevertheless serve critical roles in the establishment of a healthy gut microbiome (Marcobal, A., Barboza, M., Froehlich, J. W., Block, D. E., et al. J Agric Food Chem 58, 5334-5340 (2010)), in the prevention of disease (Newburg, D. S., Ruiz-Palacios, G. M. & Morrow, A. L. Annu Rev Nutr 25, 37-58 (2005)), and in immune function (Newburg, D. S. & Walker, W. A. Pediatr Res 61, 2-8 (2007)). Despite millions of years of exposure to human milk oligosaccharides (HMOS), pathogens have yet to develop ways to circumvent the ability of HMOS to prevent adhesion to target cells and to inhibit infection. The ability to utilize HMOS as pathogen adherence inhibitors promises to address the current crisis of burgeoning antibiotic resistance. Human milk oligosaccharides produced by biosynthesis represent the lead compounds of a novel class of therapeutics against some of the most intractable scourges of society.
(18) One alternative strategy for efficient, industrial-scale synthesis of HMOS is the metabolic engineering of bacteria. This approach involves the construction of microbial strains overexpressing heterologous glycosyltransferases, membrane transporters for the import of precursor sugars into the bacterial cytosol, and possessing enhanced pools of regenerating nucleotide sugars for use as biosynthetic precursors (Dumon, C., Samain, E., and Priem, B. (2004). Biotechnol Prog 20, 412-19; Ruffing, A., and Chen, R. R. (2006). Microb Cell Fact 5, 25). A key aspect of this approach is the heterologous glycosyltransferase selected for overexpression in the microbial host. The choice of glycosyltransferase can significantly affect the final yield of the desired synthesized oligosaccharide, given that enzymes can vary greatly in terms of kinetics, substrate specificity, affinity for donor and acceptor molecules, stability and solubility. A few glycosyltransferases derived from different bacterial species have been identified and characterized in terms of their ability to catalyze the biosynthesis of HMOS in E. coli host strains (Dumon, C., Bosso, C., Utille, J. P., Heyraud, A., and Samain, E. (2006). Chembiochem 7, 359-365; Dumon, C., Samain, E., and Priem, B. (2004). Biotechnol Prog 20, 412-19; Li, M., Liu, X. W., Shao, J., Shen, J., Jia, Q., Yi, W., Song, J. K., Woodward, R., Chow, C. S., and Wang, P. G. (2008). Biochemistry 47, 378-387). The identification of additional glycosyltransferases with faster kinetics, greater affinity for nucleotide sugar donors and/or acceptor molecules, or greater stability within the bacterial host significantly improves the yields of therapeutically useful HMOS. Prior to the invention described herein, chemical syntheses of HMOS were possible, but were limited by stereo-specificity issues, precursor availability, product impurities, and high overall cost (Flowers, H. M. Methods Enzymol 50, 93-121 (1978); Seeberger, P. H. Chem Commun (Camb) 1115-1121 (2003); Koeller, K. M. & Wong, C. H. Chem Rev 100, 4465-4494 (2000)). The invention overcomes the shortcomings of these previous attempts by providing new strategies to inexpensively manufacture large quantities of human milk oligosaccharides (HMOS) for use as dietary supplements. Advantages include efficient expression of the enzyme, improved stability and/or solubility of the fucosylated oligosaccharide product (2′-FL, LDFT, LNF I, and LDFH I) and reduced toxicity to the host organism. The present invention features novel α(1,2) FTs suitable for expression in production strains for increased efficacy and yield of fucosylated HMOS compared to α(1,2) FTs currently utilized in the field.
(19) As described in detail below, E. coli (or other bacteria) is engineered to produce selected fucosylated oligosaccharides (i.e., 2′-FL, LDFT, LDHF I, or LNF I) in commercially viable levels. For example, yields are >5 grams/liter in a bacterial fermentation process. In other embodiments, the yields are greater than 10 grams/liter, greater than 15 grams/liter, greater than 20 grams/liter, greater than 25 grams/liter, greater than 30 grams/liter, greater than 35 grams/liter, greater than 40 grams/liter, greater than 45 grams/liter, greater than 50 grams/liter, greater than 55 grams/liter, greater than 60 grams/liter, greater than 65 grams/liter, greater than 70 grams/liter, or greater than 75 grams/liter of fucosylated oligosaccharide products, such as 2′-FL, LDFT, LDHF I, and LNF I.
(20) Role of Human Milk Glycans in Infectious Disease
(21) Human milk glycans, which comprise both unbound oligosaccharides and their glycoconjugates, play a significant role in the protection and development of the infant gastrointestinal (GI) tract. Neutral fucosylated oligosaccharides, including 2′-fucosyllactose (2′-FL), protect infants against several important pathogens. Milk oligosaccharides found in various mammals differ greatly, and the composition in humans is unique (Hamosh M., 2001 Pediatr Clin North Am, 48:69-86; Newburg D. S., 2001 Adv Exp Med Biol, 501:3-10). Moreover, glycan levels in human milk change throughout lactation and also vary widely among individuals (Morrow A. L. et al., 2004 J Pediatr, 145:297-303; Chaturvedi P et al., 2001 Glycobiology, 11:365-372). Approximately 200 distinct human milk oligosaccharides have been identified and combinations of simple epitopes are responsible for this diversity (Newburg D. S., 1999 Curr Med Chem, 6:117-127; Ninonuevo M. et al., 2006 J Agric Food Chem, 54:7471-74801).
(22) Human milk oligosaccharides are composed of 5 monosaccharides: D-glucose (Glc), D-galactose (Gal), N-acetylglucosamine (GlcNAc), L-fucose (Fuc), and sialic acid (N-acetyl neuraminic acid, NeuSAc, NANA). Human milk oligosaccharides are usually divided into two groups according to their chemical structures: neutral compounds containing Glc, Gal, GlcNAc, and Fuc, linked to a lactose (Galβ1-4G1c) core, and acidic compounds including the same sugars, and often the same core structures, plus NANA (Charlwood J. et al., 1999 Anal Biochem, 273:261-277; Martin-Sosa et al., 2003 J Dairy Sci, 86:52-59; Parkkinen J. and Finne J., 1987 Methods Enzymol, 138:289-300; Shen Z. et al., 2001 J Chromatogr A, 921:315-321).
(23) Approximately 70-80% of oligosaccharides in human milk are fucosylated, and their synthetic pathways are believed to proceed as shown in
(24) Human Milk Glycans Inhibit Binding of Enteropathogens to their Receptors
(25) Human milk glycans have structural homology to cell receptors for enteropathogens and function as receptor decoys. For example, pathogenic strains of Campylobacter bind specifically to glycans containing H-2, i.e., 2′-fucosyl-N-acetyllactosamine or 2′-fucosyllactose (2′FL); Campylobacter binding and infectivity are inhibited by 2′-FL and other glycans containing this H-2 epitope. Similarly, some diarrheagenic E. coli pathogens are strongly inhibited in vivo by human milk oligosaccharides containing 2-linked fucose moieties. Several major strains of human caliciviruses, especially the noroviruses, also bind to 2-linked fucosylated glycans, and this binding is inhibited by human milk 2-linked fucosylated glycans. Consumption of human milk that has high levels of these 2-linked fucosyloligosaccharides was associated with lower risk of norovirus, Campylobacter, ST of E. coli-associated diarrhea, and moderate-to-severe diarrhea of all causes in a Mexican cohort of breastfeeding children (Newburg D. S. et al., 2004 Glycobiology, 14:253-263; Newburg D. S. et al., 1998 Lancet, 351:1160-1164). Several pathogens utilize sialylated glycans as their host receptors, such as influenza (Couceiro, J. N., Paulson, J. C. & Baum, L. G. Virus Res 29, 155-165 (1993)), parainfluenza (Amonsen, M., Smith, D. F., Cummings, R. D. & Air, G. M. J Virol 81, 8341-8345 (2007), and rotoviruses (Kuhlenschmidt, T. B., Hanafin, W. P., Gelberg, H. B. & Kuhlenschmidt, M. S. Adv Exp Med Biol 473, 309-317 (1999)). The sialyl-Lewis X epitope is used by Helicobacter pylori (Mandavi, J., Sondén, B., Hurtig, M., Olfat, F. O., et al. Science 297, 573-578 (2002)), Pseudomonas aeruginosa (Scharfman, A., Delmotte, P., Beau, J., Lamblin, G., et al. Glycoconj J 17, 735-740 (2000)), and some strains of noroviruses (Rydell, G. E., Nilsson, J., Rodriguez-Diaz, J., Ruvoen-Clouet, N., et al. Glycobiology 19, 309-320 (2009)).
(26) Identification of Novel α(1,2) Fucosyltransferases
(27) The present invention provides novel α(1,2) fucosyltransferase enzymes. The present invention also provides nucleic acid constructs (i.e., a plasmid or vector) carrying the nucleic acid sequence of a novel α(1,2) fucosyltransferases for the expression of the novel α(1,2) fucosyltransferases in host bacterium. The present invention also provides methods for producing fucosylated oligosaccharides by expressing the novel α(1,2) fucosyltransferases in suitable host production bacterium, as further described herein.
(28) Not all α(1,2)fucosyltransferases can utilize lactose as an acceptor substrate. An acceptor substrate includes, for example, a carbohydrate, an oligosaccharide, a protein or glycoprotein, a lipid or glycolipid, e.g., N-acetylglucosamine, N-acetyllactosamine, galactose, fucose, sialic acid, glucose, lactose, or any combination thereof. A preferred alpha (1,2) fucosyltransferase of the present invention utilizes GDP-fucose as a donor, and lactose is the acceptor for that donor.
(29) A method of identifying novel α(1,2)fucosyltransferase enzymes capable of utilizing lactose as an acceptor was previously carried out (as described in PCT/US2013/051777, hereby incorporated by reference in its entirety) using the following steps: 1) performing a computational search of sequence databases to define a broad group of simple sequence homologs of any known, lactose-utilizing α(1,2)fucosyltransferase; 2) using the list of homologs from step 1 to derive a search profile containing common sequence and/or structural motifs shared by the members of the broad group, e.g. by using computer programs such as MEME (Multiple Em for Motif Elicitation available at http://meme.sdsc.edu/meme/cgi-bin/meme.cgi) or PSI-BLAST (Position-Specific Iterated BLAST available at ncbi.nlm.nih.gov/blast with additional information at cnx.org/content/m11040/latest/); 3) searching sequence databases (e.g., using computer programs such as PSI-BLAST, or MAST (Motif Alignment Search Tool available at http://meme.sdsc.edu/meme/cgi-bin/mast.cgi); using this derived search profile as query, and identifying “candidate sequences” whose simple sequence homology to the original lactose-accepting α(1,2)fucosyltransferase is 40% or less; 4) scanning the scientific literature and developing a list of “candidate organisms” known to express α(1,2)fucosyl-glycans; 5) selecting only those “candidate sequences” that are derived from “candidate organisms” to generate a list of “candidate lactose-utilizing enzymes”; and 6) expressing each “candidate lactose-utilizing enzyme” and testing for lactose-utilizing α(1,2)fucosyltransferase activity.
(30) The MEME suite of sequence analysis tools (meme.sdsc.edu/meme/cgi-bin/meme.cgi) can also be used as an alternative to PSI-BLAST. Sequence motifs are discovered using the program “MEME”. These motifs can then be used to search sequence databases using the program “MAST”. The BLAST and PSI-BLAST search algorithms are other well known alternatives.
(31) To identify additional novel α(1,2)fucosyltransferases, a multiple sequence alignment query was generated using four previously identified lactose-utilizing α(1,2)fucosyltransferase protein sequences: H. pylori futC (SEQ ID NO: 1), H. mustelae FutL (SEQ ID NO: 2), Bacteroides vulgatus futN (SEQ ID NO: 3), and E. coli 0126 wbgL (SEQ ID NO: 4). These sequence alignment and percentage of sequence identity is shown in
(32) This PSI-BLAST search resulted in an initial 2515 hits. There were 787 hits with greater than 22% sequence identity to FutC. 396 hits were of greater than 275 amino acids in length. Additional analysis of the hits was performed, including sorting by percentage identity to FutC, comparing the sequences by BLAST to existing α(1,2) fucosyltransferase inventory (of known α(1,2) fucosyltransferases), and manual annotation of hit sequences to identify those originating from bacteria that naturally exist in the gastrointestinal tract. An annotated list of the novel α(1,2) fucosyltransferases identified by this screen are listed in Table 1. Table 1 provides the bacterial species from which the candidate enzyme is found, the GenBank Accession Number, GI Identification Number, amino acid sequence, and % sequence identity to FutC.
(33) Of the identified hits, 12 novel α(1,2) fucosyltransferases were further analyzed for their functional capacity: Prevotella melaninogenica FutO, Clostridium bolteae FutP, Clostridium bolteae+13 FutP, Lachnospiraceae sp. FutQ, Methanosphaerula palustries FutR, Tannerella sp. FutS, Bacteroides caccae FutU, Butyrivibrio FutV, Prevotellaa sp. FutW, Parabacteroides johnsonii FutX, Akkermansia muciniphilia FutY, Salmonella enterica FutZ, and Bacteroides sp. FutZA. For Clostridium bolteae FutP, the annotation named the wrong initiation methionine codon. Thus, the present invention includes FutP with an additional 13 amino acids at the N-terminus of the annotated FutP (derived in-frame from the natural upstream DNA sequence), which is designated herein as Clostridium bolteae+13 FutP. The sequence identity between the 12 novel α(1,2) fucosyltransferases identified and the 4 previously identified α(1,2) fucosyltransferases is shown in Table 2 below.
(34) TABLE-US-00001 TABLE 2 Sequence Identity 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 H. pylori futC 1 70.10 21.99 20.82 27.68 27.36 23.56 23.28 23.62 25.75 23.72 24.05 22.29 24.19 22.92 22.29 H. mustelae futL 2 70.10 23.87 19.88 26.38 28.21 24.30 23.38 24.62 25.31 25.31 24.47 23.56 25.15 23.55 23.26 Bacteroides vulgatus 3 21.99 23.87 25.16 32.05 28.71 28.94 25.79 37.46 32.27 26.11 61.27 71.63 27.67 25.15 84.75 futN E. coli 0126 wbgL 4 20.82 19.88 25.16 24.25 22.73 22.32 26.04 25.45 24.77 21.49 23.29 26.71 24.63 21.45 25.16 Prevotella 5 27.68 26.38 32.05 24.25 36.96 31.63 35.74 35.16 55.74 30.28 30.03 32.80 30.09 26.28 31.83 melaninogenica Fut0 YP_003814512.1 Clostridium 6 27.36 28.21 28.71 22.73 36.96 37.87 35.10 33.77 36.91 35.74 29.58 31.39 27.67 26.33 29.13 bolteae + 13 FutP WP_002570768.1 Lachnospiraceae sp. 7 23.56 24.30 28.94 22.32 31.63 37.87 29.87 29.17 32.90 51.02 28.53 30.00 27.69 24.00 27.74 FutQ WP_009251343.1 Methanosphaerula 8 23.28 23.38 25.79 26.04 35.74 35.10 29.87 28.71 38.24 31.41 25.39 28.08 30.65 23.93 25.55 palustris FutR YP_002467213.1 Tannerella sp. FutS 9 23.62 24.62 37.46 25.45 35.16 33.77 29.17 28.71 34.41 30.03 35.71 36.27 26.48 21.75 36.60 WP_021929367.1 Bacteroides caccae 10 25.75 25.31 32.27 24.77 55.74 36.91 32.90 38.24 34.41 31.21 29.94 33.33 29.28 24.46 33.01 FutU WP_005675707.1 Butyrivibrio FutV 11 23.72 25.31 26.11 21.49 30.28 35.74 51.02 31.41 30.03 31.21 27.62 26.20 26.46 22.15 26.52 WP_022772718.1 Prevotella sp. FutW 12 24.05 24.47 61.27 23.29 30.03 29.58 28.53 25.39 35.71 29.94 27.62 57.60 25.79 22.15 59.01 WP_022481266.1 Parabacteroides 13 22.29 23.56 71.63 26.71 32.80 31.39 30.00 28.08 36.27 33.33 26.20 57.60 28.71 24.00 74.02 johnsonii FutX WP_008155883.1 Akkermansia 14 24.19 25.15 27.67 24.63 30.09 27.67 27.69 30.65 26.48 29.28 26.46 25.79 28.71 21.45 28.08 muciniphilia FutY YP_001877555 Salmonella enterica 15 22.92 23.55 25.15 21.45 26.28 26.33 24.00 23.93 21.75 24.46 22.15 22.15 24.00 21.45 24.62 FutZ WP_023214330 Bacteroides sp. 16 22.29 23.26 84.75 25.16 31.83 29.13 27.74 25.55 36.60 33.01 26.52 59.01 74.02 28.08 24.62 FutZA WP_022161880.1
(35) Based on the amino acid sequences of the identified α(1,2) fucosyltransferases (i.e., in Table 1), syngenes can be readily designed and constructed by the skilled artisan using standard methods known in the art. For example, the syngenes include a ribosomal binding site, are codon-optimized for expression in a host bacterial production strain (i.e., E. coli), and have common 6-cutter restriction sites or sites recognized by endogenous restriction enzymes present in the host strain (i.e., EcoK restriction sites) removed to ease cloning and expression in the E. coli host strain. In a preferred embodiment, the syngenes are constructed with the following configuration: EcoRI site-T7g10 RBS-α(1,2) FT syngene-XhoI site. The nucleic acid sequences of sample syngenes for the 12 identified α(1,2) fucosyltransferases are shown in Table 3. (the initiating methionine ATG codon is bolded)
(36) TABLE-US-00002 TABLE 3 Nucleic acid sequences of 12 novel α(1,2) fucosyltransferase syngenes Bacteria/ SEQ Gene ID name Sequence NO: FutO CAGTCAGTCAGAATTCAAGAAGGAGATATACATATGAAAATCGTCAAAATCCTGGGCGGT 276 CTGGGCAATCAGATGTTCCAGTATGCTCTGTACCTGAGCCTGCAAGAAAGTTTTCCAAAA GAACGTGTGGCCCTGGACCTGTCCTCCTTCCACGGCTATCACCTGCATAATGGCTTTGAG CTGGAGAACATTTTCTCCGTTACCGCTCAGAAAGCATCCGCCGCAGATATCATGCGTATT GCTTATTACTACCCGAACTATCTGCTGTGGCGCATTGGCAAACGTTTTCTGCCGCGTCGT AAAGGTATGTGCCTGGAATCTAGCTCCCTGCGTTTCGATGAAAGCGTTCTGCGTCAGGAA GGTAACCGTTATTTTGACGGTTACTGGCAAGACGAACGCTACTTCGCAGCCTATCGTGAA AAAGTGCTGAAGGCTTTCACCTTTCCTGCATTCAAACGCGCAGAAAACCTGAGCCTGCTG GAAAAACTGGACGAAAACAGCATTGCTCTGCATGTTCGTCGCGGTGATTACGTAGGTAAT AACCTGTACCAAGGCATCTGTGACCTGGACTACTACCGTACCGCTATCGAGAAAATGTGT GCACACGTTACTCCGTCTCTGTTTTGTATCTTTTCCAACGACATCACGTGGTGCCAGCAG CACCTGCAACCGTACCTGAAGGCCCCTGTGGTGTACGTTACTTGGAACACCGGTGTTGAA TCCTACCGCGATATGCAGCTGATGTCCTGCTGCGCACATAACATCATCGCGAATAGCTCC TTCTCTTGGTGGGGTGCTTGGCTGAATCAGAACCGTGAAAAAGTTGTTATCGCCCCGAAA AAATGGCTGAACATGGAAGAATGTCACTTCACGCTGCCGGCAAGCTGGATCAAAATTTAG CTCGAGTGACTGACTG FutP CAGTCAGTCAGAATTCAAGAAGGAGATATACATATGGTGATTATCAAAATGATGGGTGGT 277 CTGGGCAACCAGATGTTCCAGTACGCACTGTACAAAGCATTCGAGCAGAAGCACATCGAT GTGTATGCAGACCTGGCATGGTACAAAAACAAATCCGTGAAATTTGAACTGTACAACTTC GGCATTAAAATCAACGTAGCATCCGAGAAAGACATCAACCGTCTGAGCGATTGCCAGGCG GACTTTGTTTCCCGCATCCGCCGTAAAATCTTTGGTAAAAAAAAGAGCTTCGTATCTGAA AAAAATGACTCCTGCTATGAAAACGACATCCTGCGTATGGACAACGTTTATCTGAGCGGT TATTGGCAGACCGAAAAATACTTCTCTAACACGCGTGAGAAGCTGCTGGAGGATTATTCC TTCGCTCTGGTAAACTCTCAGGTGTCCGAATGGGAAGACTCCATTCGCAACAAAAACAGC GTTAGCATCCATATCCGTCGTGGTGATTATCTACAGGGCGAACTGTATGGTGGTATTTGC ACCTCTCTGTACTACGCCGAAGCAATCGAGTACATTAAAATGCGTGTTCCGAACGCAAAA TTCTTCGTTTTCTCTGATGACGTTGAATGGGTTAAACAGCAAGAAGACTTCAAAGGCTTC GTAATCGTTGATCGCAACGAGTATTCTAGCGCTCTGTCCGATATGTACCTGATGTCCCTG TGCAAGCATAACATTATTGCTAACTCCTCTTTCAGCTGGTGGGCAGCTTGGCTGAACCGT AACGAAGAAAAAATTGTAATCGCGCCGCGCCGTTGGCTGAACGGCAAGTGCACCCCAGAT ATCTGGTGTAAAAAATGGATTCGTATCTAGCTCGAGTGACTGACTG FutQ CAGTCAGTCAGAATTCAAGAAGGAGATATACATATGGTGATCGTACAGCTGAGCGGCGGT 278 CTGGGCAACCAGATGTTCGAATACGCGCTGTACCTGAGCCTGAAAGCAAAAGGCAAAGAA GTGAAAATTGACGATGTTACGTGTTACGAGGGCCCTGGCACCCGTCCGCGTCAACTGGAT GTTTTTGGTATCACGTACGATCGCGCGTCTCGTGAGGAGCTGACTGAGATGACGGACGCG AGCATGGATGCGCTGTCTCGTGTTCGTCGCAAACTGACCGGTCGCCGCACTAAAGCGTAC CGCGAACGCGACATCAACTTCGATCCACTGGTTATGGAAAAAGACCCGGCACTGCTGGAA GGCTGTTTCCAGTCTGACAAATACTTTCGTGATTGCGAAGGCCGCGTGCGCGAAGCGTAT CGTTTCCGCGGCATTGAATCCGGCGCGTTCCCGCTGCCGGAAGACTATCTGCGCCTGGAA AAGCAGATCGAAGATTGTCAGTCCGTATCCGTACACATCCGTCGTGGCGACTACCTGGAC GAATCTCATGGTGGTCTGTACACCGGCATTTGTACTGAGGCGTACTATAAAGAGGCTTTT GCTCGCATGGAACGTCTGGTTCCGGGCGCACGTTTCTTCCTGTTCTCTAACGATCCAGAA TGGACTCGTGAGCACTTTGAGAGCAAGAACTGCGTTCTGGTTGAAGGTAGCACCGAAGAC ACGGGTTACATGGACCTGTACCTGATGAGCCGCTGCCGCCACAATATTATTGCCAACTCT TCTTTCAGCTGGTGGGGCGCTTGGCTGAATGAGAACCCTGAGAAAAAAGTCATCGCACCG GCTAAATGGCTGAACGGTCGTGAGTGCCGTGATATCTATACCGAACGCATGATTCGTCTG TAGCTCGAGTGACTGACTG FutR CAGTCAGTCAGAATTCAAGAAGGAGATATACATATGATCATTGTTCGTCTGAAAGGCGGT 279 CTGGGCAACCAACTGTCTCAGTATGCACTGGGCCGTAAGATCGCGCATCTGCACAATACC GAACTGAAACTGGACACCACTTGGTTCACCACTATCTCCTCCGACACTCCACGTACCTAC CGTCTGAACAATTATAACATCATCGGCACTATTGCATCCGCAAAGGAAATCCAGCTGATC GAACGTGGTCGCGCGCAAGGCCGTGGCTACCTGCTGTCTAAAATTTCTGATCTGCTGACT CCGATGTACCGTCGTACCTACGTCCGTGAACGTATGCATACCTTCGATAAAGCTATCCTG ACCGTTCCGGACAACGTGTACCTGGATGGTTACTGGCAGACCGAAAAGTACTTCAAAGAC ATCGAAGAAATCCTGCGCCGTGAGGTTACGCTGAAAGATGAACCGGATAGCATCAACCTG GAAATGGCTGAACGTATTCAGGCTTGCCACAGCGTTTCCCTGCACGTGCGTCGTGGCGAC TACGTTTCCAACCCGACCACTCAACAATTCCACGGCTGTTGCTCCATTGACTACTACAAC CGCGCTATCTCTCTGATTGAAGAAAAAGTGGATGACCCGTCTTTCTTTATTTTTTCTGAC GATCTGCCGTGGGCTAAAGAAAACCTGGACATCCCTGGCGAAAAAACCTTCGTTGCGCAT AACGGCCCGGAAAAAGAGTATTGCGATCTGTGGCTGATGTCTCTGTGCCAGCACCATATC ATCGCAAACTCTTCTTTCAGCTGGTGGGGTGCCTGGCTGGGTCAAGACGCCGAAAAGATG GTGATCGCGCCGCGTCGCTGGGCCCTGTCCGAGAGCTTTGACACTTCTGACATCATTCCG GACTCTTGGATTACTATCTAGCTCGAGTGACTGACTG FutS CAGTCAGTCAGAATTCAAGAAGGAGATATACATATGGTACGCATTGTGGAAATCATCGGC 280 GGTCTGGGTAACCAGATGTTCCAGTACGCATTCTCCCTGTACCTGAAAAACAAATCTCAC ATCTGGGACCGTCTGTATGTGGACATCGAGGCGATGAAAACCTACGATCGTCACTATGGT CTGGAACTGGAGAAAGTTTTCAATCTGAGCCTGTGTCCAATCTCTAACCGTCTGCACCGC AACCTGCAAAAACGCTCCTTCGCAAAACACTTTGTAAAGAGCCTGTACGAGCACTCTGAA TGCGAGTTCGACGAACCGGTGTACCGTGGCCTGCGTCCTTATCGCTATTATCGCGGCTAC TGGCAAAACGAAGGTTACTTCGTTGATATTGAACCGATGATCCGTGAGGCTTTTCAGTTC AACGTTAACATCCTGAGCAAAAAGACTAAAGCGATCGCATCCAAAATGCGCCGTGAACTG TCCGTATCTATCCATGTTCGCCGTGGTGATTACGAAAACCTGCCGGAAGCGAAAGCGATG CATGGCGGTATTTGTTCTCTGGACTATTACCACAAAGCGATCGACTTCATCCGCCAGCGT CTGGATAATAACATCTGTTTCTATCTGTTCTCCGACGATATCAATTGGGTAGAAGAAAAC CTGCAACTGGAAAACCGTTGCATCATCGACTGGAACCAGGGCGAAGATAGCTGGCAGGAC ATGTACCTGATGAGCTGCTGCCGCCACCACATTATCGCAAACAGCTCTTTCTCCTGGTGG GCGGCATGGCTGAATCCAAACAAGAACAAAATCGTACTGACCCCGAACAAATGGTTCAAC CATACTGACGCAGTGGGTATCGTCCCAAAGTCCTGGATTAAAATTCCTGTGTTTTAGCTC GAGTGACTGACTG FutU CAGTCAGTCAGAATTCAAGAAGGAGATATACATATGAAAATCGTTAAAATCCTGGGCGGC 281 CTGGGTAACCAGATGTTTCAGTACGCCCTGTTCCTGTCTCTGAAAGAACGCTTCCCGCAT GAACAGGTGATGATTGACACCAGCTGCTTCCGCAATTACCCACTGCACAACGGTTTCGAA GTGGATCGTATCTTCGCCCAGAAAGCACCGGTTGCCTCTTGGCGTAACATCCTGAAGGTT GCCTACCCGTACCCGAACTACCGTTTCTGGAAAATCGGTAAATACATCCTGCCTAAACGT AAAACCATGTGTGTAGAGCGTAAAAACTTCAGCTTTGACGCCGCAGTCCTGACCCGTAAA GGCGATTGCTACTATGATGGCTACTGGCAGCATGAGGAATATTTCTGTGATATGAAAGAA ACGATTTGGGAGGCTTTCTCCTTCCCTGAGCCGGTTGATGGTCGTAACAAGGAGATCGGT GCCCTGCTACAGGCATCTGATAGCGCTTCCCTGCACGTTCGTCGCGGTGACTACGTGAAC CACCCACTGTTTCGTGGTATTTGTGACCTGGACTATTATAAACGTGCCATCCACTACATG GAAGAACGCGTCAACCCACAGCTGTACTGCGTTTTCAGCAACGATATGGCCTGGTGCGAG TCCCACCTGCGTGCACTGCTGCCAGGCAAAGAAGTAGTTTATGTTGACTGGAACAAGGGT GCGGAATCTTACGTTGATATGCGTCTGATGAGCCTGTGCCGTCACAACATCATCGCTAAC TCTTCTTTCAGCTGGTGGGGCGCATGGCTGAACCGTAACCCGCAGAAAGTGGTGGTAGCG CCGGAACGTTGGATGAACAGCCCGATTGAAGACCCAGTGAGCGACAAATGGATTAAACTG TAGCTCGAGTGACTGACTG FutV CAGTCAGTCAGAATTCAAGAAGGAGATATACATATGATCATCATCCAGCTGAAAGGTGGC 282 CTGGGCAACCAAATGTTCCAGTACGCGCTGTACAAATCCCTGAAAAAACGTGGTAAAGAA GTTAAAATTGATGACAAAACTGGCTTCGTGAACGACAAACTGCGTATCCCGGTACTGTCC CGTTGGGGTGTTGAGTACGATCGTGCAACCGACGAAGAGATTATTAACCTGACCGACTCC AAAATGGACCTGTTCTCTCGCATCCGCCGTAAACTGACTGGCCGCAAAACGTTCCGTATC GACGAAGAATCCGGTAAATTCAACCCGGAAATCCTGGAAAAAGAGAACGCTTATCTGGTG GGTTATTGGCAGTGCGACAAGTACTTCGACGACAAAGATGTGGTTCGCGAAATTCGTGAA GCGTTCGAGAAAAAACCGCAGGAGCTGATGACCGACGCCAGCTCTTGGTCTACTCTACAG CAGATTGAATGCTGCGAGTCCGTATCCCTGCACGTACGTCGTACTGATTACGTGGACGAG GAACATATTCATATCCATAACATCTGTACGGAAAAATACTATAAAAACGCCATTGATCGT GTGCGTAAACAGTACCCGAGCGCAGTGTTCTTCATCTTCACCGATGATAAAGAATGGTGC CGCGACCACTTTAAAGGTCCGAACTTCATCGTAGTCGAACTGGAAGAAGGCGACGGTACC GACATCGCTGAAATGACTCTGATGTCCCGCTGTAAACATCACATCATCGCTAATTCTAGC TTTAGCTGGTGGGCGGCGTGGCTGAACGACTCCCCGGAAAAAATCGTGATCGCTCCTCAG AAATGGATTAACAACCGCGACATGGACGATATTTACACCGAGCGTATGACTAAAATCGCA CTGTAGCTCGAGTGACTGACTG FutW CAGTCAGTCAGAATTCAAGAAGGAGATATACATATGCGCCTGGTTAAAATGATCGGCGGT 283 CTGGGTAATCAGATGTTCATCTACGCGTTTTACCTACAGATGCGTAAGCGTTTCTCCAAC GTTCGTATCGACCTGACCGATATGATGCACTACAACGTACACTATGGCTACGAACTGCAC AAAGTTTTCGGTCTGCCGCGCACCGAGTTCTGTATGAACCAGCCTCTGAAAAAGGTTCTG GAGTTCCTGTTCTTCCGTACCATTGTTGAACGTAAACAGCACGGTCGTATGGAGCCGTAT ACTTGCCAGTATGTTTGGCCGCTGGTTTACTTTAAGGGCTTCTATCAGTCCGAACGTTAC TTCTCCGAAGTTAAGGACGAAGTTCGTGAGTGTTTCACCTTCAATCCGGCACTGGCGAAT CGTTCTTCCCAACAGATGATGGAACAGATCCAGAATGATCCTCAGGCTGTCTCTATCCAC ATCCGTCGTGGCGACTATCTGAATCCGAAGCACTACGACACTATCGGTTGTATCTGTCAG CTGCCGTATTACAAGCACGCCGTTTCCGAAATTAAAAAGTACGTTTCTAACCCTCACTTT TACGTTTTCTCCGAAGACCTGGATTGGGTCAAAGCAAACCTGCCGCTGGAAAACGCACAG TACATCGATTGGAACAAAGGCGCAGATAGCTGGCAGGATATGATGCTGATGAGCTGTTGC AAACACCACATTATCTGTAACTCCACCTTTAGCTGGTGGGCGGCGTGGCTGAACCCATCT GTCGAAAAAACCGTGATCATGCCGGAACAGTGGACGTCTCGTCAAGATTCCGTGGACTTT GTGGCTAGCTGTGGCCGTTGGGTCCGTGTTAAAACGGAGTAGCTCGAGTGACTGACTG FutX CAGTCAGTCAGAATTCAAGAAGGAGATATACATATGCGTCTGATCAAGATGATCGGCGGC 284 CTGGGTAACCAGATGTTTATCTACGCGTTCTACCTGAAAATGAAACACCATTACCCGGAT ACGAACATCGATCTGTCTGACATGGTTCATTATAAAGTTCACAACGGTTATGAGATGAAC CGTATCTTTGACCTGAGCCAGACTGAATTTTGCATCAACCGTACCCTGAAAAAAATCCTG GAGTTCCTGTTCTTCAAAAAAATCTACGAACGTCGCCAGGACCCGTCTACTCTGTATCCA TACGAAAAACGTTATTTTTGGCCGCTGCTGTACTTTAAAGGTTTCTACCAGTCTGAACGC TTCTTCTTCGATATCAAAGACGACGTTCGTAAAGCCTTCTCTTTTAACCTGAACATCGCT AACCCGGAAAGCCTGGAACTGCTGAAACAGATCGAAGTTGACGACCAAGCTGTTTCTATC CACATCCGCCGTGGTGACTACCTGCTGCCGCGTCACTGGGCAAACACGGGTTCCGTGTGC CAGCTGCCGTATTACAAGAACGCGATCGCGGAAATGGAGAACCGTATTACTGGCCCGAGC TACTACGTGTTCTCTGATGATATCTCTTGGGTTAAAGAAAACATCCCGCTGAAGAAAGCG GTCTACGTGACGTGGAACAAGGGCGAAGACAGCTGGCAGGATATGATGCTGATGAGCCAC TGTCGTCACCACATTATCTGTAATTCTACGTTCTCCTGGTGGGGTGCTTGGCTGAACCCA CGTAAAGAGAAAATCGTCATCGCGCCGTGTCGCTGGTTCCAGCATAAAGAAACCCCGGAC ATGTACCCGAAAGAATGGATCAAAGTACCGATTAACTAGCTCGAGTGACTGACTG FutZ CAGTCAGTCAGAATTCAAGAAGGAGATATACATATGTATTCTTGCCTGTCTGGTGGCCTG 285 GGTAACCAAATGTTTCAATACGCAGCAGCGTATATCCTGAAGCAGTATTTTCAGTCTACC ACTCTGGTCCTGGATGATAGCTATTACTATTCCCAGCCGAAACGTGATACCGTTCGTAGC CTGGAACTGAATCAGTTCAACATCTCTTATGATCGTTTTAGCTTCGCGGATGAAAAAGAG AAGATCAAACTGCTGCGCAAATTCAAACGTAACCCGTTCCCTAAACAGATTTCCGAGATC CTGTCTATTGCGCTGTTCGGCAAATACGCGCTGTCCGACCGTGCATTTTACACCTTCGAA ACTATCAAAAACATCGACAAAGCGTGCCTGTTCTCCTTTTACCAGGACGCCGATCTGCTG AATAAATATAAGCAGCTGATCCTGCCGCTGTTCGAACTGCGCGATGACCTGCTGGATATC TGCAAGAACCTGGAACTGTATTCCCTGATCCAACGCAGCAACAATACCACTGCACTGCAT ATCCGCCGTGGCGACTACGTGACCAACCAGCACGCCGCGAAATACCACGGCGTGCTGGAC ATCAGCTACTATAACCACGCAATGGAATACGTGGAACGTGAACGCGGCAAACAGAACTTC ATTATCTTCAGCGATGATGTACGTTGGGCACAGAAAGCGTTTCTGGAGAACGATAATTGC TACGTGATTAACAACTCCGACTACGATTTCTCTGCGATCGATATGTATCTGATGTCTCTG TGCAAAAACAACATCATCGCAAATTCCACCTACTCCTGGTGGGGTGCGTGGCTGAACAAA TACGAGGACAAACTGGTTATCTCTCCGAAACAATGGTTTCTGGGTAACAACGAAACCTCT CTGCGTAACGCGTCTTGGATCACCCTGTAGCTCGAGTGACTGACTG FutZA CAGTCAGTCAGAATTCAAGAAGGAGATATACATATGCGTCTGATCAAGATGACCGGTGGC 286 CTGGGTAACCAGATGTTCATCTACGCGTTTTATCTGCGTATGAAAAAACGTTATCCGAAA GTTCGTATTGATCTGTCTGATATGGTTCATTATCACGTTCACCACGGCTATGAAATGCAC CGTGTTTTCAATCTGCCGCACACCGAATTTTGCATCAACCAGCCGCTGAAAAAAGTGATC GAGTTCCTGTTTTTCAAAAAGATTTACGAACGTAAACAGGACCCTAATTCTCTGCGTGCA TTCGAGAAGAAGTATCTGTGGCCGCTGCTGTACTTCAAAGGTTTCTATCAATCTGAGCGC TTCTTTGCTGACATCAAAGACGAGGTTCGTAAAGCATTCACCTTTGACTCTTCTAAAGTG AACGCTCGCTCTGCCGAACTGCTGCGTCGCCTGGATGCCGATGCTAACGCGGTTAGCCTG CACATTCGTCGCGGTGACTATCTACAGCCGCAGCATTGGGCTACCACTGGTTCTGTCTGC CAGCTGCCGTACTACCAGAACGCGATCGCTGAAATGAACCGTCGCGTTGCTGCCCCGAGC TACTACGTTTTCAGCGATGACATCGCGTGGGTGAAGGAAAACATCCCTCTACAGAACGCA GTGTACATCGACTGGAATAAAGGCGAAGAAAGCTGGCAGGATATGATGCTGATGAGCCAC TGCCGCCACCACATTATCTGTAACAGCACCTTCTCTTGGTGGGGCGCGTGGCTGGACCCG CACGAGGACAAAATTGTAATCGTTCCGAATCGTTGGTTCCAGCATTGCGAAACTCCTAAC ATCTATCCGGCAGGCTGGGTGAAAGTTGCGATTAATTAGCTCGAGTGACTGACTG
(37) In any of the methods described herein, the α(1,2) fucosyltransferase genes or gene products may be variants or functional fragments thereof. A variant of any of genes or gene products disclosed herein may have 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% sequence identity to the nucleic acid or amino acid sequences described herein.
(38) Variants as disclosed herein also include homolog, orthologs, or paralogs of the genes or gene products described herein that retain the same biological function as the genes or gene products specified herein. These variants can be used interchangeably with the genes recited in these methods. Such variants may demonstrate a percentage of homology or identity, for example, 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% identity conserved domains important for biological function, preferably in a functional domain, e.g. catalytic domain.
(39) The term “% identity,” in the context of two or more nucleic acid or polypeptide sequences, refer to two or more sequences or subsequences that are the same or have a specified percentage of amino acid residues or nucleotides that are the same, when compared and aligned for maximum correspondence, as measured using one of the following sequence comparison algorithms or by visual inspection. For example, % identity is relative to the entire length of the coding regions of the sequences being compared, or the length of a particular fragment or functional domain thereof.
(40) For sequence comparison, typically one sequence acts as a reference sequence, to which test sequences are compared. When using a sequence comparison algorithm, test and reference sequences are input into a computer, subsequence coordinates are designated, if necessary, and sequence algorithm program parameters are designated. The sequence comparison algorithm then calculates the percent sequence identity for the test sequence(s) relative to the reference sequence, based on the designated program parameters.
(41) Percent identity is determined using search algorithms such as BLAST and PSI-BLAST (Altschul et al., 1990, J Mol Biol 215:3, 403-410; Altschul et al., 1997, Nucleic Acids Res 25:17, 3389-402). For the PSI-BLAST search, the following exemplary parameters are employed: (1) Expect threshold was 10; (2) Gap cost was Existence:11 and Extension:1; (3) The Matrix employed was BLOSUM62; (4) The filter for low complexity regions was “on”.
(42) Changes can be introduced by mutation into the nucleic acid sequence or amino acid sequence of any of the genes or gene products described herein, leading to changes in the amino acid sequence of the encoded protein or enzyme, without altering the functional ability of the protein or enzyme. For example, nucleotide substitutions leading to amino acid substitutions at “non-essential” amino acid residues can be made in the sequence of any of sequences expressly disclosed herein. A “non-essential” amino acid residue is a residue at a position in the sequence that can be altered from the wild-type sequence of the polypeptide without altering the biological activity, whereas an “essential” amino acid residue is a residue at a position that is required for biological activity. For example, amino acid residues that are conserved among members of a family of proteins are not likely to be amenable to mutation. Other amino acid residues, however, (e.g., those that are poorly conserved among members of the protein family) may not be as essential for activity and thus are more likely to be amenable to alteration. Thus, another aspect of the invention pertains to nucleic acid molecules encoding the proteins or enzymes disclosed herein that contain changes in amino acid residues relative to the amino acid sequences disclosed herein that are not essential for activity (i.e., fucosyltransferase activity).
(43) An isolated nucleic acid molecule encoding a protein essentially retaining the functional capability compared to any of the genes described herein can be created by introducing one or more nucleotide substitutions, additions or deletions into the corresponding nucleotide sequence, such that one or more amino acid substitutions, additions or deletions are introduced into the encoded protein.
(44) Mutations can be introduced into a nucleic acid sequence by standard techniques such that the encoded amino acid sequence is altered, such as site-directed mutagenesis and PCR-mediated mutagenesis. Preferably, conservative amino acid substitutions are made at one or more predicted non-essential amino acid residues. A “conservative amino acid substitution” is one in which the amino acid residue is replaced with an amino acid residue having a similar side chain. Families of amino acid residues having similar side chains have been defined in the art. Certain amino acids have side chains with more than one classifiable characteristic. These families include amino acids with basic side chains (e.g., lysine, arginine, histidine), acidic side chains (e.g., aspartic acid, glutamic acid), uncharged polar side chains (e.g., glycine, asparagine, glutamine, serine, threonine, tyrosine, tryptophan, cysteine), nonpolar side chains (e.g., alanine, valine, leucine, isoleucine, proline, phenylalanine, methionine, tyrosine, tryptophan), beta-branched side chains (e.g., threonine, valine, isoleucine) and aromatic side chains (e.g., tyrosine, phenylalanine, tryptophan, histidine). Thus, a predicted nonessential amino acid residue in a given polypeptide is replaced with another amino acid residue from the same side chain family. Alternatively, in another embodiment, mutations can be introduced randomly along all or part of a given coding sequence, such as by saturation mutagenesis, and the resultant mutants can be screened for given polypeptide biological activity to identify mutants that retain activity. Conversely, the invention also provides for variants with mutations that enhance or increase the endogenous biological activity. Following mutagenesis of the nucleic acid sequence, the encoded protein can be expressed by any recombinant technology known in the art and the activity of the protein can be determined. An increase, decrease, or elimination of a given biological activity of the variants disclosed herein can be readily measured by the ordinary person skilled in the art, i.e., by measuring the capability for mediating oligosaccharide modification, synthesis, or degradation (via detection of the products).
(45) The present invention also provides for functional fragments of the genes or gene products described herein. A fragment, in the case of these sequences and all others provided herein, is defined as a part of the whole that is less than the whole. Moreover, a fragment ranges in size from a single nucleotide or amino acid within a polynucleotide or polypeptide sequence to one fewer nucleotide or amino acid than the entire polynucleotide or polypeptide sequence. Finally, a fragment is defined as any portion of a complete polynucleotide or polypeptide sequence that is intermediate between the extremes defined above.
(46) For example, fragments of any of the proteins or enzymes disclosed herein or encoded by any of the genes disclosed herein can be 10 to 20 amino acids, 10 to 30 amino acids, 10 to 40 amino acids, 10 to 50 amino acids, 10 to 60 amino acids, 10 to 70 amino acids, 10 to 80 amino acids, 10 to 90 amino acids, 10 to 100 amino acids, 50 to 100 amino acids, 75 to 125 amino acids, 100 to 150 amino acids, 150 to 200 amino acids, 200 to 250 amino acids, 250 to 300 amino acids, 300 to 350 amino acids, 350 to 400 amino acids, 400 to 450 amino acids, or 450 to 500 amino acids. The fragments encompassed in the present invention comprise fragments that retain functional fragments. As such, the fragments preferably retain the catalytic domains that are required or are important for functional activity. Fragments can be determined or generated by using the sequence information herein, and the fragments can be tested for functional activity using standard methods known in the art. For example, the encoded protein can be expressed by any recombinant technology known in the art and the activity of the protein can be determined. The biological function of said fragment can be measured by measuring ability to synthesize or modify a substrate oligosaccharide, or conversely, to catabolize an oligosaccharide substrate.
(47) Within the context of the invention, “functionally equivalent”, as used herein, refers to a gene or the resulting encoded protein variant or fragment thereof capable of exhibiting a substantially similar activity as the wild-type fucosyltransferase. Specifically, the fucosyltransferase activity refers to the ability to transfer a fucose sugar to an acceptor substrate via an alpha-(1,2)-linkage. As used herein, “substantially similar activity” refers to an activity level within 5%, 10%, 20%, 30%, 40%, or 50% of the wild-type fucosyltransferase.
(48) To test for lactose-utilizing fucosyltransferase activity, the production of fucosylated oligosaccharides (i.e., 2′-FL) is evaluated in a host organism that expresses the candidate enzyme (or syngene) and which contains both cytoplasmic GDP-fucose and lactose pools. The production of fucosylated oligosaccharides indicates that the candidate enzyme-encoding sequence functions as a lactose-utilizing α(1,2)fucosyltransferase.
(49) Engineering of E. coli to Produce Human Milk Oligosaccharide 2′-FL
(50) Described herein is a gene screening approach, which was used to validate the novel α (1,2) fucosyltransferases (α (1,2) FTs) for the synthesis of fucosyl-linked oligosaccharides in metabolically engineered E. coli. Of particular interest are α (1,2) FTs that are capable of the synthesis of the HMOS 2′-fucosyllactose (2′-FL). 2′-FL is the most abundant fucosylated oligosaccharide present in human milk, and this oligosaccharide provides protection to newborn infants against infectious diarrhea caused by bacterial pathogens such as Campylobacter jejuni (Ruiz-Palacios, G. M., et al. (2003). J Biol Chem 278, 14112-120; Morrow, A. L. et al. (2004). J Pediatr 145, 297-303; Newburg, D. S. et al. (2004). Glycobiology 14, 253-263). Other α (1,2) FTs of interest are those capable of synthesis of HMOS lactodifucotetraose (LDFT), laco-N-fucopentaose I (LNFI), or lacto-N-difucohexaose I (LDFH I).
(51) The synthetic pathway of fucosyl oligosaccharides of human milk is illustrated in
(52) Candidate α(1,2) FTs (i.e., syngenes) were cloned by standard molecular biological techniques into an expression plasmid. This plasmid utilizes the strong leftwards promoter of bacteriophage λ (termed P.sub.L) to direct expression of the candidate genes (Sanger, F. et al. (1982). J Mol Biol 162, 729-773). The promoter is controllable, e.g., a trp-cI construct is stably integrated the into the E. coli host's genome (at the ampC locus), and control is implemented by adding tryptophan to the growth media. Gradual induction of protein expression is accomplished using a temperature sensitive cI repressor. Another similar control strategy (temperature independent expression system) has been described (Mieschendahl et al., 1986, Bio/Technology 4:802-808). The plasmid also carries the E. coli rcsA gene to up-regulate GDP-fucose synthesis, a critical precursor for the synthesis of fucosyl-linked oligosaccharides. In addition, the plasmid carries a β-lactamase (bla) gene for maintaining the plasmid in host strains by ampicillin selection (for convenience in the laboratory) and a native thyA (thymidylate synthase) gene as an alternative means of selection in thyA.sup.− hosts. Alternative selectable markers include the proBA genes to complement proline auxotrophy (Stein et al., (1984), J Bacteriol 158:2, 696-700 (1984) or purA to complement adenine auxotrophy (S. A. Wolfe, J. M. Smith, J Biol Chem 263, 19147-53 (1988)). To act as plasmid selectable markers each of these genes are first inactivated in the host cell chromosome, then wild type copies of the genes are provided on the plasmid. Alternatively a drug resistance gene may be used on the plasmid, e.g. beta-lactamase (this gene is already on the expression plasmid described above, thereby permitting selection with ampicillin). Ampicillin selection is well known in the art and described in standard manuals such as Maniatis et al., (1982) Molecular cloning, a laboratory manual. Cold Spring Harbor Laboratory, Cold Spring, N.Y.
(53) The nucleic acid sequence of such an expression plasmid, pEC2-(T7)FutX-rcsA-thyA (pG401) is provided below. The underlined sequence represents the FutX syngene, which can be readily replaced with any of the novel α(1,2) FTs described herein using standard recombinant DNA techniques.
(54) TABLE-US-00003 (SEQ ID NO: 287) TCGCGCGTTTCGGTGATGACGGTGAAAACCTCTGACACATGCAGCTCCCGGAGACGGTCACAGCTTGT CTGTAAGCGGATGCCGGGAGCAGACAAGCCCGTCAGGGCGCGTCAGCGGGTGTTGGCGGGTGTCGGGG CTGGCTTAACTATGCGGCATCAGAGCAGATTGTACTGAGAGTGCACCATATATGCGGTGTGAAATACC GCACAGATGCGTAAGGAGAAAATACCGCATCAGGCGCCTCCTCAACCTGTATATTCGTAAACCACGCC CAATGGGAGCTGTCTCAGGTTTGTTCCTGATTGGTTACGGCGCGTTTCGCATCATTGTTGAGTTTTTC CGCCAGCCCGACGCGCAGTTTACCGGTGCCTGGGTGCAGTACATCAGCATGGGGCAAATTCTTTCCAT CCCGATGATTGTCGCGGGTGTGATCATGATGGTCTGGGCATATCGTCGCAGCCCACAGCAACACGTTT CCTGAGGAACCATGAAACAGTATTTAGAACTGATGCAAAAAGTGCTCGACGAAGGCACACAGAAAAAC GACCGTACCGGAACCGGAACGCTTTCCATTTTTGGTCATCAGATGCGTTTTAACCTGCAAGATGGATT CCCGCTGGTGACAACTAAACGTTGCCACCTGCGTTCCATCATCCATGAACTGCTGTGGTTTCTGCAGG GCGACACTAACATTGCTTATCTACACGAAAACAATGTCACCATCTGGGACGAATGGGCCGATGAAAAC GGCGACCTCGGGCCAGTGTATGGTAAACAGTGGCGCGCCTGGCCAACGCCAGATGGTCGTCATATTGA CCAGATCACTACGGTACTGAACCAGCTGAAAAACGACCCGGATTCGCGCCGCATTATTGTTTCAGCGT GGAACGTAGGCGAACTGGATAAAATGGCGCTGGCACCGTGCCATGCATTCTTCCAGTTCTATGTGGCA GACGGCAAACTCTCTTGCCAGCTTTATCAGCGCTCCTGTGACGTCTTCCTCGGCCTGCCGTTCAACAT TGCCAGCTACGCGTTATTGGTGCATATGATGGCGCAGCAGTGCGATCTGGAAGTGGGTGATTTTGTCT GGACCGGTGGCGACACGCATCTGTACAGCAACCATATGGATCAAACTCATCTGCAATTAAGCCGCGAA CCGCGTCCGCTGCCGAAGTTGATTATCAAACGTAAACCCGAATCCATCTTCGACTACCGTTTCGAAGA CTTTGAGATTGAAGGCTACGATCCGCATCCGGGCATTAAAGCGCCGGTGGCTATCTAATTACGAAACA TCCTGCCAGAGCCGACGCCAGTGTGCGTCGGTTTTTTTACCCTCCGTTAAATTCTTCGAGACGCCTTC CCGAAGGCGCCATTCGCCATTCAGGCTGCGCAACTGTTGGGAAGGGCGATCGGTGCGGGCCTCTTCGC TATTACGCCAGCTGGCGAAAGGGGGATGTGCTGCAAGGCGATTAAGTTGGGTAACGCCAGGGTTTTCC CAGTCACGACGTTGTAAAACGACGGCCAGTGCCAAGCTTTCTTTAATGAAGCAGGGCATCAGGACGGT ATCTTTGTGGAGAAAGCAGAGTAATCTTATTCAGCCTGACTGGTGGGAAACCACCAGTCAGAATGTGT TAGCGCATGTTGACAAAAATACCATTAGTCACATTATCCGTCAGTCGGACGACATGGTAGATAACCTG TTTATTATGCGTTTTGATCTTACGTTTAATATTACCTTTATGCGATGAAACGGTCTTGGCTTTGATAT TCATTTGGTCAGAGATTTGAATGGTTCCCTGACCTGCCATCCACATTCGCAACATACTCGATTCGGTT CGGCTCAATGATAACGTCGGCATATTTAAAAACGAGGTTATCGTTGTCTCTTTTTTCAGAATATCGCC AAGGATATCGTCGAGAGATTCCGGTTTAATCGATTTAGAACTGATCAATAAATTTTTTCTGACCAATA GATATTCATCAAAATGAACATTGGCAATTGCCATAAAAACGATAAATAACGTATTGGGATGTTGATTA ATGATGAGCTTGATACGCTGACTGTTAGAAGCATCGTGGATGAAACAGTCCTCATTAATAAACACCAC TGAAGGGCGCTGTGAATCACAAGCTATGGCAAGGTCATCAACGGTTTCAATGTCGTTGATTTCTCTTT TTTTAACCCCTCTACTCAACAGATACCCGGTTAAACCTAGTCGGGTGTAACTACATAAATCCATAATA ATCGTTGACATGGCATACCCTCACTCAATGCGTAACGATAATTCCCCTTACCTGAATATTTCATCATG ACTAAACGGAACAACATGGGTCACCTAATGCGCCACTCTCGCGATTTTTCAGGCGGACTTACTATCCC GTAAAGTGTTGTATAATTTGCCTGGAATTGTCTTAAAGTAAAGTAAATGTTGCGATATGTGAGTGAGC TTAAAACAAATATTTCGCTGCAGGAGTATCCTGGAAGATGTTCGTAGAAGCTTACTGCTCACAAGAAA AAAGGCACGTCATCTGACGTGCCTTTTTTATTTGTACTACCCTGTACGATTACTGCAGCTCGAGCTAG TTAATCGGTACTTTGATCCATTCTTTCGGGTACATGTCCGGGGTTTCTTTATGCTGGAACCAGCGACA CGGCGCGATGACGATTTTCTCTTTACGTGGGTTCAGCCAAGCACCCCACCAGGAGAACGTAGAATTAC AGATAATGTGGTGACGACAGTGGCTCATCAGCATCATATCCTGCCAGCTGTCTTCGCCCTTGTTCCAC GTCACGTAGACCGCTTTCTTCAGCGGGATGTTTTCTTTAACCCAAGAGATATCATCAGAGAACACGTA GTAGCTCGGGCCAGTAATACGGTTCTCCATTTCCGCGATCGCGTTCTTGTAATACGGCAGCTGGCACA CGGAACCCGTGTTTGCCCAGTGACGCGGCAGCAGGTAGTCACCACGGCGGATGTGGATAGAAACAGCT TGGTCGTCAACTTCGATCTGTTTCAGCAGTTCCAGGCTTTCCGGGTTAGCGATGTTCAGGTTAAAAGA GAAGGCTTTACGAACGTCGTCTTTGATATCGAAGAAGAAGCGTTCAGACTGGTAGAAACCTTTAAAGT ACAGCAGCGGCCAAAAATAACGTTTTTCGTATGGATACAGAGTAGACGGGTCCTGGCGACGTTCGTAG ATTTTTTTGAAGAACAGGAACTCCAGGATTTTTTTCAGGGTACGGTTGATGCAAAATTCAGTCTGGCT CAGGTCAAAGATACGGTTCATCTCATAACCGTTGTGAACTTTATAATGAACCATGTCAGACAGATCGA TGTTCGTATCCGGGTAATGGTGTTTCATTTTCAGGTAGAACGCGTAGATAAACATCTGGTTACCCAGG CCGCCGATCATCTTGATCAGACGCATATGTATATCTCCTTCTTGAATTCTAAAAATTGATTGAATGTA TGCAAATAAATGCATACACCATAGGTGTGGTTTAATTTGATGCCCTTTTTCAGGGCTGGAATGTGTAA GAGCGGGGTTATTTATGCTGTTGTTTTTTTGTTACTCGGGAAGGGCTTTACCTCTTCCGCATAAACGC TTCCATCAGCGTTTATAGTTAAAAAAATCTTTCGGAACTGGTTTTGCGCTTACCCCAACCAACAGGGG ATTTGCTGCTTTCCATTGAGCCTGTTTCTCTGCGCGACGTTCGCGGCGGCGTGTTTGTGCATCCATCT GGATTCTCCTGTCAGTTAGCTTTGGTGGTGTGTGGCAGTTGTAGTCCTGAACGAAAACCCCCCGCGAT TGGCACATTGGCAGCTAATCCGGAATCGCACTTACGGCCAATGCTTCGTTTCGTATCACACACCCCAA AGCCTTCTGCTTTGAATGCTGCCCTTCTTCAGGGCTTAATTTTTAAGAGCGTCACCTTCATGGTGGTC AGTGCGTCCTGCTGATGTGCTCAGTATCACCGCCAGTGGTATTTATGTCAACACCGCCAGAGATAATT TATCACCGCAGATGGTTATCTGTATGTTTTTTATATGAATTTATTTTTTGCAGGGGGGCATTGTTTGG TAGGTGAGAGATCAATTCTGCATTAATGAATCGGCCAACGCGCGGGGAGAGGCGGTTTGCGTATTGGG CGCTCTTCCGCTTCCTCGCTCACTGACTCGCTGCGCTCGGTCGTTCGGCTGCGGCGAGCGGTATCAGC TCACTCAAAGGCGGTAATACGGTTATCCACAGAATCAGGGGATAACGCAGGAAAGAACATGTGAGCAA AAGGCCAGCAAAAGGCCAGGAACCGTAAAAAGGCCGCGTTGCTGGCGTTTTTCCATAGGCTCCGCCCC CCTGACGAGCATCACAAAAATCGACGCTCAAGTCAGAGGTGGCGAAACCCGACAGGACTATAAAGATA CCAGGCGTTTCCCCCTGGAAGCTCCCTCGTGCGCTCTCCTGTTCCGACCCTGCCGCTTACCGGATACC TGTCCGCCTTTCTCCCTTCGGGAAGCGTGGCGCTTTCTCATAGCTCACGCTGTAGGTATCTCAGTTCG GTGTAGGTCGTTCGCTCCAAGCTGGGCTGTGTGCACGAACCCCCCGTTCAGCCCGACCGCTGCGCCTT ATCCGGTAACTATCGTCTTGAGTCCAACCCGGTAAGACACGACTTATCGCCACTGGCAGCAGCCACTG GTAACAGGATTAGCAGAGCGAGGTATGTAGGCGGTGCTACAGAGTTCTTGAAGTGGTGGCCTAACTAC GGCTACACTAGAAGAACAGTATTTGGTATCTGCGCTCTGCTGAAGCCAGTTACCTTCGGAAAAAGAGT TGGTAGCTCTTGATCCGGCAAACAAACCACCGCTGGTAGCGGTGGTTTTTTTGTTTGCAAGCAGCAGA TTACGCGCAGAAAAAAAGGATCTCAAGAAGATCCTTTGATCTTTTCTACGGGGTCTGACGCTCAGTGG AACGAAAACTCACGTTAAGGGATTTTGGTCATGAGATTATCAAAAAGGATCTTCACCTAGATCCTTTT AAATTAAAAATGAAGTTTTAAATCAATCTAAAGTATATATGAGTAAACTTGGTCTGACAGTTACCAAT GCTTAATCAGTGAGGCACCTATCTCAGCGATCTGTCTATTTCGTTCATCCATAGTTGCCTGACTCCCC GTCGTGTAGATAACTACGATACGGGAGGGCTTACCATCTGGCCCCAGTGCTGCAATGATACCGCGAGA CCCACGCTCACCGGCTCCAGATTTATCAGCAATAAACCAGCCAGCCGGAAGGGCCGAGCGCAGAAGTG GTCCTGCAACTTTATCCGCCTCCATCCAGTCTATTAATTGTTGCCGGGAAGCTAGAGTAAGTAGTTCG CCAGTTAATAGTTTGCGCAACGTTGTTGCCATTGCTACAGGCATCGTGGTGTCACGCTCGTCGTTTGG TATGGCTTCATTCAGCTCCGGTTCCCAACGATCAAGGCGAGTTACATGATCCCCCATGTTGTGCAAAA AAGCGGTTAGCTCCTTCGGTCCTCCGATCGTTGTCAGAAGTAAGTTGGCCGCAGTGTTATCACTCATG GTTATGGCAGCACTGCATAATTCTCTTACTGTCATGCCATCCGTAAGATGCTTTTCTGTGACTGGTGA GTACTCAACCAAGTCATTCTGAGAATAGTGTATGCGGCGACCGAGTTGCTCTTGCCCGGCGTCAATAC GGGATAATACCGCGCCACATAGCAGAACTTTAAAAGTGCTCATCATTGGAAAACGTTCTTCGGGGCGA AAACTCTCAAGGATCTTACCGCTGTTGAGATCCAGTTCGATGTAACCCACTCGTGCACCCAACTGATC TTCAGCATCTTTTACTTTCACCAGCGTTTCTGGGTGAGCAAAAACAGGAAGGCAAAATGCCGCAAAAA AGGGAATAAGGGCGACACGGAAATGTTGAATACTCATACTCTTCCTTTTTCAATATTATTGAAGCATT TATCAGGGTTATTGTCTCATGAGCGGATACATATTTGAATGTATTTAGAAAAATAAACAAATAGGGGT TCCGCGCACATTTCCCCGAAAAGTGCCACCTGACGTCTAAGAAACCATTATTATCATGACATTAACCT ATAAAAATAGGCGTATCACGAGGCCCTTTCGTC
(55) The expression constructs were transformed into a host strain useful for the production of 2′-FL. Biosynthesis of 2′-FL requires the generation of an enhanced cellular pool of both lactose and GDP-fucose (
(56) First, the ability of the E. coli host strain to accumulate intracellular lactose was engineered by simultaneous deletion of the endogenous β-galactosidase gene (lacZ) and the lactose operon repressor gene (lad). During construction of this deletion, the lacIq promoter was placed immediately upstream of the lactose permease gene, lacY. The modified strain maintains its ability to transport lactose from the culture medium (via LacY), but is deleted for the wild-type copy of the lacZ (β-galactosidase) gene responsible for lactose catabolism. Therefore, an intracellular lactose pool is created when the modified strain is cultured in the presence of exogenous lactose. A schematic of the P.sub.lacIq lacY.sup.+ chromosomal construct is shown in
(57) Genomic DNA sequence of the P.sub.lacIq lacY.sup.+ chromosomal construct is set forth below (SEQ ID NO: 288):
(58) TABLE-US-00004 CACCATCGAATGGCGCAAAACCTTTCGCGGTATGGCATGATAGCGCCCGGAAGAGAGTCAAGTGTAGGCTGGAGC TGCTTCGAAGTTCCTATACTTTCTAGAGAATAGGAACTTCGGAATAGGAACTTCGGAATAGGAACTAAGGAGGAT ATTCATATGTACTATTTAAAAAACACAAACTTTTGGATGTTCGGTTTATTCTTTTTCTTTTACTTTTTTATCATG GGAGCCTACTTCCCGTTTTTCCCGATTTGGCTACATGACATCAACCATATCAGCAAAAGTGATACGGGTATTATT TTTGCCGCTATTTCTCTGTTCTCGCTATTATTCCAACCGCTGTTTGGTCTGCTTTCTGACAAACTCGGGCTGCGC AAATACCTGCTGTGGATTATTACCGGCATGTTAGTGATGTTTGCGCCGTTCTTTATTTTTATCTTCGGGCCACTG TTACAATACAACATTTTAGTAGGATCGATTGTTGGTGGTATTTATCTAGGCTTTTGTTTTAACGCCGGTGCGCCA GCAGTAGAGGCATTTATTGAGAAAGTCAGCCGTCGCAGTAATTTCGAATTTGGTCGCGCGCGGATGTTTGGCTGT GTTGGCTGGGCGCTGTGTGCCTCGATTGTCGGCATCATGTTCACCATCAATAATCAGTTTGTTTTCTGGCTGGGC TCTGGCTGTGCACTCATCCTCGCCGTTTTACTCTTTTTCGCCAAAACGGATGCGCCCTCTTCTGCCACGGTTGCC AATGCGGTAGGTGCCAACCATTCGGCATTTAGCCTTAAGCTGGCACTGGAACTGTTCAGACAGCCAAAACTGTGG TTTTTGTCACTGTATGTTATTGGCGTTTCCTGCACCTACGATGTTTTTGACCAACAGTTTGCTAATTTCTTTACT TCGTTCTTTGCTACCGGTGAACAGGGTACGCGGGTATTTGGCTACGTAACGACAATGGGCGAATTACTTAACGCC TCGATTATGTTCTTTGCGCCACTGATCATTAATCGCATCGGTGGGAAAAACGCCCTGCTGCTGGCTGGCACTATT ATGTCTGTACGTATTATTGGCTCATCGTTCGCCACCTCAGCGCTGGAAGTGGTTATTCTGAAAACGCTGCATATG TTTGAAGTACCGTTCCTGCTGGTGGGCTGCTTTAAATATATTACCAGCCAGTTTGAAGTGCGTTTTTCAGCGACG ATTTATCTGGTCTGTTTCTGCTTCTTTAAGCAACTGGCGATGATTTTTATGTCTGTACTGGCGGGCAATATGTAT GAAAGCATCGGTTTCCAGGGCGCTTATCTGGTGCTGGGTCTGGTGGCGCTGGGCTTCACCTTAATTTCCGTGTTC ACGCTTAGCGGCCCCGGCCCGCTTTCCCTGCTGCGTCGTCAGGTGAATGAAGTCGCTTAAGCAATCAATGTCGGA TGCGGCGCGAGCGCCTTATCCGACCAACATATCATAACGGAGTGATCGCATTGTAAATTATAAAAATTGCCTGAT ACGCTGCGCTTATCAGGCCTACAAGTTCAGCGATCTACATTAGCCGCATCCGGCATGAACAAAGCGCAGGAACAA GCGTCGCA
(59) Second, the ability of the host E. coli strain to synthesize colanic acid, an extracellular capsular polysaccharide, was eliminated by the deletion of the wcaJ gene, encoding the UDP-glucose lipid carrier transferase (Stevenson, G. et al. (1996). J Bacteriol 178, 4885-893). In a wcaJ null background GDP-fucose accumulates in the E. coli cytoplasm (Dumon, C. et al. (2001). Glycoconj J 18, 465-474). A schematic of the chromosomal deletion of wcaJ is shown in
(60) The sequence of the chromosomal region of E. coli bearing the ΔwcaJ::FRT mutation is set forth below (SEQ ID NO: 289):
(61) TABLE-US-00005 GTTCGGTTATATCAATGTCAAAAACCTCACGCCGCTCAAGCTGGTGATCAACTCCGGGAACGGCGCAGCGGGTCC GGTGGTGGACGCCATTGAAGCCCGCTTTAAAGCCCTCGGCGCGCCCGTGGAATTAATCAAAGTGCACAACACGCC GGACGGCAATTTCCCCAACGGTATTCCTAACCCACTACTGCCGGAATGCCGCGACGACACCCGCAATGCGGTCAT CAAACACGGCGCGGATATGGGCATTGCTTTTGATGGCGATTTTGACCGCTGTTTCCTGTTTGACGAAAAAGGGCA GTTTATTGAGGGCTACTACATTGTCGGCCTGTTGGCAGAAGCATTCCTCGAAAAAAATCCCGGCGCGAAGATCAT CCACGATCCACGTCTCTCCTGGAACACCGTTGATGTGGTGACTGCCGCAGGTGGCACGCCGGTAATGTCGAAAAC CGGACACGCCTTTATTAAAGAACGTATGCGCAAGGAAGACGCCATCTATGGTGGCGAAATGAGCGCCCACCATTA CTTCCGTGATTTCGCTTACTGCGACAGCGGCATGATCCCGTGGCTGCTGGTCGCCGAACTGGTGTGCCTGAAAGA TAAAACGCTGGGCGAACTGGTACGCGACCGGATGGCGGCGTTTCCGGCAAGCGGTGAGATCAACAGCAAACTGGC GCAACCCGTTGAGGCGATTAACCGCGTGGAACAGCATTTTAGCCGTGAGGCGCTGGCGGTGGATCGCACCGATGG CATCAGCATGACCTTTGCCGACTGGCGCTTTAACCTGCGCACCTCCAATACCGAACCGGTGGTGCGCCTGAATGT GGAATCGCGCGGTGATGTGCCGCTGATGGAAGCGCGAACGCGAACTCTGCTGACGTTGCTGAACGAGTAATGTCG GATCTTCCCTTACCCCACTGCGGGTAAGGGGCTAATAACAGGAACAACGATGATTCCGGGGATCCGTCGACCTGC AGTTCGAAGTTCCTATTCTCTAGAAAGTATAGGAACTTCGAAGCAGCTCCAGCCTACAGTTAACAAAGCGGCATA TTGATATGAGCTTACGTGAAAAAACCATCAGCGGCGCGAAGTGGTCGGCGATTGCCACGGTGATCATCATCGGCC TCGGGCTGGTGCAGATGACCGTGCTGGCGCGGATTATCGACAACCACCAGTTCGGCCTGCTTACCGTGTCGCTGG TGATTATCGCGCTGGCAGATACGCTTTCTGACTTCGGTATCGCTAACTCGATTATTCAGCGAAAAGAAATCAGTC ACCTTGAACTCACCACGTTGTACTGGCTGAACGTCGGGCTGGGGATCGTGGTGTGCGTGGCGGTGTTTTTGTTGA GTGATCTCATCGGCGACGTGCTGAATAACCCGGACCTGGCACCGTTGATTAAAACATTATCGCTGGCGTTTGTGG TAATCCCCCACGGGCAACAGTTCCGCGCGTTGATGCAAAAAGAGCTGGAGTTCAACAAAATCGGCATGATCGAAA CCAGCGCGGTGCTGGCGGGCTTCACTTGTACGGTGGTTAGCGCCCATTTCTGGCCGCTGGCGATGACCGCGATCC TCGGTTATCTGGTCAATAGTGCGGTGAGAACGCTGCTGTTTGGCTACTTTGGCCGCAAAATTTATCGCCCCGGTC TGCATTTCTCGCTGGCGTCGGTGGCACCGAACTTACGCTTTGGTGCCTGGCTGACGGCGGACAGCATCATCAACT ATCTCAATACCAACCTTTCAACGCTCGTGCTGGCGCGTATTCTCGGCGCGGGCGTGGCAGGGGGATACAACCTGG CGTACAACGTGGCCGTTGTGCCACCGATGAAGCTGAACCCAATCATCACCCGCGTGTTGTTTCCGGCATTCGCCA AAATTCAGGACGATACCGAAAAGCTGCGTGTTAACTTCTACAAGCTGCTGTCGGTAGTGGGGATTATCAACTTTC CGGCGCTGCTCGGGCTAATGGTGGTGTCGAATAACTTTGTACCGCTGGTCTTTGGTGAGAAGTGGAACAGCATTA TTCCGGTGCTGCAATTGCTGTGTGTGGTGGGTCTGCTGCGCTCCG
(62) Third, the magnitude of the cytoplasmic GDP-fucose pool was enhanced by the introduction of a null mutation into the lon gene. Lon is an ATP-dependant intracellular protease that is responsible for degrading RcsA, which is a positive transcriptional regulator of colanic acid biosynthesis in E. coli (Gottesman, S. & Stout, V. Mol Microbiol 5, 1599-1606 (1991)). In a lon null background, RcsA is stabilized, RcsA levels increase, the genes responsible for GDP-fucose synthesis in E. coli are up-regulated, and intracellular GDP-fucose concentrations are enhanced. The lon gene was almost entirely deleted and replaced by an inserted functional, wild-type, but promoter-less E. coli lacZ.sup.+ gene (Δlon::(kan, lacZ.sup.+). λ Red recombineering was used to perform the construction. A schematic of the kan, lacZ.sup.+ insertion into the lon locus is shown in
(63) Genomic DNA sequence surrounding the lacZ+ insertion into the lon region in the E. coli strain is set forth below (SEQ ID NO: 290):
(64) TABLE-US-00006 GTGGATGGAAGAGGTGGAAAAAGTGGTTATGGAGGAGTGGGTAATTGATGGTGAAAGGAAAGGGTTGGTGATTTA TGGGAAGGGGGAAGGGGAAGAGGGATGTGGTGAATAATTAAGGATTGGGATAGAATTAGTTAAGGAAAAAGGGGG GATTTTATGTGGGGTTTAATTTTTGGTGTATTGTGGGGGTTGAATGTGGGGGAAAGATGGGGATATAGTGAGGTA GATGTTAATAGATGGGGTGAAGGAGAGTGGTGTGATGTGATTAGGTGGGGGAAATTAAAGTAAGAGAGAGGTGTA TGATTGGGGGGATGGGTGGAGGTGGAGTTGGAAGTTGGTATTGTGTAGAAAGTATAGGAAGTTGAGAGGGGTTTT GAAGGTGAGGGTGGGGGAAGGAGTGAGGGGGGAAGGGGTGGTAAAGGAAGGGGAAGAGGTAGAAAGGGAGTGGGG AGAAAGGGTGGTGAGGGGGGATGAATGTGAGGTAGTGGGGTATGTGGAGAAGGGAAAAGGGAAGGGGAAAGAGAA AGGAGGTAGGTTGGAGTGGGGTTAGATGGGGATAGGTAGAGTGGGGGGTTTTATGGAGAGGAAGGGAAGGGGAAT TGGGAGGTGGGGGGGGGTGTGGTAAGGTTGGGAAGGGGTGGAAAGTAAAGTGGATGGGTTTGTTGGGGGGAAGGA TGTGATGGGGGAGGGGATGAAGATGTGATGAAGAGAGAGGATGAGGATGGTTTGGGATGATTGAAGAAGATGGAT TGGAGGGAGGTTGTGGGGGGGGTTGGGTGGAGAGGGTATTGGGGTATGAGTGGGGAGAAGAGAGAATGGGGTGGT GTGATGGGGGGGTGTTGGGGGTGTGAGGGGAGGGGGGGGGGGTTGTTTTTGTGAAGAGGGAGGTGTGGGGTGGGG TGAATGAAGTGGAGGAGGAGGGAGGGGGGGTATGGTGGGTGGGGAGGAGGGGGGTTGGTTGGGGAGGTGTGGTGG AGGTTGTGAGTGAAGGGGGAAGGGAGTGGGTGGTATTGGGGGAAGTGGGGGGGGAGGATGTGGTGTGATGTGAGG TTGGTGGTGGGGAGAAAGTATGGATGATGGGTGATGGAATGGGGGGGGTGGATAGGGTTGATGGGGGTAGGTGGG GATTGGAGGAGGAAGGGAAAGATGGGATGGAGGGAGGAGGTAGTGGGATGGAAGGGGGTGTTGTGGATGAGGATG ATGTGGAGGAAGAGGATGAGGGGGTGGGGGGAGGGGAAGTGTTGGGGAGGGTGAAGGGGGGATGGGGGAGGGGGA GGATGTGGTGGTGAGGGATGGGGATGGGTGGTTGGGGAATATGATGGTGGAAAATGGGGGGTTTTGTGGATTGAT GGAGTGTGGGGGGGTGGGTGTGGGGGAGGGGTATGAGGAGATAGGGTTGGGTAGGGGTGATATTGGTGAAGAGGT TGGGGGGGAATGGGGTGAGGGGTTGGTGGTGGTTTAGGGTATGGGGGGTGGGGATTGGGAGGGGATGGGGTTGTA TGGGGTTGTTGAGGAGTTGTTGTAATAAGGGGATGTTGAAGTTGGTATTGGGAAGTTGGTATTGTGTAGAAAGTA TAGGAAGTTGGAAGGAGGTGGAGGGTAGATAAAGGGGGGGGTTATTTTTGAGAGGAGAGGAAGTGGTAATGGTAG GGAGGGGGGGTGAGGTGGAATTGGGGGGATAGTGAGGGGGTGGAGGAGTGGTGGGGAGGAATGGGGATATGGAAA GGGTGGATATTGAGGGATGTGGGTTGTTGGGGGTGGAGGAGATGGGGATGGGTGGTTTGGATGAGTTGGTGTTGA GTGTAGGGGGTGATGTTGAAGTGGAAGTGGGGGGGGGAGTGGTGTGGGGGATAATTGAATTGGGGGGTGGGGGAG GGGAGAGGGTTTTGGGTGGGGAAGAGGTAGGGGGTATAGATGTTGAGAATGGGAGATGGGAGGGGTGAAAAGAGG GGGGAGTAAGGGGGTGGGGATAGTTTTGTTGGGGGGGTAATGGGAGGGAGTTTAGGGGGTGTGGTAGGTGGGGGA GGTGGGAGTTGAGGGGAATGGGGGGGGGATGGGGTGTATGGGTGGGGAGTTGAAGATGAAGGGTAATGGGGATTT GAGGAGTAGGATGAATGGGGTAGGTTTTGGGGGTGATAAATAAGGTTTTGGGGTGATGGTGGGAGGGGTGAGGGG TGGTAATGAGGAGGGGATGAGGAAGTGTATGTGGGGTGGAGTGGAAGAAGGGTGGTTGGGGGTGGTAATGGGGGG GGGGGTTGGAGGGTTGGAGGGAGGGGTTAGGGTGAATGGGGGTGGGTTGAGTTAGGGGAATGTGGTTATGGAGGG GTGGAGGGGTGAAGTGATGGGGGAGGGGGGTGAGGAGTTGTTTTTTATGGGGAATGGAGATGTGTGAAAGAAAGG GTGAGTGGGGGTTAAATTGGGAAGGGTTATTAGGGAGGTGGATGGAAAAATGGATTTGGGTGGTGGTGAGATGGG GGATGGGGTGGGAGGGGGGGGGGAGGGTGAGAGTGAGGTTTTGGGGGAGAGGGGAGTGGTGGGAGGGGGTGATGT GGGGGGGTTGTGAGGATGGGGTGGGGTTGGGTTGGAGTAGGGGTAGTGTGAGGGAGAGTTGGGGGGGGGTGTGGG GGTGGGGTAGTTGAGGGAGTTGAATGAAGTGTTTAGGTTGTGGAGGGAGATGGAGAGGGAGTTGAGGGGTTGGGA GGGGGTTAGGATGGAGGGGGAGGATGGAGTGGAGGAGGTGGTTATGGGTATGAGGGAAGAGGTATTGGGTGGTGA GTTGGATGGTTTGGGGGGATAAAGGGAAGTGGAAAAAGTGGTGGTGGTGTTTTGGTTGGGTGAGGGGTGGATGGG GGGTGGGGTGGGGAAAGAGGAGAGGGTTGATAGAGAAGTGGGGATGGTTGGGGGTATGGGGAAAATGAGGGGGGT AAGGGGAGGAGGGGTTGGGGTTTTGATGATATTTAATGAGGGAGTGATGGAGGGAGTGGGAGAGGAAGGGGGGGT GTAAAGGGGGATAGTGAGGAAAGGGGTGGGAGTATTTAGGGAAAGGGGGAAGAGTGTTAGGGATGGGGTGGGGGT ATTGGGAAAGGATGAGGGGGGGGGTGTGTGGAGGTAGGGAAAGGGATTTTTTGATGGAGGATTTGGGGAGAGGGG GGAAGGGGTGGTGTTGATGGAGGGGGGGGTAGATGGGGGAAATAATATGGGTGGGGGTGGTGTGGGGTGGGGGGG GTTGATAGTGGAGGGGGGGGGAAGGATGGAGAGATTTGATGGAGGGATAGAGGGGGTGGTGATTAGGGGGGTGGG GTGATTGATTGGGGAGGGAGGAGATGATGAGAGTGGGGTGATTAGGATGGGGGTGGAGGATTGGGGTTAGGGGTT GGGTGATGGGGGGTAGGGAGGGGGGATGATGGGTGAGAGGATTGATTGGGAGGATGGGGTGGGTTTGAATATTGG GTTGATGGAGGAGATAGAGGGGGTAGGGGTGGGAGAGGGTGTAGGAGAGGGGATGGTTGGGATAATGGGAAGAGG GGAGGGGGTTAAAGTTGTTGTGGTTGATGAGGAGGATATGGTGGAGGATGGTGTGGTGATGGATGAGGTGAGGAT GGAGAGGATGATGGTGGTGAGGGTTAAGGGGTGGAATGAGGAAGGGGTTGGGGTTGAGGAGGAGGAGAGGATTTT GAATGGGGAGGTGGGGGAAAGGGAGATGGGAGGGTTGTGGTTGAATGAGGGTGGGGTGGGGGGTGTGGAGTTGAA GGAGGGGAGGATAGAGATTGGGGATTTGGGGGGTGGAGAGTTTGGGGTTTTGGAGGTTGAGAGGTAGTGTGAGGG GATGGGGATAAGGAGGAGGGTGATGGATAATTTGAGGGGGGAAAGGGGGGGTGGGGGTGGGGAGGTGGGTTTGAG GGTGGGATAAAGAAAGTGTTAGGGGTAGGTAGTGAGGGAAGTGGGGGGAGATGTGAAGTTGAGGGTGGAGTAGAG GGGGGGTGAAATGATGATTAAAGGGAGTGGGAAGATGGAAATGGGTGATTTGTGTAGTGGGTTTATGGAGGAAGG AGAGGTGAGGGAAAATGGGGGTGATGGGGGAGATATGGTGATGTTGGAGATAAGTGGGGTGAGTGGAGGGGAGGA GGATGAGGGGGAGGGGGTTTTGTGGGGGGGGTAAAAATGGGGTGAGGTGAAATTGAGAGGGGAAAGGAGTGTGGT GGGGGTAAGGGAGGGAGGGGGGGTTGGAGGAGAGATGAAAGGGGGAGTTAAGGGGATGAAAAATAATTGGGGTGT GGGGTTGGTGTAGGGAGGTTTGATGAAGATTAAATGTGAGGGAGTAAGAAGGGGTGGGATTGTGGGTGGGAAGAA AGGGGGGATTGAGGGTAATGGGATAGGTGAGGTTGGTGTAGATGGGGGGATGGTAAGGGTGGATGTGGGAGTTTG AGGGGAGGAGGAGAGTATGGGGGTGAGGAAGATGGGAGGGAGGGAGGTTTGGGGGAGGGGTTGTGGTGGGGGAAA GGAGGGAAAGGGGGATTGGGGATTGAGGGTGGGGAAGTGTTGGGAAGGGGGATGGGTGGGGGGGTGTTGGGTATT AGGGGAGGTGGGGAAAGGGGGATGTGGTGGAAGGGGATTAAGTTGGGTAAGGGGAGGGTTTTGGGAGTGAGGAGG TTGTAAAAGGAGGGGGAGTGAATGGGTAATGATGGTGATAGTAGGTTTGGTGAGGTTGTGAGTGGAAAATAGTGA GGTGGGGGAAAATGGAGTAATAAAAAGAGGGGTGGGAGGGTAATTGGGGGTTGGGAGGGTTTTTTTGTGTGGGTA AGTTAGATGGGGGATGGGGGTTGGGGTTATTAAGGGGTGTTGTAAGGGGATGGGTGGGGTGATATAAGTGGTGGG GGTTGGTAGGTTGAAGGATTGAAGTGGGATATAAATTATAAAGAGGAAGAGAAGAGTGAATAAATGTGAATTGAT GGAGAAGATTGGTGGAGGGGGTGATATGTGTAAAGGTGGGGGTGGGGGTGGGTTAGATGGTATTATTGGTTGGGT AAGTGAATGTGTGAAAGAAGG
(65) Fourth, a thyA (thymidylate synthase) mutation was introduced into the strain by P1 transduction. In the absence of exogenous thymidine, thyA strains are unable to make DNA and die. The defect can be complemented in trans by supplying a wild-type thyA gene on a multicopy plasmid (Belfort, M., Maley, G. F., and Maley, F. (1983). Proc Natl Acad Sci USA 80, 1858-861). This complementation was used here as a means of plasmid maintenance.
(66) An additional modification that is useful for increasing the cytoplasmic pool of free lactose (and hence the final yield of 2′-FL) is the incorporation of a lacA mutation. LacA is a lactose acetyltransferase that is only active when high levels of lactose accumulate in the E. coli cytoplasm. High intracellular osmolarity (e.g., caused by a high intracellular lactose pool) can inhibit bacterial growth, and E. coli has evolved a mechanism for protecting itself from high intra cellular osmlarity caused by lactose by “tagging” excess intracellular lactose with an acetyl group using LacA, and then actively expelling the acetyl-lactose from the cell (Danchin, A. Bioessays 31, 769-773 (2009)). Production of acetyl-lactose in E. coli engineered to produce 2′-FL or other human milk oligosaccharides is therefore undesirable: it reduces overall yield. Moreover, acetyl-lactose is a side product that complicates oligosaccharide purification schemes. The incorporation of a lacA mutation resolves these problems. Sub-optimal production of fucosylated oligosaccharides occurs in strains lacking either or both of the mutations in the colanic acid pathway and the lon protease. Diversion of lactose into a side product (acetyl-lactose) occurs in strains that do not contain the lacA mutation. A schematic of the lacA deletion and corresponding genomic sequence is provided above (SEQ ID NO: 288).
(67) The strain used to test the different α(1,2) FT candidates incorporates all the above genetic modifications and has the following genotype: ΔampC::P.sub.trp.sup.BcI, A(lacI-lacZ)::FRT, P.sub.lacIqlacY.sup.+, ΔwcaJ::FRT, thyA::Tn10, Δlon:(npt3, lacZ.sup.+), ΔlacA
(68) The E. coli strains harboring the different α(1,2) FT candidate expression plasmids were analyzed. Strains were grown in selective media (lacking thymidine) to early exponential phase. Lactose was then added to a final concentration of 0.5%, and tryptophan (200 μM) was added to induce expression of each candidate α(1,2) FT from the P.sub.L promoter. At the end of the induction period (˜24 h) equivalent OD 600 units of each strain and the culture supernatant was harvested. Lysates were prepared and analyzed for the presence of 2′-FL by thin layer chromatography (TLC).
(69) A map of plasmid pG217 is shown in
(70) TABLE-US-00007 TCTAGAATTCTAAAAATTGATTGAATGTATGCAAATAAATGCATACACCATAGGTGTGGTTTAATTTGATGCCCT TTTTCAGGGCTGGAATGTGTAAGAGCGGGGTTATTTATGCTGTTGTTTTTTTGTTACTCGGGAAGGGCTTTACCT CTTCCGCATAAACGCTTCCATCAGCGTTTATAGTTAAAAAAATCTTTCGGAACTGGTTTTGCGCTTACCCCAACC AACAGGGGATTTGCTGCTTTCCATTGAGCCTGTTTCTCTGCGCGACGTTCGCGGCGGCGTGTTTGTGCATCCATC TGGATTCTCCTGTCAGTTAGCTTTGGTGGTGTGTGGCAGTTGTAGTCCTGAACGAAAACCCCCCGCGATTGGCAC ATTGGCAGCTAATCCGGAATCGCACTTACGGCCAATGCTTCGTTTCGTATCACACACCCCAAAGCCTTCTGCTTT GAATGCTGCCCTTCTTCAGGGCTTAATTTTTAAGAGCGTCACCTTCATGGTGGTCAGTGCGTCCTGCTGATGTGC TCAGTATCACCGCCAGTGGTATTTATGTCAACACCGCCAGAGATAATTTATCACCGCAGATGGTTATCTGTATGT TTTTTATATGAATTTATTTTTTGCAGGGGGGCATTGTTTGGTAGGTGAGAGATCAATTCTGCATTAATGAATCGG CCAACGCGCGGGGAGAGGCGGTTTGCGTATTGGGCGCTCTTCCGCTTCCTCGCTCACTGACTCGCTGCGCTCGGT CGTTCGGCTGCGGCGAGCGGTATCAGCTCACTCAAAGGCGGTAATACGGTTATCCACAGAATCAGGGGATAACGC AGGAAAGAACATGTGAGCAAAAGGCCAGCAAAAGGCCAGGAACCGTAAAAAGGCCGCGTTGCTGGCGTTTTTCCA TAGGCTCCGCCCCCCTGACGAGCATCACAAAAATCGACGCTCAAGTCAGAGGTGGCGAAACCCGACAGGACTATA AAGATACCAGGCGTTTCCCCCTGGAAGCTCCCTCGTGCGCTCTCCTGTTCCGACCCTGCCGCTTACCGGATACCT GTCCGCCTTTCTCCCTTCGGGAAGCGTGGCGCTTTCTCATAGCTCACGCTGTAGGTATCTCAGTTCGGTGTAGGT CGTTCGCTCCAAGCTGGGCTGTGTGCACGAACCCCCCGTTCAGCCCGACCGCTGCGCCTTATCCGGTAACTATCG TCTTGAGTCCAACCCGGTAAGACACGACTTATCGCCACTGGCAGCAGCCACTGGTAACAGGATTAGCAGAGCGAG GTATGTAGGCGGTGCTACAGAGTTCTTGAAGTGGTGGCCTAACTACGGCTACACTAGAAGGACAGTATTTGGTAT CTGCGCTCTGCTGAAGCCAGTTACCTTCGGAAAAAGAGTTGGTAGCTCTTGATCCGGCAAACAAACCACCGCTGG TAGCGGTGGTTTTTTTGTTTGCAAGCAGCAGATTACGCGCAGAAAAAAAGGATCTCAAGAAGATCCTTTGATCTT TTCTACGGGGTCTGACGCTCAGTGGAACGAAAACTCACGTTAAGGGATTTTGGTCATGAGATTATCAAAAAGGAT CTTCACCTAGATCCTTTTAAATTAAAAATGAAGTTTTAAATCAATCTAAAGTATATATGAGTAAACTTGGTCTGA CAGTTACCAATGCTTAATCAGTGAGGCACCTATCTCAGCGATCTGTCTATTTCGTTCATCCATAGTTGCCTGACT CCCCGTCGTGTAGATAACTACGATACGGGAGGGCTTACCATCTGGCCCCAGTGCTGCAATGATACCGCGAGACCC ACGCTCACCGGCTCCAGATTTATCAGCAATAAACCAGCCAGCCGGAAGGGCCGAGCGCAGAAGTGGTCCTGCAAC TTTATCCGCCTCCATCCAGTCTATTAATTGTTGCCGGGAAGCTAGAGTAAGTAGTTCGCCAGTTAATAGTTTGCG CAACGTTGTTGCCATTGCTACAGGCATCGTGGTGTCACGCTCGTCGTTTGGTATGGCTTCATTCAGCTCCGGTTC CCAACGATCAAGGCGAGTTACATGATCCCCCATGTTGTGCAAAAAAGCGGTTAGCTCCTTCGGTCCTCCGATCGT TGTCAGAAGTAAGTTGGCCGCAGTGTTATCACTCATGGTTATGGCAGCACTGCATAATTCTCTTACTGTCATGCC ATCCGTAAGATGCTTTTCTGTGACTGGTGAGTACTCAACCAAGTCATTCTGAGAATAGTGTATGCGGCGACCGAG TTGCTCTTGCCCGGCGTCAATACGGGATAATACCGCGCCACATAGCAGAACTTTAAAAGTGCTCATCATTGGAAA ACGTTCTTCGGGGCGAAAACTCTCAAGGATCTTACCGCTGTTGAGATCCAGTTCGATGTAACCCACTCGTGCACC CAACTGATCTTCAGCATCTTTTACTTTCACCAGCGTTTCTGGGTGAGCAAAAACAGGAAGGCAAAATGCCGCAAA AAAGGGAATAAGGGCGACACGGAAATGTTGAATACTCATACTCTTCCTTTTTCAATATTATTGAAGCATTTATCA GGGTTATTGTCTCATGAGCGGATACATATTTGAATGTATTTAGAAAAATAAACAAATAGGGGTTCCGCGCACATT TCCCCGAAAAGTGCCACCTGACGTCTAAGAAACCATTATTATCATGACATTAACCTATAAAAATAGGCGTATCAC GAGGCCCTTTCGTCTCGCGCGTTTCGGTGATGACGGTGAAAACCTCTGACACATGCAGCTCCCGGAGACGGTCAC AGCTTGTCTGTAAGCGGATGCCGGGAGCAGACAAGCCCGTCAGGGCGCGTCAGCGGGTGTTGGCGGGTGTCGGGG CTGGCTTAACTATGCGGCATCAGAGCAGATTGTACTGAGAGTGCACCATATATGCGGTGTGAAATACCGCACAGA TGCGTAAGGAGAAAATACCGCATCAGGCGCCTCCTCAACCTGTATATTCGTAAACCACGCCCAATGGGAGCTGTC TCAGGTTTGTTCCTGATTGGTTACGGCGCGTTTCGCATCATTGTTGAGTTTTTCCGCCAGCCCGACGCGCAGTTT ACCGGTGCCTGGGTGCAGTACATCAGCATGGGGCAAATTCTTTCCATCCCGATGATTGTCGCGGGTGTGATCATG ATGGTCTGGGCATATCGTCGCAGCCCACAGCAACACGTTTCCTGAGGAACCATGAAACAGTATTTAGAACTGATG CAAAAAGTGCTCGACGAAGGCACACAGAAAAACGACCGTACCGGAACCGGAACGCTTTCCATTTTTGGTCATCAG ATGCGTTTTAACCTGCAAGATGGATTCCCGCTGGTGACAACTAAACGTTGCCACCTGCGTTCCATCATCCATGAA CTGCTGTGGTTTCTGCAGGGCGACACTAACATTGCTTATCTACACGAAAACAATGTCACCATCTGGGACGAATGG GCCGATGAAAACGGCGACCTCGGGCCAGTGTATGGTAAACAGTGGCGCGCCTGGCCAACGCCAGATGGTCGTCAT ATTGACCAGATCACTACGGTACTGAACCAGCTGAAAAACGACCCGGATTCGCGCCGCATTATTGTTTCAGCGTGG AACGTAGGCGAACTGGATAAAATGGCGCTGGCACCGTGCCATGCATTCTTCCAGTTCTATGTGGCAGACGGCAAA CTCTCTTGCCAGCTTTATCAGCGCTCCTGTGACGTCTTCCTCGGCCTGCCGTTCAACATTGCCAGCTACGCGTTA TTGGTGCATATGATGGCGCAGCAGTGCGATCTGGAAGTGGGTGATTTTGTCTGGACCGGTGGCGACACGCATCTG TACAGCAACCATATGGATCAAACTCATCTGCAATTAAGCCGCGAACCGCGTCCGCTGCCGAAGTTGATTATCAAA CGTAAACCCGAATCCATCTTCGACTACCGTTTCGAAGACTTTGAGATTGAAGGCTACGATCCGCATCCGGGCATT AAAGCGCCGGTGGCTATCTAATTACGAAACATCCTGCCAGAGCCGACGCCAGTGTGCGTCGGTTTTTTTACCCTC CGTTAAATTCTTCGAGACGCCTTCCCGAAGGCGCCATTCGCCATTCAGGCTGCGCAACTGTTGGGAAGGGCGATC GGTGCGGGCCTCTTCGCTATTACGCCAGCTGGCGAAAGGGGGATGTGCTGCAAGGCGATTAAGTTGGGTAACGCC AGGGTTTTCCCAGTCACGACGTTGTAAAACGACGGCCAGTGCCAAGCTTTCTTTAATGAAGCAGGGCATCAGGAC GGTATCTTTGTGGAGAAAGCAGAGTAATCTTATTCAGCCTGACTGGTGGGAAACCACCAGTCAGAATGTGTTAGC GCATGTTGACAAAAATACCATTAGTCACATTATCCGTCAGTCGGACGACATGGTAGATAACCTGTTTATTATGCG TTTTGATCTTACGTTTAATATTACCTTTATGCGATGAAACGGTCTTGGCTTTGATATTCATTTGGTCAGAGATTT GAATGGTTCCCTGACCTGCCATCCACATTCGCAACATACTCGATTCGGTTCGGCTCAATGATAACGTCGGCATAT TTAAAAACGAGGTTATCGTTGTCTCTTTTTTCAGAATATCGCCAAGGATATCGTCGAGAGATTCCGGTTTAATCG ATTTAGAACTGATCAATAAATTTTTTCTGACCAATAGATATTCATCAAAATGAACATTGGCAATTGCCATAAAAA CGATAAATAACGTATTGGGATGTTGATTAATGATGAGCTTGATACGCTGACTGTTAGAAGCATCGTGGATGAAAC AGTCCTCATTAATAAACACCACTGAAGGGCGCTGTGAATCACAAGCTATGGCAAGGTCATCAACGGTTTCAATGT CGTTGATTTCTCTTTTTTTAACCCCTCTACTCAACAGATACCCGGTTAAACCTAGTCGGGTGTAACTACATAAAT CCATAATAATCGTTGACATGGCATACCCTCACTCAATGCGTAACGATAATTCCCCTTACCTGAATATTTCATCAT GACTAAACGGAACAACATGGGTCACCTAATGCGCCACTCTCGCGATTTTTCAGGCGGACTTACTATCCCGTAAAG TGTTGTATAATTTGCCTGGAATTGTCTTAAAGTAAAGTAAATGTTGCGATATGTGAGTGAGCTTAAAACAAATAT TTCGCTGCAGGAGTATCCTGGAAGATGTTCGTAGAAGCTTACTGCTCACAAGAAAAAAGGCACGTCATCTGACGT GCCTTTTTTATTTGTACTACCCTGTACGATTACTGCAGCTCGAGTTAGGATACCGGCACTTTGATCCAACCAGTC GGGTAGATATCCGGTGCTTCGGAGTGCTGGAACCAACGGCTCGGCACAATAACAGTCTTATCCATATTAGGGTTC AGCCAGGCACCCCACCAAGAAAACGTGCTGTTACAAATGATGTGATGTTTGCAATGAGACATCAGCATCATATCC TGCCAGGAGTCTTCATCAGTGTTCCAGTCAATATAAACCGCATTCTGCAGTGGCAGATTTTCTTTAACCCACGCG ATATCGTCGGAGAAGATATAGTAAGATGGGCTAGCAACACGACGGGACATTTCCGCGATAGCATTCTGGTAATAC GGCAGCTGGCACACGGAACCGGTAGTAGCCCAGTGTTTCGGCTGCAGATAGTCACCACGACGAATGTGCAGGGAA ACCGCGTTTTCATCTTTGTCCAGGATTTCCAGCATGTTCAGGCTGCGGGAATTTGCTTTGTTCTTATCAAAGGTG AAGGATTCACGCACTTCGTCTTTGATATCAGCGAAGAAACGCTCGCTCTGATAGAAACCTTTAAAGTACAGCAGC GGCCAGAAATACTTCTTCTCGAACGCACGCAGAGAGTTCGGCGCCTGCTTGCGTTCGTAGATTTTTTTAAAAAAC AGGAATTCGATAACTTTTTTCAGCGGTTGGTTGATGCAGAATTCGGTGTGCGGCAGGTTGAACACGCGGTGCATT TCGTAACCGTAATGGACTTTGTAATGCATCATGTCGCTCAGGTCGATACGGACCTTCGGGTAATACTTTTTCATA CGCAGATAGAAAGCATAGATAAACATCTGGTTGCCCAGACCGCCAGTCACTTTGATCAGACGCATTATATCTCCT TCTTG
(71) Fucosylated oligosaccharides produced by metabolically engineered E. coli cells are purified from culture broth post-fermentation. An exemplary procedure comprises five steps. (1) Clarification: Fermentation broth is harvested and cells removed by sedimentation in a preparative centrifuge at 6000×g for 30 min. Each bioreactor run yields about 5-7 L of partially clarified supernatant. (2) Product capture on coarse carbon: A column packed with coarse carbon (Calgon 12×40 TR) of ˜4000 ml volume (dimension 5 cm diameter×60 cm length) is equilibrated with 1 column volume (CV) of water and loaded with clarified culture supernatant at a flow rate of 40 ml/min. This column has a total capacity of about 120 g of sugar. Following loading and sugar capture, the column is washed with 1.5 CV of water, then eluted with 2.5 CV of 50% ethanol or 25% isopropanol (lower concentrations of ethanol at this step (25-30%) may be sufficient for product elution.) This solvent elution step releases about 95% of the total bound sugars on the column and a small portion of the color bodies. In this first step capture of the maximal amount of sugar is the primary objective. Resolution of contaminants is not an objective. (3) Evaporation: A volume of 2.5 L of ethanol or isopropanol eluate from the capture column is rotary-evaporated at 56 C.° and a sugar syrup in water is generated. Alternative methods that could be used for this step include lyophilization or spray-drying. (4) Flash chromatography on fine carbon and ion exchange media: A column (GE Healthcare HiScale50/40, 5×40 cm, max pressure 20 bar) connected to a Biotage Isolera One FLASH Chromatography System is packed with 750 ml of a Darco Activated Carbon G60 (100-mesh): Celite 535 (coarse) 1:1 mixture (both column packings were obtained from Sigma). The column is equilibrated with 5 CV of water and loaded with sugar from step 3 (10-50 g, depending on the ratio of 2′-FL to contaminating lactose), using either a celite loading cartridge or direct injection. The column is connected to an evaporative light scattering (ELSD) detector to detect peaks of eluting sugars during the chromatography. A four-step gradient of isopropanol, ethanol or methanol is run in order to separate 2′-FL from monosaccharides (if present), lactose and color bodies. Fractions corresponding to sugar peaks are collected automatically in 120-ml bottles, pooled and directed to step 5. In certain purification runs from longer-than-normal fermentations, passage of the 2′-FL-containing fraction through anion-exchange and cation exchange columns can remove excess protein/DNA/caramel body contaminants. Resins tested successfully for this purpose are Dowex 22.
(72) The gene screening approach described herein was successfully utilized to identify new α(1,2) FTs for the efficient biosynthesis of 2′-FL in metabolically engineered E. coli host strains. The results of the screen are summarized in Table 1.
(73) Production Host Strains
(74) E. coli K-12 is a well-studied bacterium which has been the subject of extensive research in microbial physiology and genetics and commercially exploited for a variety of industrial uses. The natural habitat of the parent species, E. coli, is the large bowel of mammals. E. coli K-12 has a history of safe use, and its derivatives are used in a large number of industrial applications, including the production of chemicals and drugs for human administration and consumption. E. coli K-12 was originally isolated from a convalescent diphtheria patient in 1922. Because it lacks virulence characteristics, grows readily on common laboratory media, and has been used extensively for microbial physiology and genetics research, it has become the standard bacteriological strain used in microbiological research, teaching, and production of products for industry and medicine. E. coli K-12 is now considered an enfeebled organism as a result of being maintained in the laboratory environment for over 70 years. As a result, K-12 strains are unable to colonize the intestines of humans and other animals under normal conditions. Additional information on this well known strain is available at http://epa.gov/oppt/biotech/pubs/fra/fra004.htm. In addition to E. coli K12, other bacterial strains are used as production host strains, e.g., a variety of bacterial species may be used in the oligosaccharide biosynthesis methods, e.g., Erwinia herbicola (Pantoea agglomerans), Citrobacter freundii, Pantoea citrea, Pectobacterium carotovorum, or Xanthomonas campestris. Bacteria of the genus Bacillus may also be used, including Bacillus subtilis, Bacillus licheniformis, Bacillus coagulans, Bacillus thermophilus, Bacillus laterosporus, Bacillus megaterium, Bacillus mycoides, Bacillus pumilus, Bacillus lentus, Bacillus cereus, and Bacillus circulans. Similarly, bacteria of the genera Lactobacillus and Lactococcus may be modified using the methods of this invention, including but not limited to Lactobacillus acidophilus, Lactobacillus salivarius, Lactobacillus plantarum, Lactobacillus helveticus, Lactobacillus delbrueckii, Lactobacillus rhamnosus, Lactobacillus bulgaricus, Lactobacillus crispatus, Lactobacillus gasseri, Lactobacillus casei, Lactobacillus reuteri, Lactobacillus jensenii, and Lactococcus lactis. Streptococcus thermophiles and Proprionibacterium freudenreichii are also suitable bacterial species for the invention described herein. Also included as part of this invention are strains, modified as described here, from the genera Enterococcus (e.g., Enterococcus faecium and Enterococcus thermophiles), Bifidobacterium (e.g., Bifidobacterium longum, Bifidobacterium infantis, and Bifidobacterium bifidum), Sporolactobacillus spp., Micromomospora spp., Micrococcus spp., Rhodococcus spp., and Pseudomonas (e.g., Pseudomonas fluorescens and Pseudomonas aeruginosa).
(75) Suitable host strains are amenable to genetic manipulation, e.g., they maintain expression constructs, accumulate precursors of the desired end product, e.g., they maintain pools of lactose and GDP-fucose, and accumulate endproduct, e.g., 2′-FL. Such strains grow well on defined minimal media that contains simple salts and generally a single carbon source. The strains engineered as described above to produce the desired fucosylated oligosaccharide(s) are grown in a minimal media. An exemplary minimal medium used in a bioreactor, minimal “FERM” medium, is detailed below.
(76) Ferm (10 liters): Minimal medium comprising:
(77) 40 g (NH.sub.4).sub.2HPO.sub.4
(78) 100 g KH.sub.2PO.sub.4
(79) 10 g MgSO.sub.4.7H.sub.2O
(80) 40 g NaOH
(81) 1× Trace elements:
(82) 1.3 g NTA (nitrilotriacetic acid)
(83) 0.5 g FeSO.sub.4.7H.sub.2O
(84) 0.09 g MnCl.sub.2.4H.sub.2O
(85) 0.09 g ZnSO.sub.4.7H.sub.2O
(86) 0.01 g CoCl.sub.2.6H.sub.2O
(87) 0.01 g CuCl.sub.2.2H.sub.2O
(88) 0.02 g H.sub.3BO.sub.3
(89) 0.01 g Na.sub.2MoO.sub.4.2H.sub.2O (pH 6.8)
(90) Water to 10 liters
(91) DF204 antifoam (0.1 ml/L)
(92) 150 g glycerol (initial batch growth), followed by fed batch mode with a 90% glycerol-1% MgSO.sub.4-1× trace elements feed, at various rates for various times.
(93) A suitable production host strain is one that is not the same bacterial strain as the source bacterial strain from which the fucosyltransferase-encoding nucleic acid sequence was identified.
(94) Bacteria comprising the characteristics described herein are cultured in the presence of lactose, and a fucosylated oligosaccharide is retrieved, either from the bacterium itself or from a culture supernatant of the bacterium. The fucosylated oligosaccharide is purified for use in therapeutic or nutritional products, or the bacteria are used directly in such products.
EXAMPLES
Example 1: Identification of Novel α(1,2) Fucosyltransferases
(95) To identify additional novel α(1,2)fucosyltransferases, a multiple sequence alignment query was generated using the alignment algorithm of the CLCbio Main Workbench package, version 6.9 (CLCbio, 10 Rogers Street #101, Cambridge, Mass. 02142, USA) using four previously identified lactose-utilizing α(1,2)fucosyltransferase protein sequences: H. pylori futC (SEQ ID NO: 1), H. mustelae FutL (SEQ ID NO: 2), Bacteroides vulgatus futN (SEQ ID NO: 3), and E. coli 0126 wbgL (SEQ ID NO: 4). This sequence alignment and percentages of sequence identity between the four previously identified lactose-utilizing α(1,2)fucosyltransferase protein sequences is shown in
(96) A portion of the initial position-specific scoring matrix file used is shown below:
(97) TABLE-US-00008 Last position-specific scoring matrix computed A R N D C Q E G H I L K M F P S T W Y V 1 M −1 −1 −2 −3 −2 0 −2 −3 −2 1 2 −1 6 0 −3 −2 −1 −2 −1 1 2 A 2 −2 0 4 −2 −1 1 −1 −1 −2 −3 −1 −2 −3 −1 1 −1 −3 −3 −1 3 F −2 −3 −3 −4 −3 −3 −3 −3 −1 0 0 −3 0 7 −4 −3 −2 1 1 −1 4 K 0 3 0 −3 −2 1 0 −1 −1 −3 −3 3 −2 −3 −1 2 0 −3 −2 −2 5 V −1 −3 −3 −4 −1 −3 −3 −4 −3 4 2 −3 1 0 −3 −2 −1 −3 −1 3 6 V −1 −3 −3 −3 −1 −3 −3 −4 −3 4 1 −3 1 −1 −3 −2 0 −3 −1 3 7 Q −1 4 0 −1 −3 4 1 −2 0 −3 −2 3 −1 −3 −2 0 −1 −3 −2 −3 8 I −1 −3 −3 −4 −1 −2 −3 −4 −3 3 2 −3 1 0 −3 −2 −1 −3 −1 3 9 C −1 −1 0 −1 5 3 6 −2 4 −2 −2 0 −1 −2 −2 0 2 −2 −1 −1 10 G 0 −3 −1 −1 −3 −2 −2 6 −2 −4 −4 −2 −3 −3 −2 0 −2 −3 −3 −3 11 G 0 −3 −1 −1 −3 −2 −2 6 −2 −4 −4 −2 −3 −3 −2 0 −2 −3 −3 −3 12 L −2 −2 −4 −4 −1 −2 −3 −4 −3 2 4 −3 2 0 −3 −3 −1 −2 −1 1 13 G 0 −3 −1 −1 −3 −2 −2 6 −2 −4 −4 −2 −3 −3 −2 0 −2 −3 −3 −3 14 N −2 −1 6 1 −3 0 0 −1 1 −4 −4 0 −2 −3 −2 1 0 −4 −2 −3 15 Q −1 1 0 0 −3 6 2 −2 0 −3 −2 1 −1 −3 −1 0 −1 −2 −2 −2 16 N −1 −2 −3 −4 −2 −1 −2 −3 −2 1 3 −2 5 0 −3 −2 −1 −2 −1 1 17 F −2 −3 −3 −4 −3 −3 −4 −3 −1 0 0 −3 0 7 −4 −3 −2 1 3 −1 18 Q −1 0 −1 −1 −3 5 1 −2 0 1 −1 1 0 −2 −2 −1 −1 −2 −2 0 19 Y −2 −2 −3 −3 −3 −2 −3 −3 1 −1 −1 −2 −1 5 −3 −2 −2 2 6 −1 20 A 4 −1 −1 −1 −1 −1 −1 0 −2 −2 −2 −1 −1 −2 −1 2 0 −3 −2 −1 21 F −2 −3 −3 −4 −3 −3 −4 −3 −1 0 0 −3 0 2 −4 −3 −2 1 3 −1 22 A 3 −2 −1 −2 −1 −1 −1 4 −2 −2 −2 −1 −2 −3 −1 1 −1 −3 −2 −1 23 K −1 0 −1 −2 −3 0 −1 −3 −1 −2 −2 3 −1 2 −2 −1 −1 1 5 −2 24 S 2 −1 −1 −2 −1 −1 −1 −1 −2 −1 1 −1 0 −1 −1 3 0 −3 −2 0 25 L −2 3 −2 −3 −2 −1 −2 −3 −2 1 3 0 1 0 −3 −2 −1 −2 −1 0 26 Q 0 0 0 −1 −2 4 1 −2 −1 −1 0 0 3 −2 −2 2 0 −2 −2 −1 27 K −1 2 0 −1 −3 1 0 −2 −1 −2 −2 4 −1 −3 −1 0 2 −3 −2 −2 28 H −1 0 0 −2 −3 0 0 −2 6 1 −1 2 −1 −1 −2 −1 −1 −3 0 0 29 S −1 −1 3 −1 −2 −1 −1 −2 0 −1 1 −1 0 1 −2 1 0 0 4 −1 30 N −1 −1 4 0 −3 −1 −1 3 0 −3 −3 −1 −2 0 −3 0 −1 −1 4 −3 31 T −1 −2 −1 −2 −2 −1 −2 −2 −2 1 −1 −1 −1 −2 5 0 3 −3 −2 0 32 P −1 0 −2 −1 −3 0 −1 −2 −2 −3 −3 2 −2 −4 7 −1 −1 −4 −3 −3 33 V −1 −3 −3 −4 −1 −2 −3 −4 −3 2 2 −3 1 −1 −3 −2 0 −3 −1 4 34 L −2 3 −2 −3 −2 −1 −2 −3 0 0 2 0 1 1 −3 −2 −1 0 4 −1 35 L −2 −3 −4 −4 −2 −3 −3 −4 −3 3 3 −3 1 3 −3 −3 −1 −1 1 1 36 D −2 −2 1 6 −4 0 1 −2 −1 −4 −4 −1 −3 −4 −2 0 −1 −5 −3 −4
(98) The command line of PSI-BLAST that was used is as follows: psiblast-db<LOCAL NR database name>-max_target_seqs 2500-in_msa<MSA file in FAST format>-out<results output file>-outfmt “7sskingdoms sscinames scomnames sseqid stitle evalue length pident”-out_pssm<PSSM file output>-out_ascii_pssm<PSSM (ascii) output>-num_iterations 6-num_threads 8
(99) This PSI-BLAST search resulted in an initial 2515 hits. There were 787 hits with greater than 22% sequence identity to FutC. 396 hits were of greater than 275 amino acids in length. Additional analysis of the hits was performed, including sorting by percentage identity to FutC, comparing the sequences by BLAST to an existing α(1,2) fucosyltransferase inventory (of known α(1,2) fucosyltransferases, to eliminate known lactose-utilizing enzymes and duplicate hits), and manual annotation of hits to identify those originating from bacteria that naturally exist in the gastrointestinal tract. An annotated list of the novel α(1,2) fucosyltransferases identified by this screen are listed in Table 1. Table 1 provides the bacterial species from which the enzyme is found, the GenBank Accession Number, GI Identification Number, amino acid sequence, and % sequence identity to FutC.
(100) Multiple sequence alignment of the 4 known α(1,2) FTs used for the PSI-BLAST query and 12 newly identified α(1,2) FTs is shown in
Example 2: Validation of Novel α(1,2) FTs
(101) To test for lactose-utilizing fucosyltransferase activity, the production of fucosylated oligosaccharides (i.e., 2′-FL) is evaluated in a host organism that expresses the candidate enzyme (i.e., syngene) and which contains both cytoplasmic GDP-fucose and lactose pools. The production of fucosylated oligosaccharides indicates that the candidate enzyme-encoding sequence functions as a lactose-utilizing α(1,2)fucosyltransferase. Of the identified hits, 12 novel α(1,2) fucosyltransferases were further analyzed for their functional capacity to produce 2′-fucosyllactose: Prevotella melaninogenica FutO, Clostridium bolteae FutP, Clostridium bolteae+13 FutP, Lachnospiraceae sp. FutQ, Methanosphaerula palustries FutR, Tannerella sp. FutS, Bacteroides caccae FutU, Butyrivibrio FutV, Prevotellaa sp. FutW, Parabacteroides johnsonii FutX, Akkermansia muciniphilia FutY, Salmonella enterica FutZ, and Bacteroides sp. FutZA.
(102) Syngenes were constructed comprising the 12 novel α(1,2) FTs in the configuration as follows: EcoRI-T7g10 RBS-syngene-XhoI.
(103) The candidate α(1,2) FTs (i.e., syngenes) were cloned by standard molecular biological techniques into an exemplary expression plasmid pEC2-(T7)-Fut syngene-rcsA-thyA. This plasmid utilizes the strong leftwards promoter of bacteriophage λ (termed P.sub.L) to direct expression of the candidate genes (Sanger, F. et al. (1982). J Mol Biol 162, 729-773). The promoter is controllable, e.g., a trp-cI construct is stably integrated the into the E. coli host's genome (at the ampC locus), and control is implemented by adding tryptophan to the growth media. Gradual induction of protein expression is accomplished using a temperature sensitive cI repressor. Another similar control strategy (temperature independent expression system) has been described (Mieschendahl et al., 1986, Bio/Technology 4:802-808). The plasmid also carries the E. coli rcsA gene to up-regulate GDP-fucose synthesis, a critical precursor for the synthesis of fucosyl-linked oligosaccharides. In addition, the plasmid carries a β-lactamase (bla) gene for maintaining the plasmid in host strains by ampicillin selection (for convenience in the laboratory) and a native thyA (thymidylate synthase) gene as an alternative means of selection in thyA.sup.− hosts.
(104) The expression constructs were transformed into a host strain useful for the production of 2′-FL. The host strain used to test the different α(1,2) FT candidates incorporates all the above genetic modifications described above and has the following genotype: ΔampC::P.sub.trp.sup.BcI, A(lacI-lacZ)::FRT, P.sub.lacIqlacY.sup.+, ΔwcaJ::FRT, thyA::Tn10, Δlon:(npt3, lacZ.sup.+), ΔlacA
(105) The E. coli strains harboring the different α(1,2) FT candidate expression plasmids were analyzed. Strains were grown in selective media (lacking thymidine) to early exponential phase. Lactose was then added to a final concentration of 0.5%, and tryptophan (200 μM) was added to induce expression of each candidate α(1,2) FT from the P.sub.L promoter. At the end of the induction period (˜24 h) the culture supernatants and cells were harvested. Heat extracts were prepared from whole cells and the equivalent of 0.2OD.sub.600 units of each strain analyzed for the presence of 2′-FL by thin layer chromatography (TLC), along with 2 μl of the corresponding clarified culture supernatant for each strain.
(106)
(107) Table 4 summarizes the fucosyltransferase activity for each candidate syngene as determined by the 2′FL synthesis screen described above. 11 of the 12 candidate α(1,2) FTs were found to have lactose-utilizing fucosyltransferase activity.
(108) TABLE-US-00009 TABLE 4 2′FL synthesis screen results 2′FL 2′FL culture cell 24 h OD me- ex- Syngene (induced) dium tract Escherichia coli WbgL 9.58 5 5 pG204 pEC2-WbgL-rcsA-thyA E640 Prevotella FutO 12.2 3 2 pG393 pEC2-(T7)FutO-rcsA-thyA E985 melaninogenica Clostridium bolteae FutP 10.4 1 2 pG394 pEC2-(T7)FutP-rcsA-thyA E986 Lachnospiraceae sp. FutQ 10.6 3 4 pG395 pEC2-(T7)FutQ-rcsA-thyA E987 Methanosphaerula FutR 11.9 0 1 pG396 pEC2-(T7)FutR-rcsA-thyA E988 palustris Tannerella sp. FutS 11.3 2 3 pG397 pEC2-(T7)FutS-rcsA-thyA E989 Bacteroides caccae FutU 12.1 0 2 pG398 pEC2-(T7)FutU-rcsA-thyA E990 Butyrivibrio FutV 11.3 0 1 pG399 pEC2-(T7)FutV-rcsA-thyA E991 Prevotella sp. FutW 10.5 3 3 pG400 pEC2-(T7)FutW-rcsA-thyA E992 Parabacteroides FutX 10.7 3 5 pG401 pEC2-(T7)FutX-rcsA-thyA E993 johnsonii Akkermansia FutY 9.1 0 0 pG402 pEC2-(T7)FutY-rcsA-thyA E994 muciniphilia Salmonella enterica FutZ 11.0 0 3 pG403 pEC2-(T7)FutZ-rcsA-thyA E995 Bacteroides sp. FutZA 9.9 3 3 pG404 pEC2-(T7)FutZA-rcsA-thyA E996
Example 3: Characterization of Cultures Expressing Novel α(1,2) FTs
(109) Further characterization of the bacterium expressing novel α(1,2) FTs FutO, FutQ, and FutX was performed. Specifically, proliferation rate and exogenous α(1,2) FT expression was examined.
(110) Expression plasmids containing fucosyltransferases WbgL (plasmid pG204), FutN (plasmid pG217), and novel α(1,2) FTs FutO (plasmid pG393), FutQ (plasmid pG395), and FutX (pG401) were introduced into host bacterial strains. For example, the host strains utilized has the following genotype: ΔampC::P.sub.trp.sup.BcI, A(lacI-lacZ)::FRT, P.sub.lacIqlacY.sup.+, ΔwcaJ::FRT, thyA::Tn10, Δlon:(npt3, lacZ.sup.+), ΔlacA
(111) Bacterial cultures expressing each exogenous fucosyltransferase were induced by addition of tryptophan (to induce expression of the exogenous fucosyltransferases) in the presence of lactose. Growth of the cultures was monitored by spectrophotometric readings at A600 at the following timepoints: 4 hours and 1 hour before induction, at the time of induction (time 0), and 3 hours, 7 hours, and 24 hours after induction. The results are shown in
(112) Protein expression was also assessed for the bacterial cultures expressing each fucosyltransferase after induction. Cultures were induced as described previously, and protein lysates were prepared from the bacterial cultures at the time of induction (0 hours), 3 hours, 7 hours, and 24 hours after induction. The protein lysates were run on an SDS-PAGE gel and stained to examine the distribution of proteins at each time point. As shown in
(113) Finally, additional TLC analysis to assess the efficiency and yield of 2′FL production in bacterial cultures expressing novel α(1,2) FTs FutO, FutQ, and FutX compared to known fucosyltransferases WbgL and FutN. Cultures were induced at 7 hours and 24 hours, and run out on TLC.
Example 4: FutN Exhibits Increased Efficiency for Production of 2′FL
(114) Fucosylated oligosaccharides produced by metabolically engineered E. coli cells to express B. vulgatus FutN was purified from culture broth post-fermentation.
(115) Fermentation broth was harvested and cells were removed by sedimentation in a preparative centrifuge at 6000×g for 30 min. Each bioreactor run yields about 5-7 L of partially clarified supernatant. A column packed with coarse carbon (Calgon 12×40 TR) of ˜1000 ml volume (dimension 5 cm diameter×60 cm length) was equilibrated with 1 column volume (CV) of water and loaded with clarified culture supernatant at a flow rate of 40 ml/min. This column had a total capacity of about 120 g of sugar. Following loading and sugar capture, the column is washed with 1.5 CV of water, then was eluted with 2.5 CV of 50% ethanol or 25% isopropanol (lower concentrations of ethanol at this step (25-30%) may be sufficient for product elution.) This solvent elution step released about 95% of the total bound sugars on the column and a small portion of color bodies (caramelized sugars). A volume of 2.5 L of ethanol or isopropanol eluate from the capture column was rotary-evaporated at 56 C.° and a sugar syrup in water was generated. A column (GE Healthcare HiScale50/40, 5×40 cm, max pressure 20 bar) connected to a Biotage Isolera One FLASH Chromatography System was packed with 750 ml of a Darco Activated Carbon G60 (100-mesh): Celite 535 (coarse) 1:1 mixture (both column packings were obtained from Sigma). The column was equilibrated with 5 CV of water and loaded with sugar from step 3 (10-50 g, depending on the ratio of 2′-FL to contaminating lactose), using either a celite loading cartridge or direct injection. The column was connected to an evaporative light scattering (ELSD) detector to detect peaks of eluting sugars during the chromatography. A four-step gradient of isopropanol, ethanol or methanol was run in order to separate 2′-FL from monosaccharides (if present), lactose and color bodies. Fractions corresponding to sugar peaks were collected automatically in 120-ml bottles, pooled.
(116) The results from two fermentation runs are shown in
(117) TABLE-US-00010 TABLE 1 Hits from PSI-BLAST multiple sequence alignment query for novel α(1,2)fucosyltransferases % iden- SEQ Accession Gene name tity ID Bacterium names No. GI No. [bacterium] FutC Alias SEQUENCE NO Helicobacter AAD29869.1 4808599 alpha-1,2- 98 FutC MAFKVVQICGGLGNQMFQYAFAKSLQKHSNTPVLLDITSFDWSDRKMQLELFPINLPYASAKEIAIAKMQ 1 pylori fucosyl- HLPKLVRDALKCMGFDRVSQEIVFEYEPELLKPSRLTYFYGYFQDPRYFDAISPLIKQTFTLPPPPENNKNNN transferase KKEEEYHRKLSLILAAKNSVFVHIRRGDYVGIGCQLGIDYQKKALEYMAKRVPNMELFVFCEDLEFTQNLDLG [Helicobacter YPFMDMTTRNKEEEAYWDMLLMQSCQHGIIANSTYSWWAAYLIENPEKIIIGPKHWLFGHENILCKEWVK pylori] IESHFEVKSQKYNA Helicobacter YP_003517185.1 291277413 alpha-1,2- 70.85 FutL MDFKIVQVHGGLGNQMFQYAFAKSLQTHLNIPVLLDTTWFDYGNRELGLHLFPIDLQCASAQQIAAAHM 2 mustelae; fucosyl- QNLPRLVRGALRRMGLGRVSKEIVFEYMPELFEPSRIAYFHGYFQDPRYFEDISPLIKQTFTLPHPTEHAEQY Helicobacter transferase SRKLSQILAAKNSVFVHIRRGDYMRLGWQLDISYQLRAIAYMAKRVQNLELFLFCEDLEFVQNLDLGYPFVD mustelae 12198 [Helicobacter MTTRDGAAHWDMMLMQSCKHGIITNSTYSWWAAYLIKNPEKIIIGPSHWIYGNENILCKDWVKIESQFET mustelae KS 12198] Bacteroides; YP_001300461.1 150005717 glycosyl 24.83 FutN MRLIKVTGGLGNQMFIYAFYLRMKKYYPKVRIDLSDMMHYKVHYGYEMHRVFNLPHTEFCINQPLKKVIEF 3 Bacteroides transferase LFFKKIYERKQAPNSLRAFEKKYFWPLLYFKGFYQSERFFADIKDEVRESFTFDKNKANSRSLNMLEILDKD vulgatus family protein ENAVSLHIRRGDYLQPKHWATTGSVCQLPYYQNAIAEMSRRVASPSYYIFSDDIAWVKENLPLQNAVYIDWN ATCC 8482; [Bacteroides TDEDSWQDMMLMSHCKHHIICNSTFSWWGAWLNPNMDKTVIVPSRWFQHSEAPDIYPTGWIKVPVS Bacteroides vulgatus ATCC sp. 4_3_47FAA; 8482] Bacteroides sp. 3_1_40A; Bacteroides vulgatus PC510; Bacteroides vulgatus CL09T03C04; Bacteroides vulgatus dnLKV7; Bacteroides vulgatus CAG: 6 Escherichia WP_021554465.1 545259828 protein 23.13 WbgL MSIIRLQGGLGNQLFQFSFGYALSKINGTPLYFDISHYAENDDHGGYRLNNLQIPEEYLQYYTPKINNIYKLLV 4 coli; [Escherichia RGSRLYPDIFLFLGFCNEFHAYGYDFEYIAQKWKSKKYIGYWQSEHFFHKHILDLKEFFIPKNVSEQANLLAAK Escherichia coli] ILESQSSLSIHIRRGDYIKNKTATLTHGVCSLEYYKKALNKIRDLAMIRDVFIFSDDIFWCKENIETLLSKKYN coli IYYSEDLSQEEDLWLMSLANHHIIANSSFSWWGAYLGSSASQIVIYPTPWYDITPKNTYIPIVNHWINVDKHS UMEA 3065-1 SC Helicobacter WP_005219731.1 491361813 predicted 36.79 FutD MGDYKIVELTCGLGNQMFQYAFAKALQKHLQVPVLLDKTWYDTQDNSTQFSLDIFNVDLEYATNTQIEKA 5 bilis; protein KARVSKLPGLLRKMFGLKKHNIAYSQSFDFHDEYLLPNDFTYFSGFFQNAKYLKGLEQELKSIFYYDSNNFSN Helicobacter [Helicobacter FGKQRLELILQAKNSIFIHIRRGDYCKIGWELGMDYYKRAIQYIMDRVEEPKFFIFGATDMSFTEQFQKNLGL bilis bilis] NENNSANLSEKTITQDNQHEDMFLMCYCKHAILANSSYSFWSAYLNNDANNIVIAPTPWLLDNDNIICDD ATCC 43879 WIKISSK Escherichia AAO37698.1 37788088 fucosyl- 25.94 WbsJ MEVKIIGGLGNQMFQYATAFAIAKRTHQNLTVDISDAVKYKTHPLRLVELSCSSEFVKKAWPFEKYLFSEKIP 6 coli transferase HFMKKGMFRKHYVEKSLEYDPDIDTKSINKKIVGYFQTEKYFKEFRHELIKEFQPKTKFNSYQNELLNLIKEND [Escherichia TCSLHIRRGDYVSSKIANETHGTCSEKYFERAIDYLMNKGVINKKTLLFIFSDDIKWCRENIFFNNQICFVQGD coli] AYHVELDMLLMSKCKNNIISNSSFSWWAAWLNENKNKTVIAPSKWFKKDIKHDIIPESWVKL Vibrio BAA33632.1 3721682 probable beta- 25.94 WblA MIVMKISGGLGNQLFQYAVGRAIAIQYGVPLKLDVSAYKNYKLHNGYRLDQFNINADIANEDEIFHLKGSSN 7 cholerae D-galactoside RLSRILRRLGWLKKNTYYAEKQRTIYDVSVFMQAPRYLDGYWQNEQYFSQIRAVLLQELWPNQPLSINAQA 2-alpha-L- HQIKIQQTHAVSIHVRRGDYLNHPEIGVLDIDYYKRAVDYIKEKIEAPVFFVFSNDVAWCKDNFNFIDSPVFI fucosyl EDTQTEIDDLMLMCQCQHNIVANSSFSWWAAWLNSNVDKIVIAPKTWMAENPKGYKWVPDSWREI transferase [Vibrio cholerae] Bacteroides YP_099118.1 53713126 alpha-1,2- 24.58 Bft2 MIVSSLRGGLGNQMFIYAMVKAMALRNNVPFAFNLTTDFANDEVYKRKLLLSYFALDLPENKKLTFDFSYG 8 fragilis; fucosyl- NYYRRLSRNLGCHILHPSYRYICEERPPHFESRLISSKITNAFLEGYWQSEKYFLDYKQEIKEDFVIQKKLEY Bacteroides transferase TSYLELEEIKLLDKNAIMIGVRRYQESDVAPGGVLEDDYYKCAMDIMASKVTSPVFFCFSQDLEWVEKHLAGK fragilis [Bacteroides YPVRLISKKEDDSGTIDDMFLMMHFRNYIISNSSFYWWGAWLSKYDDKLVIAPGNFINKDSVPESWFKLNVR NCTC 9343; fragilis Bacteroides YCH46] fragilis YCH46; Bacteroides fragilis HMW 615 Escherichia WP_001592236.1 486356116 protein 24.25 WbgN MSIVVARLAGGLGNQMFQYAKGYAESVERNSYLKLDLRGYKNYTLHGGFRLDKLNIDNTFVMSKKEMCIF 9 coli; [Escherichia PNFIVRAINKFPKLSLCSKRFESEQYSKKINGSMKGSVEFIGFWQNERYFLEHKEKLREIFTPININLDAKE Escherichia coli] LSDVIRCTNSVSVHIRRGDYVSNVEALKIHGLCTERYYIDSIRYLKERFNNLVFFVFSDDIEWCKKYKNEIF coli SRSDDVKFIEGNTQEVDMWLMSNAKYHIIANSSFSWWGAWLKNYDLGITIAPTPWFEREELNSFDPCPEKWV KTE84 RIEK Prevotella YP_003814512.1 302346214 glycosyl- 31.1 FutO MKIVKILGGLGNQMFQYALYLSLQESFPKERVALDLSSFHGYHLHNGFELENIFSVTAQKASAADIMRIAYYY 10 melaninogenica; transferase PNYLLWRIGKRFLPRRKGMCLESSSLRFDESVLRQEGNRYFDGYWQDERYFAAYREKVLKAFTFPAFKRAE Prevotella family 11 NLSLLEKLDENSIALHVRRGDYVGNNLYQGICDLDYYRTAIEKMCAHVTPSLFCIFSNDITWCQQHLQPYLK melaninogenica [Prevotella APVVYVTWNTGVESYRDMQLMSCCAHNIIANSSFSWWGAWLNQNREKVVIAPKKWLNMEECHFTLPA ATCC 25845 melaninogenica SWIKI ATCC 25845] Clostridium WP_002570768.1 488634090 protein 29.86 FutP MFQYALYKAFEQKHIDVYADLAWYKNKSVKFELYNFGIKINVASEKDINRLSDCQADFVSRIRRKIFGKKKSF 11 bolteae; [Clostridium VSEKNDSCYENDILRMDNVYLSGYWQTEKYFSNTREKLLEDYSFALVNSQVSEWEDSIRNKNSVSIHIRRGD Clostridium bolteae] YLQGELYGGICTSLYYAEAIEYIKMRVPNAKFFVFSDDVEWVKQQEDFKGFVIVDRNEYSSALSDMYLMSLC bolteae KHNIIANSSFSWWAAWLNRNEEKIVIAPRRWLNGKCTPDIWCKKWIRI 90A9; Clostridium bolteae 90B3; Clostridium bolteae 90B8 Lachnospiraceae WP_009251343.1 496545268 protein 29.25 FutQ MVIVQLSGGLGNQMFEYALYLSLKAKGKEVKIDDVTCYEGPGTRPRQLDVFGITYDRASREELTEMTDASM 12 bacterium [Lachnospiraceae DALSRVRRKLTGRRTKAYRERDINFDPLVMEKDPALLEGCFQSDKYFRDCEGRVREAYRFRGIESGAFPLPE 3_1_57FAA_CT1 bacterium DYLRLEKQIEDCQSVSVHIRRGDYLDESHGGLYTGICTEAYYKEAFARMERLVPGARFFLFSNDPEWTREHF 3_1_57FAA_CT1] ESKNCVLVEGSTEDTGYMDLYLMSRCRHNIIANSSFSWWGAWLNENPEKKVIAPAKWLNGRECRDIYTER MIRL Methanosphaerula YP_002467213.1 219852781 glycosyl 28.52 FutR MIIVRLKGGLGNQLSQYALGRKIAHLHNTELKLDTTWFTTISSDTPRTYRLNNYNIIGTIASAKEIQLIERG 13 palustris; transferase RAQGRGYLLSKISDLLTPMYRRTYVRERMHTFDKAILTVPDNVYLDGYWQTEKYFKDIEEILRREVTLKDEP Methanosphaerula family protein DSINLEMAERIQACHSVSLHVRRGDYVSNPTTQQFHGCCSIDYYNRAISLIEEKVDDPSFFIFSDDLPWAKE palustris E1-9c [Methanosphaerula NLDIPGEKTFVAHNGPEKEYCDLWLMSLCQHHIIANSSFSWWGAWLGQDAEKMVIAPRRWALSESFDTSDII palustris E1- PDSWITI 9c] Tannerella sp. WP_021929367.1 547187521 glycosyl 28.38 FutS MVRIVEIIGGLGNQMFQYAFSLYLKNKSHIWDRLYVDIEAMKTYDRHYGLELEKVFNLSLCPISNRLHRNLQ 14 CAG: 118 transferase KRSFAKHFVKSLYEHSECEFDEPVYRGLRPYRYYRGYWQNEGYFVDIEPMIREAFQFNVNILSKKTKAIASK family 11 MRRELSVSIHVRRGDYENLPEAKAMHGGICSLDYYHKAIDFIRQRLDNNICFYLFSDDINWVEENLQLENRC [Tannerella sp. IIDWNQGEDSWQDMYLMSCCRHHIIANSSFSWWAAWLNPNKNKIVLTPNKWFNHTDAVGIVPKSWIKI CAG: 118] PVF Bacteroides WP_005675707.1 491925845 protein 28.09 FutU MKIVKILGGLGNQMFQYALFLSLKERFPHEQVMIDTSCFRNYPLHNGFEVDRIFAQKAPVASWRNILKVAY 15 caccae; [Bacteroides PYPNYRFWKIGKYILPKRKTMCVERKNFSFDAAVLTRKGDCYYDGYWQHEEYFCDMKETIWEAFSFPEPV Bacteroides caccae] DGRNKEIGALLQASDSASLHVRRGDYVNHPLFRGICDLDYYKRAIHYMEERVNPQLYCVFSNDMAWCESH caccae LRALLPGKEVVYVDWNKGAESYVDMRLMSLCRHNIIANSSFSWWGAWLNRNPQKVVVAPERWMNSPI ATCC43185 EDPVSDKWIKL Butyrivibrio sp. WP_022772718.1 551028636 protein 27.8 FutV MIIIQLKGGLGNQMFQYALYKSLKKRGKEVKIDDKTGFVNDKLRIPVLSRWGVEYDRATDEEIINLTDSKMD 16 AE2015 [Butyrivibrio LFSRIRRKLTGRKTFRIDEESGKFNPEILEKENAYLVGYWQCDKYFDDKDVVREIREAFEKKPQELMTDASS sp. AE2015] WSTLQQIECCESVSLHVRRTDYVDEEHIHIHNICTEKYYKNAIDRVRKQYPSAVFFIFTDDKEWCRDHFKGP NFIWELEEGDGTDIAEMTLMSRCKHHIIANSSFSWWAAWLNDSPEKIVIAPQKWINNRDMDDIYTERMTKIAL Prevotella sp. WP_022481266.1 548264264 uncharacterized 27.4 FutW MRLVKMIGGLGNQMFIYAFYLQMRKRFSNVRIDLTDMMHYNVHYGYELHKVFGLPRTEFCMNQPLKKVL 17 CAG: 891 protein EFLFFRTIVERKQHGRMEPYTCQYVWPLVYFKGFYQSERYFSEVKDEVRECFTFNPALANRSSQQMMEQI [Prevotella sp. QNDPQAVSIHIRRGDYLNPKHYDTIGCICQLPYYKHAVSEIKKYVSNPHFYVFSEDLDWVKANLPLENAQYI CAG: 891] DWNKGADSWQDMMLMSCCKHHIICNSTFSWWAAWLNPSVEKTVIMPEQWTSRQDSVDFVASCGRW VRVKTE Parabacteroides WP_008155883.1 495431188 glycosyl 26.69 FutX MRLIKMIGGLGNQMFIYAFYLKMKHHYPDTNIDLSDMVHYKVHNGYEMNRIFDLSQTEFCINRTLKKILEFL 18 johnsonii; transferase FFKKIYERRQDPSTLYPYEKRYFWPLLYFKGFYQSERFFFDIKDDVRKAFSFNLNIANPESLELLKQIEVDD Parabacteroides [Parabacteroides QAVSIHIRRGDYLLPRHWANTGSVCQLPYYKNAIAEMENRITGPSYYVFSDDISWVKENIPLKKAVYVTWNK johnsonii johnsonii] GEDSWQDMMLMSHCRHHIICNSTFSWWGAWLNPRKEKIVIAPCRWFQHKETPDMYPKEWIKVPIN CL02T12C29 Akkermansia YP_001877555.1 187735443 glycosyl 25.67 FutY MRLFGGLGNQLFQYAFLFALSRQGGKARLETSSYEHDDKRVCELHHFRVSLPIEGGPPPWAFRKSRIPACLR 19 muciniphila; transferase SLFAAPKYPHFREEKRHGFDPGLAAPPRRHTYFKGYFQTEQYFLHCREQLCREFRLKTPLTPENARILEDIRSC Akkermansia family protein CSISLHIRRTDYLSNPYLSPPPLEYYLRSMAEMEGRLRAAGAPQESLRYFIFSDDIEWARQNLRPALPHVHVD muciniphila [Akkermansia INDGGTGYFDLELMRNCRHHIIANSTFSWWAAWLNEHAEKIVIAPRIWFNREEGDRYHTDDALIPGSWLRI ATCCBAA-835 muciniphila ATCCBAA-835] Salmonella WP_023214330.1 555221695 fucosyl- 25.99 FutZ MYSCLSGGLGNQMFQYAAAYILKQYFQSTTLVLDDSYYYSQPKRDTVRSLELNQFNISYDRFSFADEKEKIKL 20 enterica; transferase LRKFKRNPFPKQISEILSIALFGKYALSDRAFYTFETIKNIDKACLFSFYQDADLLNKYKQLILPLFELRDDL Salmonella [Salmonella LDICKNLELYSLIQRSNNTTALHIRRGDYVTNQHAAKYHGVLDISYYNHAMEYVERERGKQNFIIFSDDVRWA enterica subsp. enterica] QKAFLENDNCYVINNSDYDFSAIDMYLMSLCKNNIIANSTYSWWGAWLNKYEDKLVISPKQWFLGNNETSLRN enterica ASWITL serovar Poona str. ATCCBAA- 1673 Bacteroides sp. WP_022161880.1 547748823 glycosyl- 26.01 FutZA MRLIKMTGGLGNQMFIYAFYLRMKKRYPKVRIDLSDMVHYHVHHGYEMHRVFNLPHTEFCINQPLKKVIE 21 CAG: 633 transferase FLFFKKIYERKQDPNSLRAFEKKYLWPLLYFKGFYQSERFFADIKDEVRKAFTFDSSKVNARSAELLRRLDA family 11 DANAVSLHIRRGDYLQPQHWATTGSVCQLPYYQNAIAEMNRRVAAPSYYVFSDDIAWVKENIPLQNAVYID [Bacteroides WNKGEESWQDMMLMSHCRHHIICNSTFSWWGAWLDPHEDKIVIVPNRWFQHCETPNIYPAGWVKVAIN sp. CAG: 633] Clostridium sp. WP_022247142.1 547839506 alpha-1 2- 34.28 MEKIKIVKLQGGMGNQMFQYAFGKGLESKFGCKVLFDKINYDELQKTIINNTGKNAEGICVRKYELGIFNLN 22 CAG: 306 fucosyl- IDFATAEQIQECIGEKLNKACYLPGFIRKIFNLSKNKTVSNRIFEKKYGEYDEEILKDYSLAYYDGYFQNPKY transferase FEDISDKIKKEFTLPEIKNHDIYNKKLLEKITQFENSVFIHVRRDDYLNINCEIDLDYYQKAVKYILKHIENP [Clostridium KFFVFCAEDPDYIKNHFDIGYDFELVGENNKTQDTYYENMRLMMACKHAIIANSSYSWWAAWLSDYDNKIVIA sp. CAG: 306] PTPWLPGISNEIICKNWIQIKRGISNE Prevotella sp. WP_009434595.1 497004957 protein 32.11 MKIVKILGGLGNQMFQYALYLSLQESFPKERVALDLSCFNGYHLHNGFELERIFSLTAQKASAATIMRIAYYY 23 oral taxon 306; [Prevotella sp. PNYLLWRIGKRLLPRRKTMCLESSTFRYDESVLTREGNRYFDGYWQDERYFVACREKVLKAFTFPAFKRTEN Prevotella oral taxon 306] LSLLRKLDKNSVAIHVRRGDYIGNQLYQGICDLDYYRAAIDKISTYVTPSVFCIFSNDIAWCQTHLQPYLKAP sp. oral VVYVTWNTGTESYRDMQLMSCCAHNIIANSSFSWWGAWLNQNNEKVVIAPKRWLNMDDCQFPLPASW taxon 306 str. VKI F0472 Brachyspira sp. WP_021917109.1 547139308 glycosyl 30.14 MQLVKLMGGLGNQMFQYAFAKALGDKNILFYGDYKKHSLRKVELNRFKCKAVYIPRELFKYLKFVFTKFDKI 24 CAG: 484 transferase EYMRSGIYVPEYLNRDGNHIYIGFWQTEKYFKQIRPRLLKDFTPRKKLDRENAGIISKMQQINSVSVHIRRTD family 11 YVDESHIYGDTNLDYYKRAIEYISSKIENPEFFFFSDDMAYVKEKFAGLKFPHSFIDINSGNNSYKDLILMKN [Brachyspira sp. CKHNIIANSTFSWWGAWLNENEEKIVIAPAKWFVTGENDKDIVPDEWIKL CAG: 484] Thalassospira WP_008889330.1 496164823 glycosyl 30 MVIVKLLGGLGNQMFQYATGRAVASRLDVELLLDVSAFAHYDLRRYELDDWNITARLATSEELARSGVTAA 25 profundimaris; transferase PPSFFDRIARFLRIDLPVNCFREASFTYDPRILEVSSPVYLDGYWQSERYFLDIEKKLRQEFQLKASIDANNH Thalassospira family 11 SFKKKIDGLGKQAVSLHVRRGDYVTNPQTASYHGVCSLDYYRAAVDYIAEHVSDPCFFVFSDDLEWVQTNLNI profundimaris [Thalassospira KQPIVLVDANGPDNGAADMALMMACRHHIIANSSFSWWGSWLNPLNDKIIVAPKKWFGRANHDTTDL WP0211 profundimaris] VPDSWVRL Acetobacter sp. WP_022078656.1 547459369 alpha-1 2- 29.9 MAVSPQESKYSAHVSPDKPLRIVRLGGGLGNQMFQYAFGLAAGDVLWDNTSFLTNHYRSFDLGLYNISGD 26 CAG: 267 fucosyl- FASNEQIKKCKNEIRFKNILPRSIRKKFNLGKFIYLKTNRVCERQINRYEPELLSKDGDVYYDGVFQTEKYFK transferase PLRERLLHDFTLTKPLDAANLDMLAKIRAADAVAVHIRRGDYLNPRSPFTYLDKDYFLNAMDYIGKRVDKPHF [Acetobacter FIFSSDTDWVRTNIQTAYPQTIVEINDEKHGYFDLELMRNCRHNIIANSTFSWWGAWLNTNPDKIVVAPKQ sp. CAG: 267] WFRPDAAEYSGDIVPNDWIKL Dysgonomonas WP_006842165.1 493896281 protein 29.9 MVTVLLSGGLGNQMFQYAAAKSLAIRLNTALSVDLYTFSKKTQATVRPYELGIFNIEDVVETSSLKAKAVIKA 27 mossii; [Dysgonomonas RPFIQRHRSFFQRFGVFTDTYAILYQPTFEALTGGVIMSGYFQNESYFKNISELLRKDFSFKYPLIGENKDVA Dysgonomonas mossii] GQISENQSVAVHIRRGDYLNKNSQSNFAILEKDYYEKAINYISAHVKNPEFYVFSEDFDWIKDNLNFKEFPVT mossii FIDWNKGKDSYIDMQLMSLCKHNIIANSSFSWWSAWLNNSEERKIVAPERWFVDEQKNELLDCFYPQGWI DSM 22836 KI Clostridium sp. WP_021636924.1 545396671 glycosyl- 29.83 >gi|545396671|ref|WP_021636924.1|glycosyltransferase,family 11 28 KLE 1755 transferase, [Clostridium sp. KLE 1755] family 11 MIIIEISGGLGNQMFQYALGQKFISMGKEVKYDLSFYNDRVQTLRQFELDIFDLDCPVASNSELSRFG [Clostridium KGNSLKSRLKQKLGWDKEKIYEENLDLGYQPRIFELDDIYLSGYWQSELYFKDIREQILRLYTFPIQLDYMN sp. KLE 1755] GVFLRKIENSNSVSIHIRRGDYLNENNLKIYGNICTLNYYNKALQIIAKKITNPIIFVFTNDIEWVRKELEI PNMVIVDCNSGKLSYWDMYLMSKCKANIVANSSFSWWGAWLNKNENRIIISPKRWLNNHEQTSTLCDNWIRC GDD Gillisia WP_006988068.1 494045950 alpha-1,2- 29.28 MFISKNTVIIKLVGGLGNQMFQFAIAKIIAEKEKSEVLVDITFYTELTENTKKFPRHFSLGIFNSSFAIASKK 29 limnaea; fucosyl- EIDYFTKLSNFNKFKKKLGLNYPTIFHESSFNFKAQVLELKAPIYLNGYFQSFRYFLGKEYVIRKIFKFPDEA Gillisia transferase LDKDNDNIKRKIIGKTSVSLHIRRGDYVNNKKTQQFHGNCTIDYYQSAIAYLSSKLTDFNLIFFSDDIHWVRQ limnaea [Gillisia QFKNISNQKIYVSGNLNHNSWKDMYLMSLCDHNIIANSSFSWWGAWLNKNPEKIIIAPKRWFADTEQDKNSID DSM 15749 limnaea] LIPSEWYRI Methylotenera YP_003048467.1 253996403 glycosyl 29.19 MLVSRIIGGLGNQMFEYAAARAASLRISVQLKLDLSGFETYDLHAYGLNNFNIVEDVAKKDDYFIGAPESLLK 30 mobilis; transferase KIKKYLRGLIQLESFRESDLSFDSKVLELNDNTYLDGYWQCERYFIDFDKQIRQDFSFKFAPDALNQRYLELI Methylotenera family protein DSVNAVSVHIRRGDYVSNSTTNEIHGVCDLDYYQRAAEFMRARIGPENLHFFVFSDDTDWVKENISFGSDTTF mobilis JLW8 [Methylotenera ISHNDAAKNYEDMRLMSACKHHIIANSSFSWWAAWLNPSKQKVVIAPRQWFKSTLLNSDDIVPASWVRL mobilis JLW8] Runella YP_004658567.1 338214504 glycosyl 29.14 MIIVKLSGGLGNQLFQYAFGRHLATVNQKELKLDTSALTKTSDWTNRSYALDAFNIRAQEATPEEIKALAGK 31 slithyformis; transferase PNRLLQRVGRKVGITPIQYFQEPHFHFYSSALSIKSSHYLEGYWQSEKYFEAITPILREEFAFTISPSTHAQTI Runella family protein KEKISNGTSVSIHLRRGDYVKTSKANRYLRPLTMDYYQKAIDYINQRVKNPNFFLFSDDIKWAKSQVTFPPTTH slithyformis [Runella FSTGTSAHEDLWLMTHCRHHIIANSTFSWWGAWLNQQPDKIVIAPQKWFSTERFDTKDLLPEPWIQL DSM 19594 slithyformis DSM 19594] Pseudo- WP_002958454.1 489048235 alpha-1,2- 29.1 MIKVKAIGGLGNQLFQYATARAIAEKRGDGVVVDMSDFSSYKTHPFCLNKFRCKATYESKPKLINKLLSNEKI 32 alteromonas fucosyl- RNLLQKLGFIKKYYFETQLPFNEDVLLNNSINYLTGYFQSEKYFLSIRECLLDELTLIEDLNIAETAVSKAIK haloplanktis; transferase NAKNSISIHIRRGDYVSNEGANKTHGVCDSDYFKKALNYFSERKLLDEHTELFIFSDDIEWCRNNLSFDYKMN Pseudo- [Pseudo- FVDGSSERPEVDMVLMSQCKHQVISNSTFSWWGAWLNKNDEKVVVAPKEWFKSTDLDSTDIVPNQWIKL alteromonas alteromonas haloplanktis haloplanktis] ANT/505 uncultured EKE06679.1 406985989 glycosyl 28.67 MLTLKLKGGLGNQMFQYAASHNLAKNKKTKINFDLSFFSDIEVRDIKRDYLLDKFNISADISFDQKNSISGFR 33 bacterium transferase KFLVKVISKFFGEVFYYRLKFLSSKYLDGYFQSEKYFKNVEEDIRKDFTLKDEMGVEAKKIEQQIVNSKNSVSL family 11 HIRRGDYVDDLKTNIYHGVCNLDYYKRSIKYLKENFGEINIFVFSDDIAWVKENLAFENLQFVSRPDIKDYEEL [uncultured MLMSKCEHNIIANSSFSWWGAWLNENKNKIIIAPKEWFQKFNINEKHIVPKSWIRL bacterium] Clostridium sp. WP_021636949.1 545396696 glycosyl- 28.57 MVIVQLSGGLGNQMFEYALYLSLKAKGKVVKIDDITCYEGPGTRPKQLDVFGVSYERATKQELTEMTDSSL 34 KLE 1755 transferase, DPVSRIRRKLTGRKTKAYREKDINFDPQVMERDPALLEGCFQSEKYFQDCREQVREAYRFRGIESGAYPLPE family 11 AYRRLEKEIADCKSVSVHIRRGDYLEESHGGLYTGICTEQYYQEAFARMEKEVPGAKFFLFSNDPDWTREHF [Clostridium KGENRILVEGSTEDTGYLDLYLMSKCKHNIIANSSFSWWGAWLNDNPEKKVTAPARWLNGRECRDIYTER sp. KLE 1755] MIRI Francisella WP_004287502.1 490414974 alpha-1,2- 28.57 MKIIKIQGGLGNQMFQYAFYKSLKNNCIDCYVDIKNYDTYKLHYGFELNRIFKNIDLSFARKYHKKEVLGKLFS 35 philomiragia; fucosyl- IIPSKFIVKFNKNYILQKNFAFDKAYFEIDNCYLDGYWQSEKYFKKITKDIYDAFTFEPLDSINFEFLKNIQDY Francisella transferase NLVSIHVRRGDYVNHPLHGGICDLEYYNKAISFIRSKVANVHFLVFSNDILWCKDNLKLDRVTYIDHNRWMDS philomiragia [Francisella YKDMHLMSLCKHNIIANSSFSWWGAWLNQNDDKIVIAPSKWFNDDKINQKDICPNSWVRI subsp. philomiragia] philomiragia ATCC 25015 Pseudomonas WP_017337316.1 515906733 protein 28.52 MVIAHLIGGLGNQMFQYAAARALSSAKKEPLLLDTSSFESYTLHQGFELSKLFAGEMCIARDKDINHVLSW 36 fluorescens; [Pseudomonas QAFPRIRNFLHRPKLAFLRKASLIIEPSFHYWNGIQKAPADCYLMGYWQSERYFQDAAEEIRKDFTFKLNMS Pseudomonas fluorescens] PQNIATADQILNTNAISLHVRRGDYVNNSVYAACTVEYYQAAIQLLSKRVDAPTFFVFSDDIDWVKNNLNIG fluorescens FPHCYVNHNKGSESYNDMRLMSMCQHNIIANSSFSWWGAWLNSNADKIVVAPKQWFINNTNVNDLFP NCIMB 11764 PAWVTL Herbaspirillum WP_008117381.1 495392680 glycosyl 28.48 MIATRLIGGLGNQMFQYAAGRALALRVGSPLLLDVSGFANYELRRYELDGFRIDATAASAQQLARLGVNAT 37 sp. YR522 transferase PGTSLLARVLRKVWPQPADRILREASFTYDARIEQASAPVYLDGYWQSERYFARIRQHLLDEFTLKGDWGS family 11 DNAAMAAQIATAGAGAVSLHVRRGDYVSNAHTAQYHGVCSLDYYRDAVAHIGGRVEAPHFFVFSDDHE [Herbaspirillum WVRENLQIGHPATFVQINSADHGIYDMMLMKSCRHHIIANSSFSWWGAWLNPAEDKIVVAPQRWFKD sp. YR522] ATNDTRDLIPAAWVRL Prevotella WP_008822166.1 496097659 rotein 28.43 MKIVKILGGLGNQMFQYALYLSLKETFPQENVTVDLSCFHGYHLHNGFEIARIFSLHPDKATVMEILRIAYYY 38 histicola; [Prevotella PNYFFWQIGKRVLPQRKTMCTESTKLLFDKSVLQREGDRYFDGYWQDERYFIDCRRTILNTFKFPPFTDDN Prevotella histicola] NLALLKKMDTNSVSIHVRRGDYVGNKLYQGICDLNYYREAIMKISSYISPSMFCVFSNDIEWCRDNLESFIKA histicola PIYYVDWNSGTESYRDMQLMSCCGHNIIANSSFSWWGAWLNQNSSKIVIAPKRWINLKNCGFMLPSRW F0411 VKI Flavobacterium WP_017494954.1 516064371 protein 28.42 MIVVQLIGGLGNQLFQYAAAKALALQTKQKFSLDVSQFESYKLHNYALNHFNVISKNYKKPNRYLRKIKSFY 39 sp. WG21 [Flavobacterium QKNVFYKEVDFGYNPDLIHLKGGIIFLEGYFQSEKYFIKYEKEIREDFELRTPLKKETKAAIAKIESVNSVSI sp. WG21] HIRRGDYINNPLHNTSKEEYYNKALEIVENKINNPVYFVFSDDMEWVKANFSTKQETIFIDFNDASTNFEDLK LMTSCKHNIIANSSFSWWGGWLNKNPDKIVIAPKRWFNDDSINTNDIIPTNWVKI Polaribacter WP_018944517.1 517774309 protein 28.42 MIIVRIVGGLGNQMFQYAYAKALQQKGYQVKIDITKFKKYNLHGGYQLDQFKIDLETSSPIANVLCRIGLRRS 40 franzmannii [Polaribacter VKEKSLLFDEKFLEIPQREYIKGYFQTEKYFSSITPILRKQFIVQKELCNTTLRYLKEITIQKNACSLHIRRG franzmannii] DYISDEKANSVHGTCDLPYYKKSIKRIQDEYKDAHFFIFSDDISWAKKNLTNKNTTFIEHIVMPHEDMHLMSL CKHNITANSSFSWWGAWLNQHENKTVIAPKNWFVNRENEVACANWIQL Polaribacter YP_007670847.1 472321325 glycosyl 28.42 MVVVRILGGLGNQMFQYAYAKSLAEKGYEVQIDISKFKSYKLHGGYHLDKFRIDLETANSSSAFLSKIGLKKT 41 sp. MED152 transferase IKEPNLLFHKDLLKVNNNAFIKGYFQAEQYFSDIREILINQFKIKKELAKSTLAIKNQIELLKTTCSLHVRRG family 11 DYISDKKANKVHGTCDLDYYSSAIEHISKQNSNVHFFVFSDDIAWVKDNLNITNATYIDHNVIPHEDMYLMTL [Polaribacter CNHNITANSSFSWWGAWLNQNPDKIVIAPKNWFVDKENEVACKSWITL sp. MED152] Methanococcus YP_001329558.1 150402264 glycosyl 28.19 MKIIQLKGGLGNQMFQYALYKSLKKRGQEVLLDISWYLKNNAHNGYELEWVFGLSPEYASIRQCFKLGDIPI 42 maripaludis; transferase NLIYNVKRKVFPKKTHFFEKSNFNYDNNVFEVTNGYFEGYWQNENYFKNFRSEILNDFSFKNIDKRNAEFSE Methanococcus family protein YLKSINSVSVHVRRGDYVTNQKALNVHGNICNLEYYNKAINLANNNLKNPKFVIFSDDITWCKSNLGIDDPV maripaludis [Methanococcus YVDWNTGPYSYQDMYLMSNCKNNIIANSSFSWWGAWLNQNTEKKVFSPKKWVNDRNNVNIVPNGWI C7 maripaludis C7] KIK Gallionella WP_018293379.1 517104561 protein 28.15 MIIAHIIGGLGNQMFQYAAGRALSLARGVPFKLDISGFEGYDLHQGFELQRVFNCAAGIASEAEVRDSLGW 43 sp. SCGC [Gallionella QFSSPIRRIVARPSLAVLRRSTFVVEPHFHYWAGIKQVPDNCYLAGYWQSEQYFQSHAAVIRTDFAFKPPLS AAA018-N21 sp. SCGC GQNSKLAMQIAQGNAVSLHIRRGDYANNPKTTATHGLCSLDYYRAAIQHIAERVQSPHFFIFSDDIAWVKS AAA018-N21] NLAINFPHQYVDHNQGTESYNDMRLMSLCQHNIIANSSFSWWGAWLNTNAHKIVIAPKQWFANTTHVA DLIPSSWERL Azospira YP_005026324.1 372486759 Glycosyl 28.04 MQSPACIAGARAWWVGYGMAEAMQPVVVGLSGGLGNQMFQYAAGRALAHRLGHPLSLDLSWFQGR 44 oryzae; transferase GDRHFALAPFHIAASLERAWPRLPPAMQAQLSRLSRRWAPRIMGAPVFREPHFHYVPAFAALAAPVFLEG Dechlorosoma family 11 YWQSERYFRELREPLLQDFSLRQPLPASCQPILAAIGNSDAICVHVRRGDYLSNPVAAKVHGVCPVDYYQQ suillum PS [Dechlorosoma GVAELSASLARPHCFVFSDDPEWVRGSLAFPCPMTVVDVNGPAEAHFDLALMAACQHFVIANSSLSWW suillum PS] GAWLGQAAGKRVIAPSRWFLTSDKDARDLLPPSWERR Prevotella WP_018463017.1 517274199 protein 28 MKIVKIIGGLGNQMFQYALAMALNKNFTDEEVKLDIHCFNGYTKHQGFEIDRVFGNEFELASYRDVAKVAY 45 paludivivens [Prevotella PYFNFQLWRIGSRIFPDRRHMISEDTSFKIMPEVITSHNYKYYDGYWQHEEYFKNIHDEILDAFKFPKFQDER paludivivens] NKALAERLSDSNSISIHIRRGDYLNDELFKGTCGIEYYKKAIEEINERTVPTLFCVFSNDIHWCKENIEPLLN GKETIYVDWNTGSDNYRDMQLMTKCKHNIIANSSFSWWGAWLNNTKDKIVIAPRIWYNTKEKVSPVANSWIKL Gramella YP_860609.1 120434923 alpha-1,2- 27.96 MSNKNPVIVEIMGGLGNQMFQFAVAKLLAEKNSSVLLVDTNFYKEISQNLKDFPRYFSLGIFDISYKMGTEN 46 forsetii; fucosyl- GMVNFKNLSFKNRVSRKLGLNYPKIFKEKSYRFDADLFNKKTPIYLKGYFQSYKYFIGVESKIRQWFEFPYE Gramella transferase NLGVGNEEIKSKILEKTSVSVHIRRGDYVENKKTKEFHGNCSLEYYKNAITYFLDIVKEFNIVFFSDDISWV forsetii [Gramella RDEFKDLPNEKVFVTGNLHENSWKDMYLMSLCDHNIIANSSFSWWAAWLNNNSEKNVIAPKKWFADIDQEQK KT0803 forsetii SLDLLPPSWIRM KT0803] Mariprofundus WP_009849029.1 497534831 alpha-1,2- 27.92 MIIVQFTGGLGNQMFQYALGRRLSLLHDVELKFDLSFYQHDILRDFMLDRFQVNGQVATEKEIEAYTNTPIF 47 ferrooxydans; fucosyl- ALDRPLLDRLVRWGLYRGIVSVSDEPPGKQALMVYNSRVLQAPRNTYVQGYWQSEKYFMPIRQKLLDDFS Mariprofundus transferase LVDKADQANGAMLEKIRQCHSVSLHVRRGDYVSNPLTNHSHGTCGLEYYEKAIALIGSKVDDPHFFVFSDD ferrooxydans [Mariprofundus PEWTRDHLKCRFPMTYVTCNSADSCEWDMELMRHCRDHIIANSSFSWWGAWLNMNPDKVVVAPAA PV-1 ferrooxydans] WFNNFSADTSDLIPDSWVRI Bacillus WP_002174293.1 488102896 protein 27.91 MKIIQVSSGLGNQMFQYALYKKISLNDNDVFLDSSTSYMMYKNQHNGYELERIFHIKPRHAGKEIIDNLSDL 48 cereus; [Bacillus DSELISRIRRKLFGAKKSMYVELKEFEYDPIIFEKKETYFKGYWQNYNYFKDIEQELRKDFVFTEKLDKRNEK Bacillus cereus] LANEIRNKNSVSIHIRRGDYYLNKVYEEKFGNIANLEYYLKAINLVKKKIEDPKFYIFSDDIDWAQKNINLTN cereus VD107 DVVYISHNQGNESYKDMQLMSLCKHNIIANSTFSWWGAFLNNNDDKIVVAPKKWINIKGLEKVELFPENWITY Firmicutes WP_022352106.1 547951299 protein 27.81 MIIIRMTGGLGNQMFQYALYLKLRAMGKEVKMDDFTEYEGREARPLSLWAFGIEYDRASREELCRMTDGF 49 bacterium [Firmicutes LDPVSRIRRKLFGRKSLEYMEKDCNFDPEILNRDPAYLTGYFQSEKYFADIEEEVRQAFRFSERIWEGIPSQL CAG: 534 bacterium LERIRSYEQQIKTTMAVSVHIRRGDYLQNEEAYGGICFERYYKTAIEYVKKRQQDASFFVFTNDPDYAGEWIL CAG: 534] KNFGQEKERFVLIEGTQEENGYLDLYLMSLCRHHILANSSFSWWGAYLNPSREKMVIVPHKWFGNQECRD IYMENMIRIAKEQS Sideroxydans YP_003525501.1 291615344 glycosyl- 27.81 MVISNIIGGLGNQMFQYAAARALSLKLEVPLKLDISGFTNYALHQGFELDRIFGCKIEIASEADVHEILGWQS 50 lithotrophicus; transferase ASGIRRVVSRPGMSIFRRKGFVVEPHFSYWNGIRKITGDCYLAGYWQSEKYFLDAAVEIRKDFSFKLPLDSH Sideroxydans family 11 NAELAEKIDQENAVSLHIRRGDYANNPLTAATHGLCSLDYYRKSIKHIAGQVRNPYFFVFSDDIAWVKDNLEI lithotrophicus [Sideroxydans EFPSQYVDYNHGSMSFNDMRLMSLCKHHIIANSSFSWWGAWLNPNPEKVVIAPERWFANRTDVQDLLP ES-1 lithotrophicus PGWVKL ES-1] zeta WP_018281578.1 517092760 protein [zeta 27.81 MIVSQIIGGLGNQMFQYATGRALSHRLHDTFFLDLDGFSGYQLHQGFELSNVFQCEVNVATRSQMQALLG 51 proteobacterium proteo- WRSFSSVRRLLMKRSLKWARGHRVMIEPHFHYWSRFAEINEGCYLSGYWQSERYFKPIENIIRQDFKFNHL SCGC bacterium LKGVNLDLAQQMTEVNSVSLHVRRGDYASDANTNHTHGLCPLDYYRDAILYIAQNTVAPSFFIFSDDIEWC AB-137-C09 SCGC REHLKLSFPATYIDHNKGSNSYCDMQLMSLCHHHIIANSSFSWWGAWLNTRLDKIVIAPKQWFANGNRT AB-137-C09] DDLIPAEWLVM Pedobacter YP_003090434.1 255530062 glycosyl 27.8 MKIIRFLGGLGNQMFQYAFYKSLQHRFPHVKADLQGYQEYTLHNGFELEHIFNIKVNSVSSFTSDLFYNKK 52 heparinus; transferase WLYRKLRRILNLRNTYIEEKKLFSFDPSLLNNPKSAYYWGYWQNFQYFEHIADDLRKDFQFRAPLSAQNQEV Pedobacter family protein LDQTKLSNSISLHIRRGDYIKDPLLGGLCGPEYYQTAINYITSKVNAARFFIFSDDIDWCIANLKLQDCSFIS heparinus [Pedobacter WNKGTSSYIDMQLMSSCKHHIVANSSFSWWAAWLNPNPDKIVIAPEKWTNDKDINVRMSFPQGWISL DSM 2366 heparinus DSM 2366] Methylophilus WP_018985060.1 517814852 protein 27.78 MFQYAMGLSLAENNQTPLKLDLSQFTDYKLHNGFELSKVFNCSAETASVTQIETLLGICKYSFIRRILKNTYL 53 methylotrophus [Methylophilus KNLRPAQYVVEPFFGYWDGVNFLGDNVYLEGYWQSQKYFIDYESTIRTHFTFKNILSGENLKLSDRIKGSNSV methylotrophus] SLHIRRGDYVTNKNNAFIGTCSLIYYQNAIEYFSTKIADPIFFIFSDDITWAKSNLRLANEHYFVGHNQGEDS HFDMQLMSLCKHHIIANSSFSWWGAWLNPSKDKIIIAPKKWFASGLNDQDLVPKDWLRI Rhodobacterales WP_008033953.1 495309205 alpha-1,2- 27.7 MIYTRIRGGLGNQLFQYSAARSLADYLNVSLGLDTREFDENSPYKMSLNHFNIRADLNPPDLIKHKKDGKIA 54 bacterium fucosyl- YIIDHIKGNQKKVYKEPFLSFDKNLFSNVDGTYLKGYWQSEKYFLRNRKNILSDINLIKKTDKFNTINLKEIK HTCC2255 transferase KSTSISLHIRRGDYLSNESYNETHGICSLSYYTDAVEYIKNRLGENIKVFAFSDDPDWVLENLKLSVDIKIIN [Rhodobacterales NNTSANSFEDLRLMLNCDHNIIANSSFSWWGAWLNQNPEKIVISPKKWYNKKQLQNADIVPSSWLKY bacterium HTCC2255] Spirulina WP_017302658.1 515872075 protein 27.69 MAKIIARIRGGIGNQIFIYAAARRLELINNAELVLDSVSGFVHDLQYRQHYQLDHFHIPCRKATPAERFEPFSR 55 subsalsa [Spirulina VRRYLKRQLNQRLPFEQRRYVIQESIDFDPRLIEFKPRGTVHLEGYWQSEDYFKDIEATIRQDLQIQPPTDPT subsalsa] NLAIVQHIHQHTSVAVHIRFFDQPNADTMNNAPSDYYHRAVEAMETFVPGAHYYLFSDQPEAAKSRIPLP DERVTLVNHNRGNKLAYADLWLMTQCQHFIIANSTFSWWGAWLAENQKKQVIAPGFEKREGVSWWGF KGLLPKQWIKL Vibrio WP_010433911.1 498119755 glycosyl 27.67 MVIVKITGGLGNQLFQYATGSALANKLSCELVLDLSFYPTQTLRKYELAKFNINARVATDREIFLAGGGNDFF 56 cyclitrophicus transferase SKALKKLGLTSIIFPEYIKEQESIKYVGKIDLCKSGAYLDGYWQNPLYFSQNKIELTREFLPRAQLSPSALAW family 11 KDHISQASNSVSLHVRRGDYVENAHTNNIHGTCSLEYYQHAIEKIRSEVHNPVFFVFSDDIEWCKLNLSSLAE [Vibrio VEFVDNTTSAIDDLMLMRQCKHSIIANSTFSWWGAWLKLDGLVIAPRNWFSSASRNLKGIYPKEWHIL cyclitrophicus] Lachnospiraceae WP_022783177.1 551039510 protein 27.65 MRSVVDIKGGYGNQLFCYSFGYAVSKETGSELIIDTSMLDMNNVKDRNYQLGVLGITYDSHISYKYGKDFLS 57 bacterium [Lachnospiraceae RKTGLNRLRKKSAIGFGTVVFKEKEQYVYDPSVFEIKRDTYFDGFWQSSRYFEKYSDDLRKMLKPKKISNAAE NK4A179 bacterium KLAEDARDCLSVSVHIRRGDYVSLGWTLKDDYYIKALDIIKERYGSEPVFFVFSDNKKYADDFFSAAGLKYRL NK4A179] MDYETDDAVRDDMFLMSRCSHNIMANSSYSWWGAFLNDNKDKTVICPETGVWGGDFYPEGWMKVTA SSGK uncultured EKE02186.1 406980610 glycosyl 27.57 MIIVNLYGGLGNQMFQYALGRHLAEKNNTELKLDISAFESYKLRKYELGNLNIIEKFALPEEISRLSTLPTGK 58 bacterium transferase IERFIRKTLRKPVKKPESYIKENITGGFNPKILDLQNNIYLEGYWQSEKYFIEIEDIIRKEFSFKFPATGKNK family protein EILENILNINSVSLHIRRGDYVTNPEVNQVHGVCSLDYYKSCVDFIEKKLESPYFYIFSDDIEWVKNNLQIQS [uncultured QVYYVDHNTVDNAIEDMRLMFSCKHNILANSSFSWWGAWLNSNPDKMVITPRKWFNTTYDSNDLIPERWIKL bacterium] Bacteroides WP_005822375.1 492366053 protein 27.46 MKIGIIYIVTGPYIKFWNEFYSSSQLYFCVEAEKNYEVFTDSSELASQRLPNVHMHLIEDKGWIVNVSSKSKFI 59 fragilis; [Bacteroides CEIRNQLTSYDYIFYLNGNFKFISPIYCDEILPQAEHNYLTALSFSHYLTIHPDHYPYDRNKNCNAFIPYGQGK Bacteroides fragilis] YYFQGGFYGGRTQEVLSLSEWCRDAIEADFNKKVIARFHDESYINRYLLTQHPKVLNDKYAFQDIWPYEGEYKA fragilis IVLNKEEVPEDNNLQEMKQNYIDPSLSFLLNDELKFIPISIVQLYGGLGNQMFGYAFYLYIRHISTQERKLLID HMW 616 PAPCKRYGNHNGYELPSIFSKICQDIHISDETKNNIRKLRKGTSLSIEEVRASMPQSFKEKKQPIIFYSGCWQC VTYVETVKDEIKKDFIFDESKLNEPSAQMLRIIRRSNSVSVHIRRNDYLIGNNEFLYGGICFKSYYEKAISQMY TLLKDEPIFIYFTDDPEWVRSNFALDKSYLVDWNKNKDNWQDMYLMSACRHHIIANSSFSWWAAWLGGF PEKKVIAPSTWLNGMQTPDILPTEWIKIPITPDKKILDRICNHLILHSSYMKQLGLNSGKMGVVIFFFHYARYT QNPLYENYAGDLFDELYEEIHKGISFSFLDGLCGIAWAVEYLVHEQFIEGNTDDSLAEIDFKVMQIDPRRFTD YSFETGLEGIACYVLSRLLSPRVCSSSLTLDSVYLKDLTEACRKVPVDKANYTRLFLNYIESKEVGYSFKDVLM QVLNHSEKAFGSDGLTWQTGLTMIMR Butyrivibrio WP_022778576.1 551034739 protein 27.46 MIIIQLKGGLGNQMFQYALYKELRSRGKEVKIDDVTGFVDDELRTPVLQRFGIEYDRATREEVVKLTDSKMD 60 sp. AE3009 [Butyrivibrio IFSRIRRKLTGRKTCRIDEESGTFNPDILELDEAYLVGYWQSDKYFRNEDVIAQLRQEFQKRPQEIMTDSASW sp. AE3009] ATLQQIECCQSVSLHIRRTDYIDEEHNHIHNLCFEKYYKGAIDRIRSQYPSAVFFIFTDDKEWCRNHFRGPNF FVVELAEKENTDIAEMLLMSSCKHHICANSSFSWWSAWLNDSPEKMVIVPNKWINNRDMDDIYTDRMT KMAI Bacteroides WP_004317929.1 490447027 protein 27.43 MKQTIILSGGLGNQMFQYAFFLSMKAKGKSCSLDTTLFQTNKMHNGFELKSVFDIPDSPNQASALHSLLIK 61 ovatus; [Bacteroides MLRRYKPKSILTIDEPYTFCPDALESKKSFLMGDWLSPKYFESIKDVVVNAYRFHNIGNKNVDTANEMHGN Bacteroides ovatus] NSVSIHIRRGDYLKLPYYCVCNENYYRQAIEQIKDRVDNPIFYVFSNEPSWCDSFMKEFRVNFKIVNWNQG ovatus KDSYQDMYLMTQCKHNIIANSTFSWWGAWLNNNTDKIIVAPSKWFKNSEHNINCKEWLLIDTSK CL02T12C04 Desulfospira WP_022664368.1 550911345 protein 27.42 MGKKYVETVVNGGLGNQIFQFSAGFALSKRLNLDLVLNISTFDSCQKRNFELYTFPKIKNSFACIKDDDPGVF 62 joergensenii [Desulfospira SRLRIPFLNFKEKIKQFHESHFFFDPAFFDIREPVRIEGYFQSYKYFEKYSDQLKDILLDIPLTSRLKTVLKV joergensenii] ISSKKESVSVHIRRGDYISDQGINEVHGTLNEAYYLNSIKLMEKMFPESFFFLFTDDPHYVEENFKFLEDTSC IISDNDCLPYEDMYLMANCHHNIIANSSFSWWGAWLNQNPEKIVIAPRKWFSRKILMEKPVMDLLPDDWILL Lachnospiraceae EOS74299.1 507817890 protein 27.39 MNIIRMTGGLGNQMFQYALFLRLKAQGKEVKFDDRTEYKGEEARPILLWAFGIDYPAAGEEEVNELTDGV 63 bacterium 10-1 C819_03052 MKFSHRLRRKLFGRKSKEYREKSCNFDQQILEKEPAYFTGYFQSERYFEEVKEQVRKAFQFSGKIWGSVSKEL [Lachnospiraceae EERIREYQTKIENKSQMPVSVHIRRGDYLENDEAYGGICFDAYYRKAIEMMEEKFPNTVFYIFSNDTGWAK bacterium 10-1] QWIDHFYKEKSRFIVIEGTTEDTGYLDLFLMSKCRAHIIANSSFSWWGAWLDPDQEKIVIAPSKWVNNQD MKDIYTREMIKISPKGEVR Bacteroides WP_007832461.1 495107639 protein 27.33 MVVVYVGAGLANRMFQYAFALSLREKGLDVFIDEDSFIPRFDFERTKLDSVFVNVNIQRCDKNSFPLVLRED 64 dorei; [Bacteroides RFYKLLKRISEYMSDNRYIERWNLDYLPYIHKKASTNCIFIGFWISYKYFQSSEDAVRKAFTFKPLDSIRNVE Bacteroides dorei] LATKLVTENSVAVHFRKNIDYLKNLPNTCPPSYYYEAINYIKKYVPNPKFYFFSDNWDWVRENIRGVEFTAVD dorei WNPSSGIHSHCDMQLMSLCKHNIIANSTYSWWSAYLNENNNKIVVCPKDWYGGMVKKLDTIIPESWIIING DSM 17855 Firmicutes WP_021916201.1 547127421 protein 27.33 MVIVKMSGGLGNQMFQYALYRKIQQTGKDVKLDLFSFQDKNAFRRFSLDIFPIEYQTANLEECRKLGECSY 65 bacterium [Firmicutes RPVDKIRRKMFGLKESYYQEDLDKGYQPEILEMNPVYLDGYWQCERYFQDIREKILEDYTFPKKISIESSRLQE CAG: 24 bacterium RIKNTESVSIHIRRGDYLDAANYKIYGNICTIEYYQSAISRMRKLCEKPNFYLFSNDPEWAKEIFGDTEDITIV CAG: 24] EEDKERPDYEDMFLMSRCKHNIIANSSFSWWAAWLNQNENKRVIAPVKWFNNHSVTDVICDDWIRIDGDH KGA Clostridium WP_022031822.1 547299420 epsH 27.3 MIYVNIRGRLGNQLFIYAFARALQKSTNQQITLNYTSFRKHYNNTAMDLEQFNIPEDIMFENSKELPWFAN 66 hathewayi [Clostridium TDGKVIRILRHYFPKLIRSILQKMNVLMWLGDEYVEVKVNKRRDIYIDGFWQSSRYFKSVYKELKNELIPKME CAG: 224 hathewayi MSKEIKTMGDLINQKESVCVSVRRGDYVTVKKNRDVYYICDEKYLNTSIMRMVELVPNVTWFIFSDDADW CAG: 224] VKDNIVFPGEVFYQPPRVTPLETLYLMKACKHFIISNSSFSWWGQYLSNNDNKIVIGPAKWYVDGRKTDIIE EEWIKIEV Syntrophus YP_462663.1 85860461 alpha-1,2- 27.3 MVIVRLTGGIGNQMFQYAAARRVSLVNNAPLFLDLGWFQETGSWTPRKYELDAFRIAGESASVGDIKDFK 67 aciditrophicus fucosyl- SRRQNAFFRRLPLFLKKRIFHTRQTHIIEKSYNFDPEILNLQGNVYLDGYWQSEKYFSDVDSEIRREFSFQTDP SB; Syntrophus transferase AERNRKILERIASCESVSIHIRRGDYVTLPDANAFHGLCFPAYYRLAVEQISRKVVEPVFFVFSDDIAWARGNL aciditrophicus [Syntrophus KLGFETCFMDQNGPDRGDEDLRLMIACRHHIIANSSFSWWGAWLCSNPEKIVYAPRKWFNNGLDTPDNI aciditrophicus PASWIRI SB] Bacteroides WP_005678148.1 491931393 protein 27.27 MKIVKIIGGLGNQMFQYALYLSLKKKYPKEKIKIDISMFETYGLHNGFELKRIFDIDAEYASREEIRELSFYIK 68 caccae; [Bacteroides IYKLQRIFRKIFPVRKTECVEKYDFKFMSEVWSNCDRYYEGYWQNWEYFIEAQTEVRSTFTFKKELVGRNAKVI Bacteroides caccae] REIQYAKMPVSLHIRRGDYLHHKLFGGLCDLNYYKKAIDYVLNNYDTPQFYLFSNDIEWCKTYILPLVQGYPFI caccae LVDWNSGVESYIDMQLMSCCRINIIANSSFSWWAAWLNDSSEKIVIAPKLWAHSPYGKEIQLKSWLLF ATCC 43185 Butyrivibrio WP_022756304.1 551011888 protein 27.24 MIIIEMSGGLGNQMFQYALYKSMLHKGLDVTIDKSIYRDVDHKEQVDLDRFPNVSYIEADRKLSSTLRGYGY 69 fibrisolvens [Butyrivibrio NDSIIDKIRNKLNKSKRNLYHEDLDKGYQPEIFEFDNVYLNGYWQCERYFKDIKNEIKKDFIFPCFQSGDDKIK fibrisolvens] ALTIEMESCNSVSLHVRRGDYLKPGLIEIYGNICTEEYYKKSIEYIKERVDNPVFYIFSNDMAWVRDNFKSDDF RYVNEDGAFDGMTDMYLMTRCRHNIVANSSFSWWGAWLNKHDDNIVICPNRWVNTHTVTDIICEDWI RIDV Parabacteroides WP_005857874.1 492476819 protein 27.24 MIVGGNDYCKVKVVNIIGGLGNQMFQYAFALSLKEHFPKEEIRIDISHFNYLFVNKVGAANLHNGYELDKIF 70 distasonis; [Parabacteroides FNIELKKANAWQLMKLTWFIPNYLISRIARKILPVRNSEYIQNSSDCFFYDPMVYNKQGSCYYEGYWQAIGY Parabacteroides distasonis] YESMRDKLCKIFQHPSPEGKNKQYIENMESSNSVGIHIRRGDYLLSDNFRGICEVDYYKRAIDKILQDGEKHV distasonis FYLFSNDQKWCEEYILPLLGNYEIIFVTGNIGRDSCWDMFLMTHCKDLIIANSSFSWWGAFLNKRGGRVVT CL03T12C09 PKRWMNRNIRYDLWMPEWIRI Geobacter YP_001230447.1 148263741 glycosyl 27.21 MIIARLQGGLGNQMFQYAVGLHLALTHNVELKIDITMFSDYKWHTYSLRPFNIRESIATEEEIKALTDVKMD 71 uraniireducens; transferase RPYKKIDNFLCRLLRKSQKISATHVKEKHFHYDPDILKLPDNVYLDGYWQSEKYFKEIENIIRQTFIIKNPQL Geobacter family protein GRDKELACKILSTESVCLHIRRGNYVTDKTTNSVLGPCDLSYYSNCIKSLAGNNKDPHFFVFSNDHEWVSKNL uraniireducens [Geobacter KLDYPTIYVDHNNEDKDYEDLRLMSQCKHHIIANSTFSWWSAWLCSNPDKVIYAPQKWFRVDEYNTKDLLPS Rf4 uraniireducens NWLIL Rf4] Lachnospiraceae WP_016280341.1 511026085 protein 27.21 MIIVKIYEGLGNQLFQYAFARSIQVNGKKVFLDTSGYTDQLFPLCRTSTRRRYQLNCFNIRIKEVEKKNIEKY 72 bacterium A4 [Lachnospiraceae SFLIQEDMFGKLISKLAKLHLWMYKVTIQQNAQEYKESYLNTRGNVYYKGWFQNPKYFSSIRRLLLKEITPKY bacterium A4] KIRIPAELRELLQEDNIVAVHCRRGDYQYIRNCLPVNYYKKAMAYMEKKLGVPRYLFFSDDLSWVKRQFGNKD NNYYIEDYGKFEDYQELMIMSRCRNFIIANSTFSWWAAWLCSYENKVVIMPRVWTYVGGQGVEMSDFPA DWIRI Colwellia YP_270849.1 71282201 alpha-1,2- 27.15 MKVVRVCGGFGNQLFQYAFYLAVKHKFNETTKLDIHDMASYELHNGYELERIFNLNENYCSAEEKLAVQST 73 psychrerythraea; fucosyl- KNIFTKLLKEIKKYTPFIPRTYIKEKKHLHFSYQEVDLGTKDTSIYYRGSWQNPQYFNSIASEIREKLTFPEF Colwellia transferase TEPKSLALHQEISEHETVAVHIRRGDYLKHKALGGICDLPYYQNAIKEIEGLVEKPLFVIFSDDITWCRANIN psychrerythraea [Colwellia VEKVRFVDWNSGEQSFQDMHLMSLCFHNIIANSSFSWWGAWLNANPNKIVISPNKWIHYTDSMGIVPSEWIKV 34H psychrerythraea ETSI 34H] Roseobacter sp. WP_009810150.1 497495952 alpha-1,2- 26.96 MITSRLHGRLGNQMFQYAAARALAHRLGCGVALDGRGAELRGEGVLTRVFDLPLSAAPKLPPLKQHAPLR 74 MED193 fucosyl- YGLWRGLGLAPRFRRERGLGYNTAFETWEDGCYLHGYWQSERYFEEISDLIRADFTFPDFSNRQNAEMAA transferase RIMEDNAISLHVRRGDYVALSAHVLCDQAYYEAALTRLLEGLSQDAPTVYVFSDDPDWAKANLPLPCKKVV [Roseobacter VDFNGPETDFEDMRLMSLCKHNIIGNSSFSWWAAWLNANPQKRVAGPANWFGDPKLSNPDILPSQWLK sp. MED193] VAP Cesiribacter WP_009197396.1 496488826 Glycosyl 26.89 MMIVRLCGGLGNQLFQYAVGKQLSVKNNIPLKIDDSWLRLPDARKYRLQFFQIEEPLASPQEVERFVGPYES 75 andamanensis; transferase QSLYARLYRKVQNMLPRHRRRYFQESGFWAYEPELMRIRSQVFLEGFWQHHAYFTRLHPQVLEALQLREE Cesiribacter family 11 YRQEPYAVLDQIREDAASVSLHIRRGDYVSDPYNLQFFGVMPLSYYQQAVAYMQEQLHAPTFYIFSDDLD andamanensis [Cesiribacter WARAHLKLQAPMVFVDIEGGRKEYLELEAMRLCRHNILANSSFSWWGAYLNTNPHKRVIAPRQWVADPE AMV16 andamanensis] LKDKVQIQMPDWILL Rhodopirellula WP_008679055.1 495954476 glycosyl 26.89 MIATRLIGGLGNQMFQYAYGFSLARRRSERLVLDVSAFESYDLHALAIDQFDISAARMTQAEFARIPGRYRG 76 sallentina; transferase KSRWAERVANFAGGLQSCDKRPLRLRREKPFGFAEKYLAEGSDLYLDGYWQSERYFPGLQAELKKEFQLKR Rhodopirellula family 11 GLSDESSRVLDEIQSSMSVAMHVRRGDYVTNAETLRIYRRLDAEYYRKCLNDLRQRFSNLNVFVFSNDIQW sallentina [Rhodopirellula CQDHLDVGLKQRPVTHNDATTAIEDMFLMSQCDHSIIANSSFSWWAAFLGRSDAQRRVYYPDPWFNPG SM41 sallentina] TLNGDSLGCANWVSESSISVSRPSRAA Butyrivibrio WP_022762282.1 551018054 protein 26.85 MIIIRMMGGLGNQMFQYALYLQLKALGKEVKIDDVYGFRDDPQRDPVLEKMYGITYTKASDAEVVDITDS 77 sp. AD3002 [Butyrivibrio HLDIFSRIRRKLFGRKSHEYIEETGLFDPKVFEFETAYLNGYFQSDKYFPDKEVLAQLRREFVIKPDDVFTSA sp. AD3002] DSWELYRQIRETESVSIHVRRGDYLLPGTVETFGGICDNDYYKRAIDRMVSEHPDAIFFVFTSDKEWCEQNVS GKKFRIVDTKEENDDAADLLLMSLCKHHILANSSYSWWSAWMNDSPEKTVIVPSKWLNTKPMDDIYTSRM TKI Segetibacter WP_018611017.1 517440157 protein 26.78 MVVVKLIGGMGNQMFQYAIGRHLAIKNKCPLYFDHIELENKNTANTPRNYELDIFNVQYQKNPFLQSNRF 78 koreensis [Segetibacter VAKVYHKLFSVQRIKEPDFTFHPHILNVQGNIHLNGYWQNENYFKEIEEIIRQDFTFKTPANEKIESILQQIA koreensis] ATNSVSLHVRRGDYITLTEANQFHGVCSDTYYQKAIAKIKEAIPAPHLFVFSDDIHWVKQNMPFTEEHTFVDG NTGKNSFEDLRLMAACRHNILANSSFSWWAGWLNKNPEKMVIAPEKWFRAVHTDIVPPSWIKM Amphritea WP_019621022.1 518450815 protein 26.76 MVIVRLIGGLGNQLFQYAYALSLLEQGYDVKLDASAFESYTLHGGFGLGEYAERLEVATTEEVDMVSRVGRI 79 japonica [Amphritea STLLRKLQGKKSRRVIKESNFSYDEKMLTPEDSHYLVGYFQSELYFNKIRGELLSALDLKHKLSPYTEASYLA japonica] IADASVSVSMHIRRGDYVSDKAAHNTHGVCSLDYYYAAVTFFEERYPDVDFYIFSDDIEWVKENLNVQRAHYI SSEEKRFAGEDIYLMSQCDHNIVANSSFSWWGAWLNANEDKIVVAPRQWYADSNMQRLSKTLVPDTWI RL Desulfovibrio YP_002437106.1 218887785 glycosyl 26.76 MRPWVDIFGGLGNQMFQYAAAKSLAERLGVRLELDVSMFSGDPLRAFSLGEFAITDHVRGKSRSSLLVRF 80 vulgaris; transferase ARSLGFGSSSKCVEPFFHYWEGINEIEAPVHMHGYWQSEKYFKAYEDLIRRTFSFSACEGVASSGKYAGVSS Desulfovibrio family protein PMSVSVHLRRGDYKEQKNVVVHGILGREYYDAAYSIIKQGCPSACFFVFTDAINEAVDFFSHWNDVLFVDG vulgaris str. [Desulfovibrio NNQYQDMYLMSQCRHHIIANSSYSWWGAWLGAFSDGMTVAPKMWFAYDVLKEKSIKDLFPEDWIVL ′Miyazaki F′ vulgaris str. ′Miyazaki F′] Spirosoma WP_020606886.1 522095677 protein 26.76 MIISRITSGLGNQLFQYAVARHLSLKNKTSLYVDLSYYLYQYHDDTSRNFKLGNFSVPYHTLQQSPVEYVSKA 81 spitsbergense [Spirosoma TKLLPNRSLRPFFLFQKERQFHFDEQILQSRAGCVILEGFWQSEAYFRDNADTIRRDLQLSGTPSPEFNQYRE spitsbergense] LIRETPMSVSIHVRRSDYVNHPEFSQTFGFVGIDYYKRAIELARKELANPRFFVFSDDKEWSKTNLPLGEDSV FVQNTGLNGDVADLVLMSHCQHHIIANSSFSWWGAWLNPNAGKLVITPKNWYKNKPAWNTKDLLPPT WLSI Lachnospiraceae WP_016292012.1 511037988 protein 26.73 MNIIRMSGGIGNQMFQYALYLKLVSLGKEVKFDDVTEYELDNARPIMLSVFGIDYPKASREELVELTDASM 82 bacterium 28-4 [Lachnospiraceae DFLSRVRRKIFGRKSGEYHEASADYDETVLEKEHAYLCGCFQSERYFKDIEYEVREAYRFRNVVVPEEIRGGI bacterium 28- ETYERQIGESLSVSIHIRRGDYLDAADVYGGICFDAYYNQAIRYMIKKYENPSFFVFTNDTFWAEKWCEVRER 4] ETGKRFTVIKGTDEETGYIDLMLMSRCKAHIIANSSFSWWGAWLDASPDKCVVAPVKWINTRECRDIYTED MVRIGSNGKISFSNCSSL Lachnospiraceae WP_016302211.1 511048325 protein 26.71 MVVVRIWEGLGNQLFQYAYARALSLRTKDRVYLDISEYEMSPKPVRKYELCHFKIKQPVINCGRIFPFVNKD 83 bacterium COE1 [Lachnospiraceae SFYTKNNQYLRYFPAGLIKEEDCYFKRDFCELKGLLYLKGWFQSEKYFKEFESHIREEIYPRNKIKITRGLRKI bacterium LNSDNTVSVHIRRGDFGKDHNILPIEYYENSKRVILERVDNPYFIIFSDDILWVKENMNFGLNCFYMDKEYSYK COE1] DYEELMIMSRCKHNIIANSTFSWWGAWLNPSKDKIVIAPKKWFLYNPKKDFDIVPNDWIRV Parabacteroides; WP_005867692.1 492502331 alpha-1,2- 26.69 FutZB MKIVNIIGGLGNQMFQYAFAVALKAKYPNEEVFIDTQHYKNAFIKVYHGNNFYHNGYEIDKVFPNATLEPA 84 Parabacteroides fucosyl- RPKDLMKVSFYIPNQVLARAVRRIFPKRKTEFVTDQQPYVFIPEALSVIDDCYFDGYWMTPLYFDKYRDRILK sp. 20_3; transferase EFTFRPFDTKENLELEPLLKQDNSVTVHIRRGDYVGSSSFGGICTLDYYRNAIREAYNLITSPEFFIFSNDQKW Parabacteroides [Parabacteroides] CMENMRNEFGDAKVHFIAHNRGADSYRDMQLLSIARCNILANSSFSWWGAYLNQRKNCFIICPHKWHN distasonis TLEYSDLYLPTWIKI CL09T03C24 Bacteroides sp. WP_002561428.1 488624717 protein 26.62 MFVIRLIGGVGNQLFQYTFGQFLRHKFGVEVCYDIVAFDTVDKGRNLELQLLDESLPLFETSNFFFSKYKSWK 85 HPS0048 [Bacteroides sp. KRLFLYGFLLKKNNKYYTKYAPEEISLFTEKGLSYFDGWWQYPALLRDTINNMEDFFIPKQPIPVQIQKYYNEI HPS0048] LLNNFAVALHVRRGDYFTSKYAKTYAVCNVEYYTSAVNLMCEKLRSCKFYVFSDDLDWVKSNLILPSNTVYV KNYDINSYWYIYLMSLCRHIIISNSSFSWWGATLNRNFHKIVIAPKYWSTKKNNTLCDNSWIKI Bacteroides WP_016267863.1 511013468 protein 26.58 MKIINILGGLGNQMFEYAMYLALKNAHSEEEILCSTRSFCGYGLHNGYELGRIFGIQVKEASLLQLTKLAYPFF 86 thetaiotaomicron; [Bacteroides NYKSWQVMRHWLPVRKTMTRGAINIPFDYSQVMREDSVYYDGYWQNEKNFLHIREEILTAYTFPKFDDE Bacteroides thetaiotaomicron] KNQELADIIVKSNAVSCHIRRGDYLKEINMCVCTSSYYAHAISYMNEEINPNLYCVFSDDIEWCRNNICELM thetaiotaomicron GEDKKIIFIDWNKGEKSFRDMQLMSLCKHNIIANSSFSWWGAWLNRNDKKIVVAPTRWIASEVKNDPLCD dnLKV9 SWKRIE Desulfovibrio YP_389367.1 78357918 glycosyl 26.56 MKFVGVWILGGLGNQMFQFAAAYALAKRMGGELRLDLSGFKKYPLRSYSLDLFTVDTPLWHGLPMSQRR 87 alaskensis; transferase FRIPMDAWTRGSRLPLVPSPPFVMAKEKNFAFSPIVYELQQSCYLYGYWQSYRYFQDVEDDIRTLFSLSRFA Desulfovibrio [Desulfovibrio TLELAPVVAQLNEVESVAVHLRRGDYITDAASNAVHGVCGIDYYQRSMSLVRRSTTKPIFYIFSDEPEVAKKL alaskensis G20 alaskensis G20] FATEDDVVVMPSRRQEEDLLLMSRCKHHIIANSSFSWWAAWLGKRASGLCIAPRYWFARPKLESTYLFDLI PDEWLLL Prevotella ETD21592.1 564721540 protein 26.56 MDIVVIFNGLGNQMSQYAFYLAKRKSGSRCHCIFHNVSTGFHNGSELDKVFGIKYEKGIFSKLLSKIYDIFDGI 88 oralis HMPREF1199_00667 PKLRKKLNSLGIHIIREPRNYDYTASLLPRVSRWGLNYFVGGWHSEKYYTEILQEIKNTFSFKIDDEIKDIDFY CC98A [Prevotella EFYSLIHNDINSVSLHIRRGDYVGANEYSYFQFGGVATLEYYHKAIDEIYQRIENPTFYVFSDDIGWCKTTFLK oralis CC98A] NNFIFVDCNCGEKSWRDMFLISQCKHHIIANSTFSWWGAWLSIFHNSITICPKEFIKGVVIRDVYPDTWIKLSS Comamonadaceae YP_008680725.1 550990115 glycosyl- 26.54 MASKISKIIPRIFGGLGNQLFIYAAARRLALVNGAELALDDVSGFVRDHEYNRHYQLDHFNIPCRKATAAERL 89 bacterium CR transferase EPFARVRRYLKRKWNQRLPFEQRKYLVQESVDFDERLLTFKPRGTVYLEGYWQSEDYFKDIEPQIRADLRIH [Comamonadaceae PPTDTVNQQMAERIRATNAVAVHVRFFDAPAQSALGVGGNNAPGDYYQRAIKVMQEQAPDAQYYIFSD bacterium QPQAARARIPLRDDHVTLVNHNQCDAVAYADLWLISQCQHFIIANSTFSWWGAWLGKTPESIVIAPGFEK CR] REGAMFWGFRGLLPDRWVKL Vibrio WP_022596860.1 550250577 WblA protein 26.51 MKDSRIVKLNGGLGNQMFQFALAFALKKKLNVAVKFDTELLDTNRTEFKLSLERFGLIVDKLTITEKFKYKGL 90 nigripulchritudo; [Vibrio ESCKYRKICNWISNFTTINIHKGYYKEKERGVYDRGIFDSNVKYIDGYWQNQEYFNDFRSELLNKFNLNGKV Vibrio nigripulchritudo] SNHAIQYLKEITSVQNSVSIHVRRGDYLLLDVYRNLTLDYYSEAIKLVRITNPDSKFFIFSNDINWCKSNFKS nigripulchritudo VDNAIFVDSTVDEFDDMFLMSKCKTNIIANSTFSWWAAWLNNNSGKIVYCPKKWRNDTTEVHKGLPEGWNI AM115; IDK Vibrio nigripulchritudo FTn2; Vibrio nigripulchritudo Pon4; Vibrio nigripulchritudo SO65 Sulfurospirillum YP_003304837.1 268680406 glycosyl 26.48 MIIIKIMGGLTSQMHKYALGRVLSLKYNVPLKLDLTWFDNPKSDTPWEYQLDYFNINATIATVSEIKKLKGN 91 deleyianum; transferase NLFNRIARKIEKFFSIRIYKKSYINKSFISISDFHKLKSDIYLDGEWNGFKYFEDYQDTIKNELTLKRGSSIN Sulfurospirillum family protein IQNTIKELKSSDNSVFLHIRRGDYLSNKNAAAFHAKCSLDYYYKAIQIVKEKIDNPIFYIFSDDILWVKKNFV deleyianum [Sulfurospirillum INESCRFMEKNQNFEDLLLMSYCKHGITANSGFSLMAGWLNQNKDKMIIVPQTWVNDDRININILNSLEQDNF DSM 6946 deleyianum TIIR DSM 6946] Escherichia coli; WP_001581194.1 486318742 glycosyl 26.47 MTFIVRLTGGLGNQMFQYALARSLAKKYNARLKLDISYYHNQPHKDTPRTFELNQLCIVDNILNSSSFSEKFL 92 Escherichia coli transferase 11 YIYDKLRVKLSKKISLPYFRNIVTPVNFNCIDFAEDKDYYFLGHFQELSNIYSIDESLRSEFKPNQEIMNLAHQ Jurua 18/ll; family protein SKIYELIKQSRGSVALHIRRGDYVTNKNAAEHHGVIGLSYYVNALSYLENVSEFFDVFVFSDDPEWARKNIKNS Escherichia coli [Escherichia RNLFFCDEGNCRYSKKYSTIDMYLMSQCDHFIIANSTYSWWAAWLGNYPSKHVVAPARWNANNSPYPIL 180600; coli] QNWKAIHE Escherichia coli P0304777.1; Escherichia coli P0304777.2; Escherichia coli P0304777.3; Escherichia coli P0304777.4; Escherichia coli P0304777.7; Escherichia coli P0304777.9; Escherichia coli P0304777.10; Escherichia coli P0304777.11; Escherichia coli P0304777.12; Escherichia coli P0304777.13; Escherichia coli P0304777.14; Escherichia coli P0304777.15 Firmicutes WP_021914998.1 547109632 protein 26.44 MVGVQLSGGLGNQMFEYALYLKLKSMGKDVRIDDVTCYGAQEKQRVNQLSVFGVSYEHMTKQEYEQITD 93 bacterium [Firmicutes SSMSPLHRARRLLCGRKDLSYREASCNYDPEILRREPALLLGYFQTERYFADIKDQVREAFTFRNLTLTKESAA CAG: 24 bacterium MEQQMKECESVSVHIRRGDYLTPANQALFGGICDLDYYHRAVAEIRKRKPDVKFFLFSNDMEWTKEHFCG CAG: 24] SEFVPVEGNSEQAGEQDLYLMSCCKNHILANSSFSWWGAWLDNGKDKLVIAPEKWMNGRGCCDIYTDE MIRV Amphritea WP_019622926.1 518452719 protein 26.42 MVKIKIIGGLGNQMFQYAAAKSLAVLNNTRVSANVSVFSNYKTHPLRLNKLNCDCEFDFTRDFRLVLSGFPL 94 japonica [Amphritea LGSAFSKKSMLLNHYVEKDLLFDSSFFDLDDNVLLSGYFQSEKYFSNIRELLIQEFSLDDRLTEAELAINNKIE japonica] SCNSIAIHIRRGDYITDLSANNIHGICSEEYFEKALNYLDSINVLSDPTTTLFIFSDDILWCKDNLAFKYRTVF VEGSVDRPEVDIHLMSKCKHQVISNSTFSWWGAWLNTNLDKCVIAPLKWFNSLHDSTDIVPKQWMRL Bacteroides WP_005923045.1 492689153 protein 26.41 MKQTIIMSGGLGNQMFQYALYCSMREKGIRVKIDISLYEFNRMHNGYMLDYAFGLNISHNKINKYSVLWT 95 salyersiae; [Bacteroides RLIRSNRAPFLLFREDESRFCDDVFTTYKPYIDGCWIDERYFFNIKKKIISQFSFHNIDQKNLMVANMMKVCN Bacteroides salyersiae] SVSLHIRRGDYLSQSMYNICNESYYKSAIEYIISRVEDSKFFIFSDDPEWCKYFMEKFNVDYEIIQHNFGKDS salyersiae YKDMYLMTQCKHNIIANSTFSWWGAWLNNNAGKNVVCPSVWINGRDFNPCLEEWYHI WAL 10018 = DSM 18765 = JCM 12988 Bacteroides WP_005786334.1 492241663 protein 26.38 MDIILLHNGLGNQMSQYAFYLSKKKNGIHTSYICLSNDHNGIELDKVFGVECQMGCKKIFLLFILRLLMSNRT 96 fragilis; [Bacteroides GFLIRKVNLLFSKIKIKLITENLDYSFHPSFLSASPYCLAFWVGGWHHPQYYSEISSQIKEAFTFKRSLLDERN Bacteroides fragilis] ICIEKRMREPNSVCLHIRRGDYLTGINYELFGKVCNEQYYQKAIDYIEGKLSDICYYVFSNDMEWAKKILLGKN fragilis AVFVDWNRGEESWKDMYLMSKCSNLIIPNSTFSWWAAWLCEHPVNIVCPKLFVYGDEQSDIYLDNWHKIE CL03T00C08; Bacteroides fragilis CL03T12C07 Bacteroides WP_007486843.1 494751435 protein 26.37 MMGIEKTNMVIVRLWGGIGNQLFQYSFGEFLREKYQVDVIYDIASFGKSDKLRKLELSVVVPGIPVTTDISFS 97 nordii; [Bacteroides KYVGTKNRLLRFIYGLKNSFIEEKYFSDEQLFKYLSKRGDVYLQGYWQKTIYAETLRRKGSFFLSQEEPIVLHT Bacteroides nordii] IKAKIQEAEGAIALHVRRGDYFSSKHINTFGVCDAHYYEKAVDIMRGRVSNAMIFVFSDDLDWVRRYVNLPT nordii NVIYVPNYDIPQYWYIYLMSLCRHNIISNSSFSWWGAFLNMNTNKIVVSPSKWTLNSDKTIALDEWFKI CL02T12C05 Butyrivibrio YP_003829743.1 302669783 glycosyl 26.37 MECSMIIIKFCGALGNQLFQYALYEKMRILGKDVKADISAFGDGNEKRFFYLDELGIEFNIASADEIAEYLNR 98 proteoclasticus; transferase 11 KTIRFVPGFLQHRHYYFEKKPYVYNKKILSYDDCYLEGYWQNYRYFDDIKDELLKHMKFPCLPLEQKKLAEKM Butyrivibrio [Butyrivibrio ENENSVAVHVRMGDYLNLQDLYGGICDADYYDRAFSYIEGNISNPVYYGFSDDVDKASALLAKHKINWIDY proteoclasticus proteoclasticus NSEKGAIYDLILMSKCKNNIIANSSFSWWGAYLEYNNGKVVVSPNRWMNCFENSNIAYWGWISL B316 B316] Prevotella YP_003574648.1 294674032 family 11 26.33 MRIVKVLGGLGNQMFQFALYKALQKQYPEERVLLDLHCFNGYHKHRGFEIDSVFGVTYEKATLKEVASLAY 99 ruminicola; glycosyl PYPNYQCWRIGSRILPVRKTMLKEEPNFTLEPSALSLPDSTYYDGYWQHEEYFMHIREEILSTYAFPAFDDER Prevotella transferase NKTTAQLAASTNSCSIHIRRGDYLTDPLRKGTTNGNYVIAAIKEMQQEVKPEKWLVFSDDIAWCQQHLAST ruminicola 23 [Prevotella LDATNTIYIDWNTGANSIHDMHLMALCRHHIIANSSFSWWGAWLSQQDGITIAPSNWMNLKDVCSPVP ruminicola 23] DNWIKI Prevotella WP_007135533.1 494223898 protein 26.33 MKIIKIIGGLGNQMFQYALAIALQQQYKDEEIRLDLNCFRGYNKHQGYLLDEIFGRRFRAASLQEVARLAWP 100 salivae; [Prevotella YPHYQLWRVGSRVLPRRQTMVCEPADGSFSPDVLTLEGNRYYDGYWQDERYFKAYRKEIIEAFKFSPFVGD Prevotella salivae] GNRHVENMLRNERFASLHVRRGDYLNDALYQNTCGIDYYQRAISQMNAMANPSCYFIFSDDIAWCKTHIE salivae PLCEGHRPYYIDWNKGKEAYRDMQLMALCKYHIIANSSFSWWGAWLNDAEDGITIAPQQWYSHGNKPS DSM 15606 PASESWIKV Lachnospiraceae WP_016299568.1 511045640 protein 26.3 MNIVRISDGLGNQMFQYAYARKISILSRQRTYLDIRFINNEDLVKKGNHVQFRKKLGHRKYGLSHFNVSLQI 101 bacterium COE1 [Lachnospiraceae ADLKMLSHWEYLIQSNCMQQLIYSLSMQDKWIWRYRHEEVNYDGMLSKVELLFPTYYQGYFFALKYYDDI bacterium KHILQHDFSLKDKMKLLPELRDALYNRNTISLHVRRGDFLEINRDISGSEYYEKAVQMIGSKVESPIFLIFSDD COE1] IEWVKEHIRIPNDKIYVSGIGYEDYEELTIMKHCKHNIIANSTFSYWAAYLNSNKDKIVICPKHWRERIIPKDW ICI Bacteroides WP_007842931.1 495118115 alpha-1,2- 26.28 MIVVNVNAGLANQMFHYAFGRGLEAKGWNIYFDQTNFKPRKEWSFENVQLQDAFPNLGLKMMPEGKF 102 dorei; fucosyl- KWICVNNTNKLSKGLHLAMINLHNLIGDEKYIFETTYGYDPDIEKEITKNCILKGFWQSEKYFAHCKDDIRKQ Bacteroides transferase FSFLPFDEEKNIVIMNKMVKENSVAIHLRKGADYLKSELMGKGLCGVEYYIKAIEYIKKNIDNPVFYVFTDNP dorei [Bacteroides VWVKNNLPKFDYILVDWNEVAGKKNFRDMQLMSCAKHNIIANSTYSWWGAWLNPNPNKIVIGPAKFFN 5_1_36/D4 dorei] PINNFFSSSDIMCEDWVKI Roseobacter sp. WP_008210047.1 495485361 alpha-1,2- 26.28 MLSKDPGMITTRLHGRLGNQMFQYAAGRALAARLGVPLALDSRGAKLRGEGVLTRVFDLPLAQPLSLPPLK 103 SK209-2-6 fucosyl- QDAPLRYAAWRLTGRTPRFRREQGLGYNPAFETWGDDSYLHGYWQSEAYFDSIADQIRQDFTFPEFSNSQ transferase NREMAQRIAGSTAISLHVRRGDYVALAAHVLCDQAYYEAALTRILEGVEGSPTVYVFSDDPNWAKENLPLP [Roseobacter CEKVVVDFNGPDTDFEDMRLMSLCQHNIIGNSSFSWWAAWLNTHNEKRVAGPAHWFGNPKLQNPDIL sp. SK209-2-6] PESWLKISV alpha WP_020056701.1 518900826 protein [alpha 26.26 MIYSRIRGGLGNQLFQYCVARSLADNLGTSLGLDVRDFNENSPYLMGLKHFNIRADFNPPGMIEHKKNGYF 104 proteobacterium proteobacterium RYLIDVVNGKQKFVYKEPHLNFDKNIFSLPNSSYLKGYWQTEKYFIKNKVNILNDLKIISHQSDKNKTISSKIA SCGC AAA076-C03 SCGC NNTSVSLHIRRGDYISNSAYNSTHGTCSLAYYTNAVNFLVNKIGGNFKVFAFSDDPEWVSSNLKLPVDICFVKN AAA076-C03] NSSEYNYEDLRLMSECNHNIIANSSFSWWGAWLNTNHNKTVITPCKWYADNSTKNADITPSNWIKI Helicobacter WP_004087499.1 490188900 protein 26.26 MGGGGQDLRLFELMLYNISLPLCFDYKTLVKYFYSNDKSLKYNFPLQYIRYATRSKYHKLYWLALKHYKYFYD 105 bilis; [Helicobacter EDPQGDNIVKMYLNNSLEKHAYPFGYFQNLIYFDEIDSIIREEFCLKIPLKPHNQALKEKIEKTENSVFLHVRL Helicobacter bilis] GDYLKMEATDGGYVRLGKTYYQSALEILKTRLGQPHIFIFSNDIEWCEKNLCNLLDFTGCHIEFVKANGEGNA bilis AEEMELMRACKHAVIANSTFSWWASYLIDNPDKQIIMPTQVFNDTRRIPKSNMLAKKGYILIDPFWGMHS WiWa IV Ralstonia sp. WP_010813809.1 498513378 glycosyl 26.26 MIVTRVIGGLGNQMFQYAAGRALARRLGVPLKIDSSGFADYPLHNYGLHHFALKAVQAGDREIPSGRAEN 106 GA3-3 transferase RWAKALRRFGLGTELRVFRERGFAVDPEVMKLPDGTYLDGYWQSESYFAEMTQELRRDFQIATPPTSENA family protein EWLARIGGDEGAVSIHVRRGDYVTNASANAVHGICSLDYYMRAARYVAENIGVKPTFYVFSDDPDWVAG [Ralstonia sp. NLHLGHETRYVRHNDSARNYEDLRLMSACRHHIIANSTFSWWGAWLNASEKKVVIAPAQWFRDEKYDTR GA3-3] DLLPPTWTKL Bacteroides WP_004303999.1 490431888 protein 26.25 MVVVYIAAGLANKMFQYAFSRGLMSHGLDVFLDQTSFQPEWSFEDIALEEVFPNIEIKNAPNNMFSLAYKK 107 ovatus; [Bacteroides DLLSRIYRRMSAFFPNNRYLMERPFIYDELIYKKATNNCIFCGLWQTELYFNFCERDVRRNFVFTPFQDDQNI Bacteroides ovatus] KLAEKMKNENSVAIHIRKGADYLKRNIWDGTCSVEYYNQAINYLKEHVSNPVFYLFTDNPEWVEENLKNID ovatus YKLVDWNPVSGKQSYRDMQLMSCAKHNIIANSTYSWWGAWLNNNPQKIVVAPKIWFNPKIEKAPYIIPD 3_8_47FAA RWIRL Loktanella WP_019955906.1 518799952 protein 26.23 MIITKLIGGLGNQMFQYAAGRSLAMRHGVPLLLDITELRSYPKHQGYQFEDVFAGRFEIAGLIPLIRVLGRKA 108 vestfoldensis [Loktanella RKVPKTVAVVSPKWPPMGDHVWVRQRTHDYDAAFESIGADCYLSGFWQSEKYFATIAPQIRESFRFKEAL vestfoldensis] TGANAAIASRMKEAPSAAIHIRRGDYVTDKGAHAFHGLCAWDYYDAAIDHISRHEPDARFFVFSDDVVAA QERFANRQRAEVVAVNSGRHSYRDMMLMAQCKHQIIANSTFSWWAAWLNQNPDKIVVAPGTWFSGN DGQIKDIYCKDWIVI Flavobacterium WP_016991189.1 515558304 protein 26.14 MDVVIIFNGLGNQMSQYAFYSQKKKINNSTYFVPFCKDHNGLELETVFSLNTKETLIQKSLYILFRILLTDRLK 109 sp. ACAM 123 [Flavobacterium IVSDPLKWILNLFKCKIVKESFNYNYNPEYLKPSKGITFYYGGWHAEKYFAKENQQIKSVFEFTGDLGKINKEH sp. ACAM 123] VKDIASTNAVSLHVRRGDFMNEANIGLFGGVSTKAYFEGAIKLIATKVDHPHFFVFSNDMDWVKENLSMD TVTYVTCNSGKDSWKDMCLMSLCQHNIIPNSTFSWWGAWLNKNPHKIVVCPSRFLNNDTYTDIYPDSWV KISDY Bacteroides WP_005779407.1 492219620 glycosyl 26.1 MMKLVRMTGGLGNQMFIYAFYIQMKTIFPELRIDMSEMKKYKLHNGYELEDVFSIRPQTISAHKWLKRVIV 110 fragilis; transferase YAFFSIIREKSEEELSIHKYTQHKRWPLVYYKGFFQSELFFKESSDTIRDIFSFNTENANFRTKEWAKIIKEQR Bacteroides family 11 SSVSIHIRRGDYTSAKNKIKYGNICTEEYYQKAISIILKKEPKAFFHIFSDDVEWTKAHLKIHHLPHQYISWNK fragilis [Bacteroides GPDSWQDMMLMSLCRHNIIANSSFSWWGAWLNAYKDKTVIAPSRWSNVKKTPHILPESWISIDI 3_1_12 fragilis] Spirosoma WP_020598002.1 522086793 protein 26.09 MIISRVTSGLGNQLFQYAAARSLSLRNKTAFYVDLSYYLYEYPDDTSRSFKLGFFSVPYRILQESPVEYLSKST 111 panaciterrae [Spirosoma KLFPNRSLRPFFLFLKEKQFHFDPTILQAHAGCVIMEGFWQSECYFRDHAEIIRRELQLSKSPSSEFEGYHQQI panaciterrae] QATPVPVSVHVRRGDYVNHPEFSKTFGFIGLDYYKTAIRHLTKTIKNPHFYVFSDDKEWARANLPLPTDSVFVT NTGPSGDVADLVLMSTCHHHIIANSSFSWWGAWLNPNPDKLVITPKLWYKNQPTWNTKDLLPPTWVSL uncultured EKE06672.1 406985982 glycosyl 26.09 MIITKLTGGLGNQLFQYAIGRNLIYINGSDLKLDVSEYDVSNKGNFRHYALDKFNTIQNFASKKETNNFKFGV 112 bacterium transferase FKKWLYKSGIVKNKNYFLEKKFNFDKEILKIKDNAFLQGYWQSEKYFIGIRDILLQEFSLKENIELKFGEILKE family 11 INESNSVSIHVRRGDYVKNPKNLSFHGVCSPKYYSESTSKIASLIEKPVFFVFSDDIEWVKENLNITFPVVYLS [uncultured GIKNIKSYEELVLMSKCKHNIIANSSFSWWGAWLNTNQKKIVIAPKRWFNDVKLDTTDLIPENWIRI bacterium] Thermo- NP_681784.1 22298537 alpha-1,2- 26.07 MIIVHLCGGLGNQMFQYAAGLAAAHRIGSEVKFDTHWFDATCLHQGLELRRVFGLELPEPSSKDLRKVLGA 113 synechococcus fucosyl- CVHPAVRRLLAGHFLHGLRPKSLVIQPHFHYWTGFEHLPDNVYLEGYWQSERYFSNIADIIRQQFRFVEPLD elongatus; transferase PHNAALMDEMQSGVSVSLHIRRGDYFNNPQMRRVHGVDLSEYYPAAVATMIEKTNAERFYVFSDDPQW Thermo- [Thermo- VLEHLKLPVSYTVVDHNRGAASYRDMQLMSACRHHIIANSTFSWWGAWLNPRPDKVVIAPRHWFNVDV synechococcus synechococcus FDTRDLYCPGWIVL elongatus elongatus BP-1 BP-1] Colwellia WP_019028421.1 517858213 protein 26.03 MKIVKIAGGFGNQLFQYAFYLALDKKYAEQVCLDSLDMAKYRLHNGYELEGIFKLDARYCTEEQRIIVRKDN 114 piezophila [Colwellia NIFTKLLSSLKKKLGNNKNYILEPKQEHFTFHEKSFGQANTPTYYKGYWQDVKYLENIEEELKSSLVFPEFELG piezophila] KNIELANFISSNSSVSLHVRRGDYVQHKAFGGICDLSYYQRAVEQINTLVKDPIFIVFSDDIQWCKDNLNLEK AKFVDWNIGENSFRDMQLMTLCKHNIIANSSFSWWGAWLNANDDKNVICPDKWVHYTSATGVLPSEWI KIKASV Prevotella WP_019966794.1 518810840 protein 26 MKIVKIIGGLGNQMFQYALAIALQERWKDEEIKLDLHGFNGYHKHQGYQLDMLFGHRFEAATLTDVAQLA 115 maculosa [Prevotella WPYPHYQLWRVGSRLLPKRRSMLCEPSKGLLPSDVLKQKGSLYYDGYWQDERYFRAIRPQIMAAFKFPDF maculosa] TDRRNLETEKRLKASEAVSIHVRRGDYLDDVLFQGTCNIAYYQRAIARLCQLKTPVFCIFSNDMAWCKVHIE PLLHGKEILYVDWNRGKESYRDLQLMTLCRHHIIANSSFSWWGAWLSKAEDGITIAPRHWYAHDAKPSPA AERWIKV Salmonella YP_008261369.1 525860034 fucosyl- 25.99 MYSCLSGGLGNQMFQYAAAYILKQYFQSTTLVLDDSYYYSQPKRDTVRSLELNQFNISYDRFSFADEKEKIKL 116 enterica; transferase LRKFKRNPFPKQISEILSIALFGKYALSDRAFYTFETIKNIDKACLFSFYQDADLLNKHKQLILPLFELRDDLL Salmonella [Salmonella DICKNLELYSLIQRSNNTTALHIRRGDYVTNQHAAKYHGVLDISYYNHAMEYVERERGKQNFIIFSDDVRWAQK enterica subsp. enterica subsp. AFLENDNCYVINNSDYDFSAIDMYLMSLCKNNIIANSTYSWWGAWLNKYEDKLVISPKQWFLGNNETSLR enterica serovar enterica serovar NASWITL Worthington str. Cubana str. ATCC CFSAN002050] 9607; Salmonella enterica subsp. enterica serovar Cubana str. CFSAN001083; Salmonella enterica subsp. enterica serovar Cubana str. CFSAN002050; Salmonella enterica subsp. enterica serovar Cubana str. CVM42234 Bacteroides sp. WP_008659600.1 495935021 protein 25.94 MKKVIFSGGLGNQMFQYAFYLFLKKKGIKAVIDNSLYSEFKMHNGFELIKVFDIKESIYRTYFLKVHLIFIKLL 117 3_2_5 [Bacteroides sp. MKIPPVRKLSCKDDVIPIGDHEFDPPYARFYLGYWQSKKIVNYVIEELRAQFIFRNIPQMTIEKGDFLSSINSV 3_2_5] SIHIRRGDYMGIPAYQGICNEIYYERAISFMKEHFLNPRFYVFSNDSIWAKLFLEKFDIDMEIIVTPPIYSYWD MYLMSRCRNHIIANSTFSWWAAVLNINKDKIVISPTIFKKDECIDIIFDDWVKISNI Clostridium sp. WP_022124550.1 547662453 protein 25.86 MIMLQMTGGMGNQMFTYALYRSLRQKGKEVCIEDFTHYDTPEKNCLQTVFHLDYRKADREVYQRLTDSE 118 CAG: 510 [Clostridium sp. PDFLHKVKRKLTGRKEKIYQEKDAIIFEPEVFQTDDVYMIGYFQSGRYFEKAVFDLRKDFTFAWNTFPEKAKK CAG: 510] LREQMQAESSVSLHIRRGDYMNGKFASIYGNICTDAYYEAARRYMKEHFGDCRFYLFTDDAEWGRQQESE DTVYVDASEGAGAYVDMALMSCCRHHIIANSSFSWWGAWLDENPDKTVIAPAKWLNISEGKDIYAGLCN CLIDANGSVQGE Rhodopirellula WP_008665459.1 495940880 glycosyl 25.86 MIVTRLIGGLGNQLFQYAFGHSLARSTYQTLLIDDSAFIDYRLHPLAIDHFTISASRLSDADRSRVPGKFLRTP 119 europaea; transferase VGRALDKVSRFVPGYQGVLPVRREKPFGFRESLLARESDLYLDGYWQSEKFFPGLRGSLREEFQLREQPSETT Rhodopirellula family protein RRLSAQMKSENSVAIHVRRGDYVTSAKAKQIYRTLDADYYRRCLLDLAAHETDLKLYLFSNDVPWCESNLDV europaea SH398 [Rhodopirellula GIPFTPVQHTDGATAHEDLHLIAQCRHVVIANSTFSWWGAYLGQLHPTRRVYYPEPWFHPGTLDGSAMG europaea] CDDWISEASLEEQSSLKSSRRAA uncultured EKD23702.1 406873590 glycosyl 25.82 MIIVKLKGGMGNQMFQYAIGRNLATKLGTQLRLDLTFLLDRSPRKDFVFRDYDLDIFALDVAFAGPTDLKPF 120 bacterium transferase TQFRISHLTKIYNIFPRLLGRPYVISEPHFHFSEAILKSSDNVYLDGYWQSEKYFKEIENSIRDDFKFRQPLE family protein GRAAEMAAQIKNEDRAVCLNVRRADFVTSKKAQEFHGFIGLDYYQKAVDLLVSKVGPLHLFIFSDDVDWCAAN [uncultured LKFNYPTTFVTKDYSGKKYEAYLQLMTLCRHYIIPNSTFAWWGAWLNSDPNKIVIAPKQWFKEASIDTTDIIP bacterium] STWIRL Bacillus WP_000587678.1 446510160 protein 25.74 MIIVKLKGGLGNQMFQYALGKSLALYYDKPLKIDADYIKNNEGYVPRDFSLSKFNIELDLYQEADKERVGFILK 121 cereus; Bacillus [Bacillus NNFLAKKLRNYFLKKGKYKGKYIIENPDNLGLFKKELFENHNESMYIDGYWQSYLYFNNIRECLIKEFNLKPEY cereus AH1271 cereus] TKEMTEIMQRINETNSVAVHIRRGDYVKLGWTLDTTYYKKAIAEIVKNVDNPKFYVFSDDTDWVRSNLQEL DNAVFIGECNLFDYQELWLMSTCKHNIISNSTFSWWGAWLNQNDHQVVVSPSAWINGMSVETTSLIPDS WKRV Firmicutes WP_022499937.1 548309386 protein 25.74 MDIIRMEGGLGNQLFQYALYRQLQFMGRTVKMDVTTEYGREHDRQQMLWAFDVHYEEATQEEINRLTD 122 bacterium [Firmicutes GFMDLPSRIRRKLTGRRTKKYAEADSNFDPQVLLKTPVYLTGYFQSEKYFKDVEGILHTELGFSDRIYDGISEV CAG: 95 bacterium FADQIRNYQKQIRETESVSLHVRRGDYLEHPEIYGMSCFMEYYQAGVRYIRERHPDAEIFVFTNDPVFTEKW CAG: 95] LQENFLGDFTLIQGTSEETGYLDLMLMSQCKHQIMANSSFSWWGAWLNPNKDKIVVAPEPWFGDRNFH DIYTEEMIRISPRGEVKKHG Prevotella oris; WP_004374901.1 490508875 alpha-1,2- 25.74 MIAATLFGGLGNQMFIYATVKALSLHYQVPMAFNLNHGFANDYKYHRKLELCKFNCQLPTAKWITFDYRG 123 Prevotella oris fucosyl- ELNIKRISRRIGRNLLCPNYQFVIEEEPFHYEKRLFEFTNKNIFLEGYWQSPCYFENYSKEIRADFQLKVPLSK F0302 transferase EMLEEIYALKATGKTLVMLGIRRYQEVEGRDICTYKLCDKEYYIKAITYIQERIPNALFVVFTQDKEWATTHLP [Prevotella KGAEFYFVKDKQDEYATVADMFLMTQCTHAIISNSTFYWWGAWLQCFTKNHIVIAPDSFINSDCVCKEWIIL oris] KRNSLC Escherichia AAO37719.1 37528734 fucosyl- 25.73 MYSCLSGGLGNQMFQYAAAYILQRKLKQRSLVLDDSYFLDCSNRDTRRRFELNQFNICYDRLTTSKEKKEISII 124 coli transferase RHVNRYRLPLFVTNSIFGVLLKKNYLPEAKFYEFLNNCKLQVKNGYCLFSYFQDATLIDSHRDMILPLFQINED [Escherichia LLHLCNDLHIYKKVICENANTTSLHIRRGDYITNPHASKFHGVLPMDYYEKAIRYIEDVQGEQVIIVFSDDVK coli] WAENTFANQPNYYVVNNSECEYSAIDMFLMSKCKNNIIANSTYSWWGAWLNTFEDKIVVSPRKWFAGN NKSKLTMDSWINL Leeia oryzae WP_018150480.1 516890767 protein 25.71 MIIVKIIGGLGNQMMQYAFAHACAKRLGVPFKLDITAFESYKLWPYGLHNFEITAPIASLEEIEHAKSMGVIT 125 [Leeia ETSFRFDDSLVSAVKDGMYLDGYWADYRYSESVWGELKPVFTLMDPLTPEQQALAMNLSAPNAVALHVR oryzae] RGDYVTNPNCFLLPQQYYRDAIKLVLDQQPDAVFYCFSDDPDWVEAHLDIPAPKVVVRGQGIDNGFVDMI LMSKARHRIVANSTFSIWASRLADQDGLTIVPSQFFRKDDPWLLQVYGEVLQPCYPPQWRVVDVTGDGK KEAENTSTALLQIAGGDVRGRKLRIGVWGFYEEFYQNNYIFLNKNAPIGHELLKPFNQLYQYGQAHNLEFVT LDLVADLSTLDAVLFFDAPNMRSPLVSSVMQLDIKKYLCLLECELIKPDNWQQSLHELFTRIFTWHDGLVDN HRYIKVNYVTDLMPWIESAQSLTAPFEETARKGYLQKKLICNISGNKLVSHPFELYSKRIEVIRWFESHHPEH FDLYGMGWSASDYPSYKGKIDDKLEVLKGYRFSLCYENAKELPGYITEKIIDCFKAGVVPVYSGAPNIADWIP DNCFIDSGKFPDTDALYTYLISMTEEVHADYLENIRQFFLGGKAYPFSADAFINTITRTIVQDCLFPHERTDV SVVVPNYNHGNFVVSAITSALNQNVSVELLVLDNASTDDSWSQLQFFADYPQVRLIRNRWNIGVQHNWNH ATWLATGRYVVMLSADDLLLPGHLEQAVKRLDENPASSLYYTPCLWINEHDQPLGTLNHPGHLESDYVGG RDEISDLLKFDSYITPSAAVIRRETLNRIGSMNLHLKGAIDWDLWIRIAEISPAFIFRKQPGVCYRQHSGNNSV DFYASTAPLEDHIRIVESIIDRKVAVKYLLKAKEEIIAHLDNRASSYPENQIQHLLSRINNIKDYLRKGAGPVI SVIIPTKNRPGLLANALESLTYQTFKDFEVVIHNDGGCDIGGIVDFFSDQLQISYVRSSQSGGAAASRNRALKL AKGRIIAYLDDDDVYLDSHLEKLVDAYKGRSEKFIYTNCEYLIQERKEGRLIELGRERRYAGISYSRAQLLVSN FIPTPTWSHTKELIDTIGDFDESLEILEDWDFLLRASKVTEFYQVNATTVEVRSDRSRDDHTLRANADKLLAYH QKIYAKHPVENESILANRQSLINSLSNRQDVTPKNENSYQGWVNARQPNELAVQILAERMMLQWSKQYQFMI VMWKQSQQNLLANTIDSFCQQLYSGWKLIVISDFEAPDESFINNEVLGWLTLETVEDENLLTQAFNGVLA EVPSDWVTILPVGTRLTSTALLKVGDRLLLNGGACVIYTDHDYVSDDGMIKDPVLKPAFNLDMLRSQDYIGS SIFFRTDSLAAVGGFASFPGARTYEACFRMLDNYGPQTIEHLPEPVMTFPENQPENSLRVAAMQLALEEHL HRNNISASIEEGYVTGTFLVQYHHSEQPFVSIIIPNKDKHEFLAPCIETLMKVTQYPAFEVIIVDNQSTDPDTL SYYEEIESRFANNVKVIQYDNPFNFSAQCNLGAESARGDFILFLNNDTEIVQANWLERMMQHAQRNDVGV VGARLVFPETVTIQHAGIVLGGKYPDEVFQFPYMNFPVDKDVSLNRTKVVQNYSAVTGACLLVRKSLYQQV GGMNEQNLAVLYGDVDLCLRIRQLHKSVVWTPFSTLVHHTGKTLNSNSDHEKHLMMVIQTRQEREYMLS HWLDIIANDPYYHRLLDKSECNGTIDCTHTPLWDDIPSARPRLQGMALVGGSGEYRVNMPFRTLERSALAE IVLSNMTSKARLPSITELARNAPDVFVVQNALADEFIRMLEMYKKYLPSVFRIQMLDDLLTEIPDASSFKRHF QKNWRDAKARLRKSLKFCDRLIVSTEPLRTFAEDMIDDIIVVPNMLERSVWGDLVSKRRAGKKPRVGWVG AQQHAGDLALMTDVVKATGHEVDWVFQGMCPDDIRPYVAEVNTEWLTYDKYPQGIAALNLDLAIAPLEI NAFNEAKSNLRLLEYGALGWPVICTDIYPYQTNNAPVCRVPNDASAWIEAIRSHIADLDATAQKGDQLRQ WVHDHYMIEDHAQEWLSALTRPAGK Desulfovibrio WP_005984173.1 492830219 Glycosyl 25.68 MFQYAAARALSLRHSASLAADLTWFSQQFDVQTTPREYALPAFRLNLPEADKRIVATFRLNPTELRIVSFLR 126 africanus; transferase HRICFPSRFLPRHITELSFDYWDGFRDILPPAYLDGYWQSERYFSDYPDIIRADFSMLSISEQAAWMSAKIAS Desulfovibrio family 11 VQDSISLHIRRGDYVNSLATRKAHGIDTERYYAKALEWIADRIGAATIFAFSDDPRWVRANFDFGKHKGIVV africanus PCS [Desulfovibrio DGSWTAHEDMHLMSLCSHHIIANSSFSWWGAWLSTSQGITIAPKSWFSNPHIWTPDVCPATWERIPC africanus] Akkermansia WP_022196965.1 547786341 glycosyl 25.66 MAKGKIIVMRLFGGLGNQLFQYAFLFALSRQGGKARLETSSYEHDDKRVCELHHFRVSLPIEGGPPPWAFR 127 muciniphila transferase KSRIPACLRSLFAAPKYPHFREEKRHGFDPGLAAPPRRHTYFKGYFQTEQYFLHCREQLCREFRLKTPLTPEN CAG: 154 family 11 ARILEDIRSCCSISLHIRRTDYLSNPYLSPPPLEYYLRSMAEMEGRLRAADAPQESLRYFIFSDDIEWARQNLR [Akkermansia PALPHVHVDINDGGTGYFDLELMRNCRHHIIANSTFSWWAAWLNEHAEKIVIAPRIWFNREEGDRYHTDD muciniphila ALIPGSWLRI CAG: 154] Dysgonomonas WP_006843524.1 493897667 protein 25.66 MKIVKLQGGLGNQMFQYAIARTLETNKKKDIFLDLSFLRMNNVSTDCFTARDFELSIFPHLRAKKLNSLQEK 128 mossii; [Dysgonomonas FLLSDRVRYKFIRKIANINFHKINQLENEIVGIPFGIKNVYLDGFFQSESYFKHIRFDLIKDFEFPELDTRNEA Dysgonomonas mossii] LKKTIVNNNSVSIHIRRGDYVHLKNANTYHGVLSLEYYLNCIKRIGEETKEQLSFFIFSDDPEYASKSLSFLPN mossii MQIVDWNLGKNSWKDMALMLACKHHIIANSSFSWWGAWLSERNGITYAPVKWFNNESQYNINNIIPSDWVII DSM 22836 Prevotella WP_004372410.1 490506359 glycosyl 25.66 MDIVLIFNGLGNQMSQYAFYMSKKKFVPQSKCMYYKGASNNHNGSELDKLFDIKYSETFFCKLILLLFKLYE 129 oris; transferase, NIPRLRKYFHILGINIVSEPQNYDYNESILKKKTRFGITLYKGGWHSEKYFLANKQDVLNTFSFKIAKEDKNFI Prevotella family 11 DLAKSIEEDTNSVSLHVRRGDYLNISPTDHYQFGGVATTNYYKNAVSYMLKRNKQAHFYIFSDDITWCKAEYK oris [Prevotella DLMPTFIECNKKNKSWRDMLLMSLCTNHINANSTFSWWGAWLSTKNGITICPTEFIHNVVTRDIYPETWV F0302 oris] QL Pseudogul- WP_008952440.1 496239055 glycosyl 25.66 MIIVRLMGGMGNQLFQYATAFALSKRKSEPLVLDTRFFDHYTLHGGYKLDHFNISARILSKEEESLYPNWQA 130 benkiania transferase NLLLRYPIIDRAFKKWHVERQFTYQDRIYRMKRGQALLGYWQSELYFQEYRKEISAEFTLKEQSSVTAQQISV ferrooxidans family 11 AMQGGNSVAVHIRRGDYLSNPSALRTHGICSLGYYNHAMSLLNERINDAQFYIFSDDIAWAKENIKIGKTSK 2002; [Pseudogul- NLIFIEGESVETDFWLMTQSKHHIIANSTFSWWGAWLANNTDEQLVICPSPWFDDKNLSETDLIPKSWIRL Pseudogul- benkiania NKDLPV benkiania ferrooxidans] ferrooxidans Salmonella WP_000286641.1 446208786 protein 25.66 MYSCLSGGLGNQMFQYAAAYILKQYFQSTTLVLDDSYYYSQPKRDTVRSLELNQFNISYDRFSFTDEKEKIKL 131 enterica [Salmonella LRKFKRNPFPKKISEILSIALFGKYALSDSAFYAVETIKNIDKACLFSFYQDADLLNKHKQLILPLFELRDDLL enterica] DICKNLDVYPLILRNNNTTALHIRRGDYLTNQHAAKYHGVLDTSYYNNAMEYVERERGKQNFIIFSDDVKWAQ KAFLGNENCYIVNNGDYDYSAIDMYLMSLCKNNIIANSTYSWWGAWLNKSEDKLVISPKQWFLGNNETSL RNASWIIL Carnobacterium YP_008718688.1 554649642 glycosyl 25.59 MLIVKVYGGIGNQMFQYSFYKYLQKNNDDVFLDISDYKVHNHHNGFELIDVFNIEVKQADMSKFKGHVSS 132 sp. WN1359 transferase KNSIFYRLTSKLFKRNILGYSEFMDSNGISIVRNEKILTDHYFIGFWQDVLYLQSVEEEIKEAFNFKNVAIGK family 11 QNLELISLSESVESVSVHIRKGDYANNSDLSDICDLEYYEEAMKIIDSKVSEPLYFIFSDDIEWCKQKFGKRDN [Carnobacterium LIYVDWNIAKKSYIDMLLMSKCKHNIIANSTFSWWGAWLNNNSKKIVICPKTWDRKKNENHLLLNDWIAI sp. WN1359] Prevotella sp. WP_021964668.1 547227670 protein 25.58 MMKIIVNMACGLANRMFQYSYYLFLMHKGYNVKVDFYNSAKLAHEKVAWNDIFPKARIEQASFSDILKSG 133 CAG: 1185 [Prevotella sp. GGSDVISKIRRKYLPFLSSVVNMPTAFDANLPVENKKLQYIIGVFQNANMVEAVEEDVKRCFKFQPFTDERN CAG: 1185] LKLQNEMQSCESVAIHVRKGKDYAQRIWYQNTCPIEYYQNAIRLISEKVNNPKLYVFTDNPEWVKEHFKDF PYTLVEGNPASGWGSHFDMQLMSVCKHNIISNSTYSWWSAFLNVHNEKIVIGPKVWFNPDSCSEFTSERI LCKDWIAV Selenomonas sp. WP_009645343.1 497331130 glycosyl- 25.58 MFQYAMASSVARRAGEILKLDLSWIRQMEKKLSADDIYGLGIFSFDEKFSTSNEVQKFLPSGKFSAKIYRAVN 134 CM52 transferase, RRMPFSWRRVLEEGGMGWHPQIMEIRRSVYFYMGYWQSEKYFSDFIQEIRKDFTFREEVRQSIEERRPIVE family 11 KIRKSDAVSLHIRRGDYAQNPALGEIFLSFTMQYYIDAARYISERVKTPVFFIFSDDIPWAKENLPLPYEVCYI [Selenomonas DDNIQTNEREIGHKSKGYEDMYLMTQCQHNIIANSSFSWWGAWLNHNPNKIVVAPKKWCNGSFNYADIV sp.CM52] PEQWVKL Bacteroides WP_007486621.1 494751213 protein 25.57 MEIVFIFNGLGNQMSQYALYLSKRNLGCKVRYAYNIRSLSDHNGFELDRVFGITYPNNLFNKCINIIYRLLFAN 135 nordii; [Bacteroides KYLFLVQKMIYVLRQMNVYSIKEKDNYDYDYKILTRHKGIVLYYGGWHSEKYFLSNADIIKDKFRFNISKLNSE Bacteroides nordii] SLVLYHRLSSLNAVALHVRRGDYMAPEHYNVFGCVCGIEYYKAAIQYIQSQILNPVFIVFSNDIEWVKENITGI nordii QMIFVDFNKKENSWMDMCLMSCCEHNIISNSTFSWWGAWLNNNKNKIVVCPKYFMSNIDTKDIYPESW CL02T12C05 IKI Parabacteroides WP_005635503.1 491855386 protein 25.54 MKKKDIILRVWGGVGNQLFIYAFAKVLSLITDCKVTLDIRTGFANDGYKRVYRLGDFSISLLPALRFYTLLSFA 136 merdae; [Parabacteroides QRKMPYIRHLLAYKFDFFEEDQKYPLETLDSFFKIYSDKNLYLQGYWQYFDFSSYRDVLLKDLRFEVEINNTYL Parabacteroides merdae] YYSDLIEKSNAVAIHFRRIQYEPVISIDYYKKAIKYISENVENPTFFIFSDDINWCRENLSINGICFFVENFKD merdae ATCC ELYELKLMSQCNHFIIANSTFSWWGAWLSVNADKKVIMPDGYTDVSMNGSIVHI 43184; Parabacteroides merdae CL09T00C40 Butyrivibrio sp. WP_022768139.1 551024004 protein 25.51 MIIIQLKGGLGNQMFQYALYKELKHRGRDVKIDDESGFIGDKLRVPVLDRFGVEYDRATKDEVIALTDSKMD 137 NC2007 [Butyrivibrio IFSRIRRKLTGRKTFRIDEMEGIFDPKILETENAYLVGYWQSEKYFTSPEVIEQIQEAFGKRPQEIMHDSVSWS sp. NC2007] TLQQIECCESVSIHVRRTDYMDAEHIKIHNLCSEKYYKNAISKIREEHPNAVFFIFTDDKEWCKEHFKGPKFIT VELQEGEFTDVADMLLMSRCKHHIIANSSFSWWSAWLNDSPEKIVIAPSKWINNKKMDDIYTERMTKVAI Bacteroides WP_004302233.1 490430100 protein 25.5 MIVVYSNAGLANRMFHYALYKALEVKGIDVYFDEKSYVPEWSFETTTLMDVFPNIQYRESLQFKRASKKTFL 138 ovatus; [Bacteroides DKIVIHCSNLFGGRYYVNYRFKYDDKLFTKLETNQDLCLIGLWQSEKYFMDVRQEIQKCFQYRSFVDDKNVK Bacteroides ovatus] TAQQMLSENSVAIHVRKGADYQQNRIWKNTCTIDYYRLAIDYIRMHVQNPVFYVFTDNKDWVIENFTDLD ovatus YTLCDWNPTSGKQNYLDMQLMSCAKHNVIANSTYSWWGAWLNENSDKIVIAPKRWFNKIVTPDILPEQ ATCC 8483; WIKI Bacteroides ovatus CL02T12C04 Mesotoga prima YP_006346113.1 389844033 glycosyl 25.5 MRVVWFGGGLGNQMFQYGLYCFLKKNNQEVKADCTQYSTTPMNNGFELERLFNLDIAHANLDVISKLTG 139 MesG1.Ag.4.2; transferase GNRLSPRKVIWKLFRKPKVYFEEKIPFSFDPDVLKGNNRYLKGYWQNMNYLEPCAKELRDVFTFPAFSSDN Mesotoga prima family protein NKRLADEIAKVEAVGVHFRRGDFLKSSNLGLFGGICSDQYYLRAIQTMENTVVEPVFYVFCDDPQWAKNSF [Mesotoga SDARFTVIDWNIGSNSYRDMQLMSLCKHNIIANSTFSWWAAWLNRNPNRTVIAPERMVNRDLDFSGIFP prima NDWIRLQG MesG1.Ag.4.2] Clostridium sp. WP_021639228.1 545399562 glycosyl- 25.49 MIVLKLQGGLGNQMFEYAFARTIQEQKKDKKLILDTSDFQYDKQREYSLGHFILNENIEIDSSGKFNLWYDQ 140 KLE 1755 transferase, RKNPLLKVGFKFWPKFQFQTLKLFGIYVWDYAKYIPVDVSKKHKNILLHGLWQSDKYFSQISEIIRKEFAVKD family 11 EPSQGNKAWLERISSANAVCVHIRRGDFLAKGSVLLTCSNSYYLKAMEIISKKVNEPEFFIFSDDIEDVKKIFE [Clostridium sp. FPGYQITLVNQSNPDYEELRLMSKCKHFIIANSTFSWWSSLLSENEDKVIVAPRLWYSDGRDTSALMRDEWII KLE 1755] IDNE Bacteroides WP_022052991.1 547321746 glycosyl- 25.42 MDLVTLSGGLGNQMFQFAFYWALKKRGKKVFLYKNKLAAKEHNGYELQTLFGVEEKCVDGLWMTRLLGC 141 plebeius transferase PLLGKILKHILFPHKIRERVLYNYSIYLPLFERNGLHWVGYWQSEKYFQDVADDIRRIFCFDHLSLNPATSAA CAG: 211 family 11 LKCMSEQVAVSVHIRRGDYYLPCNVATYGGLCTVEYYENAIRYVKERYPQAVFYVFSDDLDWVRENIPSAGK [Bacteroides MVFVDWNRGKDSWQDMFLMSKCHHNILANSSFSWWGAWLNTHPEKLVIAPERWANCPAPDALPDG plebeius WVRIEGVSRR CAG: 211] Treponema WP_021686002.1 545448980 glycosyl- 25.4 MAIKIVKISGGLGNQMFCYAFACALQKCGHKVYVDTSLYRKATVHSGIDFCHNGLETERLFGIKFDEADTAD 142 lecithinolyticum; transferase, VRRLSTSAEGLLNRIRRKYFTKKTHYIDTVFKYTPELLSDKNDCYLEGYWQTEKYFLPIEKDIRRLFTFRPTL Treponema family 11 SEKSAAVQSALQAQQAAVLSASIHVRRGDFLNTKTLNVCTETYYNNAIKYAVKKHAVSRFYIFSDDIPWCREH lecithinolyticum [Treponema LCFCNAHAVFIDWNTGNDSWQDMALMSMCRCNIIANSSFSWWAAWLNNASDKTVLAPAIWNRRQLEYV ATCC 700332 lecithinolyticum] DRYYGYDYSDIVPESWIRIPID Bacteroides WP_004291980.1 490419682 glycosyl 25.34 MRLIKMTGGLGNQMFIYAFYLRMKKRHTNTRIDLSDMMHYNVHHGYEMHRVFNLPKTEFCINQPLKKVI 143 eggerthii; transferase EFLFFKKIYERKQDPSSLLPFDKKYLWPLLYFKGFYQSERFFADMENDIRIAFTFNSDLFNEKTQAMLTQIKH Bacteroides [Bacteroides NEHAVSLHIRRGDYLEPKHWKTTGSVCQLPYYLNAITEMNKRIEQPSYYVFSDDIAWVKENLPLPQAVFIDW eggerthii eggerthii] NKGAESWQDMMLMSHCRHHIICNSTFSWWGAWLNPRENKTVIMPERWFQHCDTPNIYPDGWIKVPVN DSM 20697 Bacteroides WP_005656005.1 491891563 glycosyl 25.34 MRFIKMTGGLGNQMFIYAFYMRMKKHYSNTRIDLSDMVHYKAHNGYEMHRVFNLPPIEFRINQPLKKVIE 144 stercoris; transferase FLFFKKIYERKQVPSSLVPYDKKYFWPLLYFKGFYQSERFFADMADDIRKAFTFNPRLSNRKTKEMSEQIDHD Bacteroides [Bacteroides ENAVSIHVRRGDYLEPKYWKTTGCVCQLPYYLNAIAEMNKRISQPSYYVFSDDIAWVKENLPLPKAFFIDW stercoris stercoris] NKGAESWQDMMLMSRCRHHIICNSTFSWWGAWLNPRENKTVIMPERWFRHCETPDICPDKWIKVPIN ATCC 43183 QPDSIQ Butyrivibrio YP_003831842.1 302671882 glycosyl 25.34 MIIIQLKGGMGNQMFQYALYRQLKKLGREVKIDDETGFVDDELRIPVLQRFGISYDKATREEIVKLTDSKMD 145 proteoclasticus; transferase 11 IFSRIRRKLTGRKTFRIDEESGIFDPRILEVEDAYLVGYWQSDKYFANEEVEKEIREAFEKRPQEVMQDSVSW Butyrivibrio [Butyrivibrio TILQQIECCESVSLHIRRTDYIDEEHIHIHNICTEKYYKSAIDEVRNQYPSAVFFIFTDDKDWCRQHFRGPNF proteoclasticus proteoclasticus FVVDLDEDTNTDIAEMTLMSRCKHHILANSSFSWWAAWLNDNPGKIVIAPSKWINNRKMDDIYTARMKKIAI B316 B316] Roseobacter sp. WP_008228724.1 495504071 alpha-1,2- 25.34 MSPIVHFPSDRLLRYEHLNSLWKTAMIYTRLLARLGNQMFQYAAGRGLAARLGVDFTVDSRRAVHKGDGV 146 GAI101 fucosyl- LTRVFDLDWAAPENMPPAQHERPLAYYAWRGLRRDPKIYRENGLGYNAAFETLPDNTYLHGYWQCERYF transferase AHIADDIRAAFVPRHPMSAQNADMARRIASGPSVSLHVRRGDYLTVGAHGICDQTYYDAALAAVMQGLP [Roseobacter SPTVYVFSDDPQWAKDNLPLTFEKVVVDFNGPDSDYEDMRLMSLCQHNVIANSSFSWWGAWLNANPQ sp. GAI101] KRVAGPANWFSNPKLSNPDILPSRWIRI Thalassobacter WP_021099615.1 544666256 alpha-1,2- 25.34 MGQDMIYSRIFGGLGNQLFQYATARAVSLRQGVELVLDTRLAPPGSHWAFGLDHFNISARIAEPSELPPSK 147 arenae; fucosyl- DNFFKYVMWRAFGHDPAFMRERGLGYQSRIAQAPDGTYLHGYFQSERYFADVLDHLENELRIVTPPDTRN Thalassobacter transferase, AEYADRIASAGHTVSLHVRRGDYVETSKSNSTHATCDEAYYLRALARLSEGKSDLKVFVFSDDPEWVRDNLK arenae [Thalassobacter LPYDTTPVGHNGPDKPHEDLRLMSCCSDHVIANSTFSWWGGWLDRRPEARVVGPAKWFNNPKLVNPDI DSM 19593 arenae] LPERWIAI Prevotella WP_004377401.1 490511493 protein 25.33 MKIIKIIGGLGNQMFQYALAVALQKKWKDEEIKLDLHGFNGYHKHQGYQLDEIFGHRFKAASLKEVAQLA 148 oris; Prevotella [Prevotella WPYPHYQLWRVGSRLLPKRKTMVCESADCRFQSDLLNLEGSLYYDGYWQDERYFKAFRTEIIEAFKFTPLV oris C735 oris] GDSNRKVENMLKEGRFASLHVRRGDYLKEPLFQSTCDIAYYQRAISRLNQMADPYCYLIFSNDIAWCKTHIE PLCDGRRTHYVDWNHGKESYRDMQLMTFCKHHIIANSSFSWWGAWLSTANDGITIAPHQWYANDRKP SPAAEAWLKL Prevotella WP_004380180.1 490514606 protein 25.33 MKIVRIIGGLGNQMFQYALALALKQQQENEEVKLDLSAFRGYKKHGGFQLVQCFGTTLPAATWQEVAQL 149 oulorum; [Prevotella AWYYPHYQLWRLGHRVLPVCKTMLKEPDNGAFLPEVLQRKGDAYYEGCWQDERYFSHYRPAILQAFTFP Prevotella oulorum] TFTNPRNLAMQQQINTTESVAIHVRRGDYLHDALFRNTCGLAYFQRAITCILQHVAHPVFYVFSDDMAWC oulorum F0390 RQHIQPLLQTNEAVFVDWNHGKASICDLHLMTLCRHHIIANSSFSWWGAWLSPHQAGWIIAPKQWYAH EEKMSPAAERWLKL Spirosoma WP_020596174.1 522084965 protein 25.33 MNRRVAVQLKGGLGNQLFQYALGRRLSLQLEAELLFDCSVLENRIPVTNFTFRSFDLDMFRIAGRVATPSDL 150 panaciterrae [Spirosoma PLFPKSASIRSPWPHLVQLARLWKQGYSYVYERGFAYNPKMLRQLSDRVYLNGYWQSYRYFEDIAATLRAD panaciterrae] CSFPDPLPDSAVGLAGQINATNSICLHIRRTDFLQVPLHQVSNADYVGRAIAYMAERVNDPHFFVFSDDIA WCQTNLRLSYPVVFVPNELAGPKNSLHFRLMRYCKHFITANSTFSWWAAWLSEPSDGKVIVTPQTWFSDS RSIDDLIPANWIRL Butyrivibrio YP_003829826.1 302669866 glycosyl 25.26 MNYVEVKGGLGNQLFQYTFYKYLEKKSGHKVLLHTDFFKNIDSFEEATKRKLGLDRFDCDFVAVSGFISCEKL 151 proteoclasticus; transferase 11 VKESDYKDSMLSQDEVFYSGYWQNKRFFLEVMDDIRKDLLLKDENIQDEVKELAKELRAVDSVAIHFRRGD Butyrivibrio [Butyrivibrio YLSEQNKKIFTSLSVDYYQKAIAQLAERNGADLKGYIFTDEPEYVSGIIDQLGSIDIKLMPVREDYEDLYLMSC proteoclasticus proteoclasticus ARHHIIANSSFSWWGAALGDTESGITIAPAKWYVDGRTPDLYLRNWISI B316 B316] Butyrivibrio sp. WP_022765786.1 551021623 protein 25.26 MIIIQLKGGLGNQMFQYALYKELKHRGREVKIDDVSGFVNDKLRVPVLDRFGVEYERATREEVVELTDSRM 152 XPD2006 [Butyrivibrio DIFSRIRRKLTGRKTYRIDEMEGIFDPAILETENAYLVGYWQSEKYFTSPEVIEQIQEAFGKRPQEIMHDSVS sp. XPD2006] WSTLQQIECCESVSIHVRRTDYVDAEHIKIHNLCSEKYYKNAIGKIREKHPNAVFFIFTDDKEWCKDHFKGPN FITVELQEGEFTDVADMLLMSRCKHHIIANSSFSWWSAWLNDSPEKMVIAPSKWINNKKMDDIYTERMT RVAI Bacteroides sp. WP_008766093.1 496041586 protein 25.24 MKIVNITGGLGNQMFQYAFAMALKYRNPQEEVFVDIQHYNTIFFKKFKGINLHNGYEIDKVFPKAKLPVAG 153 1_1_6 [Bacteroides sp. VRQLMKFSYWIPNYILSRLGRKFLPIRKKEYIPPYSMNYSYDEKALNWKGDGYFEGYWQSYNHFGDIKEELQ 1_1_6] KVYAHPKPNQYNAALISNLESCNSVGIHVRRGDYLAEPEFRGICGLDYYEKGIKEILSDEKKYVFFIFSNDMQ WCQENIAPLVGDNRIVFISGNKGKDSCWDMFLMTHCKDLIIANSSFSWWGAFLNKKVDRVICPKPWLNR DCNIDIYNPSWILVPCYSEDW Bacteroides YP_099857.1 53713865 alpha-1,2- 25.17 MKIVTFQGGLGNQLFQYVFYLWLDMRCDKDNIYGYYPKKGLRAHNGLEIEKVFEVKLPNSSLSTDLIVKSIKL 154 fragilis; fucosyl- INKIFKNRQYISTDGRLDVNGVLFEGFWQDKYFWEDVDIVLNFRWPLKLDVTNSFIMTKIQANNSISIHIRR Bacteroides transferase GDYLLPKYRNIYGDICNEEYYQKAIEYILKCVDDPFFFVFSDDIDWAKSIINVSNVTFVNNNKGKDSYIDMFL fragilis [Bacteroides MSLCHHNIIANSTFSWWAAQLNKHSDKIMIAPIRWFKSLFKDPNIFTESWIRI YCH46 fragilis YCH46] Bacteroides sp. WP_008671843.1 495947264 alpha-1,2- 25.17 MIKIVSFSGGLGNQLFQYLLVVYLRECGHQVYGYYNRKWLIGHNGLEVNNVFDIYLPKTNFIVNALVKVIRV 155 9_1_42FAA fucosyl- LRCLGFKKYVATDTYNNPIAIFYDGYWQDQKYFNIIDSKLSFKKFDLSAENKSILSKIKSNISVALHIRCGDYL transferase SSSNVEIYGGVCTKEYYEKALELVCKIKNVMFFVFSDDIEYAKLLLNLPNAIYVNANVGNSSFIDMYLMANCKV [Bacteroides NVIANSTFSYWAARLNQDNILTIYPKKWYNSKYAVPDIFPSEWVGV sp. 9_1_42FAA] Coraliomargarita WP_022477844.1 548260617 glycosyl 25.17 MIIVKVQGGLGNQMFQYAFGRALSEKHSQDLYLDCSEYLRPSCKREYGLDHFNIRAKKASCGDVKSMVTP 156 sp. CAG: 312 transferase HFALRKKLKKIFAVPYSLSPTHILERNFNFQPSILEFNCGYFDGFWQTQKYFSGISDIVRKDLTFKDAVKYSG family 11 GETFAKITSLNSVSLHIRRGDYVKVKRTRKRFSVIRAGYFKRAVEYMRSKLDTPHFFIFTDDPKWVSENFPAG [Coraliomargarita EDYTLVSSSGMYEDLFLMAQCRHNIIFNSSFSWWGAWLNGNPGKIVVAPDMWFTPHYKLDYSDVVPEEWI sp. CAG: 312] KLNTGYFESKEF Pseudorhodobacter WP_022705649.1 550957292 alpha-1,2- 25.17 MIVMQIKGGLGNQMFQYAAGRALSLQTGMPLHLDLRYYRREREHGYGLGAFNIEASPLDESLLPPLPRESP 157 ferrugineus fucosyl- LAWLIWRLGRRGPNLVRENGMGFNPTLSNVTKPAWITGYFQSERYFAAHAATIRAELTPVAAPDLVNAR transferase WLAEIAAEPRAVSLHVRRGDYVRDAKAAAKHGSCTPAYYERALAHITARMGTAPVVYAFSDDPAWVRENL [Pseudo- RLPAEIRVPGHNDTAGNVEDLRLMSACRHHIVANSSFSWWGAWLNPRADKIVASPARWFADPAFTNPDI rhodobacter WPEAWARIEG ferrugineus] Escherichia coli; YP_002329683.1 215487252 fucosyl- 25.16 MMYCCLSGGLGNQMFQYAAAYILKQHFPDTILVLDDSYYFNQPQKDTIRHLELDQFKIIFDRFSSKDEKVKI 158 Escherichia coli transferase NRLRKHKKIPLLNSFLQFTAIKLCNKYSLNDASYYNPESIKNIDVACLFSFYQDSKLLNEHRDLILPLFEIRD O127: H6 str. [Escherichia DLRVLCHNLQIYSLITDSKNITSIHVRRGDYVNNKHAAKFHGTLSMDYYISAMEYIESECGSQTFIIFTDDVI E2348/69 coli O127: WAKEKFSKYSNCLVADADENKFSVIDMYLMSLCNNNIIANSTYSWWGAWLNRSEDKLVIAPKQWYISGNECSL H6 str. KNENWIAM E2348/69] Lachnospiraceae WP_016359991.1 511537894 protein 25.16 MIIIKVMGGLGNQMQQYALYEKFKSIGKNVKLDISWFEDSSVQEKVFARRSLELRQFKDLQFDTCSAEEKEA 159 bacterium [Lachnospiraceae LLGKSGILGKLERKLIPARNKHFYESDIYHSEVFNMSDAYLEGHWACEKYYHDIMPLLQEKIQFPESANSQNI 3_1_57FAA_CT1 bacterium TVKKRMKAENSVSIHIRRGDYLDPENEAMFGGICTNSYYKAAEEYIKSRVPDTHFYLFSDDTAYLRENYHGD 3_1_57FAA_CT1] EYTIVDWNKGEDSFYDMELMSCCRHNICANSTFSFWGARLNRTPDKIVIRPAKHKNSQEIEPQLLHELWD NWVIIDGDGRIV Butyrivibrio WP_022755397.1 551010878 glycosyl 25.09 MKPLVSLIVPVLNVEKYLEQCLTSISSQTYDNFEVILVVGKCIDNSENICKKWCEKDHRFRIEPQLKSCLGYA 160 fibrisolvens transferase RNVGIDAAKGEYIAFCDSDDCITSDFLSCFVDTALKNSSDIVETQFTLCDQNLSPIYDYDRNILGHILGHGFL [Butyrivibrio EYTSAPSVWKYFVKRDIFTSNNLHYPEIRFGEDISMYSLLFSYCNKIDYVEKPTYLYRQVPSSLMNNPQGKRK fibrisolvens] RYESLFDHDFVTNEFKTRLLFQKSWLKLLFQLEMHSASIISDSATSDDEAISMRQEISGYLKKVFPVKNTIFE VTALGWGGEIVSSIASKFNTLHGVSSSNMFNRYFFELLEDSTRKKLEEMIINFSPDIFLIDLISEADYLSSYK GNLGTFVKNWKIGFSIFMKMIQTHSNNSSIFLLENYMQQAPDHVDNTNEILKMLYDDIKINHPDIICISPAPD ILNRSSEPELPCIYQLKLVSDKLHTMYSPVINCVETKGGLGNQIFQYVFSKYIEKMTGYRPLLHIGFFDYVKA IPGGTKRIFSLDKLFPDIETTSGKIPCSHVVEEKSFISNPGSDIFYRGYWQDIRYFSDVKDEVLESFNVDTSS MSKDVIDFADTIRNANSIAMHIRRQDYLNENNVSLFEQLSIDYYKSAVDMIRKEYADDLVLFIFSDDPEYANS IADSFDIEGFVMPLHKDYEDLYLITLAHHHIIANSTFSLWGALLSARKDGIRIAPRNWFKGTPATNLYPDKWL IL Anaeromusa WP_018702959.1 517532751 protein 25.08 MFCVRIYGGLGNQMFQYALGRAMAKHYSETAAFDLSWYEQKIKPGFEASVCQYNIELSRKDRPKAWYEPI 161 acidaminophila [Anaeromusa LKRISRHTDKLEMWFGLFFEKKYHYDSTVFERGLCKKNITLDGYWQSYKYFSAIEDDLRRELTIPKEREELI acidaminophila] AISRSLPENSVSIHVRRGDYVSNPKANAMHGTCSWEYYQAAIEKMTGLVKEPQYVVFSDDITWTKENLPLPN AMYIGRELGLFDYEELILMSRCKHNIMANSTFSWWGAWLNSNPNKVVIAPRKWFRHKKIKVNDLFPSSWV VL Bacteroides sp. WP_008768986.1 496044479 glycosyl 25.08 MDIVVIFNGLGNQMSQYAFYLAKKKDNLNCHVIFDPKSTNVHNGAELKRVFGIELNRNYLDKIISYFYGYIFN 162 2_1_16 transferase KRIVNKLFSLVGIRMIYEPKNYDYREELLKPSSNFISFYWGGWHSEKYFKDIELEVKKVFKFPEVTNSPYFTEW family 11 FNKIFLDNNSVSIHIRRGDYLDKPSDPYYQFNGVCTIDYYEKAILYLKERILEPNFYIFSNDINWCMKTFGTEN [Bacteroides sp. MYYVDCNKGKDSWRDMYLMSECRHHINANSTFSWWAAWLSPYSNGIVLHPKYFIKDIETKDYYPQKWI 2_1_16] MIE Chlorobium YP_001960319.1 189500849 glycosyl 25.08 MDKVVVHLTGGLGNQMFQYALGRSISINRNCPLLLNTSFYDTYDKFSCGLSRYNVKAEFIKKNSYYNNKYYR 163 phaeobacteroides; transferase YVIRLLSRYGVACYFGSYYEKKIFSYDEKVYKRSCVSYYGTWQSYGYFDSIRDILLRDYEMVGCLEEEVEKYVS Chlorobium family protein DIKRVDSVSLHIRRGDYFDNKRLQSIHGILTMEYYYKAMSLFPDSSVFYVFSDDIEWVRENLITNTNIVYVVLE phaeobacteroides [Chlorobium SDNPENEIYLMSLCKNNIISNSTFSWWGAWLNKNKYKKVIAPRMWYKDNQSSSDLMPSDWCLI BS1 phaeobacteroides BS1] Treponema WP_022932606.1 551312724 protein 25.08 MIVISMGGGLGNQMFEYAFYTQLKHLYPKSEIKVDTKYAFPYSHNGIEVFKIFGLNPPEANWKEVHSLVKTY 164 bryantii [Treponema PIEGNKAHFIKFFLYRILRKANLVEREPTSFCKQKDFTEFYNSFFELPQNKSFYLYGPFVNYNYFAAIHNEIMD bryantii] LYTFPEITDVTNIEYKRKIESSHSISIHIRRGDYITEGVPLVPDAYYREALVYINKKIEDPHFFVFTDDKDYCK SLFSDNQNFTIVEGNTGANSFRDMQLMSLCKHNIIANSTFSFWGAFLNKNSEKIVIAPNIAFKDCSCPYICPDW Bacteroides YP_005110943.1 375358171 LPS 25 IILMVIAKLFGGLGNQMFIYAAAKGIAQISNQKLTFDIYTGFEDDSRFRRVYELKQFNLSVQESRRWMSFRYPL 165 fragilis; biosynthesis GRILRKISRKIGFCIPLVNFKFIVEKKPYHFQNEIMRIASFSSIYLEGYFQSYKYFSKIEAQIREDFKFTKEVI Bacteroides alpha-1,2- GSVEKEASFITNSRYTPVAIGVRRYSEMKGEFGELAVVEHDYYDAAIKYIANKVPNLIFIVFSEDIDWVKKNLK fragilis 638R fucosyl- LDYPVYFVTSKKGELAAIQDMYLMSLCNHHIISNSSFYWWGAYLASTNNHIVIAPSVFLNKDCTPIDWVII transferase [Bacteroides fragilis 638R] Firmicutes WP_022352105.1 547951298 protein 25 MSGGLGNQMFQYALYMKLTAMGREVKFDDINEYRGEKAWPIMLAVFGIEYPRATWDEIVAFTDGSMDFSK 166 bacterium [Firmicutes RLKRLFRGRHPIEYVEQGFYDPKVLSFENMYLKGSFQSQRYFEDILEEVQETFRFPELKDMNLPAPLYETT CAG: 534 bacterium EKYLLRIEGCNAVGLHMYRGDSRSNEELYDGICTEKYYEGAVRFIQDKCPDAKFFIFSNEPKWVKGWVISLM CAG: 534] KSQIREDMSREEIRALEDHFVLIENNTEYTGYLDMFLMSRCRHNIISNSSFSWWAAFINENPDKLVTAPSRW VNGVPSEDVYVKGMTLIDEKGRVERTIKE Firmicutes WP_022368748.1 547971670 glycosyl 25 MVIVKIGDGLGNQMFNYVCGYSVAKHDNDTLLLDTSDVDNSTLRTYDLDKFNIDFTDRESFTNKGFFHKVY 167 bacterium transferase KRLRRSLKYNVIYESRTENCPCVLDVYRRKFIRDKYLHGYFQNLCYFKTCKEDIMRQFTPKEPFSAKADELIHR CAG: 882 family 11 FATENTCSVHVRGGDIKPLSIKYYKDALDKIGEAKKDMRFIVFSNVRNLAEEYIKELGVDAEFIWDLGEFTDIE [Firmicutes ELFLMKACRRHILSDSTFSRWAALLDEKSEEVFVPFSPDADKIYMPEWIMEEYDGNEEKR bacterium CAG: 882] Vibrio WP_005496882.1 491639353 glycosyl 25 MVIVKVSGGLGNQLFQYAIGCAISNRLSCELLLDTSFYPKQSLRKYELDKFNIKAKVATQKEVFSCGGGDDLL 168 parahaemolyticus; transferase SRFLRKLNLSSLFFPNYIKEKESLVYLAEISHCKSGSFLDGYWQNPQYFSDIKDELVKQIVPIMPLSSPALEWQ Vibrio family 11 NIIINTKNCVSLHVRRGDYVNNAHTNSVHGVCDLSYYREAITNIHETVEKPKFFVFSDDISWCKDNLGSLGHF parahaemolyticus [Vibrio TYVDNTLSAIDDLMLMSFCEHHIIANSTFSWWGAWLNDHGITIAPKRWFSSVERNNKDLFPEKWLIL 10329; Vibrio parahaemo- parahaemolyticus lyticus] 10296; Vibrio parahaemolyticus 12310; Vibrio parahaemolyticus 10290 Herbaspirillum WP_006463714.1 493509348 glycosyl 24.92 MIVSRLIGGLGNQMFQYAAGRALALRRGVPFAIDSRAFADYKTHAFGMQCFCADQTEAPSRLLPNPPAEG 169 frisingense; transferase RLQRLLRRFLPNPLRVYTEKTFTFDEAVLSLPDGIYLDGYWQSEKYFADFADDIRKDFAVKAAPSAPNQAWL Herbaspirillum family protein ELIGRTHSVSLHIRRGDYVSNAAAAAVHGTCDLGYYERAVAHLHQVTGQAPELFVFSDDLDWVATNLQLP frisingense [Herbaspirillum YTMHLVRDNDAATNFEDLRLMTACRHHIVANSSFSWWGAWLDGRSESITIAPARWFVADTPDARDLVP GSF30 frisingense] QRWVRL Rhizobium sp. WP_007759661.1 495034125 Glycosyl 24.92 MIITRILGGLGNQMFQYAAGRALAIANEAELKLDLIEMGAYKLRPFALDQFNIKAAIAQPDEVPAKPKRGLL 170 CF080 transferase RKFTSAFKPDRSSCERIVENGLTFDSRVPALRGSLHLSGYWQSEQYFASSADAIRSDFSLKSPLGPARQDVLA family 11 RIGAATTPVSIHVRRGDYVTNPSANAVHGTCEPPWYHEAMRRMLDRAGDASFFVFSDEPQWARDNLQS [Rhizobium sp. SRPMVFIEPQNNGRDGEDMHLMAACHAHIIANSSFSWWGAWLNPRPNKHVIAPRQWFRAPDKDDRDI CF080] VPATWERL Verrucomicrobium WP_009959380.1 497645196 glycosyl 24.85 MVISHISEGLGNQMFQYAAGRRLSYHLGTTLKLDDYHYRLHPFRSFQLDRFLITSPIATDAEISHLCPLEGLAR 171 spinosum transferase, AIRARLPGKLRGATLRLLGNLGLGSPYQPRLHSFKEETPKQPLLIGKVVSERHFHFDPDVLECPDNVCLVGY family 11 WQDERYFGEIRDILLRELTLKSPPAGATKAVLERIQRSSSVSLHVRRGDKTKSSSYHCFSLEYCLAAMSEMRA [Verrucomicrobium RLQAPTFFVFSDDWDWVREQIPCSSSVIHVDHNRAEDVSEDFRLMKSCDHHIIASSSLSWWAAWLGTNE spinosum] NSFVFSPPADRWLNFSNHFTADVLPPHWIQLDGSSLLPAQ Fibrella YP_007319049.1 436833833 glycosyl 24.83 MTANRVLVNSPMVIAKITSGLGNQLFQYALGRHLALQGNTSLWFDLRYFHQEYATDTPRKFKLDRFNVRY 172 aestuarina; transferase NLLDSSPWLYASKATRLLPGRSLRPLIDTRFEADFHFDPTVIRPAAPLTILWGFWQSEKYFAQSTPQIRQELTF Fibrella family 11 NRPLSDTFVGYQQQIEQAEVPISVHVRRGDYVTHPEFSQSFGFVGLAYYQKALAHLQDLFPNATLFFFSDDP aestuarina [Fibrella DWVRANIVTEQPHVFVQNSGPDADVDDLQLMSLCHHHVIANSSFSWWGAWLNPRPDKVVIGPQRWF BUZ 2 aestuarina ANKPWDTKDLLPSGWLRL BUZ 2] Rhodobacter sp. WP_023665745.1 563380195 alpha-1,2- 24.83 MIHMRLVGGLGNQLFQYACGRAVALRHGTELVLDTRELSRGAAHAVFGLDHFAIRARMGASADLPPPRS 173 CACIA14H1 fucosyl- RVLAYGLWRAGFMAPRFLRERGLGVNPAVLAAGDGTYLHGYFQSEAYFRDVVPQIRPELEIVTPPSDDNLR transferase WASRIAGDDRAVSLHVRRGDYVASAKGQQVHGTCDADYYARAVAAIRARAGIDPRLYVFSDDPHWARD [Rhodobacter NLALDAETVVLDHNPPGAAVEDMRLMGVCRHHIIANSSFSWWGAWRNPSAGKVVVAPVRWFADPKLH sp. CACIA14H1] NPDICPPEWLRV Rhodopirellula NP_868779.1 32475785 fucosyl 24.83 MATSAHLHLSDEKQTLDSKASDRDCATTEASASDKTCTISISGGLGNQMLQYAAGRALSIHHDCSLQLDLKF 174 baltica transferase YSSKRHRSYELDAFPIQAHRSIKPSFFSQILSKIQSESKHVPTYQEQSKRFDPAFFNTEPPVKIRGYFFSEKYF SH 1; [Rhodopirellula SPYADQIRTELTPPIPPDQPARDMAIRLKECVSTSLHVRRGDYVTNANARQRFWCCTSEYFEAAIERLPTDSTV Rhodopirellula baltica SH 1] FVFSDDIEWAKQNIRSSRTTVYVNDELKKAGSPETGLRDLWLMTHAKSHIIANSSFSWWGAWLANSEANL baltica TIAPKKWFNDPEIDDSDIVPSSWHRI Spirosoma WP_020604054.1 522092845 protein 24.83 MVVVELMGGLGNQMFQYAFGMQLAHQRQDTLTVSTFLLSNKLLANLRNYTYRPFELCIFGIDKPKASPFN 175 spitsbergense [Spirosoma LLRALLPFDLNTSLLRETDDPEAVIPAASARIVCVGYWQSEHYFEEVTVHVREKFIFRQPFNSFTSRLANNLN spitsbergense] GIPNSVFVHIRRGDYVTNKGANAHHGLCDRTYYERAVTFMREHLENPLFFIFSDDLEWVSQELGPILEPATY VGGNQKNDSWQDMYLMSLCRHAIVANSSFSWWGAWLSPHASKIVVAPKEWFGKPLLPVKTNDLIPNS WIRI uncultured EKD71402.1 406938106 protein 24.76 MNAIIPRLTGGIGNQLFIYAAARRMAIANSMNLVIDDTSGFKYDVLYKRFYQLEKFNITSRMATPTERLEPFS 176 bacterium ACD_46C00193 KIRRYLKRKINKTYPFAQRAYITQEKSGFDPRLLVFRPKGNVYLDGYWQSENYFKDIEGIIRQDLIIKSPSDS G0003 LNIATAERIKNTLAIAVHVRFFDMVDISDSSNCQSNYYHTAIAKMEEKIPNAHYFIFSDKPVLARLAMPLPDD [uncultured RITIIDHNIGDMNAYADLWLMSLCKHFVIANSTFSWWGAWLSDNKEKIVIAPDIKITSGVTQWGFDGLIPDEW bacterium] IKL Prevotella WP_006950883.1 494008437 protein 24.75 MDVIVIFNGLGNQMSQYAFYLEKRLRNRQTTYFVLNPRSTYELERLFGIPYRSNLMCRMIYKLLDKAYFSNHI 177 micans; [Prevotella RLKKILRTALNAVGIRLIVEPITRNYSLSNFTHHPGLTFYRGGWHSELNFTSVVTELRRKFIFPPSDDEEFKRI Prevotella micans] SALIIRTQSISLHIRRGDYLDYSEYQGVCTEEYYERAIEYIRSHVENPVFFVFSDDKEYAINKFSGDDSFRIVD micans F0438 FNTGENSWRDMQLMSLCRHHILANSTFSWWGAWLDSAPEKIVLHPIYHMRDVPTRDFYPHNWIGISGE Thermo- AHB87954.1 564737556 alpha-1,2- 24.75 MIIVRLYGGLGNQMFQYAAGLALSLRHAVPLRFDLDWFDGVRLHQGLELHRVFDLDLPRAAPSEMRQVL 178 synechococcus fucosyl- GSFSHPLVRRLLVRRRLRWLLPQGYALEPHFHYWPGFEALGPKAYLDGYWQSERYFSEYQDAVRAAFRFA sp. NK55 transferase QPLDERNRQIVEEMAACESVSLHVRRGDFVQDPVVRRVHGVDLSAYYPRAVALLMERMREPRFYVFSDD [Thermo- PDWVRANLKLPAPMIVIDHNRGEHSFRDMQLMSACRHHILANSSFSWWGAWLNSQPHKLVIAPKRWF synechococcus NVDDFDTRDLYCSGWTVL sp. NK55] Coleofasciculus WP_006100814.1 493031416 Glycosyl 24.73 MLSLNKNFLFVHIPKSCILKEVYIYMISFPNLGKGVRLGNQMFQYAFLRSTARRLGVKFYCPAWSGDSLFTLN 179 chthonoplastes; transferase DQEERVSQPEGITKQYRQGLNPGFSENALSIQDGTEISGYFQSDKYYDNPDLVRQWFSLKEEKIASIRDRFSR Coleofasciculus family 11 LNFANSVGMHLRFGDVVGQLKRPPMRRSYYKKALSYIPNQELILVFSDEPERTKKMLDGLSGNFLFLSGHK chthonoplastes [Coleofasciculus NYEDLYLMTKCQHFICSYSTFSWWGAWLGGERERTVIYPKEGQYRPGYGRKAEGVSCESWIEVQSLRGFL PCC 7420 chthonoplastes] DDYRLVSRLEKRLPKSLMNFFY Bacteroides WP_018666797.1 517496220 glycosyl 24.66 MRLIKMTGGLGNQMFIYAFYLRMKKRHTNTRIDLSDMMHYNVHHGYEMHHVFNLPKTEFCINQPLKKVI 180 gallinarum transferase EFLFFKKIYERKQDSSNLLPFDKKYFWPLLYFKGFYQSERFFADMENDIRKAFTFNSGLFNEKTQTMLKQIEH [Bacteroides NEHAVSLHVRRGDYLEPKHWKTTGSVCQLPYYINAIAEMNRRIEQPFYYVFSDDIAWVKENLPLPQAVFID gallinarum] WNKGVESWQDMMLMSHCRHHIICNSTFSWWGAWLNPKENKTVIMPERWFQHCETPNIYPAGWIKVP IN Firmicutes WP_022367483.1 547967507 glycosyl 24.66 MNNVEIMGGLGKQLFQYAFSRYLQKLGVKNVVLRKDFFTIQFPENNGITKREFVLDKYNTRYVAAAGEKTY 181 bacterium transferase RDYCDENDYRDDYAIGSDEVLYEGYWQNIDFYNVVRKEMQEELKLKPEFIDNSMAAVEKDMSSCNSVALH CAG: 882 GT11 family IRRSDYLTQVNAQIFEQLTQDYYASAVSIIEQYTHEKPVLYIFSDDPEYAAENMKDFMGCRTVIMPPCEPYQ [Firmicutes DMYLMTRAKHNIIANSTFSWWGATLNANPDNITVAPSRWMKGRTVNLYHKDWITL bacterium CAG: 882] Bacteroides WP_008021494.1 495296741 protein 24.6 MIAVNVNAGLANQMFHYAFGRGLMAKGLDVCFDQSNFKPRSQWAFELVRLQDAFPSIDIKVMPEGHFK 182 xylanisolvens; [Bacteroides WVFPSLPRNGLERRFQEFMKKWHNFIGDEVYIDEPMYGYVPDMEKCATRNCIYKGFWQSEKYFRHCEDD Bacteroides xylanisolvens] IRKQFTFLPFDELKNIEVAAKMSQENSVAIHLRKGDDYMQSELMGKGLCTVDYYMKAIDYMRKHINNPHF xylanisolvens YVFTDNPCWVKDNLPEFEYILVDWNEVSGKRNFRDMQLMSCAKHNIIGNSTYSWWAAWLNANQDKIVV CL03T12C04 GPKRFFNPINSFFSTCDIMCEDWISL Geobacter sp. YP_004197726.1 322418503 glycosyl 24.58 MIGMVIFRAYNGLGNQMFQYALGRHLALLNEAELKIDTTAFADDPLREYELHRLKVQGSIATPDEIAFFRE 183 M18 transferase MENTHPQAYLRLTQKSRLFDPAILSARGNIYLHGFWQTEKYFADIREILLDEFEPIVPAGEDSIKVLSHMK family protein ATNAVALHVRRSDYVSNPMTLRHHGVLPLDYYREAVRRIAGMVPDPVFFIFSDDPQWAKDNIRLEYPAFCV [Geobacter sp. DAHDASNGHEDLRLMRNCKHFIIANSSFSWWGAWLSQNTGKKVVAPLKWFAKPEIDTRDIVPLQWIRI M18] Ruegeria YP_168587.1 56698215 alpha-1,2- 24.57 MITTRLHGRLGNQMFQYAAARGLAARLGTQVALDTRLAESRGEGVLTRVFDLDLAQPDQLPPLKGDGLLR 184 pomeroyi fucosyl- HGAWRLLGLAPRFRREHGLGYNAAIETWDDGTYLHGYWQSERYFAHIAARIRADFAFPAFSNSQNAEMA DSS-3 transferase, ARIGDTDAISLHVRRGDYVALAAHTLCDQRYYAAALTRLLEGVAGDPVVYLFSDDPAWARDNLALPVQKV [Ruegeria VVDFNGPETDFEDMRLMSLCRHNIIGNSSFSWWAAWLNAHPGKRIAAPASWFGDAKLHNPDLLPPDWL pomeroyi KIEV DSS-3] Lachnospiraceae WP_016291997.1 511037973 protein 24.52 MIIIQLAGGLGNQMQQYAMYQKLLSLGKKVKLDISWFEEKNRQKNVYARRELELNYFKKAEYEACFEEERK 185 bacterium 28-4 [Lachnospiraceae ALVGEGGFAGKIKGKLFPGTRKIFRETEMYHPEIFDFEDRYLYGYFACEKYYADIMEILQEQFVFPPSGNPEN bacterium 28- QKMAERIADGESVSLHIRRGDYLDAENMAMFGNICFEEYYAGAIREMKKIYPSAHFFVFSDDIPYAKETYSG 4] EEFTVVDINRGKDSFFDIWLMSGCRHNICANSTFSFWGARLNRNKGKVVMRPFIHKNSQKFEPELMHEL WKGWVFIDNRGNIC Prevotella sp. WP_021989703.1 547254188 glycosyl 24.49 MRILVFTGGLGNQMFEYAFYKHLKSCFPKESFYGHYGVKLKEHYGLEINKWFDVTLPPAKWWTLPVVGLFY 186 CAG: 1092 transferase LYKKLVPNSKWLDLFQREWKHKDAKVFFPFKFTKQYFPKENGWLKWKVDEASLCEKNKKLLQVIHDEETCF family 11 VHVRRGDYLASNFKSIFEGCCTLDYYKRALEYMNKNNPKVRFICFSDDLEWMRKNLPMDDSAIYVDWNTG [Prevotella sp. TDSPLDMYMMSQCDNGIIANSSFSYWGAYLGGKKTTVIYPQKWWNMEGGNPNIFMDEWLGM CAG: 1092] Spirosoma luteum WP_018618567.1 517447743 protein 24.41 MVISVLSGGLGNQLFQYAFGLKLAAQLQTELRLERHLLESKAIARLRQYTPRTYELDTFGVEAPAASLMDTVS 187 [Spirosoma CLSRVALSDKTALLLRESTLTPNAINNLNNRVRDVVCLGYWQSEEYFRPATEQLRKHLVFRKNPAQSRSMA luteum] DTILSCQNAAFVHIRRGDYVTNTHANQHHGLCDVSYYRRACEYVKECIPDVQFFVFSDDPDWAKRELGIHL QPARFIDHNRGADSWQDMYLMSLCRHAIVANSSFSWWGAWLNPVAERLVVAPGQWFVNQPVLSQQII PPHWHCL Marinomonas YP_004480472.1 333906886 glycosyl 24.34 MIIVDLSGGLGNQMFQYACARSLSIELNLPLKVVYGSLASQTVHNGYELNRVFGLDLEFATENDMQKNLGF 188 posidonica transferase FLSKPILRKIFSKKPLNNLKFQNFFPENSFNYNSSLFSYIKDSGFLQGYWQTEKYFLNHKSQILKDFCFVNMD IVIA-Po- family protein DETNISIANDIQSGHSISIHVRRGDYLTNLKAKAIHGHCSLDYYLKAIEFLQEKIGESRLFIFSDDPEWVSEN 181; [Marinomonas IATRFSDVSVIQHNRGVKSFNDMRLMSMCDHHIIANSSFSWWGAWLNPSQNKKIIAPKNWFVTDKMNTIDLIP Marinomonas posidonica SSWILK posidonica IVIA-Po-181] Bacteroides; WP_005839979.1 492425792 glycosyl 24.32 MKIVVFKGGLGNQLFQYAFYKYLSRKDETFYFYNDAWYNVSHNGFELDKYFKTDDLKKCSRFWIILFKTILSK 189 Bacteroides sp. transferase LYHWKIYVVGSVEYQYPNHLFQAGYFLDKKYYDENTIDFKHLLLSEKNQSLLKDIQNSNSVGVHIRRGDYMT 4_3_47FAA; family 11 KQNLVIFGNICFQKYYHDAIRIITEKVNDAVFYVFSDDISWVQTHLDIPNAVYVNWNTGESSIYDMYLMSSC Bacteroides sp. [Bacteroides] KYNIIANSTFSYWAARLNKKTNMVIYPSKWYNTFTPDIFPESWCGI 3_l_40A; Bacteroides dorei 5136/D4; Bacteroides vulgatus PC510; Bacteroides dorei CL03T12C01; Bacteroides vulgatus dnLKV7 Candidatus WP_020169431.1 519013556 protein 24.32 MTIRIKLTGGLGNQMFQFATGFAIAKKKNVRLSLDLKYINKRKLFNGFELQKIFNIYSKVSFLNKTLSFKSI 190 Pelagibacter [Candidatus NFTEILNRIDTTFYNFKEPHFHYTSNILNLPKHSFLDGYWQSELYFNEFATEIKRIFNFSGKLDKSNLLVAD ubique Pelagibacter DINRNNSISIHIRRGDFLLKQNNNHHTDLKEYYLKAINETSKIFKNPKYFIFSDDTSWTVDNFVIDHPYIIV ubique] DINFGARSFLDMYLMSLCKSNIIANSSFSWWSAWLNNNKDKIIYAPKNWFNDKSICTDDLIPESWNIIL Bacteroides sp. WP_022353174.1 547952428 uncharacterized 24.29 MSVIINMACGLANRMFQYAFYLYLQKEGYDAYVDYFTRADLVHENVDWLRIFPEATFRRATARDIRKMGG 191 CAG: 875 protein GHDCFSRLRRKLLPMTTKVLETSGAFEIILPPKNRDSYLLGAFQSAKMVESVDAEVRRIFTFPEFESGKNQY [Bacteroides sp. FQTRLAQENSVGLHIRKGKDYQERIWYKNTCGVEYYRKAVDLMKEKVDSPSFYVFTDNPAWVKENLSWLEY CAG: 875] KLVDGNPGSGWGSHCDMQLMSLCKHNIISNSSYSWWGAYLNNTLNKIVVCPRIWFNPESTKDFSSNPLLA EGWISL Butyrivibrio WP_022756327.1 551011911 protein 24.29 MIIIKLQGGLGNQLFLYGLYKNLKHLKRDVKMDIESGFEGDELRKPCLDCMNLEYAIATRDEVTDIRDSYMDI 192 fibrisolvens [Butyrivibrio FSRIRRKITGRKTFDYYEPEDGNYDPKVLEMTKAYLNGYFQSEKYFGDEESVKALKDELTKGKEDILTSTDLIT fibrisolvens] KIYHDIKNSESVSLHIRRGDYLTPGIIETYGGICTDEYYDKAIAMIRETFPEARFFIFSNDIEWCKEKFAGDKN ILFVNTIGINLDSEDNIKIGKSDKDISEYRDLAELYLMSACKHHILANSSFSWWGAWLSDHEGMTIAPSKWLNN KNMTDIYTKDMLLI Roseburia YP_004839455.1 347532692 glycosyl 24.22 MVTVKIGDGMGNQMYNYACGYAAAKRSGEKLRLDISECDNSTLRDYELDHFRVVYDEKESFPNRTFWQK 193 hominis; transferase LYKRLRRDIRYHVIRERDMYAVDARVFVPARRGRYLHGYWQCLGYFEEYLDDLREMFTPAYEQTDAVREL Roseburia family protein MQQFTQTPTCALHVRGGDLGGPNRAYFQQAIARMQKEKPDVTFIVFTNDLPKAKECLDDGEARMRYIAE hominis [Roseburia FGEALSDIDEFFLMSACQNQIISNSTYSTWAAYLNTLPGRIVIVPKFHGVEQMALPDWIVLDGGACQKGEID A2-183 hominis A2-183] AV Rhodopirellula WP_008659200.1 495934621 alpha-1,2- 24.16 MATSVHPHLSDGKQALDSKAAQQVCSTQAASASDRACTISISGGLGNQMLQYAAGRALSIHHDCPLQLDL 194 europaea; fucosyl- KFYSSKRHRSYELDAFPIQAQRWIKPSFFSQVLDKIQGESKSAPTYEEQSKRFDRAFFDIELPARIRGYFFSE Rhodopirellula transferase KYFLPYADQIRTELTPPVPLDQPARDMAQRLSEGMSTSLHVRRGDYVSNANARQRFWSCTSEYFEAAIEQMP europaea 6C [Rhodopirellula ADSTVFVFSDDIEWAKQNIRSSRPTVYVNDELKLAGSPETGLRDLWLMTHAKSHIIANSSFSWWGAWLSG europaea] SEANLTIAPKKWFNDPEIDDSDIVPTSWRRI Rudanella lutea WP_019988573.1 518832653 protein 24.16 MVIAKITSGLGNQLFQYALGRHLAIQNQTRLWFDLRYYHRTYETDTPRQFKLDRFSIDYDLLDYSPWLYVSK 195 [Rudanella ATRLLPGRSLRPLFDTRKEPHFHLDPAVPNAKGAFITLDGFWQSEGYFASNAATIRRELTFTRQPGPMYARY lutea] RQQIEQTQTPVSVHIRRGDYVSHPEFSQSFGALDDTYYQTALAQINGQFPDATLLVFSDDPEWVRQHMRF ERPHVLVENTGPDADVDDLQLMSLCHHHIIANSSFSWWGAWLNPRPDKRVIAPKQWFRNKPWNTADLI PAGWVRL Bacteroidetes; WP_008618094.1 495893515 glycosyl 24.15 MRLIKMTGGLGNQMFIYAMYLKMKTIFPDVRIDLSDMVHYQVHYGYEMNKVFHLPRTEFCINRSLKKIIEF 196 Capnocytophaga transferase LLFKTILERKQGGSLVPYTRKYHWPWIYFKGFYQSEKYFAGIEKEVREAFVFDIRRASRRSLRAMQEIKADPH sp. oral taxon [Bacteroidetes] AVSIHVRRGDYLLEKHWKALGCICQSSYYLNALAELEKRVKHPHYYVFSEDLNWVRQNLPLIKAEFIDWNKG 329 str. F0087; EDSWQDMMLMSHCRHHIICNSTFSWWGAWLNPLPDKIVIAPERWTQTTDSADVVPESWLKVSIG Paraprevotella clara YIT 11840 Smaragdicoccus WP_018159152.1 516906936 protein 24.08 MADVVVTLAGGLGNQLFQTAYAKNLEARGHRVTLDGTVVRWTRGLHIDPQICGLKILNATPPAPVPGRLA 197 niigatensis [Smaragdicoccus ATVLRRALATRLRFGPDGRIVRTQRTLEFDEQYLNLNSPGRYRVEGYWQCERYFSDVGQTVRKVFLDMLGR niigatensis] HVSYNGLSRLPAMADPSSISLHVRRGDYVTANFIDPLALEYYERALEELAVPSPRIFVFSDDLDWATRELGR ICDVIPVEPDWTSHPGGEIFLMSQCSHHIIANSSFSWWGAWLDGRTSSRVVAPRQWFSLETYSARDIVPDR WTKV Bacteroides WP_022012576.1 547279005 family 11 24.05 MIHLILGGGLGNQMFQYAFARSLALQYNENISFNTILYKELKNEERSFSLGHLNINTMCIVETPDENKRIWEL 198 fragilis glycosyl- FNKQIFHQKIARKILPASIRWWWMSNRNIYANVCGPYKYYHPRHRSQNTTIIHGGFQSWKYFKEHQSMIK CAG: 558 transferase AELKVITPISEPNKKILKEIQNSNSICVHIRRGDFLSAQFSPHLEVCNKDYYEKAIKMISSQIENPTFFIFSNT [Bacteroides HEDLVWIRKNYNIPQNSVYVDLNNPDYEELRLMYNCKHFILSNSSFSWWAQYLSESKNKIIIAPKIWDKRKGID fragilis FSDIYMPEWIIIK CAG: 558] Desulfovibrio WP_022657592.1 550904402 protein 24.05 MSFSIDVAAIQRMALVKVDGGLGSQMWQYALSLAVGKSSSFTVKHDLSWFRHYAKDIRGIENRFFILNSVF 199 desulfuricans [Desulfovibrio TNINLRLASENERLFFHIALNRYPDSICNFDPDILALKQPTYLGGYYVNAQYVTSAEKEIREAYVFAPAVEES desulfuricans] NQAMLQTIHAAPMPVAVHVRRGDYIGSMHEVLTPRYFERAFKILAAALQPKPTFFVFSNGMEWTKKAFAGL PYDFVYVDANDNDNVAGDLFLMTQCKHFTISNSSLSWWGAWLSQRAENKTVIMPSKWRGGKSPIPGEC MRVEGWHMCPVE Hoeflea WP_007199917.1 494373839 alpha-1,2- 24.05 MHGGLGNQLFQYAVGRAVALRTGSELLLDTREFTSSNPFQYDLGHFSIQAKVANSSELPPGKNRPLAYAW 200 phototrophica; fucosyl- WRKFGRSPRFVREQDLGYNARIETIEADCYLHGYFQSQKYFEDIASILWKDLSFRQAISGENASMAERIQSA Hoeflea transferase, PSVSMHIRRGDYLTSAKARSTHGAPDLGYYGRALGEIRARSGSDPVVYLFSDDPDWVRNNMRMDANLVT phototrophica [Hoeflea VAINDGKTAFEDLRLMSLCDHNIIVNSTFSWWGAWLNPSLDKIVVAPKRWFADPKLSNPDITPPGWLRLGD DFL-43 phototrophica] Vibrio cholerae; WP_002030616.1 487957217 glycosyl 24.04 MKIISFSGGLGNQLFQYAFYLYLKDNSDFGNIFLDFSFYESQNKRDAVIRNFYGVDSLDIIKQSSYVRGKFLI 201 Vibrio cholerae transferase, LKLINKFRFFNNLLEFVDKENGLDETLLSTNKVFFDGYWQSYRYVKDYKSNIKELFSFYDFKGNILEVRKKIC O1 str. 87395 family 11 QSNSVCMHVRRGDYVAEKNTKLVHGVCSLQYYRDALNNIKNVDNSIDHIFIFSDDIDWVKNNISFDIPVTVVD [Vibrio FVGQSVPDYAEMLLFSCGKHKVIANSTFSWWGAFLSDRNGVIVSPKKWFAKEEKNYDEIFIEGSLRL cholerae] Lachnospiraceae WP_022784718.1 551041074 protein 24.03 MIIVRFRGGMGNQMFQYAFLRYLEMKGATLKADLSEFKCMKTHAGYELDKAFDLHPAEASYKEIRAVADYI 202 bacterium [Lachnospiraceae PVMHRFPFSRKVFEILYKKETKRVEAEGPKKSHISEEKYFDMSEDERLHLASSSEDLYMDGFWIKPDMYDDE NK4A179 bacterium VLKCFTFSKTLDEKYKGTIEDEHSCSVHVRCGDYTGTGLDILGKEYYEKAAEKILSEDADVKFYVFSDDREKA NK4A179] EKLLSPFMKKMVFCDTPASHAYDDMYLMSRCRHHIIANSTFSFWGARLSADKSGITICPKYEDKNNTANRLV HEGWQML Cecembia WP_009185692.1 496476931 Glycosyl 24.01 MIIMKFMGGLGNQIYQYALGRKLSELHNSFLASDIHIYKNDPDREFVLDKFNIKVKHLPWKVIKLLNSDYALK 203 lonarensis; transferase FDKVFHTEFYHELVLEKALESKDIPRKNNLYLRGSWGNRKYYEDYIDKISDEITLKEKFKTKDFNTVNKKVKNS Cecembia family 11 DSVGIHIRRGDYEKVAHFKNFYGLLPPSYYSAAVDFIGNRIEKSNFFIFSDDTDWVKENLPFLKDSFFVSDIIG lonarensis [Cecembia SVDYLEFELLKNCKHQIIANSTFSWWAARLNSNPAKIVIKPKRWFADDRQQAVYEIEDSYYIKEAIKL LW9 lonarensis] Bacteroides WP_004295547.1 490423336 protein 24 MKIVNILGGLGNQMFVYAMYLALKEAHPEEEILLCRRSYKGYPLHNGYELERIFGVEAPEAALSQLARVAYP 204 ovatus; [Bacteroides FFNYKSWQLMRHFLPLRKSMASGTTQIPFDYSEVTRNDNVYYDGYWQNEKNFLSIRDKVIKAFTFPEFRDE Bacteroides ovatus] KNKALSDKLKSVKTASCHIRRGDYLKDPIYGVCNSDYYTRAITELNQSVNPDMYCIFSDDIGWCKENFKFLIG ovatus DKEVVFVDWNKGQESFYDMQLMSLCHYNIIANSSFSWWGAWLNNNDDKVVVAPERWMNKTLENDPI ATCC 8483 CDNWKRIKVE Bacteroides WP_022125287.1 547668508 glycosyl- 23.99 MRLIKMTGGLGNQMFIYAFYLKMKKLFPHTKIDLSDMMHYHVHHGYEMNRVFALPHTEFCINRTLKKLM 205 coprocola transferase EFLLCKVVYERKQKNGSMEAFEKKYAWPLIYFKGFYQSERFFADIEDDVRKTFCFNMELINSRSREMMKIID CAG: 162 family 11 ADEHAVSIHIRRGDYLLPKFWANAGCVCQLPYYKNAITELEKHESTPSFYVFSDDIEWVKQNLSLPNAHYID [Bacteroides WNQGNDSWQDMMLMSHCRNHIICNSTFSWWGAWLNPRKNKTVIVPSRWFMKEETPYIYPVSWIKVPIN coprocola CAG: 162] Bacteroides WP_007835585.1 495110765 glycosyl 23.99 MRLIKVTGGLGNQMFIYAFYLRMKKYYPKVRIDLSDMMHYKVHYGYEMHRVFKLPHTEFCINQPLKKIIEFL 206 dorei; transferase FFKKIYERKQAPNSLRAFEKKYFWPLLYFKGFYQSERFFADIKDEVREAFTFDRSKANSRSLDMLDILDKDEN Bacteroides [Bacteroides AVSLHIRRGDYLQPKHWATTGSVCQLPYYQNAIAEMSKRVTSPSYYIFSDDIVWVRENLPLQNAVYIDWNT dorei DSM dorei] GEDSWQDMMLMSHCKHHIICNSTFSWWGAWLNPSIDKTVIVPSRWFQYSETPDIYPTGWIKVPVD 17855; Bacteroides dorei CL03T12C01 Bacteroides; WP_007662951.1 494936920 protein 23.97 MIIVRLWGGLGNQLFQYSFGQYLEIETDKKVFYDVASFGTSDQLRKLELCSFIPDIPLYNAYFTRYTGVKNRL 207 Bacteroides [Bacteroides] FKALFQWSNTYLSESMFDICLLEKARGKIFLQGYWQEEKYATYFPMQKVLSEWKNPNVLSEIEENIRSAKISV intestinalis SLHVRRGDYFSPKNINVYGVCTEKYYEQAIDRANSEIEEDKQFFVFSDDILWVKNHVSLPESTVFVPNHEISQ DSM 17393; FAYIYLMSLCKVNIISNSTFSWWGAYLNQHKNQLVIAPSRWTFTSNKTLALDSWTKI Bacteroides intestinalis CAG: 564 Lachnospiraceae WP_016283022.1 511028838 protein 23.95 MIVIHVMGGLGNQLYQYALYEKLRALGREVKLDVYAYRQAEGAEREWRALELEWLEGIRYEVCTAAERQQ 208 bacterium A4 [Lachnospiraceae LLDNSMRLADRVRRRLTGRRDKTVRECAAYMPEIFEMDDVYLYGFWGCEKYYEDIIPLLQEKIVFPESSNPK bacterium A4] NADVLRAMAGENAVSVHIRRKDYLTVADGKRYMGICTDAYYKGAFRYITERVERPVFYIFSDDPAFAKTQF CEENMHVVDWNTGRESLQDMALMSRCRHNICANSTFSIWGARLNRHPDKIMIRPLHHDNYEALDARTV HEYWKGWVLIDADGKV Phaeobacter YP_006574665.1 399994425 protein 23.91 MIITRLHGRLGNQMFQYAAGRALADRLGVSVALDSRGAELRGEGVLTRVFDLDLATPDILPPLRQRAPLGY 209 gallaeciensis; PGA1_c33070 ALWRGLGQHLGTGPKLRREVGLGYNPDFVDWSDNSYLHGYWQSERYFAQSAERIRRDFTFPEYSNQQNA Phaeobacter [Phaeobacter EMAARIGETNAISLHVRRGDYLTLAAHVLCDQAYYEAALAQVLDGLEGQPTVYVFSDDPQWAKENLPLPC gallaeciensis gallaeciensis DKVVVDFNGADTDYEDMRLMSLCKHNIIGNSSFSWWAAWLNQTPDRRVAGPTKWFGDPKLNNPDILPP DSM 17395 = CIP DSM 17395 = DWLRISV 105210 CIP 105210] Firmicutes WP_021849028.1 546362318 protein 23.88 MSGGLGNQMFQYALYLKLRSLGREVCFDDKSQYDEETFRNSSQKRRPKHLDIFGITYPSAGKEELEKLTDGA 210 bacterium [Firmicutes MDLPSRIRRKILGRKSLEKNDRDFMFDPSFLEETEGYFCGGFQSPRYFAGAEEEVRKAFTFPEELLCPKEGCS CAG: 791 bacterium RQEQKMLEQSASYAERIRKANCEAADRGVPGGGSASIHLRFGDYVDKGDIYGGICTDAYYDTAIRCLKERD CAG: 791] PGMIFFVFSNDEEKAGEWIRYQAERSENLGRGHFVLVKGCDEDHGYLDLYLMTLCRNHVIANSSFSWWAS FMCDAPDKMVFAPSIWNNQKDGSELARTDIYADFMQRISPRGTRLSDRPLISVIVTAYNVAPYIGRALDSV CGQTWKNLEIIAVDDGSSDETGAILDRYAAGDSRIQVVHTENRGVSAARNEGIAHARGEYIGFVDGDDRA HPAMYEAMIRGILSSGADMAVVRYREVSAEETLTDAEEQVASFDPVLRASVLLQQRDAVQCFIRAGMAEE EGKIVLRSAVWNKLFHRRLLRDNRFPEGTSAEDIPFTTRALCLSKKVLCVPEILYDYVVNRQESIMNTGRAER TLTQEIPAWRTHLELLKESGLSDLAEESEYWFYRRMLSYEEEYRRCSETAKEAKELQERILKHRDRILELAEE HSFGRRGDRERLKLYVNSPRQYFLLSDLYEKTVVNWKNRPDKT Butyrivibrio YP_003829733.1 302669773 glycosyl 23.84 MRKRIIALNGGLGNQMFQYAFARMLEDRKHCLIEFDTGFYSTVNDRKLAIQNYNIHKYDFCNHEYYNKIRLL 211 proteoclasticus; transferase 11 FQKIPFVAWLAGTYKEYSEYQLDPRVFLFNYRFYYGYWQNKQYFENISNDIRNELSYIGNVSEKENALLNML Butyrivibrio [Butyrivibrio EAHNAIAIHVRRGDYTQEGYNKIYISLSKEYYKRAVSIACKELGDNNIPLYVFSDDIDWCKANLADIGNVTFV proteoclasticus proteoclasticus DNTISSSADIDMLMMKKSRCLITANSTFSWWSAWLSDRDDKIVLVPDKWLQDEEKNTKLMKAFICDKWKI B316 B316] VPV Bacteroides sp. WP_008768245.1 496043738 protein 23.81 >gi|496043738|ref|WP_008768245.1|protein[Bacteroides sp. 212 2_1_16 [Bacteroides 2_1_16]MQVVARIIGGLGNQMFIYATARALALRIDADLILDTQSGYKNDLFKRNFLLDSFCISYRKANCFQK sp. 2_1_16] YDYYLGEKVKSLGKKTHFSVIPFMKYISENTSCDFVDGLLKKHILSVYLDGYWQNEAYFKDYASIIKKDFQFCQ VNDLRTLSEAEIIKKSITPVAIGVRRYQELNSHQNTKVTDLDFYQKAINYIESKVDNPTFFIFSEDQEWVKNNL EQKSNFIMISPKEGNYSALNDMYLISLCKHHIVSNSSFYWWGAWLANNKNKIVVASDCFLNPQSIPDSWIKF Desulfomicrobium YP_003159045.1 256830317 glycosyl 23.76 >gi|256830317|ref|YP_003159045.1|glycosyl transferase family protein 213 baculatum; transferase [Desulfomicrobium baculatum DSM Desulfomicrobium family protein 4028]MAKIVTRIMGGIGNQLFCYAAARRLALVNHAELVIDDVTGFSRDRVYRRRYMLDHFNISARKATNYE baculatum [Desulfomicrobium RMEPFERYRRGLAKYISKKLPFFEREYIEQERIEFDPRFLEYRTYNNIYIDGLWQSENYFKDVEDIIRDDLKII DSM 4028 baculatum PPTDLENINIAKKIKNIQNTIAMHVRWFDLPGINLGNNVSTYYYHRAIAMMEQRINAPHYFLFSDNLEAVHSKL DSM 4028] DLPEGRVTFVSNNDGDDNAYADLWLMSQCKHFITANSTFSWWGAWLGESRDSVVLVPRFSPDGGVTS WCFTGLIPERWEQVSSIR Prevotella WP_021584236.1 545304945 galactoside 23.76 MDIVLIFNGLGNQMSQYAFYLAKRQRNNHTVYCVFGPRTQYSLDKLFDIPYRHNAVLVLLYRALDKAHFSN 214 pleuritidis; 2-alpha-L- HRWLRRLLRPTLQLLGVKMIVEPLSRDFDMRHFTHQKGIVFYRGGWHSELNFTAVADAVKRRFRFPEIQD Prevotella fucosyl- AAVLAVIDRIKSCQSVSLHLRRGDYLGLSEFQGVCTEAYYEHAIAYFESQIESPEYFVFSDDPTYAREQFGAD pleuritidis transferase PNFHIIDLNHGEDAWCDLLMMTQCRYNIIANSTFSWWGAWLNDNPSKIVVHPRYHLNGVETRDFYPRNW F0068 [Prevotella ICIE pleuritidis] Bacteroides sp. WP_008763191.1 496038684 glycosyl 23.75 MKVIWFNGNLGNQVFYCKYKEFLHNKYPNETIKYYSNSRSPKICVEQYFRLSLPDRIDSFKVRFVFEFLGKFFR 215 1_1_14 transferase, RIPLKFVPKWYCFRKSLNYEASYFEHYLQDKSFFEKEDSSWLKAKKPDNFSEKYLIFENLICNTNSVAVHIRRG family 11 DYIKPGSDYEDLSATDYYEQAIKKATEVYLDSQFFFFSDDLEFVKNNFKGDNIYYVDCNRGADSYLDILLMSQ [Bacteroides AKINIIANSTFSYWGAYMNHEKKKVMYSDLWFRNESGRQMPNIMLDSWICIETKRK sp. 1_1_14] Agromyces WP_022893737.1 551273588 protein 23.65 MVGRVGIARRQAADVSCTDGEGLVAWRIRTGEIVLGLQGGIGNQLFEWAFAMALRSIGRRVLFDAVRCR 216 subbeticus [Agromyces GDRPLMIGPLLPASDWLAAPVGLALAGATKAGLLSDRSWPRLVRQRRSGYDPSVLERLGGTSYLLGTFQSA subbeticus] RYFDGVEHEVRAAVRALLEGMLTPSGRRFADELRADPHRVAVHVRRGDYVSDPNAAVRHGVLGAGYYDQ ALEHAAALGHVRRVWFSDDLDWVREHLARDDDLLCPADATRHDGGEIALIASCATRIIANSSFSWWGGW LGAPSSPAHPVIAPSTWFADGHSDAAELVPRDWVRL Prevotella WP_007133870.1 494220705 alpha-1,2- 23.59 MIATTLFGGLGNQMFIYATAKALSLHYRTPMAFNLRQGFEQDYKYQRHLELNHFKCQLPTAKWITFNYKG 217 salivae; fucosyl- ELNIKRISRRIGRNLLCPHYQFIKEKEPFHYEKRLFEFTNKNIFLEGYWQSPRYFENYSDEIRRDFQLKSILP Prevotella transferase HTITDELQMLKGTGKPLVMLGIRRYQEVKDKKDSPYPLCNKDYYAKAISHVQEQLPAPLFVVFTQEQAWAMNN salivae [Prevotella LPTNANLYFVKEKDNAWATIADMYLMTQCQHAIISNSTFYWWGAWLQHPIENHIVVAPNNFINRDCVCD DSM 15606 salivae] NWIILD Carnobacterium YP_008718687.1 554649641 glycosyl 23.57 MIFVDLSEGLGNQMFQYAYSRYLQELYGGTLYLNTSSFKRKNSTRSYSLNNFYLYENVKLPSKFRRVIYNFYS 218 sp. WN1359 transferase KTIRMFIKKVIRMNPYSDKYYFSMIPYGFYVSSQVFKYLTVPTTKRHNIFVMGTWQTNKYFQSINDKIKDELK [Carnobacterium VKTEPNELNKKLITEINSNQSVCVHIRLGDYTNPEFDYLHVCTSDYYLKGMDYIVSKVKEPNFYIFSNSSSDIE sp. WN1359] WIKNNYNFKYKVKYIDLNNPDFEDFRLMYNCKHFIISNSTFSWWAQFLSNNDKKIIVAPSKWQKSNENEAK DIYLDHWKLIEIE Butyrivibrio sp. WP_022762290.1 551018062 glycosyl 23.55 MLIIQIAGGLGNQMQQYAMYRKLLKAGADRNIKLDTKWFDEDKQSGVLAKRKLELEYFTGLPLPVCSESER 219 AD3002 transferase ARFTDRSVARKVVEKLVPGMGSRFTESCMYHPEIFELKDKYIEGYFACQKYYDDIMGELQELFVFPTHPDEEI [Butyrivibrio NIKNMNLMNEMEMVPSVSVHIRRGDYLDPENAALFGNIATDAYYDSAMEYFKAIDPDTHFYIFTNDPEYA sp. AD3002] REKYADPGRYTIVDHNTGKYSLLDIQLMSHCRGNICANSTFSFWGARLNRRKDKIPVRTLVMRNNQPVTPE LMHEYWPGWVLVDKDGKVR Clostridium sp. WP_021636935.1 545396682 glycosyl- 23.55 MIVIRVMGGLGNQMQQYALYEKFKALGKETRLDTSWFDNASMQENVLARRSLELRFFDNLTYEACTPQE 220 KLE 1755 transferase, REALLGKEGFFNKLERKLFPSKNKHFYESEMFHPEIFKLDNVYLEGHWACEKYYHDIMPLLQSKIIFPKTDNI family 11 QNNMLKNKMNSENSVSIHIRRGDYLDPENAAMFGGICTDSYYKSAEGYIRNRVTNPHFYLFSDDPAYLREHY [Clostridium KGEEYTVVDWNHGADSFYDMELMSCCKHNVCANSTFSFWGARLNRTEKKIVIRPAKHKNSQQAEPERM sp. KLE 1755] HELWENWVIIDEEGRIV Bacteroides; YP_001300694.1 150005950 glycosyl 23.47 MKFFVFGGGLGNQLFQYSYYRYLKKKYPSERILGIYPDSLKAHNGIEIDKWFDIELPPTSYLYNKLGILLYRV 221 Bacteroides transferase NRFLYNHGYRLLFCNRVYPQSMKHFFQWGDWQDYSIIKQINIFEFRSELPIGKENMEFLKKMETCNSISVHIR vulgatus family protein RGDYLKTDLIHIYGGICTSKYYREAIKFMEQEVEEPFFFFFSDDCLYVETEFADIRNKIIISHNRDDRSFFDM ATCC 8482; [Bacteroides YLMAHAKNMILANSTFSCWAAYLNRTAKIIITPDRWVNTDFSKLEALPNEWIKIRV Bacteroides vulgatus ATCC dorei DSM 8482] 17855; Bacteroides massiliensis dnLKV3 Paraprevotella WP_008626629.1 495902050 glycosyl 23.47 MRLIKMTGGLGNQMFIYAMYLKMRAVFPDTRIDLSDMVHYRVHYGYEMNKVFNLPRTEFRINRSLKKIIEF 222 xylaniphila; transferase LLFKTILERKQGGSLVPYIRKYHWPWIYFKGFYQSEEYFAGVEKEVREAFVFDVRRVNRKSLCAMQEIMADP Paraprevotella [Paraprevotella DAVSIHVRRGDYLQGKHWKSLGCICQRSYYLNALSELEKRIVHPHYYVFSEDLDWVRQYLPLENAVFIDWN xylaniphila xylaniphila] KGEDSWQDMMLMSHCRHHIICNSTFSWWGAWLNPSPDKIVIAPERWTQTTNSADVVPESWLKVSIG YIT 11841 Thauera sp. 28 WP_002930798.1 489020296 glycosyl 23.47 MTDRALIAIVKGGLGNQLFIYAAARAMALRTGRQLYLDAVRGYLADDYGRSFRLNRFPIEAELMPEQWRV 223 transferase ASTLRHPRAKLVRALNKYLPEAWRFYVAERGDTRPGALWNHGRNVKRVTLMGYWQDEAYFLDYAELLRR family protein ELGPPMPDAPEVRARGERFAGTESVFLHVRRCRYSPLLDAGYYQKAVDLACAELNKPVFMIFGDDIEWVV [Thauera sp. 28] NNIDFRGAGYERQDYDESDELADLWLMTRCRHAIIANSSFSWWAAWLGGAAGSGRHVWAPGQSGLAL KCAKSWEAVDAQPE Subdoligranulum WP_007048308.1 494107522 alpha-1,2- 23.44 MIYAELAGGLGNQMFIYAFARALGLRCGEAVTLLDRQDWRDGAPAHTACALEGLNLVPEVKILAEPGFAK 224 variabile; fucosyl- RHLPRQNTAKALMIKYEQRQGLMARDWHDWERRCAPVLNLLGLHFATDGYTPVRRGPARDFLAWGYF Subdoligranulum transferase QSEAYFADFAPTIRAELRAKQAPAGVWAEKIRAAACPVALHLRRGDYCRPENEILQVCSPAYYARAAAAAA variabile [Subdoligranulum AAYPEATLFVFSDDIDWAKEHLDTAGLPAVWMPRGDAVGDLNLMALCRGFILSNSTYSWWAQYLAGEG DSM 15176 variabile] RTVWAPDRWFAHTKQTALYQPGWHLIETR Firmicutes WP_021916223.1 547127527 protein 23.4 MIIVEVMGGLGNQMQQYALYRKLESLGKDARLDVSWFLDKERQTKVLASRKLELSWFENLPAKYCTQEEK 225 bacterium [Firmicutes QAILGKNNLIGKLKKKLLGGSNRHFTESDMYHPEIFDLEDAYLSGFWACEAYYADILPMLRSQIHFPDPEKGE CAG: 24 bacterium GWDLEAAAKNKETMERMKQETSVSIHIRRGDYLDAKNAEMFGGICTDAYYEAAISYIKEQTPDAHFYVFSD CAG: 24] DSAYVKNAYPGKEFTVVDWNTGKNSLFDMQLMSCCNHNICANSTFSFWGARLNPSPDKVMIRPSKHKN SQNIVPEEMKRLWDGWVLIDGKGRII Prevotella sp. WP_022310139.1 547906803 glycosyl 23.39 MIITKLNGGLGNQLFEYACARNLQLKYNDVLYLDIEGFKRSPRHYSLEKFKLSSDVRMLPEKDSKSLILLQA 226 CAG: 474 transferase ISKLNRNLAFKLGPLFGTYIWKSSNYRPLKIKNTRGKKLYLYGYWQSYEYFKENEAIIKQELNVKTEIPIECS family 11 ELLKEINKPHSICVHVRRGDYVSCGFLHCDEAYYNRGINHIFDKHPDSNVVVFSDDIKWVKANMNFDHPVAYV [Prevotella EVDVPDYETLRLMYMCKHFVMSNSSFSWWASYLSDNKEKIVVAPSYWLPANKDNKSMYLDNWTIL sp. CAG: 474] Roseburia WP_006855899.1 493910390 glycosyl 23.38 intestinalis]MRGNRGMIAVKIGDGMGNQLFNYACGYAQARRDGDSLVLDISECDNSTLRDFELDKFHL 227 intestinalis; transferase KYDKKESFPNRNLGQKIYKNLRRALKYHVIKEREVYHNRDHRYDVNDIDPRVYKKKGLRNKYLYGYWQHLAY Roseburia family 11 FEDYLDEITAMMTPAYEQSETVKKLQEEFKKTPTCAVHVRGGDIMGPAGAYFKHAMERMEQEKPGVRYIVF intestinalis [Roseburia TNDMERAEEALAPVLESQKKDAVGQAENRLEFVSEMGEFSDVDEFFLMAACQNQILSNSTFSTWAAYLN L1-82 intestinalis] QNPDKTVIMPDDLLSERMRQKNWIILK Bacteroides WP_004296622.1 490424433 protein 23.29 MKIVLFTPGLGNQMFQYLFYLYLRDNYPNQNIYGYYNRNILNKHNGLEVDKVFDIQLPPHTVISDASAFFIR 228 ovatus; [Bacteroides ALGGLGLKYFIGKDQLSPWKVYFDGYWQNKEYFQNNVDKMRFREGFLNKKNDDILSLIRNTNSVSVHVRR Bacteroides ovatus] GDYCDSCRKDLFLQSCTPQYYESAISVMKEKFQKPVFFVFSDDIPWVKVNLNIPNAYYIDWNKKENSYLDM ovatus YLMSLCTASIIANSTFSFWGAMLGNKKELVIKPKKWIGDEIPEIFPPSWLSL ATCC 8483 Butyrivibrio sp. WP_022779599.1 551035785 glycosyl 23.25 MLIIQIAGGLGNQMQQYALYRKLLKYHPDGVRLDLSWFDSEVQKNMLAKREFELALFKGLPYIECKPEERA 229 AE3009 transferase AFLDRNAAQKLSGKVLKKLGLRDNANPNVFEESRMFHPEIFELDNKYIIGYFACQKYYDDIMGDLCNLFEFP [Butyrivibrio EHLDPELEKKNLELISKMEKENSVSVHIRRGDYLDPENFKILGNIATDEYYESAMKYFEDRYEKVHFYIFTS sp. AE3009] DHEYAREHFADESKYTIVDWNTGKDSLQDVRLMNHCLGNICANSTFSFWGARLNQRQDKVMIRTYKMRNN QPVDPDTMHDYWKGWILIDETGREV Butyrivibrio YP_003829712.1 302669752 glycosyl 23.23 MTKNEKKLIVKFQGGLGNQLYEYAFCEWLRQQYSDYEVLADLSYYKIRSAHGELGIWNIFPNINIEVASNW 230 proteoclasticus; transferase 11 DIIKYSDQIPIMYGGKGADRLNSVRTNVNDRFFSKRKHSYYTEISNTDVSEVINALNNGIRYFDGYWQNIDYF Butyrivibrio [Butyrivibrio KGNIEDLRNKLKFSEKCDKYITDEMLRDNAVSLHVRRGDYVGSEYEKEVGLSYYKKAVEYVLDRVDQAKFFIF proteoclasticus proteoclasticus SDDKYYAETAFEWIDNKTVVAGYDNELAHVDMLLMSRMKNNIIANSTFSLWAAYLNDSMNPLIVYPDVES B316 B316] LDKKTFSDWNGIK Prevotella WP_018362656.1 517173838 protein 23.23 MDSQFLKHIKLSGGFGNQLFQYFFGEYLKEKYNCSISFFSEPALDINQLQIHRFFPALRISHNTELRPYHYSFT 231 nanceiensis [Prevotella QQLAYRCMRKLLLLFPFLNRKVKIENGSNYQNQSFNDTYCFDGYWQSYRYLSAFTPSLQFEDQLINDISADY nanceiensis] INAIEQSEAVFLHIRRGDYLNKENQKVFAECPLNYFENAANRIKEDIKNVHFFVFSNDIQWVKSHLKLNDNE VTFIQNEGNSCDLKDFYLMTRCKHAIISNSTFSWWAAYLINNSDKKVIAPKHWYNDISMNNATKDLIPPTW IRL Ruegeria sp. R11 WP_008562971.1 495838392 alpha-1,2- 23.23 MIITRLHGRLGNQMFQYAAGRALADRAGVPLALDSRGAILRGEGVLTRVFDLELADPVHLPPLKQTNPLRY 232 fucosyl- AIWRGIGQKVGAKPYFRRERGLGYNPAFEDWGDNSYLHGYWQSQKYFQNSAERIRSDFTFPAFSNQQNA transferase EMAARIAESTAISLHVRRGDYLTFAAHVLCDQAYYDAALAKVLDGLQGDPIVYVFSDDPQWAKDNLSLPCE [Ruegeria sp. KVVVDFNGPETDFEDMRLMSLCQHNIIGNSSFSWWAAWLNQTPGRRVAGPAKWFGDPKLSNPDIFPHD R11] WLRISV Winogradskyella WP_020895733.1 527072096 alpha-1,2- 23.21 MGNQLYEYATAKAMAVALNKKLVIDPRPILKEAPQRHYDLGLFNIQDEDFGSPFVQWLVRWVASVRLGKF 233 psychrotolerans fucosyl- FKTIMPFAWSYQMIRDKEEGFDESLLQQKSRNIVIEGYWQSFKYFESIRPTLLKELSFKDKPNAINQKYLDE RS-3; transferase IESVNAVAVHIRRGDYVANPVANAVHGLCDMDYYKKAIAIIKDKVENPYFFIFTDDPDWAEDNFKISEHQKI Winogradskyella [Winograd- IKHNIGKQDHEDFRLLTNCKYFIIANSSFSWWGAWLSDYKNKIVISPNKWFNVDAVPITERIPESWIRV psychrotolerans skyella psychro- tolerans] Lachnospiraceae WP_022785342.1 551041720 protein 23.2 MITVRIDGGFGNQMFQYAFFLHLKKTITDNKISVDLNCYNPHGSGDIFTRFKLAPEQAAPSEIKRFHRNSIYH 234 bacterium [Lachnospiraceae LLRPLDSAGITTNPYYREEDIDDLNSVLNKKRVYLRGYWQDKRYPFSVKDQLIDCFDLGKMDMTGASAENN NK4A179 bacterium VILEQIASEESRSVGVHLRGGDYIGDPVYSGICTPEYYEAAFKHVSEKIKDPVFHIFTNDISMIEKCGLSGKYD NK4A179] LKITDINDEAHGWADLKLMSACRHHIISNSSFSWWAAFLGEATTEASADVINVIPEYMRQGVSAETLRCPC WTTVTSDGRVYPS Prevotella sp. WP_009230832.1 496522549 alpha-1,2- 23.13 MKIVCIKGGLGNQLFEYCRYRSLHRHDNRGVYLHYDRRRTKQHGGVWLDKAFHITLPNEPLRVKLLVMVLK 235 oral taxon fucosyl- TLRRLHLFKRLYREEDPRAVLIDDYSQHKQYITNAAEILNFRPFEQLDYAEEIQTTPFAVSVHVRRGDYLLLA 317 str. transferase NKSNFGVCSVHYYLSAAVAVRERHPESRFFVFSDDMEWAKENLNLPNCVFVEHAQAQPDHADLYLMSLCK F0108; [Prevotella GHIIANSTFSFWGAYLSKGSSAIAIYPKQWFAEPTWNVPDIFPAHWMAL Prevotella sp. oral sp. oral taxon 317] taxon 317 Butyrivibrio sp. WP_022765796.1 551021633 glycosyl 23.1 MLIIQIAGGLGNQMQQYAVYTKLRGMGKDVRLDLSWFDPSVQKNMLAPREFELSMFEGVDYTECTAEER 236 XPD2006 transferase DSFLKQGMIANVTGKMLKKLGLRDEANPKVFSEKEMYHPEIFELEDRYIKGYFACQKYYDDIMGELWEKYT [Butyrivibrio FPAHSDPDLHTRNMALVERMEKETSVSVHIRRGDYLDPSNVEILGNIATEEYYQGAMDYFSVKDPDTHFYI sp. XPD2006] FTSDHEYAREKFSDESKYTIVDWNSGRNSVQDLMLMSHCKGNICANSTFSFWGARLNRRPDKTVIRTYKM RNNQPVNPDIMHDYWKGWILMDEKGSII Butyrivibrio WP_022752717.1 551008140 protein 23.08 MIIIKLQGGLGNQLFLYGLYKNLKHLKRDVKMDIESGFEEDKLRVPCLKSMGLDYEVATRDEIVAIRDSYMDI 237 fibrisolvens [Butyrivibrio FSRIRRKITGRKTFDYYEPEDGNFDPRVLEQTHAYLDGYFQSEKYFGDSDDRKKLKDELLKEKIRVLDSSDTL fibrisolvens] KDLYNMMSSGSSVSLHIRRGDYLTPGIMETYGGICTDEYYDIAMNRIKNEYPDSKFFIFSNDIDWCKEKYGSR DDVIFVDSCDEHEGLTNVSGDQDDIQVQGDIKEHGNNSLRDAAELYLMSACKHHILANSSFSWWGAWLS DHEGMTIAPSKWLNNKNMTDIYTKDMLLI Cylindro- WP_006278973.1 493321658 Glycosyl 23.05 MKKTVVLLKGGLGNQMFQYAFARSISLKNSSKLVIDNWSGFTFDYKYHRQYELGTFSIVGRPANLTEKFPF 238 spermopsis transferase WFYELKSKFFPRLPKVFQQQFYGLLINEVGGEYIPEIEETKISQNCWLNGYWQSPLYFQKHSDSITRELMPPE raciborskii; family 11 PMEKHFLELGKLLRETESVALGIRLYEESKNPGSHSSSGELKSHFEINQAILKLRELCNGAKFFVFCTHRSPL Cylindro- [Cylindro- LQELALPENTIFVTHDDGYVGSMERMWLLTQCKHHIFTNSTFYWWGAWLSQKFYIQGSQIVFAADNFINSDA spermopsis spermopsis IPKHWKPF raciborskii raciborskii] CS-505 Prevotella WP_007368154.1 494609908 alpha-1,2- 23.05 MKIVNFQGGLGNQMFIYAFSRYLSRLYPQEKIYGSYWSRSLYVHSAFQLDRIFSLQLPPHNLFTDCISKLAR 239 multiformis; fucosyl- FFERLRLVPVEETPGSMFYNGYWLDKKYWEGIDLSEMFCFRNPDLSAEAGAVLSMIERSNAVSVHIRRGDYQ Prevotella transferase SEEHIEKFGRFCPPDYYRIATERIRQREDDPLFFVFSDDMMWVKSNMDVPNAVYVDCHHGDDSWKDMF multiformis [Prevotella LMAKCRHNIIANSTFSFWAAMLNANPDKVVVYPQRWFCWPSPDIFPEMWLPVTEKEIKSSF DSM 16608 multiformis] Bacteroides sp. WP_022384635.1 548151455 protein 23 MIIVNMACGLANRMFQYAFYLSLKERGYNVKVDFYKSATLPHENVPWNDIFPYAEIDQVSNFRVLILGGGA 240 CAG: 462 [Bacteroides NLLSKLRRKYLPSLTNVITMSTAFDTDLQIDDDRKDKYIIGVFQSAAMVEGVCKKVKQCFSFLPFTDLRHLQL sp. CAG: 462] EKEMQECESVAIHVRKGNDYQQRIWYQNTCFMDYYRKAIAEIKGKVKDPRFYVFTDNADWVRRNFTDFD YKMVEGNPVYGWGSHFDMQLMSRCKYNIISNSTYSWWGAYLNANRNKIVICPNIWFNPESCNEYTSCKL LCKGWIAL Desulfovibrio WP_005984176.1 492830222 Glycosyl 23 MRIGILYICTGKYTVFWNHFFTSCEQHFLREHEKHYYIFTDGEIAHLNCNRVHRIEQQHLGWPDSTLKRFHM 241 africanus; transferase FERIADTLRQNSDFIVFFNANMVFLRDVGKEFLPTREQALVFHRHPGLFRRPAWLLPYERRPESTAYIPYGS Desulfovibrio family 11/ GSIYVCGGVNGGYTQPYLDFVAMLRRNIDIDVERGIIARWHDESHINRFVIGRHYKIGHPGYVYPDRRNLPF africanus PCS Glycosyl- PRIIRVIDKASVGGHTFLRGQTPEPAPEEQSKTVAKKLRSQLKRPCMPRAAQDEPIILARMMGGLGNQMFI transferase YAAARVLAERQGAQLHLDTGKLSGDSIRQYDLPAFSIDAPLWHIPCGCDRIVQAWFALRHVAAGCGMPKP family 6 TMQVLRSGFHLDQRFFSIRHSAYLIGYWQSPHYWRGHEDRVRSSFDLTRFERPHLREALAAVSQPNTISVH [Desulfovibrio LRRGDFRAPKNSDKHLLIDGSYYERARKLLLEMTPQSHFYIFSDEPEEAQRLFAHWENTSFQPRRSQEEDLLL africanus] MSRCSASIIANSSFSWWGAWLGRPKQHVIAPRMWFTRDVLMHTYTLDLFPEKWILL Roseburia sp. WP_022518697.1 548374190 protein 22.98 MILIHVMGGLGNQLYQYALYEKMKSLGKKVKLDTYAYNDAAGEDKEWRSLELDRFPAIEYDKATSEDRTKL 242 CAG: 100 [Roseburia sp. LDNSGLLTAKIRRKLLGRKDKTIRESKEYMPEIFHMDDVYLYGFWNCERYYEDIIPLLQDKLQFPISNNPRNQ CAG: 100] QCIEQMQKENAVSIHIRRTDYLTVADGARYMGICTEDYYKGAMAYIEERVSNPVYYIFSDDVEYAKQHYHQ DNMHVVDWNSKADSIYDMQLMSKCKHNICANSTFSMWAARLNQNKEKIMIRPLHHDNYETTTATQVK QNWKNWILLDQNGQVCE Lachnospiraceae WP_022742385.1 550997676 protein 22.96 MTMNIIRMSGGLGSQMFQYALYLKLKSMGKEVKFDDINEYRGEKARPIMLAVFGIEYPRATWDEITSFTDG 243 bacterium 10-1 [Lachnospiraceae SMDLLKRLRRKIFGRKAIEYEEQGFYDPNVLNFDSMYLRGNFQSEKYFQDIKEEVRKLYRFSTLEDMRLPERL bacterium 10-1] YKATKACLDGIESSESVGLHMYRSDSRVDGELYDGICFGNYYKGAVRFIQDKVPDAKFYIFSNEPKWVRGW VVDLIQSQIQEGMSPSQVKEMEKRFVMVEANTEYTGYLDMMLMSKCKHNIISNSSFSWWSAWMNDHP EKVVVAPDRWSSDKEGNEIYTTGMTLVNEKGRVNYTIHENSTVK Prevotella WP_004362670.1 490496500 protein 22.96 MILSYITGRLGNQLFEYAYARSLLLKRGKNEELILNFSLVRAAGKEIEGFDDNLRYFNVYSYTELDKDIVLS 244 nigrescens; [Prevotella KGDLLQLFIYILFKLDQKLFRIIKKEKWFSFFRRFGIIFQDYLDNISNLIIPRTKNVFCYGKYENPKYFDDI Prevotella nigrescens] RSILLKEFTPRIPPLKNNDQLYSVIESTNSVCISIRRGDFLCDKFKDRFLVCDKEYFLEAMEEAKKRISNST nigrescens FIFFSDDIEWVRENIHSDVPCYYESGKDPVWEKLRLMYSCKHFIISNSTFSWWAQYLSRNEEKVVIAPDRWS F0103 NVPGEKSFLLSNSFIKIPIGILP Bacteroides sp. WP_022353235.1 547952493 fucosyl 22.95 MIYVEINGRLGNNMFEIAAAKSLTDEVTLWCKGDWQLNCIKMYSDTLFKNYPIVKSLPNNIRIYEEPEFTFH 245 CAG: 875 transferase PIPYKENQDLLIKGYFQSYKYLDREKVLKLYPCPMPVKLDIEKRFGDILSQYTVVSINVRRGDYLNLPHRHPFV [Bacteroides GKKFLERAMLWFGDKVHYIISSDDIEWCKAHFKQFDNVHYLTNSYPLLDLYIQTACHHNIISNSSFSWWGA sp. CAG: 875] YLNNHPQKIVIAPHRWFGMSTNINTQDLLPPEWMIEQCVYEPKVFLKALPLHAKYLLKRVLK Prevotella sp. YP_008444280.1 532354444 protein 22.9 MDSQLLKHIKLSGGFGNQLFQYFFGEYLKEKYNCSISFFSEPALDINQLQIHRFFPTLRISHNTELRRFHYAFT 246 oral taxon 299 HMPREF0669_00176 QQLAYRCMRKLLLLFPFLNRKVKIENGSNYQNQSFNDTYCFDGYWQSYRYLSAFTPSLQFEDQLINDISADY str. F0039; [Prevotella INAIEQSEAVFLHIRRGDYLNKENQKVFAECPLNYFENAVNKIKEGNKTYHFFVFSNDIEWVKCHLKLNNNE Prevotella sp. oral taxon VTFIQNEGSSCDLKDFYLMTRCKHAIISNSTFSWWAAYLINNNDKKVIAPKRWYNDLSMNNATKDLIPPTW sp. oral 299 str. F0039] IRL taxon 299 Paraprevotella WP_008628783.1 495904204 alpha-1,2- 22.87 MKIVCLKGGLGNQMFEYCRFRDLMDSGNGKVYLFYDRRRLKQHDGLRLSDCFELELPSCPWGIRLVVWGL 247 xylaniphila; fucosyl- KICRAIGVLKRLYDDEKPDAVLIDDYSQHRRFIPNARRYFSFRQFLAELQSGFVQMIRAVDYPVSVHVRRGD Paraprevotella transferase YLHPSNSSFVLCGVDYFRQAIAYVRKKRPDARFFFFSDDMEWVRENLWMEDAVYVEHTELMPDYMDLYL xylaniphila [Paraprevotella MTLCRGHIISNSTFSFWGAYLAVDGNGMKIYPRRWFRDPTWITPPIFSEEWVGL YIT 11841 xylaniphila] Dethio- WP_005658864.1 491897177 glycosyl 22.84 MFQYAFGRALALDLGLDLKLDISNFGSDSRPFSLGIYSLTKNIPFGCYLSTSTRLKVKMTKKLRRWGVWGMD 248 sulfovibrio transferase KNMPGVLVEPFPPVLVSLDEVLSEKLSHLFVDGYWQSEKYFSRYSDVIRSDFRVIEESSAFLAWKKRMLSEP peptidovorans; family 11 GGSISVHVRRGDYVTDSSANRVHGVLPIEYYLRAKEILNTISDGLVFYVFTDDPVWARNNLCLGDKTIYVSGE Dethio- [Dethio- DLKDYEELALMSCCDHHVVANSSFSWWGAWLGQDTSTVTIAPGRWFRKMDSSFVIPDNWIKIWT sulfovibrio sulfovibrio peptidovorans peptidovorans] DSM 11002 Lachnospiraceae WP_016229292.1 510896192 protein 22.83 MIIIQVMGGLGNQLQQYALYRKFVRMGKEARLDISWFLDKEKRGEVLAERELELDYFDRLIYETCTPEEKEQ 249 bacterium 10-1 [Lachnospiraceae LIGSEGVAGKLKRKFLPGRIRWFHESKIYHPELLQMENMYLSGYFACEKYYADILYDLREKIQFPVNDHPKNI bacterium 10-1] KMAQEMQERESVSVHLRRGDYLDEKNTAMFGNICTDAYYCKAIEYMKTLCSKPHFYIFSDDIPYVRQRFTG EEYTVVDINHGRDSFFDMWLMSRCRHNICANSTFSFWGARLNSNDNKIMIRPTIHKNSQVFVKEEMEQL WPGWKFISPDGGIK Treponema WP_016525279.1 513872223 protein 22.82 MFCAAFVEALKHAGQKVFVDTSLYNKGTVRSGIDFCHNGLETEHLFGIKFDEADKADVHRLSTSAEGLLNRI 250 maltophilum; [Treponema RRKYFTKKTHYIDTVFRYTPEVLSDKSDRYLEGFWQTEKYFLPIESDIRTLFRFRQPLSEKSAAVQSALQAQ Treponema maltophilum] EPASLSASIHVRRGDFLHTKTLNVCTETYYNNAIEYAAKKYAVSAFYVFSDDIQWCREHLNFFGARSVFIDW maltophilum NIGADSWQDMVLMSMCRCNIIANSSFSWWAAWLNAASDKIVLAPAIWNRRQLEYADRYYGYDYSDVIPET ATCC 51939 WIRIPI Bacteroides WP_016276676.1 511022363 protein 22.79 MKLVSFTAGLGNQLFQYCFYRYLLNKFPNEKIYGYYNKKWLKKHGGIIIEHFFDVKLPRSTRWINLYGQYLRI 251 massiliensis; [Bacteroides IYKCFSCGVSKDDDFEMNRTMFVGYWQDQCFFSGINISYKKNLVISEKNTWLLGEILKCNSVAIHFRRGDYM Bacteroides massiliensis] LPQFKKIFGEVCTVKYYLKSIRKVEEKISEPVFFVFSDDIDWVKQNFTFNKVYFVDWNKGQNSFWDMYLMS massiliensis QCSANIIANSTFSFWGAYLNKNNPFVIYPQKWVRTNLKQPNIFPKTWMAL dnLKV3 Enterococcus YP_006376560.1 389869137 family 11 22.71 MIVLTLGGGLGNQMFQYGYARYIQKIHREKFIYINDSEVIKEADRFNSLGNLNTVNIKVLPRIISKPLNETERL 252 faecium; glycosyl- VRKIMVRLFGVAGFNESAIFQSLNKFGIYYHPSVYKFYESLKTGFPIKIIEGGFQSWKYLETCPEIKQELRVKY Enterococcus transferase EPMGENLRLLNLISQSESVCVHIRRGDYLSPKYKHLNVCDYQYYFESMNYIISKLNNPTFFIFSNTSDDLDWIK faecium [Enterococcus ENYSLPGKIVYVKNDNPDYEELRLMYSCKHFIISNSTFSWWAQYLSNNSGIVIAPEIWNRLNHDGIADLYMP DO; faecium DO] NWITMKVNR Enterococcus faecium EnGen0035 Bacteroides; WP_004313284.1 490442319 glycosyl 22.67 MDVVVIFNGLGNQMSQYAYYLAKKKVNPNTKVIFDIMSKHNHYGYDLERAFGIEVNKTLLIKVLQIIYVLSR 253 Bacteroides sp. transferase KFRLFKSVGVRTIYEPLNYDYTPLLMQKGPWGINYYVGGWHSEKNFMNVPDEVKKAFMFREQPNEDRFN 2_l_22; family 11 EWLQVIRGDNSSVSVHIRRGDYMNIEPTGYYQLNGVATLDYYHEAIDYIRQYVDTPHFYVFSNDLDWCKE Bacteroides sp. [Bacteroides] QFGVENFFYIECNQGVNSWRDMYLMSECHYHINANSTFSWWGAWLCKFEDSITVCPERFIRNVVTKDFY 2_2_4; PERWHKIKSC Bacteroides sp. D1; Bacteroides xylanisolvens SD CC 2a; Bacteroides xylanisolvens SD CC 1b; Bacteroides ovatus CAG: 22 Synechococcus YP_004322362.1 326781960 glycosyl- 22.6 MIGFNALGRMGRLANQMFQYASLKGIARNTGVDFCVPYHEEAVNDGIGNMLRTEIFDSFDLQVNVGLLN 254 phage transferase KGHAPVVQERFFHFDEELFRMCPDHVDIRGYFQTEKYFKHIEDEIREDFTFKDEILNPCKEMIAGVDNPLAL S-SM2 family 11 HVRRTDYVTNSANHPPCTLEYYEAALKHFDDDRNVIVFSDDPAWCKEQELFSDDRFMISENEDNRIDLCLM [Synechococcus SLCDDFIIANSTYSWWGAWLSANKDKKVIAPVQWFGTGYTKDHDTSDLIPDGWTRIATA phage S-SM2] Geobacter YP_006720295.1 404496189 glycosyl- 22.58 MDIHVLSYGLGNQLSQYAFFINRRQLMQRAYAFYAFKQHNGYELDRIFGLKEGLPWYLQFVRVVFRLGISR 255 metallireducens; transferase RFYSKRTADFVLSLFRIKVIDEAYNYEFDPSLLKPWFGIRILYGGWHDSRYFHPSEAAVRTAFSFPPLDDVND Geobacter [Geobacter AILQQIDAVYGVSIHVRRGDYLKGINSNLFGGIATLEYYRNAIGWAITYCKHRSLEIKFYVFSDDIDWCKQNL metallireducens metallireducens GLRDAVYVSGNSKTDSWKDILLMSHCRANIIANSTFSWWAAWLNQQPNKVVICPTKFINTDSPNQTIYPA GS-15; GS-15] AWHQIEG Geobacter metallireducens RCH3 Lachnospiraceae WP_022780989.1 551037245 protein 22.58 MIIVRFHGGLGNQMFEYAFYRYMTNKYGADNVIGDMTWFDRNYSEHQGYELKKVFDIDIPAIDYKTLAKI 256 bacterium [Lachnospiraceae HEYYPRYHRFAGLRYLSRMYAKYKNKHLKPTGEYIMDFGPSQYIHNDAFDKLDTNKDYYIEGVFCSDAYIKY NK4A136 bacterium YENQIKKDLTFKPNYSQHTKDMLPKIEETNSVAIHVRRGDYVGNVFDIVTPDYYRQAVNYIRERVENPVFFV NK4A136] FSDDMDYIKANFDFLGDFVPVHNCGKDSFQDMYLISRCRHMIIANSSFSYFGALLGEKDSTIVIAPKKYKADE DLALARENWVLL Bacteroides WP_008144634.1 495419937 protein 22.56 MGFIVNMACGLANRMFQYSYYLFLKKQGYKVTVDFYRSAKLAHEKVAWNSIFPYAEIKQASRLKVFLWGG 257 coprophilus; [Bacteroides GSDLCSKVRRRYFPSSTNVRTTTGAFDASLPANTARNEYIIGVFLNASIVEAVDDEIKKCFTFLPFTDEMNLR Bacteroides coprophilus] LKKEIEECESVAIHVRKGKDYQSRIWYQNTCSMEYYRKAILQMKEKLQHSKFYVFTDNVDWVKENFQEIDYT coprophilus DSM LVEGNPADGYGSHFDMQLMSLCKHNIISNSTYSWWSAFLNRNPEKVVIAPEIWFNPDSCDEFRSDRALCK 18228 = JCM GWIVL 13818 Bacteroidetes; WP_008619736.1 495895157 alpha-1,2- 22.53 MKIVCLKGGLGNQMFEYCRFRDLMESGHDEVYLFYDHRRLKQHNGLRLSDCFELELPSCPWGIKLVVWGL 258 Capnocytophaga fucosyl- KICRAVGVLKRLYDDEKPEAVLIDDYSQHRRFIPNARRYFFFRQFLAELQSGFVQMIRAVDYPVSVHVRRGD sp. oral transferase YLHPSNSSFGLCGVDYFQQAIAYVRKKRPDARFFFFSDDMEWVRENLWMEDAVYVEHTELLPDYVDLYL taxon 329 [Bacteroidetes] MTLCRGHIISNSTFSFWGAYLAVDGNGMKIYPRRWFRDPTWTSPPIFSEEWVGL str. F0087; Paraprevotella clara YIT 11840 Butyrivibrio sp. WP_022770361.1 551026242 glycosyl 22.47 MLIIQIAGGLGNQMQQYAVYTKLREMGKDVKLDLSWFDPQVQKNMLAPREFELPIFGGTDYEECSAYERD 259 NC2007 transferase ALLKQGAFAAIAGKVLKKLGLRDEANPKVFSEKEMYHPEVFELEDKYIKGYFACQKYYGDIMDKLQEKFIFPE [Butyrivibrio HSDPDLHARNMALVERMEREPSVSVHIRRGDYLDPSNVEILGNIATEQYYQGAMDYFTVKEPDTHFYIFTS sp. NC2007] DHEYAREKFSDESKYTIVDWNNGKNSVQDLMLMSHCKGNICANSTFSFWGARLNKRPDKTVIRTYKMRN NQPVNPQIMHDYWKGWILMDEKGSII Paraprevotella WP_008628536.1 495903957 glycosyl 22.45 MKILVFTGGLGNQMFAYAFYLYLKRLFPQERFYGLYGKKLSEHYGLEIDKWFKVSLPRQPWWVLPVTGLFY 260 xylaniphila; transferase LYKQCVPNSKWLDLNQEICKNPRAIVFFPFKFTKKYIPDDNIWLEWKVDESGLSEKNRLLLSEIRSSDCCFVH Paraprevotella [Paraprevotella VRRGDYLSPTFKSLFEGCCTLSYYQRALKSMKEISPFVKFVCFSDDIQWVKQNLELGNRAVFVDWNSGTDS xylaniphila xylaniphila] PLDMYLMSQCRYGIMANSTFSYWGARLGRKKKRIYYPQKWWNHGTGLPDIFPNTWVKI YIT 11841 Blautia WP_005944761.1 492742598 protein 22.44 MEIHVYLTGRLGNQLFQYAFARHLQKEYGGKIICNIYELEHRSEKAAWVPGKFNYEMSNYKLNDSILIEDIKL 261 hydrogenotrophica [Blautia] PWFADFSNPIIRIVKKVIPRIYFNLMASKGYLLWQKNSYINIPAIRNNEIIVNGWWQDVRFFHDVEAELSNEI DSM VPTTKPISENEYLYNIAERENSVCVSIRGGNYLVPKVKKKLFVCDKEYFYNAIELIKSKVRNAIFIVFSDDLE 10507; Blautia; WVKSYIKLEEKFPECKFYYESGKDTVEEKLRMMTKCKHFIISNSSFSWWAQYLAKNENKIVIAPDAWFTNGDK Blautia NGLYIDDWILIPTQTKDM hydrogenotrophica CAG: 147 Geobacter YP_001952981.1 189425804 glycoside 22.44 MITVLLNGGLGNQLFQYAAGRALAEKHDVELLLDLSRLQHPKPGDTPRCFELAPFNIKASLLAEEGRQPLGS 262 lovleyi; hydrolase YQACMHRLLLKASIPLWGSIILKEQGCGFDPLIFRAPSSCILDGFWQSECYFKQITSLLQQELSLKAPSPALR Geobacter family KASSVLSDATVAVHVRRGDYVTNPAAASFHGICSQDYYQAAVANILTSYPDSQFLVFSDDPAWCQEHLDLG lovleyi SZ protein QPFRLAADFGLNGSAEELVLISRCAHQIIANSSFSWWGAWLNPSPHKLVVAPCRWFTDPAITTNDLLPETW [Geobacter VRLP lovleyi SZ] Lachnospiraceae WP_022781176.1 551037435 protein 22.41 MVISHLSGGFGNQLYSYAFAYAVAKARKEELWIDTAIQDAPWFFRNPDILNLNIKYDKRVSYKIGEKKIDKIF 263 bacterium [Lachnospiraceae NRINFRNAIGWNTKIINESDMPNIDDWFDTCVNQKGNIYIKGNWSYEKLFISVKQEIIDMFTFKNELSKEAN NK4A136 bacterium DIAQDINSQETSVGIHYRLGDYVKIGIVINPDYFISAMTSMVEKYGNPVFYSFSEDNDWVKKQFEGLPYNIKY NK4A136] VEYSSDDKGLEDFRLYSMCKHQIASNSSYSWWGAYLNNNPNKYIIAPTDYNGGWKSEIYPKHWDVRPFEF LK Bacteroides WP_005840359.1 492426440 glycosyl 22.37 MFHYKFLLFGGGLGNQIFEYYFYLWLRKKYPNIVFLGCYRKASFKAHNGLEISDVFDVDLPNDGGLSGRFISY 264 vulgatus; transferase VLSVLSRIIPSLSMKANTEYSSKYLLINAYQPNLLFYLNEEKIKFRPFKLDEVNRRLLNSIKMESSVSIHVRR Bacteroides family 11 GDYLFGQYRDIYSNICTLAYYQKAVDKCKGILESPRFFVFSDDIEWARDVFVGREYEFVSNNIGKNSFIDMFL vulgatus [Bacteroides MSNCKIQIIANSTFSYWAAYLSNSLVKIYPAKWINGIERPNIFPDNWIGL PC510 vulgatus] Planctomyces YP_004271766.1 325110698 glycosyl 22.37 MIIARIENGLGNQLFKYAAGRALSLKHRTSLYTIPGSVRKPHETFILSKYFNVQAKSVSPFLLQTGFRLRLLK 265 brasiliensis; transferase GYENHSFGFDPRFETTRNNTVVSGNFQSARYFLPFFDQINRELTLKPEVVDGLESVYPHVLESLRTPNSVCVH Planctomyces family protein IRLGDYVSSGYDICGPEYYAKAISRLQQLHGELRAFVFSDTPQAASRFLPADIDAQIMSEFPEVRDAARSLTV brasiliensis [Planctomyces ERSTIRDYFLMQQCRHFVIPNSSFSYWAALLSSSDGDVIYPNRWYIDIDTSPRDLGLAPAEWTPIPLT DSM 5305 brasiliensis DSM 5305] Butyrivibrio sp. WP_022772730.1 551028648 glycosyl 22.36 MIILQIAGGLGNQMQQYALYRKLLKCGKTVKLDLSWFGPEIQKNMLAPREFELVLFKDLPFEICFKEEKDALI 266 AE2015 transferase KQNLFQKIAGKVSQKLGKSASSNAKVFVETKMYHEEIFDLDDVYITGYFACQYYYDDVMAELQDLFVFPSHS [Butyrivibrio IPELDQRNAVLASKMEKENSVSVHIRRGDYLSPENVGILGNIASDKYYESAMNYFLEKDENTHFYIFTNDHEY sp. AE2015] AREHYSDESRYTIIDWNTGKNSLQDLMLMSHCKGNICANSTFSFWGARLNKRPDRELVRTLKMRNNQEA QPEIMHEYWKNWILIDENGVIV Roseovarius WP_009813856.1 497499658 alpha-1,2- 22.34 MTDTPPPSQVITSRLFGGAGNQLFQYAAGRALADRLGCDLMIDARYVAGSRDRGDCFTHFAKARLRRDVA 267 nubinhibens fucosyl- LPPAKSDGPLRYALWRKFGRSPRFHRERGLGVDPEFFNLPRGTYLHGYWQSEQYFGPDTDALRRDLTLTTA ISM; transferase, LDAPNAAMAAQIDAAPCPVSFHVRRGDYIAAGAYAACFPDYYRAAADHLATTLGKPLTCFIFSNDPAWAR Roseovarius [Roseovarius DNLDLGQDQVIVDLNDEATGHFDMALMARCAHHVIANSTFSWWGAWLNPDPDKLVVAPRNWFATQA nubinhibens nubinhibens] LHNPDLIPEQWHRL Eubacterium sp. WP_022505071.1 548315094 protein 22.33 MIEVNIVGQLGNQMFEYACARQLQKKYGGEIVLNTYEMRKETPNFKLSILDYKLSENVKIISDKPLSSANAN 268 CAG: 581 [Eubacterium NYLVKIMRQYFPNWYFNFMAKRGTFVWKSARKYKELPELNEQLSKHIVLNGYWQCDKYFNDVVDTIREDF sp. CAG: 581] TPKYPLKAENEQLLEKIKSTESVCVTIRRGDFMNEKNKDTFYICDDDYFNKALSKIKELCPDCTFFGFSDDVE WIKKNVNFPGEVYFESGNDPVWEKLRLMSACKHFVLSNSSFSWWAQYLSDNNNKIVVAPDIWYKTGDPK KTALYQDGWNLIHIGD Providencia AFH02807.1 383289327 glycosyl- 22.26 MKINGKESSMKIKQKKIISHLIGGLGNQLFQYATSYALAKENNAKIVIDDRLFKKYKLHGGYRLDKLNIIGE 269 alcalifaciens transferase KISSIDKLLFPLILCKLSQKENFIFKSTKKFILEKKTSSFKYLTFSDKEHTKMLIGYWQNAIYFQKYFSELK [Providencia EMFVPLDISQEQLDLSIQIHAQQSVALHVRRGDYISNKNALAMHGICSIDYYKNSIQHINAKLEKPFFYIFS alcalifaciens] NDKLWCEENLTPLFDGNFHIVENNSQEIDLWLISQCQHHIIANSTFSWWGAWLANSDSQIVITPDPWFNKEI DIPSPVLSHWLKLKK Salmonella AFW04804.1 411146173 glycosyl- 22.26 MFSCLSGGLGNQMFQYSAAYILKKNICHAQLIIDDSYFYCQPQKDTPRNFEINQFNIVFDRVTTDEEKRAISK 270 enterica transferase LRKFKKIPLPLFKSNVITEFLFGKSLLTDEDFYKVLKKNQFTVKMNACLFSLYQDSSLINKYRDLILPLFTIN [Salmonella DELLQVCQQLDSYGFICEHTNTTSLHIRRGDYVTNPHAAKFHGTLSMNYYSQAMNYVDHKLGKQLFIIFSDDV enterica] QWAAEKFGGRSDCYIVNNVNCQFSAIDMYLMSLCNNNIIANSTYSWWGAWLNKSEEKLVIAPRKWFAEDK ESLLAVNDWISI Sulfurospirillum YP_003304829.1 268680398 protein 22.18 MIIIKIMGGLASQLHKYSVGRALSLKYNTELKLDIFWFDNISGSDTIREYHLDKYNVVAKIATEQEIKQFKPNK 271 deleyianum; Sdel_1779 YLLKINNLFQKFTNWKINYRNYCNESFISLENFNLLPDNIYVEGEWSGDRYFSHIKEILQKELTLKSEYMDSTN Sulfurospirillum [Sulfuro- HFLAKQSSDFAHDDNASKLHCTCSLEYYKKALQYISKNLLKMKLLIFSDDLDWLKPNFNFLDNVEFEFVEGF deleyianum spirillum QDYEEFHLMTLSKHNIIANSGFSLFFAWLNINHNKIIISLSEWVFEEKLNKYIIDNIKDKNILFLENLE DSM 6946 deleyianum DSM 6946] Pseudovibrio YP_005080114.1 374329930 alpha-1,2- 22.15 MSVASQVRISGAARRRKLKPTLIVRIRGGIGNQLFQYALGRKIALETGMKLRFDRSEYDQYFNRSYCLNLFKT 272 sp. FO-BEG1 fucosyl- QGLSATESEMSAVLWPAQSFGQTVKLCRKFYPFYQRRYIREDELLQDSETPVLKQSAYLDGYWQTWEIPFSI transferase MEQLRDEITLKKPMVLERLKLLQRIKSGPSAALHVRYGDYSQAHNLQNFGLCSAGYYKGAMDFLTERVPGL [Pseudovibrio TFYVFSDSPERAREVVPQQENVYFSDPMQDGKDHEDLMVMSSCDHIVTANSTFSWWAAFLNGNEDKHV sp. FO-BEG1] IAPLKWFKNPNLDDSLIVPPHWQRL Prevotella sp. WP_009236633.1 496529942 alpha-1,2- 22.11 MKIVCIKGGLGNQLFEYCRYHGLLRQHNNHGVYLHYDRRRTKQHGGVWLDKAFLITLPTEPWRVKLMVM 273 oral taxon 472 fucosyl- ALKMLRKLHLFKRLYREDDPRAVLIDDYSQHKQFITNAAEILNFRPFAQLDYVDEITSEPFAVSVHVRRGD str. F0295; transferase YLLPANKANFGVCSVHYYLSAAVAVRERHPDARFFVFSDDIEWAKMNLNLPNCVFVEHAQPQPDHADLYLM Prevotella [Prevotella SLCKGHIIANSTFSFWGAYLSMGSSAIAIYPKQWFAEPTWNAPDIFLGHWIAL sp. oral sp. oral taxon 472 taxon 472] Butyrivibrio WP_022752732.1 551008155 glycosyl 22.08 MLIIRVAGGLGNQMQQYAMYRKLKSLGKEVKLDLSWFDVENQEGQLAPRKCELKYFDGVDFEECTDAER 274 fibrisolvens transferase AYFTKRSILTKALNKVFPATCKIFEETEMFHPEIYSFKDKYLEGYFLCNKYYDDILPFIQNEIVFPKHSDPK [Butyrivibrio RMQKNEELMERMDGWHTASIHLRRGDYITEPQNEALFGNIATDAYYDAAIRYVLDKDYQTHFYIFSNDPEYA fibrisolvens] REHYSDESRYTIVTGNDGDNSLLDMELMSHCRYNICANSTFSFWGARLNKRSDKEMIRTFKMRNNQEVTARE MTDYWKDWILIDEKGNRIF Lewinella WP_020571066.1 522059857 protein 22.04 MVISRLHSGLGNQMFQYAFARRIQLQLNVKLRIDLSILLDSRPPDGYIKREYDLDIFKLSPAYHCNPTSLRI 275 persica [Lewinella LYAPGKYRWSQVVRDLARKGYPVYMEKSFSVDNTLLDSPPDNVIYQGYWQSERYFSEVANTIRKDFAFQHSI persica] QPQSESLAREIRKEDSVCLNIRRKDYLASPTHNVTDETYYENCIQQMRERFSGARFFLFSDDLVWCREFFAD FHDVVIVGHDHAGPKFGNYLQLMAQCHHYIIPNSTFAWWAAWLGERTGSVIMAPERWFGTDEFDYRDVV PERWLKVPN
OTHER EMBODIMENTS
(118) While the invention has been described in conjunction with the detailed description thereof, the foregoing description is intended to illustrate and not limit the scope of the invention, which is defined by the scope of the appended claims. Other aspects, advantages, and modifications are within the scope of the following claims.
(119) The patent and scientific literature referred to herein establishes the knowledge that is available to those with skill in the art. All United States patents and published or unpublished United States patent applications cited herein are incorporated by reference. All published foreign patents and patent applications cited herein are hereby incorporated by reference. Genbank and NCBI submissions indicated by accession number cited herein are hereby incorporated by reference. All other published references, documents, manuscripts and scientific literature cited herein are hereby incorporated by reference.
(120) While this invention has been particularly shown and described with references to preferred embodiments thereof, it will be understood by those skilled in the art that various changes in form and details may be made therein without departing from the scope of the invention encompassed by the appended claims.