Group of UDP-glycosyltransferase for catalyzing carbohydrate chain elongation and application thereof

11542484 · 2023-01-03

Assignee

Inventors

Cpc classification

International classification

Abstract

The present invention relates to a group of glycosyltransferase, and an application thereof. Specifically, provided is using glycosyltransferase GT29-32, GT29-33, GT29-34, GT29-4, GT29-5, GT29-7, GT29-9, GT29-11, GT29-13, GT29-17, GT29-18, GT29-19, GT29-20, GT29-21, GT29-22, GT29-23, GT29-24, GT29-25, GT29-36, GT29-37, GT29-42, GT29-43, GT29-45, GT29-46, PNUGT29-1, PNUGT29-2, PNUGT29-3, PNUGT29-4, PNUGT29-5, PNUGT29-6, PNUGT29-7, PNUGT29-8, PNUGT29-9, PNUGT29-14, and PNUGT29-15, as well as derived polypeptides thereof to catalyze the first glycosyl at position C-20, the first glycosyl at position C-6, and the first glycosyl at position C-3 of a tetracyclic triterpene compound substrate to elongate a carbohydrate chain, thereby obtaining a catalytic reaction of ginsenoside products such as ginsenoside Rg3, ginsenoside Rd, ginsenoside Rb1, ginsenoside Rb3, saponin DMGG, saponin DMGX, gypenoside LXXV, gypenoside XVII, gypenoside XIII, gypenoside IX, notoginsenoside U, and notoginsenoside R1, notoginsenoside R2, notoginsenoside R3, 3-O-β-(D-xylopyranosyl)-β-(D-glucopyranosyl)-PPD, 3-O-β-(D-xylopyranosyl)-β-(D-glucopyranosyl)-CK, 20-O-Glucosylginsenoside Rf, and Ginsenoside F3. Glycosyltransferase in the present invention can further be applied to construction of artificially synthesized ginsenoside, novel ginsenoside, and derivatives thereof.

Claims

1. An in vitro glycosylation method, comprising the steps of: (a) expressing a glycosyltransferase in a host cell, wherein the glycosyltransferase comprises; SEQ ID NO:4; (b) collecting lysates of the host cell wherein the glycosyltransferase has been expressed; and (c) in vitro transferring a glycosyl group from a glycosyl donor to the following positions of a tetracyclic triterpenoid in the presence of the lysates of the host cell: the first glycosyl group at position C20, or positions C3 and C20; thereby forming a glycosylated tetracyclic triterpenoid.

2. The method of claim 1, wherein the glycosyltransferase further comprises a tag sequence or a signal sequence, and wherein the glycosyltransferase is operably linked to the tag sequence or signal sequence.

3. The method of claim 2, wherein the tag sequence is selected from the group consisting of FLAG, HA, HA1, c-Myc, Poly-His with 2-10 residues, Poly-Arg with 5-6 residues, and Strep-TagII.

4. The method of claim 2, wherein the signal sequence comprises a pelB signal peptide sequence.

5. The method of claim 1, wherein the glycosyl donor comprises UDP-glucose, UDP-xylose, UDP-arabinose, or any combination thereof.

6. The method of claim 1, wherein the tetracyclic triterpenoid comprises protopanaxadiol ginsenoside CK, ginsenoside Rd, protopanaxatriol ginsenoside F1, or a combination thereof.

Description

DESCRIPTION OF FIGURES

(1) FIG. 1 (A) shows the expression shown by SDS-PAGE of glycosyltransferase genes GT29-32, GT29-33, and GT29-34 in E. coli; lane control represents total protein of lysate or lysis supernatant of empty vector recombinant pet28a; lane GT29-32 represents total protein or lysis supernatant of recombinant E. coli BL21-GT29-32; lane GT29-33 represents total protein or lysis supernatant of recombinant E. coli BL21-GT29-33; lane GT29-34 represents total protein of lysate or lysis supernatant of recombinant E. coli BL21-GT29-34; (B) shows the expression shown by Western Blot of glycosyltransferase genes GT29-32, GT29-33 and GT29-34 in E. coli; lane control represents total protein of lysate or lysis supernatant of empty vector recombinant pet28a; lane GT29-32 represents total protein of lysate or lysis supernatant of recombinant E. coli BL21-GT29-32; lane GT29-33 represents total protein of lysate or lysis supernatant of recombinant E. coli BL21-GT29-33; lane GT29-34 represents total protein of lysate or lysis supernatant of recombinant E. coli BL21-GT29-34.

(2) FIG. 2 shows a TLC pattern of a transglycosyl reaction catalyzed by glycosyltransferases GT29-32, GT29-33 and GT29-34 with ginsenoside CK or Rd as a glycosyl acceptor and UDP-glucose or UDP-xylose as a glycosyl donor. Control represents the lysate supernatant of the pet28a empty vector recombinant as an enzyme solution; GT29-32, GT29-33 and GT29-34 respectively represent the lysate supernatants of recombinant E. coli BL21-GT29-32, BL21-GT29-33 and BL21-GT29-34 as an enzyme solution.

(3) FIG. 3 shows an HPLC pattern of a transglycosyl reaction catalyzed by glycosyltransferases GT29-32, GT29-33 and GT29-34 with ginsenoside CK or Rd as a glycosyl acceptor and UDP-glucose as a glycosyl donor. Control represents the lysate supernatant of the pet28a empty vector recombinant as an enzyme solution; GT29-32, GT29-33 and GT29-34 respectively represent the lysate supernatants of recombinant E. coli BL21-GT29-32, BL21-GT29-33 and BL21-GT29-34 as an enzyme solution.

(4) FIG. 4 shows an HPLC pattern of a transglycosyl reaction catalyzed by glycosyltransferases GT29-32, GT29-33, and GT29-34 with ginsenoside Rd as a glycosyl acceptor and UDP-xylose as a glycosyl donor. Control represents the lysate

(5) supernatant of the pet28a empty vector recombinant as an enzyme solution; GT29-32, GT29-33, and GT29-34 respectively represent the lysate supernatants of recombinant E. coli BL21-GT29-32, BL21-GT29-33 and BL21-GT29-34 as an enzyme solution.

(6) FIG. 5 shows a TLC pattern of a transglycosyl reaction catalyzed by glycosyltransferases GT29-32, GT29-33 and GT29-34 with ginsenoside F1 as a glycosyl acceptor and UDP-glucose as a glycosyl donor. Control represents the lysate supernatant of the pet28a empty vector recombinant as an enzyme solution; GT29-32, GT29-33, and GT29-34 respectively represent the lysate supernatants of recombinant E. coli BL21-GT29-32, BL21-GT29-33 and BL21-GT29-34 as an enzyme solution.

(7) FIG. 6 shows an HPLC pattern of a transglycosyl reaction catalyzed by glycosyltransferases GT29-32, GT29-33, and GT29-34 using ginsenoside F1 as a glycosyl acceptor and UDP-glucose as a glycosyl donor. Control represents the lysate supernatant of the pet28a empty vector recombinant as an enzyme solution; GT29-32, GT29-33, and GT29-34 respectively represent the lysate supernatants of recombinant E. coli BL21-GT29-32, BL21-GT29-33 and BL21-GT29-34 as an enzyme solution.

(8) FIG. 7 shows a TLC pattern of a transglycosyl reaction catalyzed by the glycosyltransferases GT29-4, GT29-5, GT29-7, GT29-9, GT29-11, GT29-13, GT29-17, GT29-18, GT29-24 and GT29-25 with ginsenoside Rg1 as a glycosyl acceptor and UDP-xylose as a glycosyl donor. Control represents the lysate supernatant of the pet28a empty vector recombinant as an enzyme solution; GT29-4, GT29-5, GT29-7, GT29-9, GT29-11, GT29-13, GT29-17, GT29-18, GT29-24 and GT29-25 respectively represent the lysate supernatants of recombinant E. coli BL21-GT29-4, BL21-GT29-5, BL21-GT29-7, BL21-GT29-9, BL21-GT29-11, BL21-GT29-13, BL21-GT29-17, BL21-GT29-18, BL21-GT29-24 and BL21-GT29-25 as an enzyme solution.

(9) FIG. 8 shows an HPLC pattern of a transglycosyl reaction catalyzed by the glycosyltransferases GT29-4, GT29-5, GT29-7 and GT29-9 using ginsenoside Rg1 as a glycosyl acceptor and UDP-xylose as a glycosyl donor. Control represents the lysate supernatant of the pet28a empty vector recombinant as an enzyme solution; GT29-4, GT29-5, GT29-7, and GT29-9, respectively represents the lysate supernatants of recombinant E. coli BL21-GT29-4, BL21-GT29-5, BL21-GT29-7 and BL21-GT29-9 as an enzyme solution.

(10) FIG. 9 shows an HPLC pattern of a transglycosyl reaction catalyzed by the glycosyltransferases GT29-11, GT29-13, GT29-17, GT29-18, GT29-24 and GT29-25 with the ginsenoside Rg1 as a glycosyl receptor and UDP-xylose as a glycosyl donor. Control represents the lysate supernatant of the pet28a empty vector recombinant as an enzyme solution; GT29-11, GT29-13, GT29-17, GT29-18, GT29-24 and GT29-25 respectively represents the lysate supernatants of recombinant E. coli BL21-GT29-11, BL21-GT29-13, BL21-GT29-17, BL21-GT29-18, BL21-GT29-24, and BL21-GT29-25 as an enzyme solution.

(11) FIG. 10 shows Western blot detection for the protein expressions of glycosyltransferases PNUGT29-1, PNUGT29-2, PNUGT29-3, PNUGT29-4, PNUGT29-5, PNUGT29-6, PNUGT29-7, PNUGT29-8, PNUGT29-9, PNUGT29-14, and PNUGT29-15. Control represents the lysate supernatant of the pet28a empty vector recombinant as an enzyme solution; PNUGT29-1, PNUGT29-2, PNUGT29-3, PNUGT29-4, PNUGT29-5, PNUGT29-6, PNUGT29-7, PNUGT29-8, PNUGT29-9, PNUGT29-14, PNUGT29-15 respectively represents the lysate supernatants of recombinant E. coli BL21-PNUGT29-1, BL21-PNUGT29-2, BL21-PNUGT29-3, BL21-PNUGT29-4, BL21-PNUGT29-5, BL21-PNUGT29-6, BL21-PNUGT29-7, BL21-PNUGT29-8, BL21-PNUGT29-9, BL21-PNUGT29-14, and BL21-PNUGT29-15 as enzyme solution.

(12) FIG. 11 shows a TLC pattern of a transglycosyl reaction catalyzed by the glycosyltransferases PNUGT29-1, PNUGT29-2, PNUGT29-3, PNUGT29-4, PNUGT29-5, PNUGT29-6, PNUGT29-7, PNUGT29-8, PNUGT29-9, PNUGT29-14, and PNUGT29-15 with the ginsenoside Rd as a glycosyl acceptor and UDP-glucose as a glycosyl donor. Control represents the lysate supernatant of the pet28a empty vector recombinant as an enzyme solution; PNUGT29-1, PNUGT29-2, PNUGT29-3, PNUGT29-4, PNUGT29-5, PNUGT29-6, PNUGT29-7, PNUGT29-8, PNUGT29-9, PNUGT29-14, PNUGT29-15 respectively represents the lysate supernatants of recombinant E. coli BL21-PNUGT29-1, BL21-PNUGT29-2, BL21-PNUGT29-3, BL21-PNUGT29-4, BL21-PNUGT29-5, BL21-PNUGT29-6, BL21-PNUGT29-7, BL21-PNUGT29-8, BL21-PNUGT29-9, BL21-PNUGT29-14, and BL21-PNUGT29-15 as an enzyme solution.

(13) FIG. 12 shows a TLC pattern of a transglycosyl reaction catalyzed by the glycosyltransferases PNUGT29-1, PNUGT29-2, PNUGT29-3, PNUGT29-4, PNUGT29-5, PNUGT29-6, PNUGT29-7, PNUGT29-8, PNUGT29-9, PNUGT29-14, and PNUGT29-15 with the ginsenoside CK as glycosyl acceptor and UDP-glucose as a glycosyl donor. Control represents the lysate supernatant of the pet28a empty vector recombinant as an enzyme solution; PNUGT29-1, PNUGT29-2, PNUGT29-3, PNUGT29-4, PNUGT29-5, PNUGT29-6, PNUGT29-7, PNUGT29-8, PNUGT29-9, PNUGT29-14, PNUGT29-15 respectively represents the lysate supernatants of recombinant E. coli BL21-PNUGT29-1, BL21-PNUGT29-2, BL21-PNUGT29-3, BL21-PNUGT29-4, BL21-PNUGT29-5, BL21-PNUGT29-6, BL21-PNUGT29-7, BL21-PNUGT29-8, BL21-PNUGT29-9, BL21-PNUGT29-14, and BL21-PNUGT29-15 as an enzyme solution.

(14) FIG. 13 shows a TLC pattern of a transglycosyl reaction catalyzed by the glycosyltransferases PNUGT29-1, PNUGT29-2, PNUGT29-3, PNUGT29-4, PNUGT29-5, PNUGT29-6, PNUGT29-7, PNUGT29-8, PNUGT29-9, PNUGT29-14, and PNUGT29-15 with the ginsenoside Rh2 as a glycosyl acceptor and UDP-glucose as a glycosyl donor. Control represents the lysate supernatant of the pet28a empty vector recombinant as an enzyme solution; PNUGT29-1, PNUGT29-2, PNUGT29-3, PNUGT29-4, PNUGT29-5, PNUGT29-6, PNUGT29-7, PNUGT29-8, PNUGT29-9, PNUGT29-14, PNUGT29-15 respectively represents the lysate supernatants of recombinant E. coli BL21-PNUGT29-1, BL21-PNUGT29-2, BL21-PNUGT29-3, BL21-PNUGT29-4, BL21-PNUGT29-5, BL21-PNUGT29-6, BL21-PNUGT29-7, BL21-PNUGT29-8, BL21-PNUGT29-9, BL21-PNUGT29-14, and BL21-PNUGT29-15 as enzyme solution.

(15) FIG. 14 shows a TLC pattern of a transglycosyl reaction catalyzed by the glycosyltransferases GT29-36, GT29-36, GT29-42 and GT29-43 with ginsenoside Rd as a glycosyl acceptor and UDP-xylose as a glycosyl donor. GT29-36, GT29-36, GT29-42 and GT29-43 respectively represents the lysate supernatants of recombinant E. coli BL21-GT29-36, BL21-GT29-36, BL21-GT29-42 and BL21-GT29-43 as an enzyme solution.

(16) FIG. 15 shows a TLC pattern of a transglycosyl reaction catalyzed by the glycosyltransferase GT29-45 and GT29-46 using ginsenoside Rh2 as a glycosyl acceptor and UDP-glucose as a glycosyl donor. Control represents the lysate supernatant of the pet28a empty vector recombinant as an enzyme solution; GT29-45 and GT29-46 respectively represents the lysate supernatant of the recombinant E. coli BL21-GT29-45 and BL21-GT29-46 as an enzyme solution.

(17) FIG. 16 shows a TLC pattern of a transglycosyl reaction catalyzed by glycosyltransferases GT29-45 and GT29-46 using ginsenoside CK as a glycosyl acceptor and UDP-glucose as a glycosyl donor. Control represents the lysate supernatant of the pet28a empty vector recombinant as an enzyme solution; GT29-45 and GT29-46 respectively represents the lysate supernatant of the recombinant E. coli BL21-GT29-45 and BL21-GT29-46 as an enzyme solution.

(18) FIG. 17 shows a TLC pattern of a transglycosyl reaction catalyzed by glycosyltransferase GT29-32 using ginsenoside CK as a glycosyl acceptor and UDP-arabinose as a glycosyl donor. Control represents the lysate supernatant of the pet28a empty vector recombinant as an enzyme solution; GT29-32 represents the lysate supernatant of the recombinant E. coli BL21-GT29-32 as an enzyme solution.

DETAILED DESCRIPTION

(19) After an extensive and in-depth study, the present inventors have firstly provided a new glycosyltransferase and the corresponding glycosyltransfer catalytic sites. Specifically, the glycosyltransferases GT29-32 (SEQ ID NO.: 4), GT29-33 (SEQ ID NO.: 6), GT29-34 (SEQ ID NO.: 8), GT29-4 (SEQ ID NO.: 12), GT29-5 (SEQ ID NO.: 14), GT29-7 (SEQ ID NO.: 16), GT29-9 (SEQ ID NO.: 18), GT29-11 (SEQ ID NO.: 20), GT29-13 (SEQ ID NO.: 22), GT29-17 (SEQ ID NO.: 24), GT29-18 (SEQ ID NO.: 26), GT29-19 (SEQ ID NO.: 116), GT29-20 (SEQ ID NO.:118), GT29-21 (SEQ ID NO.:120), GT29-22 (SEQ ID NO.:122), GT29-23 (SEQ ID NO.:124)), GT29-24 (SEQ ID NO.: 28), GT29-25 (SEQ ID NO.: 30), GT29-36 (SEQ ID NO.: 90), GT29-37 (SEQ ID NO.: 92), GT29-42 (SEQ ID NO.: 94), GT29-43 (SEQ ID NO.: 96), GT29-45 (SEQ ID NO.: 98), GT29-46 (SEQ ID NO.: 100), PNUGT29-1 (SEQ ID NO.: 39), PNUGT29-2 (SEQ ID NO.: 41), PNUGT29-3 (SEQ ID NO.: 43), PNUGT29-4 (SEQ ID NO.: 45), PNUGT29-5 ((SEQ ID NO.: 47), PNUGT29-6 (SEQ ID NO.: 49), PNUGT29-7 (SEQ ID NO.: 51), PNUGT29-8 (SEQ ID NO.: 53), PNUGT29-9 (SEQ ID NO.: 55), PNUGT29-14 (SEQ ID NO.: 57), PNUGT29-15 (SEQ ID NO.: 59) can specifically and efficiently catalyze the hydroxyl glycosylation of the first glycosyl group on the C-20, C-6, or C3 position of a tetracyclic triterpene compound substrate or replace the original glycosyl group with a glycosyl group to extend the carbohydrate chain.

(20) The glycosyltransferase of the present invention is particularly capable of converting ginsenosides CK, DMG, F2, Rd, F1, Rh1, and Rg1 to ginsenoside Rg3, ginsenoside Rd, ginsenoside Rb1, ginsenoside Rb3, saponin DMGG, saponin DMGX, gypenosides LXXV, gypenosides XVII, gypenosides XIII, gypenosides IX, notoginsenoside U and notoginsenoside R1 and notoginsenoside R2, notoginsenoside R3, 3-O-β-(D-xylopyranosyl)-β-(D-glucopyranosyl)-PPD 3-O-β-(D-xylopyranosyl)-β-(D-glucopyranosyl)-CK, 20-O-Glucosylginsenoside Rf and Ginsenoside F3.

Definition

(21) As used herein, the terms “active polypeptide”, “polypeptide of the present invention and the derivative polypeptide thereof”, “the enzyme of the present invention” and “glycosyltransferase” can be used interchangeably and all refer to GT29-32 (SEQ ID NO.: 4), GT29-33 (SEQ ID NO.: 6), GT29-34 (SEQ ID NO.: 8), GT29-4 (SEQ ID NO.: 12), GT29-5 (SEQ ID NO.: 14), GT29-7 (SEQ ID NO.: 16), GT29-9 (SEQ ID NO.: 18), GT29-11 (SEQ ID NO.: 20), GT29-13 (SEQ ID NO.: 22), GT29-17 (SEQ ID NO.: 24), GT29-18 (SEQ ID NO.: 26), GT29-19 (SEQ ID NO.: 116), GT29-20 (SEQ ID NO.: 118), GT29-21 (SEQ ID NO.:120), GT29-22 (SEQ ID NO.:122), GT29-23 (SEQ ID NO.:124), GT29-24 (SEQ ID NO.: 28), GT29-25 (SEQ ID NO.: 30), GT29-36 (SEQ ID NO.: 90), GT29-37 (SEQ ID NO.: 92), GT29-42 (SEQ ID NO.: 94), GT29-43 (SEQ ID NO.: 96), GT29-45 (SEQ ID NO.: 98), GT29-46 (SEQ ID NO.: 100), PNUGT29-1 (SEQ ID NO.: 39), PNUGT29-2 (SEQ ID NO.: 41), PNUGT29-3 (SEQ ID NO.: 43), PNUGT29-4 (SEQ ID NO.: 45), PNUGT29-5 (SEQ ID NO.: 47), PNUGT29-6 (SEQ ID NO.: 49), PNUGT29-7 (SEQ ID NO.: 51), PNUGT29-8 (SEQ ID NO.: 53), PNUGT29-9 (SEQ ID NO.: 55), PNUGT29-14 (SEQ ID NO.: 57), PNUGT29-15 (SEQ ID NO.: 59) polypeptides and the derivative polypeptides thereof.

(22) As used herein, “the isolated polypeptide” or “active polypeptide” means that the polypeptide is substantially free of other proteins, lipids, carbohydrates, or other substances with which it is naturally associated. Those skilled in the art can purify the polypeptide using standard protein purification techniques. Substantially pure polypeptides can form a single main band on a non-reduced polyacrylamide gel. The purity of the polypeptide can be further analyzed using the amino acid sequence.

(23) The active polypeptide of the present invention may be a recombinant polypeptide, a natural polypeptide, or a synthetic polypeptide. The polypeptides of the present invention may be naturally purified products or chemically synthesized products, or produced from prokaryotic or eukaryotic hosts (e.g., bacteria, yeast, plants) using recombinant techniques. Depending on the host used in the recombinant production protocol, the polypeptides of the present invention may be glycosylated or may be non-glycosylated. The polypeptides of the present invention may also include or exclude the starting methionine residue.

(24) The present invention further provides fragments, derivatives and analogs of the polypeptides. As used herein, the terms “fragment”, “derivative” and “analog” refer to a polypeptide that substantially retains the same biological function or activity of the polypeptide.

(25) The polypeptide fragment, derivative or analog of the present invention may be (i) a polypeptide having one or more conservative or non-conservative amino acid residues (preferably conservative amino acid residues) substituted, and such substituted amino acid residues may or may not be encoded by the genetic code, or (ii) a polypeptide having a substituent group in one or more amino acid residues, or (iii) a polypeptide formed by fusion of a mature polypeptide with another compound, such as a compound that extends the half-life of the polypeptide, such as polyethylene glycol, or (iv) a polypeptide formed by fusing an additional amino acid sequence to this polypeptide sequence (such as a leader sequence or a secreted sequence or a sequence or protease sequence used to purify this polypeptide, or a fusion protein formed with an antigen IgG fragment). In accordance with the teachings herein, these fragments, derivatives, and analogs are within the scope of those skilled in the art.

(26) The active polypeptide of the present invention has glycosyltransferase activity and can catalyze one or more of the following reactions:

(27) ##STR00006##

(28) wherein R1 is H, a monosaccharide glycosyl or a polysaccharide glycosyl; R2 is H or OH; R3 is a monosaccharide glycosyl; R4 is a monosaccharide glycosyl, and the polypeptide is selected from SEQ ID NO: 4, 6, 8, 39, 41, 43, 45, 47, 49, 51, 53, 55, 57, 59, 98, 100, 116, 118, 120, 122, and 124 and a derivative polypeptide thereof.

(29) In another preferred embodiment, the monosaccharide comprises glucose (Glc), rhamnose (Rha), acetylglucose (Glc(6)Ac), arabinofuranose (Araf), arabinopyranose (Arap), or xylose (Xyl) and the like.

(30) In another preferred embodiment, the polysaccharide comprises Glc (2-1) Glc, Glc (6-1) Glc, Glc (6) Ac, Glc (2-1) Rha, Glc (6-1) Arap, Glc (6-1) Xyl, Glc (6-1) Araf, Glc (3-1) Glc (3-1), Glc (2-1) Glu (6) Ac, Glc (6-1) Arap (4-1) Xyl, Glc (6-1) Arap (2-1) Xyl, or Glc (6-1) Arap (3-1) Xyl and other polysaccharides composed of 2-4 monosaccharides.

(31) The R1-R4 substituted compounds are shown in the table below:

(32) TABLE-US-00005 substrate R1 R2 R3 R4 product CK H 0H Glc Glc Gypenosides LXXV DMG H H Glc Glc DMGG F2 Glc OH Glc Glc Gypenosides XVII Rd Glc(2-1)Glc 0H Glc Glc Rb1 CK H 0H Glc Xyl Gypenosides XIII DMG H H Glc Xyl DMGX F2 Glc OH Glc Xyl Gypenosides IX Rd Glc(2-1)Glc 0H Glc Xyl Rb3 CK H 0H Glc Arabinose Ginsenoside F3

(33) that is, when R1 is H, R2 is OH, and R3 is a glucosyl, the compound of formula (I) is ginsenoside CK (CK); when R1 and R2 are both H, and R3 is a glucosyl, the compound of Formula (I) is ginsenoside DMG; when R1 is a glucosyl, R2 is OH, and R3 is a glucosyl, the compound of Formula (I) is ginsenoside F2 (F2); or

(34) when R1 is two glucosyls (Glc (2-1) Glc), R2 is OH, and R3 is a glucosyl, the compound of Formula (I) is ginsenoside Rd;

(35) ##STR00007##

(36) wherein R1 is H, a glycosyl or a polysaccharide glycosyl, R2 is a glycosyl, R3 is a glycosyl, the polypeptide is selected from SEQ ID NOs.: 4 and a derivative polypeptide thereof;

(37) The R1-R3 substituted compounds are shown in the table below:

(38) TABLE-US-00006 substrate R1 R2 R3 product F1 H Glc Glc Notoginsenoside U Rgl Glc Glc Glc Notoginsenoside R3

(39) that is, when R1 is H and R2 is a glucosyl, the compound of Formula (III) is ginsenoside F1 (F1); or when R1 and R2 are glucosyls, the compound of Formula (III) is ginsenoside Rg1 (Rg1);

(40) ##STR00008##

(41) wherein R1 and R2 are H or glycosyls, and R3 and R4 are glycosyls. The polypeptide is selected from SEQ ID NOs.: 12, 14, 16, 18, 20, 22, 24, 26, 28, 30 or a derivative polypeptide thereof;

(42) The R1-R4 substituted compounds are shown in the table below:

(43) TABLE-US-00007 substrate R1 R2 R3 R4 product Rgl H Glc Glc Xyl Notoginsenoside R1 Rgl H Glc Glc Glc 20-O-Glucosylginsenoside Rf Rhl H H Glc Xyl Notoginsenoside R2 Rhl H H Glc Glc Ginsenoside Rf

(44) that is, when R1 is H and both R2 and R3 are glucosyls, the compound of Formula (V) is ginsenoside Rg1;

(45) when R1 and R2 are H, and R3 is glucosyl, the compound of Formula (V) is ginsenoside Rh1.

(46) ##STR00009##

(47) wherein R1 is a glycosyl; R2 and R3 are OH or H; R4 is a glycosyl or H; R5 is a glycosyl, R5-R1-O is a glycosyl derived from the first glycosyl of C3; and the polypeptide is selected from SEQ ID NOs.: 26, 28, 30, 39, 41, 43, 45, 47, 49, 51, 53, 55, 57, 59, 98, 100, 116, 118, 120, 122, and 124 and a derivative polypeptide thereof;

(48) TABLE-US-00008 substrate R1 R2 R3 R4 R5 product Rh2 Glc H OH H Glc Rg3 F2 Glc H OH Glc Glc Rd Gypenoside Glc H OH Glc(6,1)Glc Glc Rb1 XVII Gypenoside Glc H OH Glc(6,1)xyl Glc Rb3 IX

(49) that is, when R1 is a glucosyl; R2 is H, R3 is OH, R4 is H, and the compound of Formula (VII) is Rh2;

(50) R1 is a glucosyl; R2 is H, R3 is OH, R4 is a glucosyl, and the compound of Formula (VII) is F2;

(51) R1 is a glucosyl; R2 is H, R3 is OH, R4 is two glucosyl groups, and the compound of Formula (VII) is Gypenoside XVII;

(52) R1 is a glucosyl; R2 is H, R3 is OH, R4 is a glucosyl extended with a xylose, the compound of Formula (VII) is Gypenoside IX;

(53) ##STR00010##

(54) wherein R1 is a glycosyl; R2 and R3 are OH or H; R4 is a glycosyl or H; R5 is a glycosyl and R5-R1-O is a glycosyl derived from the first glycosyl of C3; R6 is a glycosyl and R6-R1-O is a glycosyl derived from the first glycosyl of C3, and the polypeptide is selected from SEQ ID NOs.: 41, 45, 90, 92, 94, and 96 and a derivative polypeptide thereof;

(55) R1 is two glucosyl groups, R2 is H, R3 is OH, R4 is H, and the compound of Formula (IX) is Rg3.

(56) R1 is two glucosyl groups, R2 is H, R3 is OH, R4 is glucosyl, and the compound of Formula (IX) is Rd.

(57) The preferred sequence of the polypeptide is as shown in SEQ ID NO.: 4, 6, 8, 12, 14, 16, 18, 20, 22, 24, 26, 28, 30, 39, 41, 43, 45, 47, 49, 51, 53, 55, 57, 59, 90, 92, 94, 96, 98, 100, 116, 118, 120, 122, or 124, and the term also includes polypeptide variants and the derived polypeptides that have the same function as the indicated polypeptides of SEQ ID NO 4, 6, 8, 12, 14, 16, 18, 20, 22, 24, 26, 28, 30, 39, 41, 43, 45, 47, 49, 51, 53, 55, 57, 59, 90, 92, 94, 96, 98, 100, 116, 118, 120, 122, or 124. These variant forms include (but are not limited to): one or more (usually 1-50, preferably 1-30, more preferably 1-20, most preferably 1-10) amino acid deletions, insertions and/or substitutions and the addition of one or several (usually within 20, preferably within 10, more preferably within 5) amino acids at the C-terminus and/or N-terminus. For example, in the art, the substitution of amino acids with similar or close properties usually does not change the function of the protein. As another example, adding one or several amino acids to the C-terminus and/or N-terminus usually does not change the function of the protein. The term also includes active fragments and active derivatives of the polypeptides of the present invention. The present invention also provides analogues of the polypeptides. The difference between these analogues and the natural polypeptide of the present invention may be a difference in amino acid sequence, a difference in the modification form that does not affect the sequence, or both. These polypeptides include natural or induced genetic variants. Induced variants can be obtained by various techniques, such as random mutagenesis by radiation or exposure to mutagen, or by site-directed mutagenesis or other known molecular biology techniques. Analogs also include analogs with residues different from natural L-amino acids (such as D-amino acids), and analogs with non-naturally occurring or synthetic amino acids (such as (3, y-amino acids). It should be understood that the polypeptide of the present invention is not limited to the representative polypeptides exemplified above.

(58) Modified (usually without changing the primary structure) forms include: in vivo or in vitro chemically derived forms of the polypeptide such as acetylation or carboxylation. Modifications also include glycosylation, such as those produced by glycosylation modification during the synthesis and processing of polypeptides or during further processing steps. This modification can be accomplished by exposing the polypeptide to an enzyme that performs glycosylation (such as mammalian glycosylation or deglycosylation enzymes). Modified forms also include sequences with phosphorylated amino acid residues (e.g., phosphotyrosine, phosphoserine, phosphothreonine). Also included are peptides that have been modified to improve their proteolytic resistance or optimize their solubility. The amino or carboxyl terminus of GT29-32, GT29-33, GT29-34, GT29-4, GT29-5, GT29-7, GT29-9, GT29-11, GT29-13, GT29-17, GT29-18, GT29-19, GT29-20, GT29-21, GT29-22, GT29-23, GT29-24, GT29-25, GT29-32, GT29-33, GT29-34, GT29-36, GT29-37, GT29-42, GT29-43, GT29-45, GT29-46, PNUGT29-1, PNUGT29-2, PNUGT29-3, PNUGT29-4, PNUGT29-5, PNUGT29-6, PNUGT29-7, PNUGT29-8, PNUGT29-9, PNUGT29-14 or PNUGT29-15 protein of the present invention may also contain one or more polypeptide fragments as protein tags. Any suitable tags can be used in the present invention. For example, the tags may be FLAG, HA, HA1, c-Myc, Poly-His, Poly-Arg, Strep-TagII, AU1, EE, T7, 4A6, c, B, gE, and Tyl. These tags can be used to purify proteins. Table 1 lists some of the commercially available tags.

(59) TABLE-US-00009 TABLE 1 tag number of residues Poly-Arg 5-6 (usually 5) Poly-His 2-10 (usually 6)  FLAG 8 Strep-TagII 8 C-myc 10 GST 220

(60) In order to make the translated protein secreted and expressed (such as secreted out of the cell), a signal peptide sequence, such as pelB signal peptide and the like can be added to the amino terminus of GT29-32, GT29-33, GT29-34, GT29-4, GT29-5, GT29-7, GT29-9, GT29-11, GT29-13, GT29-17, GT29-18, GT29-19, GT29-20, GT29-21, GT29-22, GT29-23, GT29-24, GT29-25, GT29-32, GT29-33, GT29-34, GT29-36, GT29-37, GT29-42, GT29-43, GT29-45, GT29-46, PNUGT29-1, PNUGT29-2, PNUGT29-3, PNUGT29-4, PNUGT29-9, PNUGT29-14 or PNUGT29-15. The signal peptide can be cleaved during the secretion of the polypeptide from the cell.

(61) The polynucleotide of the present invention may be in the form of DNA or RNA. DNA form includes cDNA, genomic DNA, or synthetic DNA. DNA can be single-stranded or double-stranded. DNA can be a coding strand or a non-coding strand. The coding region sequence encoding the mature polypeptide can be the same with the coding region sequence as shown in SEQ ID NO.: 3, 5, 7, 11, 13, 15, 17, 19, 21, 23, 25, 27, 29, 38, 40, 42, 44, 46, 48, 50, 52, 54, 56, 58, 89, 91, 93, 95, 97, 99, 115, 117, 119, 121, or 123 or degenerate variants. As used herein, “degenerate variant” in the present invention refers to a nucleic acid sequence encoding the protein having SEQ ID NO.: 4, 6, 8, 12, 14, 16, 18, 20, 22, 24, 26, 28, 30, 39, 41, 43, 45, 47, 49, 51, 53, 55, 57, 59, 90, 92, 94, 96, 98, 100, 116, 118, 120, 122, or 124, but differing in the coding region sequences as shown in SEQ ID NO.: 3, 5, 7, 11, 13, 15, 17, 19, 21, 23, 25, 27, 29, 38, 40, 42, 44, 46, 48, 50, 52, 54, 56, 58, 89, 91, 93, 95, 97, 99, 115, 117, 119, 121, or 123, respectively.

(62) Polynucleotides encoding mature polypeptides of SEQ ID NO.: 4, 6, 8, 12, 14, 16, 18, 20, 22, 24, 26, 28, 30, 39, 41, 43, 45, 47, 49, 51, 53, 55, 57, 59, 90, 92, 94, 96, 98, 100, 116, 118, 120, 122, or 124 include: coding sequences encoding mature polypeptides only; coding sequences encoding mature polypeptides and various additional coding sequences; mature polypeptide coding sequences (and optional additional coding sequences) and non-coding sequences.

(63) The term “polynucleotide encoding a polypeptide” may include a polynucleotide encoding the polypeptide, or a polynucleotide further including additional coding and/or non-coding sequences.

(64) The present invention also relates to variants of the aforementioned polynucleotides, which encode fragments, analogues and derivatives of polypeptides or polypeptides having the same amino acid sequence as the present invention. This polynucleotide variant may be a naturally occurring allelic variant or a non-naturally occurring variant. These nucleotide variants include substitution variants, deletion variants and insertion variants. As known in the art, an allelic variant is a form of substitution of a polynucleotide. It may be a substitution, deletion, or insertion of one or more nucleotides, but it will not substantially change the function of the polypeptide encoded.

(65) The present invention also relates to polynucleotides that hybridize to the above-mentioned sequences and have at least 50%, preferably at least 70%, more preferably at least 80%, 85%, 90%, 95% identity between the two sequences. The present invention particularly relates to polynucleotides that can hybridize to the polynucleotides of the present invention under stringent conditions (or stringent conditions). In the present invention, “stringent conditions” means: (1) hybridization and elution at a lower ionic strength and higher temperature, such as 0.2×SSC, 0.1% SDS, 60° C.; or (2) added with denaturing agents during hybridization, such as 50% (v/v) formamide, 0.1% calf serum/0.1% Ficoll, 42° C., etc.; or (3) hybridization only when the identity between the two sequences is at least 90%, more preferably at least 95%. Furthermore, the polypeptides encoded by the hybridizable polynucleotides have the same biological function and activity as the mature polypeptides as shown in SEQ ID NO.: 4, 6, 8, 12, 14, 16, 18, 20, 22, 24, 26, 28, 30, 39, 41, 43, 45, 47, 49, 51, 53, 55, 57, 59, 90, 92, 94, 96, 98, 100, 116, 118, 120, 122, or 124.

(66) The present invention also relates to a nucleic acid fragment hybridized to the aforementioned sequences. As used herein, “nucleic acid fragment” contains at least 15 nucleotides in length, preferably at least 30 nucleotides, more preferably at least 50 nucleotides, and most preferably at least 100 nucleotides or more. Nucleic acid fragments can be used in nucleic acid amplification techniques (such as PCR) to determine and/or isolate polynucleotides encoding GT29-32, GT29-33, GT29-34, GT29-4, GT29-5, GT29-7, GT29-9, GT29-11, GT29-13, GT29-17, GT29-18, GT29-19, GT29-20, GT29-21, GT29-22, GT29-23, GT29-24, GT29-25, GT29-32, GT29-33, GT29-34, GT29-36, GT29-37, GT29-42, GT29-43, GT29-45, GT29-46, PNUGT29-1, PNUGT29-2, PNUGT29-3, PNUGT29-4, PNUGT29-5, PNUGT29-6, PNUGT29-7, PNUGT29-8, PNUGT29-9, PNUGT29-14 or PNUGT29-15 protein.

(67) The polypeptide and polynucleotide in the present invention are preferably provided in an isolated form, and are more preferably purified to homogeneity.

(68) A full-length nucleotide sequence or fragment thereof of GT29-32, GT29-33, GT29-34, GT29-4, GT29-5, GT29-7, GT29-9, GT29-11, GT29-13, GT29-17, GT29-18, GT29-19, GT29-20, GT29-21, GT29-22, GT29-23, GT29-24, GT29-25, GT29-32, GT29-33, GT29-34, GT29-36, GT29-37, GT29-42, GT29-43, GT29-45, GT29-46, PNUGT29-1, PNUGT29-2, PNUGT29-3, PNUGT29-4, PNUGT29-5, PNUGT29-6, PNUGT29-7, PNUGT29-8, PNUGT29-9, PNUGT29-14 or PNUGT29-15 of the present invention can usually be obtained by PCR amplification method, recombination method or artificial synthesis method. For the PCR amplification method, primers can be designed according to the relevant nucleotide sequence disclosed in the present invention, especially the open reading frame sequence, and a commercially available cDNA library or cDNA library prepared according to conventional methods known to those skilled in the art is used as a 25

(69) template to amplify and obtain the relevant sequences. When the sequence is long, it is often necessary to perform two or more PCR amplifications, and then splice the amplified fragments together in the correct order.

(70) Once the relevant sequence is obtained, the relevant sequence can be obtained in large quantities by the recombination method. This is usually done by cloning it into a vector, then transferring it into cells, and then isolating and obtaining the relevant sequence from the proliferated host cells by conventional methods.

(71) In addition, artificial synthetic methods can be used to synthesize the relevant sequences, especially when the length of the fragments is short. Generally, a long sequence can be obtained by synthesizing multiple small fragments and then connecting them.

(72) At present, the DNA sequence encoding the protein (or fragment or derivative thereof) of the present invention can be obtained completely by chemical synthesis. This DNA sequence can then be introduced into various existing DNA molecules (or such as vectors) and cells known in the art. In addition, mutations can also be introduced into the protein sequence of the present invention by chemical synthesis.

(73) The method of amplifying DNA/RNA using PCR technology is preferably used to obtain the gene of the present invention. Especially when it is difficult to obtain full-length cDNA from the library, the RACE method (RACE-cDNA terminal rapid amplification method) can be preferably used, and the primers used for PCR can be appropriately selected based on the sequence information of the present invention disclosed herein, and can be synthesized by conventional methods. The amplified DNA/RNA fragments can be separated and purified by conventional methods such as gel electrophoresis.

(74) The present invention also relates to a vector comprising the polynucleotide of the present invention, and a host cell produced by genetic engineering using the vector of the present invention or the protein coding sequence of GT29-32, GT29-33, GT29-34, GT29-4, GT29-5, GT29-7, GT29-9, GT29-11, GT29-13, GT29-17, GT29-18, GT29-19, GT29-20, GT29-21, GT29-22, GT29-23, GT29-24, GT29-25, GT29-32, GT29-33, GT29-34, GT29-36, GT29-37, GT29-42, GT29-43, GT29-45, GT29-46, PNUGT29-1, PNUGT29-2, PNUGT29-3, PNUGT29-4, PNUGT29-5, PNUGT29-6, PNUGT29-7, PNUGT29-8, PNUGT29-9, PNUGT29-14 or PNUGT29-15, and the method of producing the polypeptide of the present invention by recombinant technology.

(75) Through the conventional recombinant DNA technology, the polynucleotide sequence of the present invention can be used to express or produce recombinant GT29-32, GT29-33, GT29-34, GT29-4, GT29-5, GT29-7, GT29-9, GT29-11, GT29-13, GT29-17, GT29-18, GT29-19, GT29-20, GT29-21, GT29-22, GT29-23, GT29-24, GT29-25, GT29-32, GT29-33, GT29-34, GT29-36, GT29-37, GT29-42, GT29-43, GT29-45, GT29-46, PNUGT29-1, PNUGT29-2, PNUGT29-3, PNUGT29-4, PNUGT29-5, PNUGT29-6, PNUGT29-7, PNUGT29-8, PNUGT29-9, PNUGT29-14 or PNUGT29-15 polypeptide.

(76) Generally speaking, there are the following steps:

(77) (1). transforming or transducing a suitable host cell with a polynucleotide (or a variant) encoding a polypeptide of GT29-32, GT29-33, GT29-34, GT29-4, GT29-5, GT29-7, GT29-9, GT29-11, GT29-13, GT29-17, GT29-18, GT29-19, GT29-20, GT29-21, GT29-22, GT29-23, GT29-24, GT29-25, GT29-32, GT29-33, GT29-34, GT29-36, GT29-37, GT29-42, GT29-43, GT29-45, GT29-46, PNUGT29-1, PNUGT29-2, PNUGT29-3, PNUGT29-4, PNUGT29-5, PNUGT29-6, PNUGT29-7, PNUGT29-8, PNUGT29-9. PNUGT29-14 or PNUGT29-15 of the present invention, or with a recombinant expression vector containing the polynucleotide;

(78) (2). culturing a host cell in a suitable medium;

(79) (3). isolating and purifying proteins from culture medium or cells.

(80) In the present invention, polynucleotide sequences of GT29-32, GT29-33, GT29-34, GT29-4, GT29-5, GT29-7, GT29-9, GT29-11, GT29-13, GT29-17, GT29-18, GT29 19, GT29-20, GT29-21, GT29-22, GT29-23, GT29-24, GT29-25, GT29-32, GT29-33, GT29-34, GT29-36, GT29-37, GT29-42, GT29-43, GT29-45, GT29-46, PNUGT29-1, PNUGT29-2, PNUGT29-3, PNUGT29-4, PNUGT29-9, PNUGT29-14 or PNUGT29-15 can be inserted into recombinant expression vectors. The term “recombinant expression vector” refers to a bacterial plasmid, bacteriophage, a yeast plasmid, plant cell virus, mammalian cell virus such as adenovirus, retrovirus, or other vectors well known in the art. As long as it can replicate and stabilize in the host, any plasmid and vector can be used. An important feature of expression vectors is that they usually contain an origin of replication, a promoter, a marker gene and a translation control element.

(81) Methods well known to those skilled in the art can be used to construct expression vectors containing GT29-32, GT29-33, GT29-34, GT29-4, GT29-5, GT29-7, GT29-9, GT29-11, GT29-13, GT29-17, GT29-18, GT29-19, GT29-20, GT29-21, GT29-22, GT29-23, GT29-24, GT29-25, GT29-32, GT29-33, GT29-34, GT29-36, GT29-37, GT29-42, GT29-43, GT29-45, GT29-46, PNUGT29-1, PNUGT29-2, PNUGT29-3, PNUGT29-4, PNUGT29-9, PNUGT29-14 or PNUGT29-15 encoding DNA sequences and appropriate transcription/translation control signals. These methods include in vitro recombinant DNA technology, DNA synthesis technology, in vivo recombinant technology and the like. The DNA sequence can be effectively linked to an appropriate promoter in an expression vector to guide mRNA synthesis. Representative examples of these promoters are: lac or trp promoters of E. coli; λ phage PL promoters; eukaryotic promoters including CMV immediate early promoters, HSV thymidine kinase promoters, early and late SV40 promoters, retroviral LTRs and other known promoters that control gene expression in prokaryotic or eukaryotic cells or their viruses. The expression vector also includes a ribosome binding site for translation initiation and a transcription terminator.

(82) In addition, the expression vector preferably contains one or more selectable marker genes to provide phenotypic traits for selection of transformed host cells, such as dihydrofolate reductase, neomycin resistance, and green fluorescent protein (GFP) for eukaryotic cell culture, or tetracycline or ampicillin resistance for E. coli.

(83) Vectors containing the appropriate DNA sequences and appropriate promoters or control sequences as described above can be used to transform appropriate host cells so that they can express proteins.

(84) The host cell may be a prokaryotic cell, such as a bacterial cell; or a lower eukaryotic cell, such as a yeast cell; or a higher eukaryotic cell, such as a mammalian cell. Representative examples are: E. coli, Streptomyces; bacterial cells of Salmonella typhimurium; fungal cells such as yeast; plant cells; insect cells of Drosophila S2 or Sf9; animal cells such as CHO, COS, 293 cells, or Bowes melanoma cells and the like.

(85) When the polynucleotide of the present invention is expressed in higher eukaryotic cells, if an enhancer sequence is inserted into the vector, transcription will be enhanced. Enhancers are cis-acting factors of DNA, usually about 10 to 300 base pairs, which act on the promoter to enhance gene transcription. Examples include 100 to 270 base pair of SV40 enhancers on the late side of the replication start point, polyoma enhancers on the late side of the replication start point, and adenovirus enhancers.

(86) Those of ordinary skill in the art know how to select appropriate vectors, promoters, enhancers and host cells.

(87) Transformation of host cells with recombinant DNA can be performed using conventional techniques well known to those skilled in the art. When the host is a prokaryotic organism such as E. coli, competent cells that can absorb DNA can be harvested after the exponential growth phase and treated with the CaCl.sub.2 method. The procedures used are well known in the art. Another method is to use MgCl.sub.2. If necessary, transformation can also be carried out by electroporation. When the host is a eukaryote, the following DNA transfection methods can be used: calcium phosphate co-precipitation method, conventional mechanical methods such as microinjection, electroporation, liposome packaging, etc.

(88) The obtained transformant can be cultured by a conventional method and express the polypeptide encoded by the gene of the present invention. Depending on the host cell used, the medium used in the culture can be selected from various conventional mediums. The cultivation is carried out under conditions suitable for the growth of host cells. When the host cell grows to an appropriate cell density, the selected promoter is induced by an appropriate method (such as temperature conversion or chemical induction), and the cell is cultured for a period of time.

(89) The recombinant polypeptide in the above method may be expressed in a cell, on a cell membrane, or secreted out of the cell. If necessary, the recombinant protein can be isolated and purified by various separation methods using its physical, chemical and other characteristics. These methods are well known to those skilled in the art. Examples of these methods include, but are not limited to: conventional renaturation treatment, treatment with protein precipitation agent (salting out method), centrifugation, bacteria disruption through osmosis, ultra-treatment, ultra-centrifugation, molecular sieve chromatography (gel filtration), adsorption chromatography, ion exchange chromatography, high performance liquid chromatography (HPLC) and various other liquid chromatography techniques and combinations of these methods.

(90) Use

(91) The active polypeptide or glycosyltransferase involved in the present invention can be used to artificially synthesize known ginsenosides and new ginsenosides and the derivatives thereof, and can convert CK, DMG, F2, Rd, F1, Rh1 and Rg1 into ginsenoside Rg3, ginsenoside Rd, ginsenoside Rb1, ginsenoside Rb3, saponin DMGG, saponin DMGX, Gypenosides LXXV, Gypenosides XVII, Gypenosides XIII, Gypenosides IX, notoginsenoside U and, notoginsenoside R1, and notoginsenoside R2, notoginsenoside R3, 3-O-β-(D-xylopyranosyl)-β-(D-glucopyranosyl)-PPD; 3-O-β-(D-xylopyranosyl)-β-(D-glucopyranosyl)-CK, 20-O-Glucosylginsenoside Rf and Ginsenoside F3.

(92) The Main Advantages of the Invention:

(93) (1) The glycosyltransferase of the present invention can specifically and efficiently transfer a glycosyl or replace a glycosyl on the first glycosyl on the C-20 position/or the first glycosyl on the C-6 or C-3 position of the substrate of the tetracyclic triterpene compound to extend the carbohydrate chain;

(94) (2) The glycosyltransferase of the present invention is particularly capable of converting CK, DMG, F2, Rd, F1, Rh1 and Rg1 into active ginsenoside Rg3, ginsenoside Rd, ginsenoside Rb1, ginsenoside Rb3, saponin DMGG, Saponin DMGX, Gypenosides LXXV, Gypenosides XVII, Gypenosides XIII, Gypenosides IX, notoginsenoside U, notoginsenoside R1, and notoginsenoside R2, notoginsenoside R3, 3-O-β-(D-xylopyranosyl)-β-(D-glucopyranosyl)-PPD; 3-O-β-(D-xylopyranosyl)-β-(D-glucopyranosyl)-CK, 20-O-Glucosylginsenoside Rf and Ginsenoside F3.

(95) (3) Ginsenoside Rb1 has the effect of protecting nerve cells and anti-inflammatory and antioxidant; and ginsenoside Rb3 has the effect of alleviating myocardial ischemia and anti-depression. Notoginsenoside R1 is the main active ingredient of notoginsenoside with anti-inflammatory effects. Notoginsenoside R2 has a neuroprotective effect.

Example 1 Isolation of Ginseng Glycosyltransferase and the Coding Gene Thereof

(96) Ginseng RNA was extracted and reverse transcription was performed to obtain ginseng cDNA. PCR amplification was performed using primer pair 1 (SEQ ID NO.: 1 and SEQ ID NO.: 2) or primer pair 2 (SEQ ID NO.: 9 and SEQ ID NO.: 10) or primer pair 3 (SEQ ID NO.: 113 and SEQ ID NO.: 114) using this cDNA as a template to obtain a 1.4-1.5 kb amplification product. The high-fidelity KOD DNA polymerase from Bao Bioengineering Co., Ltd. was used as the DNA polymerase. PCR products were detected by agarose gel electrophoresis.

(97) The target DNA band was cut off under UV irradiation. Then the Axygen Gel Extraction Kit (AEYGEN) was used to recover DNA from the agarose gel, that is, the amplified DNA fragment. After A was added at the end of this DNA fragment using rTaq DNA polymerase from Bao Bioengineering Co., Ltd., it was ligated with the commercially available cloning vector pMD18-T Vector, and the ligation product was transformed into commercially available E. coli EPI300 competent cells. The transformed E. coli solution was coated on LB plates supplemented with AMP 50 ug/mL, IPTG 0.5 mM, X-Gal 25 μg/mL, and the recombinant clone was further verified by PCR and enzyme digestion. Several clones were selected and the recombinant plasmids were extracted and sequenced to obtain 29 different nucleic acid sequences, named GT29-32 (SEQ ID NO.: 3), GT29-33 (SEQ ID NO.: 5), GT29-34 (SEQ ID NO.: 7), GT29-4 (SEQ ID NO.: 11), GT29-5 (SEQ ID NO.: 13), GT29-7 (SEQ ID NO.: 15), GT29-9 (SEQ ID NO.: 17), GT29-11 (SEQ ID NO.: 19), GT29-13 (SEQ ID NO.: 21), GT29-17 (SEQ ID NO.: 23), GT29-18 (SEQ ID NO.: 25), GT29-19 (SEQ ID NO.: 116), GT29-20 (SEQ ID NO.: 118), GT29-21 (SEQ ID NO.: 120), GT29-22 (SEQ ID NO.: 122)), GT29-23 (SEQ ID NO.: 124), GT29-24 (SEQ ID NO.: 27), GT29-25 (SEQ ID NO.: 29), GT29-36 (SEQ ID NO.: 89), GT29-37 (SEQ ID NO.: 91), GT29-42 (SEQ ID NO.: 93), GT29-42 (SEQ ID NO.: 95), GT29-45 (SEQ ID NO.: 97) and GT29-46 (SEQ ID NO.: 99), respectively. Using BESTORF software to find ORF. Through sequence alignment, it was found that the extension products all have the conserved functional domain of glycosyltransferase family 1, indicating that it is a glycosyltransferase gene.

(98) GT29-32: The glycosyltransferase gene GT29-32 encodes a protein GT29-32 containing 442 amino acids and has the amino acid sequence as shown in SEQ ID NO: 4 in the sequence listing. The theoretical molecular weight of this protein is predicted to be 49.2 kDa by software, and the isoelectric point pI is 6.09. The amino acid sequence identity between the glycosyltransferase and the functionally identified glycosyltransferase UGTPg29 (Genbank accession AKA44579.1) is 92%.

(99) GT29-33: The glycosyltransferase gene GT29-33 encodes a protein GT29-33 containing 448 amino acids with the amino acid sequence as shown in SEQ ID NO: 6 in the sequence listing. The theoretical molecular weight of the protein is predicted to be 50.0 kDa by software, and the isoelectric point pI is 6.77. The amino acid sequence identity between the glycosyltransferase and the functionally identified glycosyltransferase UGTPg29 is 90%.

(100) GT29-34: The glycosyltransferase gene GT29-34 encodes a protein GT29-34 containing 446 amino acids and has the amino acid sequence as shown in SEQ ID NO: 8 in the sequence listing. The theoretical molecular weight of the protein is predicted to be 49.7 kDa by software, and the isoelectric point pI is 6.23. The amino acid sequence identity between the glycosyltransferase and the functionally identified glycosyltransferase UGTPg29 is 90%.

(101) GT29-4: The glycosyltransferase gene GT29-4 encodes a protein GT29-4 containing 446 amino acids with the amino acid sequence as shown in SEQ ID NO: 12 in the sequence listing. The theoretical molecular weight of the protein is predicted to be 49.8 kDa by software, and the isoelectric point pI is 5.63. The amino acid sequence identity between the glycosyltransferase and the functionally identified glycosyltransferase UGTPg29 is 92%.

(102) GT29-5: The glycosyltransferase gene GT29-5 encodes a protein GT29-5 containing 446 amino acids with the amino acid sequence as shown in SEQ ID NO: 14 in the sequence listing. The theoretical molecular weight of the protein is predicted to be 49.7 kDa by software, and the isoelectric point pI is 5.93. The amino acid sequence identity between the glycosyltransferase and the functionally identified glycosyltransferase UGTPg29 is 93%.

(103) GT29-7: The glycosyltransferase gene GT29-7 encodes protein GT29-7 containing 446 amino acids with the amino acid sequence as shown in SEQ ID NO: 16 in the sequence listing. The theoretical molecular weight of the protein is predicted to be 49.8 kDa by software, and the isoelectric point pI is 5.8. The amino acid sequence identity between the glycosyltransferase and the functionally identified glycosyltransferase UGTPg29 is 92%.

(104) GT29-9: The glycosyltransferase gene GT29-9 encodes a protein GT29-9 containing 446 amino acids with the amino acid sequence as shown in SEQ ID NO: 18 in the sequence listing. The theoretical molecular weight of this protein is predicted to be 49.8 kDa by software, and the isoelectric point pI is 5.93. The amino acid sequence identity between the glycosyltransferase and the functionally identified glycosyltransferase UGTPg29 is 92%.

(105) GT29-11: The glycosyltransferase gene GT29-11 encodes a protein GT29-11 containing 446 amino acids with the amino acid sequence as shown in SEQ ID NO: 20 in the sequence listing. The theoretical molecular weight of the protein is predicted to be 49.9 kDa by software, and the isoelectric point pI is 5.90. The amino acid sequence identity between the glycosyltransferase and the functionally identified glycosyltransferase UGTPg29 is 91%.

(106) GT29-13: The glycosyltransferase gene GT29-13 encodes a protein GT29-13 containing 446 amino acids with the amino acid sequence as shown in SEQ ID NO: 22 in the sequence listing. The theoretical molecular weight of the protein is predicted to be 49.9 kDa by software, and the isoelectric point pI is 5.93. The amino acid sequence identity between the glycosyltransferase and the functionally identified glycosyltransferase UGTPg29 is 91%.

(107) GT29-17: The glycosyltransferase gene GT29-17 encodes a protein GT29-17 containing 442 amino acids with the amino acid sequence as shown in SEQ ID NO: 24 in the sequence listing. The theoretical molecular weight of the protein is predicted to be 49.3 kDa by software, and the isoelectric point pI is 5.35. The amino acid sequence identity between the glycosyltransferase and the functionally identified glycosyltransferase UGTPg29 is 93%.

(108) GT29-18: The glycosyltransferase gene GT29-18 encodes a protein GT29-18 containing 446 amino acids with the amino acid sequence as shown in SEQ ID NO: 26 in the sequence listing. The theoretical molecular weight of the protein is predicted to be 49.9 kDa by software, and the isoelectric point pI is 5.93. The amino acid sequence identity between the glycosyltransferase and the functionally identified glycosyltransferase UGTPg29 is 91%.

(109) GT29-24: The glycosyltransferase gene GT29-24 encodes a protein GT29-24 containing 446 amino acids with the amino acid sequence as shown in SEQ ID NO: 28 in the sequence listing. The theoretical molecular weight of the protein is predicted to be 49.9 kDa by software, and the isoelectric point pI is 5.93. The amino acid sequence identity between the glycosyltransferase and the functionally identified glycosyltransferase UGTPg29 is 91%.

(110) GT29-25: The glycosyltransferase gene GT29-25 encodes a protein GT29-25 containing 446 amino acids with the amino acid sequence as shown in SEQ ID NO: 30 in the sequence listing. The theoretical molecular weight of the protein is predicted to be 49.9 kDa by software, and the isoelectric point pI is 5.93. The amino acid sequence identity between the glycosyltransferase and the functionally identified glycosyltransferase UGTPg29 is 91%.

(111) GT29-19: The glycosyltransferase gene GT29-19 encodes a protein GT29-19 containing 442 amino acids with the amino acid sequence as shown in SEQ ID NO: 116 in the sequence listing. The theoretical molecular weight of the protein is predicted to be 49.1 kDa by software, and the isoelectric point pI is 5.47.

(112) GT29-20: The glycosyltransferase gene GT29-20 encodes a protein GT29-20 containing 442 amino acids with the amino acid sequence as shown in SEQ ID NO: 118 in the sequence listing. The theoretical molecular weight of the protein is predicted to be 49.1 kDa by software, and the isoelectric point pI is 5.93.

(113) GT29-21: The glycosyltransferase gene GT29-21 encodes a protein GT29-21 containing 442 amino acids with the amino acid sequence as shown in SEQ ID NO: 120 in the sequence listing. The theoretical molecular weight of the protein is predicted to be 49.1 kDa by software, and the isoelectric point pI is 5.80.

(114) GT29-22: The glycosyltransferase gene GT29-22 encodes a protein GT29-22 containing 442 amino acids with the amino acid sequence as shown in SEQ ID NO: 122 in the sequence listing. The theoretical molecular weight of the protein is predicted to be 49.1 kDa by software, and the isoelectric point pI is 5.93.

(115) GT29-23: The glycosyltransferase gene GT29-23 encodes a protein GT29-23 containing 442 amino acids with the amino acid sequence as shown in SEQ ID NO: 124 in the sequence listing. The theoretical molecular weight of the protein is predicted to be 49.0 kDa by software, and the isoelectric point pI is 5.61.

(116) GT29-36: The glycosyltransferase gene GT29-36 encodes a protein GT29-36 containing 442 amino acids with the amino acid sequence as shown in SEQ ID NO:102 in the sequence listing. The theoretical molecular weight of the protein is predicted to be 49.1 kDa by software, and the isoelectric point pI is 5.93.

(117) GT29-37: The glycosyltransferase gene GT29-37 encodes a protein GT29-37 containing 442 amino acids with the amino acid sequence as shown in SEQ ID NO: 104 in the sequence listing. The theoretical molecular weight of the protein is predicted to be 49.1 kDa by software, and the isoelectric point pI is 5.62.

(118) GT29-42: The glycosyltransferase gene GT29-42 encodes a GT29-42 protein containing 444 amino acids with the amino acid sequence as shown in SEQ ID NO: 106 in the sequence listing. The theoretical molecular weight of the protein is predicted to be 49.4 kDa by software, and the isoelectric point pI is 6.16.

(119) GT29-43: The glycosyltransferase gene GT29-43 encodes a protein GT29-43 containing 442 amino acids with the amino acid sequence as shown in SEQ ID NO: 108 in the sequence listing. The theoretical molecular weight of the protein is predicted to be 49.1 kDa by software, and the isoelectric point pI is 5.78.

(120) GT29-45: The glycosyltransferase gene GT29-45 encodes a protein GT29-45 containing 448 amino acids with the amino acid sequence as shown in SEQ ID NO: 110 in the sequence listing. The theoretical molecular weight of the protein is predicted to be 50.0 kDa by software, and the isoelectric point pI is 7.25.

(121) GT29-46: The glycosyltransferase gene GT29-46 encodes a protein GT29-46 containing 442 amino acids with the amino acid sequence as shown in SEQ ID NO: 112 in the sequence listing. The theoretical molecular weight of this protein is predicted to be 49.1 kDa by software, and the isoelectric point pI is 5.48.

Example 2 Expression of Glycosyltransferase Genes GT29-32, GT29-33 and GT29-34 in E. coli

(122) Using the plasmids GT29-32-pMD18T, GT29-33-pMD18T and GT29-34-pMD18T constructed in Example 1 containing GT29-32, GT29-33 and GT29-34 genes as templates, the target genes GT29-32, GT29-33 and GT29-34 were amplified with the primers as shown in Table 1.

(123) After the expression vector pET28a (purchased from Merck) was digested with Ncol/Sall, GT29-32, GT29-33 and GT29-34 were cloned into pET28a (one-step cloning kit, purchased from Novizan) to construct E. coli expression vectors GT29-32-pET28a, GT29-33-pET28a and GT29-34-pET28a. Using the 6×His tag sequence on pET28a, the C-terminus of the recombinant proteins GT29-32, GT29-33 and GT29-34 had a 6×His tag. The plasmids were transformed into commercially available E. coli BL21 to construct recombinant strains BL21-GT29-32, BL21-GT29-33 and BL21-GT29-34. A recombinant was inoculated into LB medium, cultured at 37° C., 200 rpm to an OD600 of about 0.6-0.8, then the bacterial solution was cooled to 4° C., and IPTG with a final concentration of 100 μM was added, and the expression was induced at 18° C., 120 rpm for 16 h. The bacteria was collected by centrifugation at 4° C., and the cells were disrupted by ultrasound. The supernatant of the cell lysate was collected by centrifugation at 12000 g, at 4° C. for 10 min. The samples were taken for SDS-PAGE electrophoresis and western blot. SDS-PAGE result shows that the recombinant transformants of GT29-32-pET28a, GT29-33-pET28a and GT29-34-pET28a are not significantly different from the cell lysate of the empty vector pET28a recombinant transformant, and the soluble expression is not obvious (FIG. 1A). Anti-6×His tag Western Blot (FIG. 1B) shows that there is a clear band between 45 and 55 kD, and the glycosyltransferases GT29-32, GT29-33, and GT29-34 are slightly solubly expressed in E. coli.

(124) TABLE-US-00010 TABLE 1 primers used to amplify genes gene primer SEQ ID NO. UGT29-4 UGT29-4-F 31 UGT29-4-R 34 UGT29-5 UGT29-5-F 33 UGT29-5-R 32 UGT29-7 UGT29-7-F 35 UGT29-7-R 32 UGT29-9 UGT29-9-F 33 UGT29-9-R 32 UGT29-11 UGT29-11-F 33 UGT29-11-R 32 UGT29-13 UGT29-13-F 33 UGT29-13-R 32 UGT29-17 UGT29-17-F 31 UGT29-17-R 32 UGT29-18 UGT29-18-F 33 UGT29-18-R 34 UGT29-24 UGT29-24-F 33 UGT29-24-R 34 UGT29-25 UGT29-25-F 33 UGT29-25-R 32 UGT29-32 UGT29-32-F 31 UGT29-32-R 32 UGT29-33 UGT29-33-F 36 UGT29-33-R 37 UGT29-34 UGT29-34-F 36 UGT29-34-R 34 UGT29-19 UGT29-19-F 125 UGT29-19-R 126 UGT29-20 UGT29-20-F 127 UGT29-20-R 128 UGT29-21 UGT29-21-F 129 UGT29-21-R 130 UGT29-22 UGT29-22-F 131 UGT29-22-R 132 UGT29-23 UGT29-23-F 133 UGT29-23-R 134 UGT29-36 UGT29-36-F 101 UGT29-36-R 102 UGT29-37 UGT29-37-F 103 UGT29-37-R 104 UGT29-42 UGT29-42-F 105 UGT29-42-R 106 UGT29-43 UGT29-43-F 107 UGT29-43-R 108 UGT29-45 UGT29-36-F 109 UGT29-36-R 110 UGT29-46 UGT29-36-F 111 UGT29-36-R 112

Example 3 In Vitro Transglycosylation Activity and Product Identification of GT29-32, GT29-33 and GT29-34

(125) The cell lysate supernatants of recombinant E. coli BL21-GT29-32, BL21-GT29-33 and BL21-GT29-34 in Example 2 was used as a crude enzyme solution to perform transglycosylation reaction, and the cell lysate of the recombinant E. coli with empty vector pET28a was used as a control.

(126) As shown in FIG. 2: using protopanaxadiol ginsenoside CK as a glycosyl receptor and UDP-glucose as a glycosyl donor, GT29-32 and GT29-34 can catalyze the formation of a new product;

(127) As shown in FIG. 3: using ginsenoside Rd as a glycosyl acceptor and UDP-glucose as a glycosyl donor, GT29-32, GT29-33 and GT29-34 can catalyze the formation of Rb1. The HPLC results are consistent with the TLC results.

(128) Therefore, GT29-32 and GT29-34 can catalyze the C20-O-Glc of CK extension to a molecule of glucose to generate ginsenoside Gypenoside LXXV. When UDP-xylose is used as a glycosyl donor, GT29-32 can catalyze Rd to produce three products. One of the products has the same mobility on TLC as Rb3, that is, GT29-32 can extend a molecule of xylose at C20-O-Glc to produce Rb3 (FIG. 2). The results of HPLC are consistent with those of TLC. GT29-32 catalyzes the production of three products from Rd and UDP-xylose (FIG. 4).

(129) Using Protopanaxatriol Ginsenoside F1 as a glycosyl acceptor and UDP-glucose as a glycosyl donor, GT29-32 can catalyze the formation of a new product. It is speculated that it also extends a molecule of glucose at C20-O-Glc of F1, the product is Notoginsenoside R3 (FIG. 5 and FIG. 6).

(130) Using Protopanaxadiol Ginsenoside CK as a glycosyl acceptor and UDP-arabinose as a glycosyl donor, GT29-32, GT29-33 and GT29-34 can catalyze the first glycosyl of C-20 of CK to extend an arabinosyl to generate Ginsenoside F3, wherein GT29-32 has the strongest activity (FIG. 17).

Example 4 Expression of Glycosyltransferase Genes GT29-4, GT29-5, GT29-7, GT29-9, GT29-11, GT29-13, GT29-17, GT29-18, GT29-24 and GT29-25 in E. coli

(131) Plasmids GT29-4-pMD18T, GT29-5-pMD18T, GT29-7-pMD18T, GT29-9-pMD18T, GT29-11-pMD18T, GT29-13-pMD18T, GT29-17-pMD18T, GT29-18-pMD18T, GT29-24-pMD18T and GT29-25-pMD18T containing GT29-4, GT29-5, GT29-7, GT29-9, GT29-11, GT29-13, GT29-17, GT29-18, GT29-24 and GT29-25 genes constructed in Example 1 were used as templates to amplify target genes GT29-4, GT29-5, GT29-7, GT29-9, GT29-11, GT29-13, GT29-17, GT29-18, GT29-24 and GT29-25 with the primers as shown in Table 1. After the expression vector pET28a (purchased from Merck) was digested with Ncol/Sall, GT29-11, GT29-13, GT29-17, GT29-18, GT29-24 and GT29-25 were cloned into pET28a (one-step cloning kit, purchased from Novizan), and E. coli expression vectors GT29-4-pET28a, GT29-5-pET28a, GT29-7-pET28a, GT29-9-pET28a, GT29-11-pET28a, GT29-13-pET28a, GT29-17-pET28a, GT29-18-pET28a, GT29-24-pET28a and GT29-25-pET28a were constructed.

(132) Using the 6×His tag sequence on pET28a, recombinant proteins GT29-4-pET28a, GT29-5-pET28a, GT29-7-pET28a, GT29-9-pET28a, GT29-11-pET28a, GT29-13-pET28a, GT29-17-pET28a, GT29-18-pET28a, GT29-24 and GT29-25 had a 6×His tag at the C-terminal. The plasmids were transformed into commercially available E. coli BL21 to construct recombinant strains BL21-GT29-4, BL21-GT29-5, BL21-GT29-7, BL21-GT29-9, BL21-GT29-11, BL21-GT29-13. BL21-GT29-17, BL21-GT29-18, BL21-GT29-24 and BL21-GT29-25. A recombinant was inoculated into LB medium, cultured at 37° C., 200 rpm to an OD600 of about 0.6-0.8, then the bacterial solution was cooled to 4° C., and IPTG with a final concentration of 100 μM was added, and induced expression was performed at 18° C., 120 rpm for 16 h. The bacteria was collected by centrifugation at 4° C., and the cells were disrupted by ultrasound. The supernatant of the cell lysate was collected by centrifugation at 12000 g at 4° C. for 10 min. The samples were taken for SDS-PAGE electrophoresis and western blot.

(133) SDS-PAGE shows recombinant transformants of GT29-4-pET28a, GT29-5-pET28a, GT29-7-pET28a, GT29-9-pET28a, GT29-11-pET28a, GT29-13-pET28a, GT29-17-pET28a, GT29-18-pET28a, GT29-24-pET28a and GT29-25-pET28a were not significantly different from the cell lysate of the empty vector pET28a recombinant transformant, and the soluble expression levels were not obvious. Anti-6×His tag Western Blot shows that there was a clear band between 45 and 55 kD, and glycosyltransferases GT29-4, GT29-5, GT29-7, GT29-9, GT29-11, GT29-13, GT29-17, GT29-18, GT29-24 and GT29-25 had a small amount of soluble expression in E. coli.

Example 5 In Vitro Transglycosylation Activity and Products Identification of GT29-4, GT29-5, GT29-7, GT29-9, GT29-11, GT29-13, GT29-17, GT29-18, GT29-24 and GT29-25

(134) The cell lysate supernatants of recombinant E. coli BL21-GT29-4, BL21-GT29-5, BL21-GT29-7, BL21-GT29-9, BL21-GT29-11, BL21-GT29-13, BL21-GT29-17, BL21-GT29-18, BL21-GT29-24 and BL21-GT29-25 in Example 2 was used as a crude enzyme solution for transglycosylation reaction, and cell lysate of recombinant E. coli with empty vector pET28a was used as a control.

(135) As shown in FIG. 7, using the Protopanaxadiol Ginsenoside Rg1 as a glycosyl acceptor, UDP-xylose as a glycosyl donor, GT29-4, GT29-5, GT29-7, GT29-9, GT29-11, GT29-13, GT29-17, GT29-18, GT29-24 and GT29-25 can catalyze the formation of Notoginsenoside R1. The HPLC results are consistent with the TLC results (FIG. 8 and FIG. 9). Therefore, GT29-4, GT29-5, GT29-7, GT29-9, GT29-11, GT29-13, GT29-17, GT29-18, GT29-24, and GT29-25 are capable of catalyzing the extension of C6-O-Glc of Rg1 by a molecule of xylose to produce notoginsenoside R1.

(136) As shown in FIG. 10, GT29-24 and GT29-25 can use Protopanaxadiol Ginsenoside Rh2 as a glycosyl acceptor and UDP-glucose as a glycosyl donor to catalyze the production of ginsenoside Rg3 by extending a glucosyl at the C-3 glycosyl of Rh2. When the substrate is changed to F2, GT29-24 and GT29-25 can further catalyze the extension of a glucosyl at the C-3 glycosyl of F2 to produce ginsenoside Rd.

Example 6 Isolation of Panax notoginseng Glycosyltransferase and the Coding Gene Thereof

(137) RNA in Panax notoginseng was extracted and reverse transcription was performed to obtain cDNA of Panax notoginseng. Using this cDNA as a template, primer pair 1 (SEQ ID NO.: 82 and SEQ ID NO.: 83), primer pair 2 (SEQ ID NO.: 84 and SEQ ID NO.: 85), primer pair 3 (SEQ ID NO.: 84 and SEQ ID NO.: 86), primer pair 4 (SEQ ID NO.: 87 and SEQ ID NO.: 88) were used for PCR amplification to obtain a 1.4-1.5 kb amplification product. The high-fidelity KOD DNA polymerase from Bao Bioengineering Co., Ltd. was used as the DNA polymerase. PCR products were detected by agarose gel electrophoresis.

(138) According to Example 1, several clones were selected to extract recombinant plasmids and sequenced to obtain 14 different nucleic acid sequences, named PNUGT29-1 (SEQ ID NO.: 38), PNUGT29-2 (SEQ ID NO.: 40), PNUGT29-3 (SEQ ID NO.: 42), PNUGT29-4 (SEQ ID NO.: 44), PNUGT29-5 (SEQ ID NO.: 46), PNUGT29-6 (SEQ ID NO.: 48), PNUGT29-7 (SEQ ID NO.: 50), PNUGT29-8 (SEQ ID NO.: 52), PNUGT29-9 (SEQ ID NO.: 54), PNUGT29-14 (SEQ ID NO.: 56) and PNUGT29-15 (SEQ ID NO.: 58), respectively. BESTORF software was used to find ORF. Through sequence alignment, the amplification products all have the conserved functional domain of glycosyltransferase family 1, indicating that it is a glycosyltransferase gene.

(139) PNUGT29-1: The glycosyltransferase gene PNUGT29-1 encodes a protein PNUGT29-1 containing 447 amino acids with the amino acid sequence as shown in SEQ ID NO: 39 in the sequence listing. The theoretical molecular weight of the protein is predicted to be 49.688 kDa by software, and the isoelectric point pI is 6.58.

(140) PNUGT29-2: The glycosyltransferase gene PNUGT29-2 encodes a protein PNUGT29-2 containing 442 amino acids with the amino acid sequence as shown in SEQ ID NO: 41 in the sequence listing. The theoretical molecular weight of the protein is predicted to be 49.118 kDa by software, and the isoelectric point pI is 6.20.

(141) PNUGT29-3: The glycosyltransferase gene PNUGT29-3 encodes a protein PNUGT29-3 containing 447 amino acids with the amino acid sequence as shown in SEQ ID NO: 43 in the sequence listing. The theoretical molecular weight of the protein is predicted to be 49.729 kDa by software, and the isoelectric point pI is 6.58.

(142) PNUGT29-4: The glycosyltransferase gene PNUGT29-4 encodes a protein PNUGT29-4 containing 447 amino acids with the amino acid sequence as shown in SEQ ID NO: 45 in the sequence listing. The theoretical molecular weight of the protein is predicted to be 49.715 kDa by software, and the isoelectric point pI is 6.58.

(143) PNUGT29-5: The glycosyltransferase gene PNUGT29-5 encodes a protein PNUGT29-5 containing 447 amino acids with the amino acid sequence as shown in SEQ ID NO: 47 in the sequence listing. The theoretical molecular weight of the protein is predicted to be 49.718 kDa by software, and the isoelectric point pI is 6.45.

(144) PNUGT29-6: The glycosyltransferase gene PNUGT29-6 encodes a protein PNUGT29-6 containing 447 amino acids with the amino acid sequence as shown in SEQ ID NO: 49 in the sequence listing. The theoretical molecular weight of the protein is predicted to be 49.657 kDa by software, and the isoelectric point pI is 6.70.

(145) PNUGT29-7: The glycosyltransferase gene PNUGT29-7 encodes a protein. PNUGT29-7 containing 447 amino acids with the amino acid sequence as shown in SEQ ID NO: 51 in the sequence listing. The theoretical molecular weight of the protein is predicted to be 49.749 kDa by software, and the isoelectric point pI is 6.58.

(146) PNUGT29-8: The glycosyltransferase gene PNUGT29-8 encodes a protein.

(147) PNUGT29-8 containing 447 amino acids with the amino acid sequence as shown in SEQ ID NO: 53 in the sequence listing. The theoretical molecular weight of the protein is predicted to be 49.657 kDa by software, and the isoelectric point pI is 6.70.

(148) PNUGT29-9: The glycosyltransferase gene PNUGT29-9 encodes a protein PNUGT29-9 containing 447 amino acids with the amino acid sequence as shown in SEQ ID NO: 55 in the sequence listing. The theoretical molecular weight of the protein is predicted to be 49.695 kDa by software, and the isoelectric point pI is 6.58.

(149) PNUGT29-14: The glycosyltransferase gene PNUGT29-14 encodes a protein PNUGT29-14 containing 447 amino acids with the amino acid sequence as shown in SEQ ID NO: 57 in the sequence listing. The theoretical molecular weight of the protein is predicted to be 49.778 kDa by software, and the isoelectric point pI is 6.70. PNUGT29-15: The glycosyltransferase gene.

(150) PNUGT29-15 encodes a protein PNUGT29-15 containing 447 amino acids with the amino acid sequence as shown in SEQ ID NO: 59 in the sequence listing. The theoretical molecular weight of the protein is predicted to be 49.755 kDa by software, and the isoelectric point pI is 6.63.

Example 7 Expression of Panax notoginseng Glycosyltransferase Genes PNUGT29-1, PNUGT29-2, PNUGT29-3, PNUGT29-4, PNUGT29-5, PNUGT29-6, PNUGT29-7, PNUGT29-8, PNUGT29-9, PNUGT29-14 and NUGT29-15 in E. coli

(151) Plasmids PNUGT29-1-pMD18T, PNUGT29-2-pMD18T, PNUGT29-3-pMD18T, PNUGT29-4-pMD18T, PNUGT29-5-pMD18T, PNUGT29-6-pMD18T, PNUGT29-7-pMD18T, PNUGT29-8-pMD18T, PNUGT29-9-pMD18T, PNUGT29-14-pMD18T and PNUGT29-15-pMD18T containing PNUGT29-1, PNUGT29-2, PNUGT29-3, PNUGT29-4, PNUGT29-5, PNUGT29-6, PNUGT29-7, PNUGT29-8, PNUGT29-9, PNUGT29-14 and PNUGT29-15 genes constructed in Example 6 were used as a template, and the target genes PNUGT29-1, PNUGT29-2, PNUGT29-3, PNUGT29-4, PNUGT29-5, PNUGT29-6, PNUGT29-7, PNUGT29-8, PNUGT29-9, PNUGT29-14 and PNUGT29-15 were amplified with the primers as shown in Table 1. Referring to the method in Example 2, the recombinant strains BL21-PNUGT29-1, BL21-PNUGT29-2, BL21-PNUGT29-3, BL21-PNUGT29-4, BL21-PNUGT29-5, BL21-PNUGT29-6, BL21-PNUGT29-7. BL21-PNUGT29-8, BL21-PNUGT29-9, BL21-PNUGT29-14 and BL21-PNUGT29-15 were constructed for SDS-PAGE electrophoresis and western blot. Anti-6×His tag Western Blot (FIG. 10) shows that there is a clear band between 45 and 65 kD, glycosyltransferases PNUGT29-1, PNUGT29-2, PNUGT29-3, PNUGT29-4, PNUGT29-5, PNUGT29-6, PNUGT29-7, PNUGT29-8, PNUGT29-9, PNUGT29-14 and PNUGT29-15 have a small amount of soluble expression in E. coli.

(152) TABLE-US-00011 TABLE 2 primers used to amplify genes gene primer SEQ ID NO. PNUGT29-1 PNUGT29-1-F 60 PNUGT29-1-R 61 PNUGT29-2 PNUGT29-2-F 62 PNUGT29-2-R 63 PNUGT29-3 PNUGT29-3-F 64 PNUGT29-3-R 65 PNUGT29-4 PNUGT29-4-F 66 PNUGT29-4-R 67 PNUGT29-5 PNUGT29-5-F 68 PNUGT29-5-R 69 PNUGT29-6 PNUGT29-6-F 70 PNUGT29-6-R 71 PNUGT29-7 PNUGT29-7-F 72 PNUGT29-7-R 73 PNUGT29-8 PNUGT29-8-F 74 PNUGT29-8-R 75 PNUGT29-9 PNUGT29-9-F 76 PNUGT29-9-R 77 PNUGT29-14 PNUGT29-14-F 78 PNUGT29-14-R 79 PNUGT29-15 PNUGT29-15-F 80 PNUGT29-15-R 81

Example 8 In Vitro Transglycosylation Activity and Product Identification of Panax notoginseng Glycosyltransferases PNUGT29-1, PNUGT29-2, PNUGT29-3, PNUGT29-4, PNUGT29-5, PNUGT29-6, PNUGT29-7, PNUGT29-8, PNUGT29-9, PNUGT29-14 and PNUGT29-15

(153) The cell lysate supernatants of recombinant E. coli BL21-PNUGT29-1, BL21-PNUGT29-2, BL21-PNUGT29-3, BL21-PNUGT29-4, BL21-PNUGT29-5, BL21-PNUGT29-6, BL21-PNUGT29-7, BL21-PNUGT29-8, BL21-PNUGT29-9, BL21-PNUGT29-14 and BL21-PNUGT29-15 in Example 7 were used as a crude enzyme solution for transglycosylation reaction. Cell lysate of recombinant E. coli with empty vector pET28a was used as a control.

(154) As shown in FIG. 11: using Protopanaxadiol Ginsenoside Rd as a glycosyl acceptor, UDP-glucose as a glycosyl donor, PNUGT29-1, PNUGT29-2, PNUGT29-3, PNUGT29-4, PNUGT29-5, PNUGT29-6, PNUGT29-7, PNUGT29-8, PNUGT29-9, PNUGT29-14, PNUGT29-15 can catalyze the extension of a glucosyl at the C-20 glycosyl of Rd to generate Rb1.

(155) As shown in FIG. 12: using Protopanaxadiol Ginsenoside CK as a glycosyl acceptor, UDP-glucose as a glycosyl donor, PNUGT29-1, PNUGT29-2, PNUGT29-3, PNUGT29-4, PNUGT29-5, PNUGT29-6, PNUGT29-7, PNUGT29-8, PNUGT29-9, PNUGT29-14, PNUGT29-15 can catalyze the extension of a glucosyl at the C-20 glycosyl to generate Gypenoside LXXV.

(156) As shown in FIG. 13: using Protopanaxadiol Ginsenoside Rh2 as a glycosyl acceptor, UDP-glucose as a glycosyl donor, PNUGT29-1, PNUGT29-2, PNUGT29-3, PNUGT29-4, PNUGT29-5, PNUGT29-6, PNUGT29-7, PNUGT29-8, PNUGT29-9, PNUGT29-14, PNUGT29-15 can catalyze the extension of a glucosyl at the C-3 glycosyl of Rh2 to generate Rg3.

Example 9 Expression of Glycosyltransferase Genes GT29-19, GT29-20, GT29-21, GT29-22, GT29-23, GT29-36, GT29-37, GT29-42, GT29-43, GT29-45 and GT29-46 in E. coli

(157) Plasmids GT29-19-pMD18T, GT29-20-pMD18T, GT29-21-pMD18T, GT29-22-pMD18T, GT29-23-pMD18T, GT29-36-pMD18T, GT29-37-pMD18T, GT29-42-pMD18T, GT29-43-pMD18T, GT29-45-pMD18T, and GT29-46-pMD18T containing GT29-19, GT29-20, GT29-21, GT29-22, GT29-23, GT29-36, GT29-37, GT29-42, GT29-43, GT29-45 and GT29-46 genes constructed in Example 1 were used as a template, and the target genes GT29-36, GT29-37, GT29-42, GT29-43, GT29-45 and GT29-46 were amplified with the primers as shown in Table 1.

(158) Referring to Example 2, recombinant strains BL21-GT29-19, BL21-GT29-20, BL21-GT29-21, BL21-GT29-22, BL21-GT29-23, BL21-GT29-36, BL21-GT29-37, BL21-GT29-42, BL21-GT29-43, BL21-GT29-45 and BL21-GT29-46 were constructed, and samples were taken for SDS-PAGE electrophoresis and western blot.

(159) Protopanaxadiol Ginsenoside Rh2 was used as a glycosyl acceptor, and UDP-glucose was used as a glycosyl donor, and the above-mentioned glycosyltransferases GT29-19, GT29-20, GT29-21, GT29-22, GT29-23, GT29-36, GT29-37, GT29-42, GT29-43, GT29-45 and GT29-46 can all catalyze the extension of a glucosyl at the C-3 glycosyl of Rh2 to generate Rg3. FIG. 15 shows GT29-45 and GT29-46 can catalyze Rh2 to generate Rg3.

(160) Protopanaxadiol Ginsenoside Rd was used as a glycosyl acceptor and UDP-xylose was used as a glycosyl donor, and the above glycosyltransferases GT29-19, GT29-20, GT29-21, GT29-22, GT29-23, GT29-36, GT29-37, GT29-42, GT29-43 can all catalyze the replacement of the second glucose at C-3 position of Rd with xylose to produce a new triterpene saponin (3-O-β-(D-xylopyranosyl)-β-(D-glucopyranosyl), 20-O-β-(D-glucopyranosyl)-PPD), of which GT29-36, GT29-37, GT29-42 and GT29-43 are the most active (FIG. 14).

(161) As shown in FIG. 16: Protopanaxadiol Ginsenoside CK is used as a glycosyl acceptor, and UDP-glucose is used as a glycosyl donor. GT29-45 and GT29-46 can catalyze the C-20 glycosyl of CK to extend a glucosyl to produce Gypenoside LXXV, in which GT29-45 has a strong activity.

Example 10 Further Verification of the Glycosyltransferase Activity

(162) The above Examples 3, 5, and 8 were repeated, and the difference was that other glycosyl donors and substrates were replaced, and the experimental results are shown in Table 3 to Table 5:

(163) TABLE-US-00012 TABLE 3 C-3 SEQ ID UDP-xylose UDP-G NO.: Rd F2 Rh2 116 GT29-19 ++ ++ +++ 118 GT29-20 ++ ++ +++ 120 GT29-21 + ++ +++ 122 GT29-22 + ++ ++ 124 GT29-23 + ++ ++ 90 GT29-36 +++ ++ ++ 92 GT29-37 +++ ++ ++ 94 GT29-42 +++ ++ ++ 96 GT29-43 +++ ++ ++ 98 GT29-45 NS ++ ++ 100 GT29-46 NS ++ ++ 39 PNUGT29-1 NS ++ +++ 41 PNUGT29-2 NS +++ +++ 43 PNUGT29-3 NS +++ +++ 45 PNUGT29-4 NS +++ +++ 47 PNUGT29-5 NS ++ ++ 49 PNUGT29-6 NS ++ ++ 51 PNUGT29-7 NS ++ ++ 53 PNUGT29-8 NS ++ ++ 55 PNUGT29-9 NS +++ +++ 57 PNUGT29-14 NS ++ ++ 59 PNUGT29-15 NS ++ ++

(164) TABLE-US-00013 TABLE 4 C-6 SEQ ID UDP-xylose UDP-G NO.: name Rg1 Rh1 Rg1 Rh1 12 GT29-4 ++ ++ + + 14 GT29-5 ++ ++ + + 16 GT29-7 +++ ++ ++ ++ 18 GT29-9 ++ ++ + + 20 GT29-11 +++ +++ ++ ++ 22 GT29-13 ++ ++ ++ ++ 24 GT29-17 ++ ++ + + 26 GT29-18 +++ +++ ++ ++ 28 GT29-24 +++ +++ ++ ++ 30 GT29-25 +++ +++ + +

(165) TABLE-US-00014 TABLE 5 C-20 SEQ UDP-xylose UDP-G UDP-arabinose ID NO.: name Rd CK F1 Rd CK 4 GT29-32 +++ +++ ++ + ++ 6 GT29-33 + ++ + ++ ++ 8 GT29-34 + ++ + ++ ++ 98 GT29-45 + ++ + + NS 100 GT29-46 + + + + NS 39 PNUGT29-1 + +++ + + NS 41 PNUGT29-2 + + + + NS 43 PNUGT29-3 + +++ ++ + NS 45 PNUGT29-4 + +++ ++ ++ NS 47 PNUGT29-5 + +++ + ++ NS 49 PNUGT29-6 + +++ ++ ++ NS 51 PNUGT29-7 + +++ + + NS 53 PNUGT29-8 + +++ + + NS 55 PNUGT29-9 + +++ + + NS 57 PNUGT29-14 + +++ + + NS 59 PNUGT29-15 + +++ + + NS * NS stands for not shown

(166) It can be seen from Tables 3 to 5 that the glycosyltransferases of the present invention can utilize common glycosyl donors and substrates, and have glycosyl extension or glycosyl substitution activity on different sites of tetracyclic triterpenes.

(167) All literatures mentioned in the present application are incorporated by reference herein, as though individually incorporated by reference. Additionally, it should be understood that after reading the above teaching, many variations and modifications may be made by the skilled in the art, and these equivalents also fall within the scope as defined by the appended claims.