Transferase enzymes
12247209 ยท 2025-03-11
Assignee
Inventors
- Anne OSBOURN (Norwich, GB)
- James REED (Norwich, GB)
- Anastasia ORME (Norwich, GB)
- Thomas Louveau (Norwich, GB)
Cpc classification
C12P5/007
CHEMISTRY; METALLURGY
C12N9/00
CHEMISTRY; METALLURGY
C12Y106/02004
CHEMISTRY; METALLURGY
A61K39/39
HUMAN NECESSITIES
C12N15/8243
CHEMISTRY; METALLURGY
Y02A50/30
GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
International classification
Abstract
The present invention relates generally to genes and polypeptides which have utility in glycosylating quillaic acid in host cells, including enzymes capable of successive glycosylation at the C-3 position of quillaic acid. The invention further relates to systems, methods and products employing the same.
Claims
1. A method of converting a host from a phenotype whereby the host is unable to perform the biosynthesis of the 3-O branched trisaccharide quillaic acid (QA) derivative (QA-3-O-TriS), which QA-3-O-TriS is 3-{[-
2. The method of claim 1, wherein the heterologous nucleic acid encodes all three types of polypeptide.
3. The method of claim 1, wherein the nucleotide sequences are from Q. saponaria.
4. The method of claim 3, wherein: the QA-GlcAT of (i) comprises the amino acid sequence of SEQ ID: No 2 or 26; the QA-GalT of (ii) comprises the amino acid sequence of SEQ ID: No 4; the QA-RhaT/XylT of (iii) comprises the amino acid sequence of SEQ ID: No 6, 28, 30, or 32.
5. The method of claim 1, wherein the heterologous nucleic acid further comprises a plurality of nucleotide sequences each of which encodes a polypeptide which in combination have QA biosynthesis activity (QA polypeptide), wherein the nucleic acid encodes all of the following QA polypeptides: (i) a -amyrin synthase (bAS) for cyclisation of 2,3-oxidosqualene (OS) to a triterpene; (ii) an enzyme capable of oxidising -amyrin or an oxidised derivative thereof at the C-28 position to a carboxylic acid (C-28 oxidase); (iii) an enzyme capable of oxidising -amyrin or an oxidised derivative thereof at the C-16 position to an alcohol (C-16 oxidase); and (iv) an enzyme capable of oxidising -amyrin or an oxidised derivative thereof at the C-23 position to an aldehyde (C-23 oxidase).
6. The method of claim 5, wherein the C-28 oxidase, C-16 oxidase, and C-23 oxidase are all CYP450 enzymes.
7. The method of claim 6, wherein (i) the C-28 oxidase is a CYP716; (ii) the C-16 oxidase is a CYP716 or CYP87; (iii) the C-23 oxidase is a CYP714, CYP72, or CYP94.
8. The method of claim 5, wherein: the -amyrin synthase (bAS) comprises the amino acid sequence of SEQ ID: No 12 or an amino acid sequence that is least 80% identical to SEQ ID: No 12; the C-28 oxidase comprises the amino acid sequence of SEQ ID: No 14 or an amino acid sequence that is least 80% identical to SEQ ID: No 14; the C-16 oxidase comprises the amino acid sequence of SEQ ID: No 16 or an amino acid sequence that is least 80% identical to SEQ ID: No 16; and the C-23 oxidase comprises the amino acid sequence of the SEQ ID: No 18 or an amino acid sequence that is least 80% identical to SEQ ID: No 18.
9. The method of claim 1, wherein the nucleic acid further comprises a plurality of nucleotide sequences encoding one or more of the following polypeptides: (i) an HMG-CoA reductase (HMGR); (ii) a squalene synthase (SQS).
10. The method of claim 1, wherein the nucleotide sequences are present on two or more different nucleic acid molecules.
11. The method of claim 10, wherein the host is a plant and the nucleic acid molecules are introduced by co-infiltration with a plurality of Agrobacterium tumefaciens strains each carrying one or more of the nucleic acid molecules.
12. The method of claim 11, wherein the nucleic acid molecules are transient expression vectors, wherein each of the transient expression vectors comprises an expression cassette comprising: (i) a promoter, operably linked to (ii) an enhancer sequence derived from the RNA-2 genome segment of a bipartite RNA virus, in which a target initiation site in the RNA-2 genome segment has been mutated; (iii) a nucleotide sequence encoding one of the polypeptides which in combination have said QA-3-O-TriS biosynthesis activity; (iv) a terminator sequence; and optionally (v) a 3 UTR located upstream of said terminator sequence.
13. The method of claim 1, wherein the host is a plant which that has modified QA-3-O-TriS content.
14. A genetically engineered host cell comprising a a heterologous nucleic acid that is not native to the cell, wherein the heterologous nucleic acid comprises a plurality of nucleotide sequences each of which encodes a polypeptide which in combination have QA-3-O-TriS biosynthesis activity, wherein expression of said nucleic acid imparts on the host cell the ability to carry out QA-3-O-TriS biosynthesis, wherein the nucleic acid encodes two or three of the following types of polypeptide (i), (ii) or (iii): (i) a QA-GlcAT capable of GlcpA at the 3-O position of quillaic acid to form QA-GlcpA; (ii) a QA-GalT capable of transferring Galp via a -1->2 linkage to QA-GlcpA to form QA-GlcpA-Galp; (iii) a QA-RhaT/XylT, capable of transferring Rhap and/or Xylp via a 1,3 linkage to QA-GlcpA-Galp to form QA-GlcpA-[Galp]-Rhap and/or GlcpA-[Galp]-Xylp respectively; wherein: the QA-GlcAT of (i) comprises the amino acid sequence of SEQ ID: No 2 or 26, or an amino acid sequence that is at least 80% identical to SEQ ID: No 2 or 26; the QA-GalT of (ii) comprises the amino acid sequence of SEQ ID: No 4, or an amino acid sequence that is at least 80% identical to SEQ ID: No 4; and the QA-RhaT/XyIT of (iii) comprises the amino acid sequence of SEQ ID: No 6, 28, 30, or 32, or an amino acid sequence that is at least 80% identical to SEQ ID: No 6, 28, 30, or 32.
15. The genetically engineered host cell of claim 14, further comprising a plurality of nucleotide sequences each of which encodes a polypeptide which in combination have QA biosynthesis activity, wherein the nucleic acid encodes all of the following QA polypeptides: (i) a -amyrin synthase (bAS) for cyclisation of 2,3-oxidosqualene (OS) to a triterpene; (ii) an enzyme capable of oxidising -amyrin or an oxidised derivative thereof at the C-28 position to a carboxylic acid (C-28 oxidase); (iii) an enzyme capable of oxidising -amyrin or an oxidised derivative thereof at the C-16 position to an alcohol (C-16 oxidase); and (iv) an enzyme capable of oxidising -amyrin or an oxidised derivative thereof at the C-23 position to an aldehyde (C-23 oxidase).
16. A method for producing the host cell of claim 14, comprising co-infiltrating a plurality of recombinant constructs comprising the heterologous nucleic acid into the cell for transient expression thereof.
17. The method for producing the host cell of claim 14, comprising transforming a cell with the heterologous nucleic acid by introducing said nucleic acid into the cell via a vector and causing or allowing recombination between the vector and the cell genome to introduce the heterologous nucleic acid into the genome.
18. The method of claim 17, wherein the cell is a plant cell and the method further comprises regenerating a plant from the transformed cell.
19. A transgenic plant made by the method of claim 18, or a clone or descendant of said transgenic plant that comprises the heterologous nucleic acid, wherein expression of said heterologous nucleic acid imparts an increased ability to carry out QA-3-O-TriS synthesis compared to a wild-type plant otherwise corresponding to said transgenic plant except that is does not comprise the heterologous nucleic acid.
20. A transgenic plant comprising a heterologous nucleic acid that is not native to the plant, wherein the nucleic acid encodes one or more of: (i) a QA 3-O glucuronosyl transferase (QA-GlcAT) capable of transferring
21. A recombinant vector comprising a nucleic acid encoding: (i) a QA 3-O glucuronosyl transferase (QA-GlcAT) capable of transferring
22. The vector of claim 21 wherein the promoter is an inducible promoter.
23. A method comprising the step of a vector of claim 21 into a host cell to transform the host cell.
24. A host cell comprising a vector according to claim 21.
25. The host cell of claim 24, wherein the host cell is microbial.
26. The host cell of claim 25, further comprising a nucleic acid which comprises one or more nucleotide sequences each of which encodes a cytochrome P450 reductase (CPR).
27. The method of claim 23, wherein the host cell is a plant cell, and the method further comprises regenerating a plant from the transformed host cell.
28. A transgenic plant made by the method of claim 27, or a clone or descendant of said transgenic plant that comprises the vector.
29. The method of claim 1, further comprising isolating QA-3-O-TriS, or a downstream product thereof from the host.
30. A method comprising culturing the host cell of claim 14 and isolating QA-3-O-TriS, or a downstream product thereof.
31. A method comprising isolating QA-3-O-TriS, or a derivative thereof from a plant of claim 20.
32. The method of claim 1, wherein: the QA-GlcAT of (i) comprises the amino acid sequence of SEQ ID: No 2 or 26, or an amino acid sequence that is at least 90% identical to SEQ ID: No 2 or 26; the QA-GalT of (ii) comprises the amino acid sequence of SEQ ID: No 4, or an amino acid sequence that is at least 90% identical to SEQ ID: No 4; and the QA-RhaT/XyIT of (iii) comprises the amino acid sequence of SEQ ID: No 6, 28, 30, or 32, or an amino acid sequence that is at least 90% identical to SEQ ID: No 6, 28, 30, or 32.
33. The method of claim 1, wherein the QA-GlcAT of (i) comprises the amino acid sequence of SEQ ID: No 2 or 26, or an amino acid sequence that is at least 95% identical to SEQ ID: No 2 or 26; the QA-GalT of (ii) comprises the amino acid sequence of SEQ ID: No 4, or an amino acid sequence that is at least 95% identical to SEQ ID: No 4; and the QA-RhaT/XyIT of (iii) comprises the amino acid sequence of SEQ ID: No 6, 28, 30, or 32, or an amino acid sequence that is at least 95% identical to SEQ ID: No 6, 28, 30, or 32.
Description
FIGURES
(1)
(2)
(3)
(4)
(5)
(6)
(7)
(8)
(9)
(10)
(11)
(12)
(13)
(14)
(15)
(16)
(17)
(18)
(19)
EXAMPLES
Example 1Identification and Cloning of Glycosyltransferases from Q. saponaria
(20) In order to augment the publicly available transcriptome, we generated genome sequence data (PacBio sequencing performed by the Earlham Institute, Norwich, Norfolk). The genome sequence was annotated using publicly available data (including the 1 KP leaf transcriptomic data from Q. saponaria [4]) and proteins from related plant species in Phytozome.
(21) From this data, we shortlisted a series of sequences which were annotated as putative Family 1 UDP-dependent glycosyltransferases (UGTs)an important class of enzymes which are known to participate in biosynthesis of many plant natural products, including triterpenes [5, 6]. We refined the initial list (containing 200 sequences) down to sequences which were also represented in the 1 KP database from which the original QA biosynthetic enzymes were found. The Q. saponaria contigs consist of a 4-letter code (OQHZ) followed by seven digits. Where possible, this seven digit code is included for all of the candidate genes below. To further refine this list, we performed phylogenetic analysis using a series of characterised GTs from other plant species (Table 3). This allowed us to prioritise the enzymes which fell into the same phylogenetic groups as currently characterised triterpene UGTs from other plant species (Groups A, D and L) and UGTs with relevant sugar-donor specificity (Group B).
(22) Finally, in recent years it has been proposed that a number of chemically-diverse plant natural products are synthesised by enzymes encoded by physically co-localised genes. These so-called biosynthetic gene clusters (BCGs) could facilitate identification of additional candidate genes. We therefore deployed the PlantiSMASH genome mining tool [7] to predict possible BCGs within the Q. saponaria genome. This combination of approaches resulted in a final list of 30 candidate Q. saponaria UGTs (
(23) As described above, the genes for quillaic acid biosynthesis appear to be expressed in leaf tissue and were previously amplified by PCR from leaf cDNA. The same approach was therefore utilised for amplification of the GT candidates. A series of oligonucleotide primers were designed which incorporated 5 attB sites upstream of the target sequence to allow for Gateway cloning. From this, genes were successfully amplified and cloned into pDONR 207. The clones were sequenced before transfer into the plant expression vector pEAQ-HT-DEST1 [14]. Finally, the expression constructs were transformed individually into Agrobacterium tumefaciens (LBA4404) for transient expression in N. benthamiana.
(24) Screening of the 31 candidate GTs was performed using transient expression in N. benthamiana. All infiltrations included the four A. tumefaciens strains carrying the constructs for QA biosynthesis (QsbAS and C-28/C-23/C-16 oxidases) along with a strain carrying tHMGR, a key yield-enhancing enzyme for triterpene production.
Example 2Identification of Quillaic Acid 3-O Glucuronosyl Transferase
(25) Following LC-MS analysis of the samples, it was discovered that, unexpectedly, one candidate, a predicted cellulose synthase-like (CSL) enzyme (named herein QsCSL1) was active upon quillaic acid. Co-expression of this enzyme with the five A. tumefaciens strains for QA-production resulted in significant depletion of the QA peak at 19.2 minutes, accompanied by the appearance of a new peak at 13.9 minutes (
Example 3Identification of QA-GlcpA Galactosyl Transferase
(26) Following the identification of a putative glucuronosyl transferase, the next proposed step was the addition of the -
(27) A triterpene 3-O-glucuronoside--1,2-galactosyltransferase, GmUGT73P2 has been previously identified in soybean (Glycine max) (Shibuya et al, 2010). This enzyme catalyses the addition of
(28) Interestingly, the phylogenetic analysis of the Q. saponaria UGT enzymes showed that one candidate, Qs_2073886_D6, is closely related to GmUGT73P2 (
(29) Qs_2073886_D6 was coexpressed with the six genes required for production of the putative QA-GlcpA (tHMGR/QsbAS/CYP716-C-28/CYP716-C-16/CYP714-C23/QsCSL1). HPLC-MS analysis revealed that Qs_2073886_D6 appeared to convert the putative QA-GlcpA product to a new, more polar product at 12.6 minutes (
(30) To establish further evidence for the identity of the new product, we utilised the soybean (Glycine max) triterpene 3-O-glucuronoside--1,2-galactosyltransferase enzyme, GmUGT73P2. It was reasoned that this enzyme may show similar galactosyltransferase activity towards the putative QA-GlcpA product. An infiltration was thus also performed with coexpression of the six enzymes necessary for synthesis of the QA-GlcpA and GmUGT73P2. LC-MS analysis of the infiltrated leaf extracts revealed that a peak could indeed be observed in the GmSGT2-expressing samples which had a matching retention time and mass spectrum to the product seen at 12.6 minutes in the Q. saponaria galactosyltransferase-expressing samples (
(31) We additionally performed a large scale infiltration of tHMGR/QsbAS/CYP716-C-28/CYP716-C-16/CYP714-C23/QsCSL1/Qs-3-O-GalT in N. benthamiana as previously described [19] to purify this compound (32.1 g) to assign its structure by NMR. This confirmed it to be 3-{[-
Example 4Identification of QA-GlcpA-Galp Dual Rhamnosyl/Xylosyl Transferase
(32) We next repeated the process of screening the remaining GT candidates against the QA-GlcpA-Galp product. As before, GT candidates were screened by co-expression with the seven genes required to make QA-GlcpA-Galp (tHMGR/QsbAS/CYP716-C-28/CYP716-C-16/CYP714-C23/QsCSL1/Qs-3-O-GalT). With this strategy we identified a UGT enzyme which resulted in depletion of the QA-GlcpA-Galp product. However, rather than a single new product, we observed the appearance of two new products with very close retention times to the former QA-GlcpA-Galp (
(33) Q. saponaria is known to produce in excess of 100 different saponins [16]. Within these saponins, the 3-O-GlcpA--1,2-
(34) Previously, chemical profiling of Q. saponaria trees has demonstrated the existence of distinct chemotypes which vary in their ability to produce saponins containing either Rhap or Xylp attached to GlcpA-3-O (see WO 2018/057031). One explanation for these observations is the presence of two distinct alleles of the terminal sugar transferase with differing sugar specificity as previously demonstrated for soybean [18]. Notwithstanding this, the present disclosure provides an enzyme which is capable of catalysing addition of two distinct sugars at the same position.
Example 5Purification and NMR Validation of the Trisaccharide Produced in N. benthamiana
(35) To verify the structures of the compounds 1 and 2 (
Example 6Use of QA-3-O-TriS Genes, Optionally in Combination with QA Genes, for Production of Stably Transformed Plants
(36) Triterpenes have previously been produced using engineered transgenic plant lines (e.g. Arabidopsis, Wheat). A series of Golden Gate [23] vectors which allows for construction of multigene vectors and allows integration of an entire pathway into a single locus have been reported. These can be applied analogously to the present invention, in the light of the disclosure herein.
(37) The QA-3-O-TriS genes described herein, optionally in conjunction with QA genes of prior-filed unpublished PCT/EP2018/086430 (subsequently published as WO 2019/122259), may thus be used to produce stable transgenic plants in the light of the present disclosure in combination with known transgenic technologies.
Example 7Identification of Quillaic Acid 3-O Glucuronosyl Transferase CSLG2 (QsCSLG2)
(38) As described in the preceding Examples, the 1 KP Q. saponaria leaf transcriptome was used to identify genes involved in the biosynthesis of quillaic acid (QsbAS, QsCYP716-C-16, QsCYP714-C-23 and QsCYP716-C-28) and the trisaccharide at the C-3 position of QS-21 (QsCSL1, Qs-3-O-GalT and Qs-3-O-RhaT/XylT).
(39) Genes involved in triterpene glycoside biosynthesis are typically co-expressed [25]. In order to investigate the expression pattern of the characterised QS-21 biosynthetic genes across multiple tissues, RNA-seq data were generated for six Q. saponaria tissues (primordia, expanding leaf, mature leaf, old leaf, green stem and root). The gene expression profiles for QsbAS, QsCYP716-C-16, QsCYP714-C-23, QsCYP716-C-28 and Qs-3-O-GalT showed a pattern of low expression in old leaf and high expression in primordia, with some variability in expression levels in root, expanding leaf, green stem and mature leaf (
(40) As the expression profile for QsCSL1 did not follow the general pattern seen for the other characterised QS-21 genes, it was investigated whether there might be genes related to QsCSL1 that did have the QS-21 gene expression pattern and which therefore might be involved in QS-21 biosynthesis. QsCSL1 was used in a BLASTp search to identify cellulose synthase-like genes in the Q. saponaria annotated genome. This identified 39 additional cellulose synthase superfamily genes, of which five (named CslG2 to CslG6) were in the same subfamily as QsCSL1 (
(41) Analysis of the expression profile of these genes show that CslG3-CslG6 are expressed most highly in old leaf or in the root (
Example 8Identification of QA-GlcpA-Galp Xylosyltransferase and Rhamnosyltransferases
(42) As explained in Example 4, the DNA sequence for the dual glycosyltransferase Qs-3-O-RhaT/XylT was not identified in the Quillaja saponaria genomic dataset. Instead, this gene appeared to be a chimera between two adjacent genes, Qs_0283860 (a pseudogene) and Qs_0283870 (
(43) It is theoretically possible that there are alleles of these genes that are not represented in the genomic Q. saponaria dataset or that this region was incorrectly resolved. As an alternative database, a de novo transcriptome assembly was generated from the Q. saponaria primordia RNA-seq reads [26]. A BLASTn search using the three genomic genes and Qs-3-O-RhaT/XylT as queries identified two full-length transcripts: DN20529_c0_g2_i6, which was identical to the sequence of Qs_0283870, corroborating the sequence of this gene; and DN20529_c0_g2_i8, which had 99% DNA sequence identity to the Qs_0283860 pseudogene and 98% DNA sequence identity to Qs_0283850 (Table 9).
(44) To investigate the presence and function of these genes, we attempted to amplify the sequences from Q. saponaria leaf cDNA. Qs_0283850 and Qs_0283870 were successfully amplified. Primers designed to amplify the pseudogene Qs_0283860 amplified a full-length sequence with 100% sequence identity in the coding region of the gene predicted by the de novo transcriptome, DN20529_c0_g2_i8. This amplified sequence is subsequently referred to as DN20529_c0_g2_i8. These three amplified genes (Qs_0283850, Qs_0283870 and DN20529_c0 g2_i8) were cloned into the plant expression vector pEAQ-HT-DEST1 and transformed into A. tumefaciens for transient expression in Nicotiana benthamiana.
(45) As described above, co-expression of Qs-3-O-RhaT/XylT with the seven genes able to make QA-GlcpA-Galp (tHMGR/QsbAS/CYP716-C-28/CYP716-C-16/CYP714-C23/QsCSL1/Qs-3-O-GalT) resulted in the appearance of trisaccharides QA-GlcpA-[Galp]-Rhap (retention time=12.5 min, MW=970) and QA-GlcpA-[Galp]-Xylp (retention time=12.75 min, MW=956), which have very close retention times to the former QA-GlcpA-Gal (retention time=12.6 min, MW=824) (
(46) Similarly, co-expression of either Qs_0283850, Qs_0283870 or DN20529_c0 g2_i8 with the genes required to make QA-GlcpA-Galp revealed that all three enzymes were able to convert QA-GlcpA-Galp, but resulted in the production of one new product each (
(47) Co-expression of Qs_0283870 with the genes required to make QA-GlcpA-Galp also reduced the QA-GlcpA-Galp peak, however it accumulated a less polar compound with the same retention time (12.75 min) and molecular weight (MW=956) as QA-GlcpA-[Galp]-Xylp (
(48) This suggests that Qs_0283870 is primarily a xylosyltransferase and can produce QA-GlcpA-[Galp]-Xylp without producing significant amounts of QA-GlcpA-[Galp]-Rhap.
(49) Materials and Methods
(50) Phylogenetic Analysis of UGT Candidates
(51) Amino acid sequences were deduced from the predicted full-length coding sequences of the Q. saponaria UGTs. Representative amino acid sequences of characterised glycosyltransferase family 1 UGTs from other plant species (Table 3) were obtained from the NCBI database and incorporated into the phylogenetic analysis. Protein sequences were aligned using MAFFT (https://mafft.cbrc.jp/alignment/software/). The unrooted trees were constructed in MEGA7 by the Neighbor-Joining method with 1000 bootstrap replicates [20, 21].
(52) Primers and Cloning
(53) The genes encoding the enzymes described herein (QsCSL1, Qs-3-O-GalT, Qs-3-O-RhaT/XylT, QsCslG2, Qs_0283850, DN20529_c0_g2_i8 and Qs_0283870) were amplified by PCR from cDNA derived from leaf tissue of Q. saponaria. PCR was performed using the primers detailed in Tables 2 and 10, using iProof polymerase with thermal cycling according to the manufacturer's recommendations. The resultant PCR products were purified (Qiagen PCR cleanup kit) and each cloned into the pDONR207 vector using BP clonase according to the manufacturer's instructions. The BP reaction was transformed into E. coli and the resulting transformants were cultured and the plasmids isolated by miniprep (Qiagen). The isolated plasmids were sequenced (Eurofins) to verify the presence of the correct genes. Next each of the three genes were further subcloned into the pEAQ-HT-DEST1 expression vector using LR clonase. The resulting vectors were used to transform A. tumefaciens LBA4404 by flash freezing in liquid N.sub.2.
(54) Agroinfiltration of N. benthamiana Leaves
(55) Agroinfiltration was performed using a needleless syringe as previously described [19]. All genes were expressed from pEAQ-HT-DEST1 binary expression vectors [14] in A. tumefaciens LBA4404 as described above. Cultivation of bacteria and plants is as described in [19].
(56) Preparation of N. benthamiana Leaf Extracts for LC-MS Analysis
(57) Leaves were harvested 5 days after agroinfiltration and freeze-dried. Freeze-dried leaf material (10 mg per sample) was ground at 1000 rpm for 1 min (Geno/Grinder 2010, Spex SamplePrep). Extractions were carried out in 550 L 80% methanol with 20 g/mL of digitoxin (internal standard; Sigma) for 20 min at 40 C., with shaking at 1400 rpm (Thermomixer Comfort, Eppendorf). The sample was partitioned twice with 400 L hexane. The aqueous phase was dried under vacuum at 40 C. (EZ-2 Series Evaporator, Genevac). Dried material was resuspended in 75 L of 100% methanol and filtered at 12, 500 g for 30 sec (0.2 m, Spin-X, Costar). Filtered samples were transferred to glass vials and analysed as detailed below.
(58) LC-MS Analysis of N. benthamiana Leaf Extracts
(59) Analysis was carried out using a Prominence HPLC system with single quadrupole mass spectrometer LCMS-2020 (Shimadzu) and Corona Veo RS Charged Aerosol Detector (CAD) (Dionex). Detection: MS (dual ESI/APCI ionization, DL temp 250 C., neb gas flow 15 L.Math.min1, heat block temp 400 C., spray voltage Pos 4.5 kV, Neg 3.5 kV) CAD: data collection rate 10 Hz, filter constant 3.6 s, 925 evaporator temp. 35 C., ion trap voltage 20.5 V. Method: Solvent A: [H.sub.2O+0.1% formic acid] Solvent B: [acetonitrile (CH.sub.3CN)+0.1% formic acid. Injection volume: 10 L. Gradient: 15% [B] from 0 to 1.5 min, 15% to 60% [B] from 1.5 to 26 min, 60% to 100% [B] from 26 to 26.5 min, 100% [B] from 26.5 to 28.5 min, 100% to 15% [B] from 28.5 to 29 min, 35% [B] from 29 to 30 min. Method was performed using a flow rate of 0.3 mL.Math.min-1 and a Kinetex column 2.6 m XB-C18 100 , 502.1 mm (Phenomenex). Analysis was performed using LabSolutions software (Shimadzu).
(60) Large Scale Vacuum Infiltration of N. benthamiana
(61) A total of 198 plants were infiltrated by vacuum as previously described [19, 22] with the A. tumefaciens strains carrying the pEAQ-HT-DEST1 constructs for tHMGR, QsbAS, CYP716-C-28, CYP716-C-16, CYP714-C-23, QsCSL1, Qs-3-O-GalT and Qs-3-O-RhaT/XylT. Plants were harvested after 4 days and freeze dried, resulting in a total of 175.25 g dry leaf material.
(62) Purification of Compounds from Large Scale Infiltrations of N. benthamiana
(63) General Procedures
(64) Organic solvents used for extraction and flash chromatography were reagent grade and used directly without further distillation. HPLC mobile phases were prepared using HPLC grade solvents. LC-MS spectral data were recorded on SHIMADZU-2020, single quad, using Kinetex-XB-C.sub.18 (5010 mm i.d.; 2.6 m; USA), (JIC, UK). 1D and 2D NMR spectra were recorded on Bruker Avance 600 MHz spectrometer equipped with a BBFO Plus Smart probe and a triple resonance TCI cryoprobe, respectively (JIC, UK). The chemical shifts are relative to the residual signal solvent (MeOH-d.sub.4: .sub.H 3.31; .sub.C 49.15). Preparative HPLC experiments were performed on Ultimate 3000 using Luna C.sub.18 column (25010 mm i.d.; 5 m; USA). Flash column chromatography (FCC) was performed using an Isolera One (Biotage), using SNAP Ultra 50 g columns. Analytical TLC experiments were performed on silica gel precoated aluminium plates (F254, 2020 cm, Merck KGaA, Germany). TLC plates were visualized under UV light (254 nm) followed by staining with p-anisaldehyde (2% v/v p-anisaldehyde, 2% v/v, Conc. H.sub.2SO.sub.4).
(65) Extraction and Isolation
(66) Dried N. benthamiana powder was mixed with quartz sand (0.3-0.9 mm). This mixture was layered on top of a bottom layer of quartz sand (0.3-0.9 mm) 3 cm in depth within a 120 mL extraction cell. Extraction was performed using a Speed Extractor E-914 (Bchi) with three cycles at 100 C. and a pressure of 130 bar. Cycle one had zero hold time, and cycles two and three had 5 min hold times. The run finished with a 1 min solvent flush and 12 min N.sub.2 flush. The dried leaves were initially extracted by hexane for defatting, followed by subsequent exhaustive extraction using methanol. Organic layers were combined together and evaporated under reduced pressure. The crude methanolic extract was dissolved in the least amount of methanol and diluted with equivalent volume of water, then it was successfully partitioned using separation funnel against hexane, dichloromethane, ethyl acetate and n-butanol. The butanol layer was recollected and dried over anhydrous NaSO.sub.4, evaporated under reduced pressure and subjected to a normal phase silica-gel flash chromatography (35-70 m), using a long gradient of DCM/MeOH [100/0-0/100] along 30 min. The column was further washed with ethyl acetate/acetone/water/formic acid (5/3/0.5/0.5). All fractions were monitored by TLC using different eluent systems and combined together according to their polarities. Based on the LC-MS profiling and .sup.1H NMR as well, promising fractions were introduced for further reparative chromatographic purifications by reversed phase (preparative/semipreparative C.sub.18-HPLC) using the eluent system water/acetonitrile containing 0.1% formic acid, to finally afford pure saponins. The detailed isolation scheme of the isolated compounds for the purification of compounds 1 and 2 (see Examples 4 and 5) and their quantities is given (
(67) NMR Analysis
(68) NMR spectra were recorded in Fourier transform mode at a nominal frequency of 600 MHz for .sup.1H NMR and 150 MHz for .sup.13C NMR in deuterated methanol unless otherwise indicated. Chemical investigation of the n-butanol fraction of N. benthamiana leaves (Examples 4 and 5) afforded the isolation of two previously reported triterpene saponins, namely 3-{[-
(69) Alignment of RNA Sequences and Heatmaps
(70) RNA-seq data (Illumina-sequenced reads) were aligned to the Q. saponaria genome using the STAR package (version 2.5) [27] and quantified using the featureCounts program (http://subread.sourceforge.net/, version 1.6.0). The heatmaps were drawn in R using heatmap.2, https://CRAN.R-proiect.org/packaae=gplots).
REFERENCES
(71) 1. Del Giudice, G., R. Rappuoli, and A. M. Didierlaurent, Correlates of adjuvanticity: A review on adjuvants in licensed vaccines. Seminars in Immunology, 2018. 39: p. 14-21. 2. Marciani, D. J., Elucidating the Mechanisms of Action of Saponin-Derived Adjuvants. Trends in Pharmacological Sciences, 2018. 39(6): p. 573-585. 3. FDA. Available from: https://www.fda.qov/vaccines-blood-biologics/vaccines/shinqrix. 4. Johnson, M. T. J., et al., Evaluating Methods for Isolating Total RNA and Predicting the Success of Sequencing Phylogenetically Diverse Plant Transcriptomes. PLOS ONE, 2012. 7(11): p. e50226. 5. Bowles, D., et al., Glycosyltransferases of lipophilic small molecules. Annu Rev Plant Biol, 2006. 57: p. 567-97. 6. Louveau, T. and A. Osbourn, The Sweet Side of Plant-Specialized Metabolism. Cold Spring Harb Perspect Biol (In Press), 2019. 7. Kautsar, S. A., et al., plantiSMASH: automated identification, annotation and expression analysis of plant biosynthetic gene clusters. Nucleic Acids Res, 2017. 8. Luang, S., et al., Rice Os9BGlu31 is a transglucosidase with the capacity to equilibrate phenylpropanoid, flavonoid, and phytohormone glycoconjugates. J Biol Chem, 2013. 288(14): p. 10111-23. 9. Matsuba, Y., et al., A novel glucosylation reaction on anthocyanins catalyzed by acylglucose-dependent glucosyltransferase in the petals of carnation and delphinium. Plant Cell, 2010. 22(10): p. 3374-89. 10. Miyahara, T., et al., Isolation of an acyl-glucose-dependent anthocyanin 7-O-glucosyltransferase from the monocot Agapanthus africanus. J Plant Physiol, 2012. 169(13): p. 1321-6. 11. Miyahara, T., et al., Isolation of anthocyanin 7-O-glucosyltransferase from Canterbury bells (Campanula medium). Plant Biotechnology, 2014. advpub. 12. Nishizaki, Y., et al., p-Hydroxybenzoyl-glucose is a zwitter donor for the biosynthesis of 7-polyacylated anthocyanin in Delphinium. Plant Cell, 2013. 25(10): p. 4150-65. 13. Song, X., et al., Genome-wide characterization of the cellulose synthase gene superfamily in Solanum lycopersicum. Gene, 2019. 688: p. 71-83. 14. Sainsbury, F., E. C. Thuenemann, and G. P. Lomonossoff, pEAQ: versatile expression vectors for easy and quick transient expression of heterologous proteins in plants. Plant Biotechnol J, 2009. 7(7): p. 682-93. 15. Louveau, T., et al., Analysis of two new arabinosyltransferases belonging to the carbohydrate-active enzyme (CAZY) glycosyl transferase family 1 provides insights into disease resistance and sugar donor specificity. The Plant Cell, 2018: p. tpc.00641.2018. 16. Kite, G. C., M. J. Howes, and M. S. Simmonds, Metabolomic analysis of saponins in crude extracts of Quillaja saponaria by liquid chromatography/mass spectrometry for product authentication. Rapid Commun Mass Spectrom, 2004. 18(23): p. 2859-70. 17. Fleck, J. D., et al., Saponins from Quillaja saponaria and Quillaja brasiliensis: Particular Chemical Characteristics and Biological Activities. Molecules, 2019. 24(1). 18. Sayama, T., et al., The Sg-1 glycosyltransferase locus regulates structural diversity of triterpenoid saponins of soybean. Plant Cell, 2012. 24(5): p. 2123-38. 19. Reed, J., et al., A translational synthetic biology platform for rapid access to gram-scale quantities of novel drug-like molecules. Metab Eng, 2017. 42: p. 185-193. 20. Kumar, S., G. Stecher, and K. Tamura, MEGA7: Molecular Evolutionary Genetics Analysis Version 70 for Bigger Datasets. Mol Biol Evol, 2016. 33(7): p. 1870-4. 21. Saitou, N. and M. Nei, The neighbor-joining method: a new method for reconstructing phylogenetic trees. Mol Biol Evol, 1987. 4(4): p. 406-25. 22. Stephenson, M. J., et al., Transient Expression in Nicotiana Benthamiana Leaves for Triterpene Production at a Preparative Scale. Journal of visualized experiments: JoVE, 2018(138): p. 58169. 23. Guo, S., et al., Triterpenoid saponins from Quillaja saponaria. Phytochemistry, 1998. 48(1): p. 175-180. 24. Miettinen, K., et al., The ancient CYP716 family is a major contributor to the diversification of eudicot triterpenoid biosynthesis. Nat Commun, 2017. 8: p. 14153. 25. Thimmappa, R., Geisler, K., Louveau, T., O'Maille, P., and Osbourn, A. (2014). Triterpene biosynthesis in plants. Annu. Rev. Plant Biol., 65:225-257. 26. Grabherr, M. G., et al. (2011). Full-length transcriptome assembly from RNA-seq data without a reference genome. Nat. Biotechnol., 29:644-652. 27. Dobin, A., et al. (2013). STAR: ultrafast universal RNA-seq aligner. Bioinformatics, 29:15-21.
Tables and Sequences
(72) TABLE-US-00001 TABLE 1 .sup.1H, .sup.13C NMR spectral data for compounds 1 and 2 in MeOH-d4, (600, 150 MHz) QA-GlcpA-[Galp]-Rhap (1) QA-GlcpA-[Galp]-Rhap (2) Position .sub.H mult, .sub.H mult, No. .sub.C, Type (J in Hz) .sub.C, Type (J in Hz) 1 .sup.39.4, CH.sub.2 1.71/1.13, m 39.3, CH.sub.2 1.69/1.10, m 2 .sup.25.7, CH.sub.2 2.02/1.79, m 25.8, CH.sub.2 2.05/1.77, m 3 86.0, CH 3.85, dd, (11.9, 4.55) 85.9, CH.sup. 3.86, m 4 56.4, Cq 56.5, Cq.sup. 5 49.2, CH 1.35, m 49.1, CH.sup. 1.32, m 6 .sup.21.4, CH.sub.2 1.50/0.90, m 21.5, CH.sub.2 1.47/0.90, m 7 .sup.33.7, CH.sub.2 1.57/1.25, m 33.8, CH.sub.2 1.55/1.23, m 8 41.0, Cq .sup.41.1, Cq HMBC 9 48.2, CH 1.77, m 48.2, CH.sup. 1.75, m 10 37.2, Cq 37.1, Cq.sup. 11 .sup.24.6, CH.sub.2 1.93/1.93, m 24.6, CH.sub.2 1.92/1.92 12 123.3, CH 5.30, br t (3.7) 123.4, CH.sup. .sup.5.30, bd s 13 145.3, Cq.sup. .sup.145.3, Cq HMBC 14 42.8, Cq 42.8, Cq.sup. 15 .sup.36.3, CH.sub.2 1.84/1.34, m 36.4, CH.sub.2 1.84/1.33, m 16 75.4, CH 4.45, m 75.5, CH.sup. .sup.4.44, br s 17 49.7, Cq 49.7, Cq.sup. 18 42.2, CH 3.01, dd, (14.3, 4.4) 42.3, CH.sup. 3.01, d (14.3, 4.4) 19 .sup.47.8, CH.sub.2 2.30, t (13.6)/1.04, m 47.9, CH.sub.2 2.29, t (13.6)/1.03, m 20 31.5, Cq 30.9, Cq.sup. 21 .sup.36.7, CH.sub.2 1.96/1.16, m 36.8, CH.sub.2 195/1.14, m 22 .sup.32.9, CH.sub.2 1.91/1.78, m 32.9, CH.sub.2 1.89/1.76, m 23 210.9, CH 9.44, s.sup. 210.8, CH.sup. 9.43, s.sup. 24 .sup.10.9, CH.sub.3 1.15, s.sup. 10.9, CH.sub.3 1.13, s.sup. 25 .sup.16.4, CH.sub.3 1.00, s.sup. 16.5, CH.sub.3 1.00, s.sup. 26 .sup.17.9, CH.sub.3 0.79, s.sup. 17.9, CH.sub.3 0.80, s.sup. 27 .sup.27.4, CH.sub.3 1.40, s.sup. 27.4, CH.sub.3 1.39, s.sup. 28 181.2, Cq.sup. .sup.181.0, Cq HBMC 29 .sup.33.6, CH.sub.3 0.88, s.sup. 33.6, CH.sub.3 0.88, s.sup. 30 .sup.25.0, CH.sub.3 0.97, s.sup. 25.1, CH.sub.3 0.97, s.sup. GlcA-1 104.2, CH 4.46, m 104.5, CH.sup. .sup.4.36, d (7) GlcA-2 78.4, CH 3.63, m 78.7, CH.sup. 3.65, m GlcA-3 86.0, CH 3.63, m 86.7, CH.sup. 3.67, m GlcA-4 73.2, CH 3.48, m 72.1, CH.sup. 3.55, m GlcA-5 77.0, CH 3.73, m Not detected Not detected GlcA-6 174.2, Cq.sup. 176.3 Cq, HMBC.sup. Gal-1 104.4, CH 4.45, m 104.0, CH.sup. .sup.4.78, d (6) Gal-2 73.2, CH 3.48, m 73.8, CH.sup. 3.45, m Gal-3 75.2, CH 3.47, m 75.6, CH.sup. 3.43, m Gal-4 70.8, CH 3.81, m 70.9, CH.sup. 3.80, m Gal-5 77.1, CH 3.47, m 76.8, CH.sup. 3.48, m Gal-6 .sup.62.4, CH.sub.2 3.78/3.73, m 62.3, CH.sub.2 3.75/3.72, m Rha-1 103.4, CH .sup.5.03, d (1.8) Rha-2 72.3, CH 4.01, dd (3.4, 1.8) Rha-3 72.4, CH 3.65, m Rha-4 74.0, CH 3.40, m Rha-5 70.7, CH 3.96, m Rha-6 .sup.18.0, CH.sub.3 .sup.1.24, d (6.1) Xyl-1 104.9, CH.sup. .sup.4.61, d (7.7) Xyl-2 75.5, CH.sup. 3.23, m Xyl-3 78.3, CH.sup. 3.31, m Xyl-4 71.3, CH.sup. 3.49, m Xyl-5 67.3, CH.sub.2 3.89/3.21, m
(73) TABLE-US-00002 TABLE2 Primersusedtoclonethethreeglycosyltransferasesrequiredforbiosynthesisof thetrisaccharideatC-3ofquillaicacid.Genespecificsequencesareshowninblack,while theattBsitesrequiredforGatewaycloningareshowninred. Name Sequence QsCSL1_attB1F GGGGACAAGTTTGTACAAAAAAGCAGGCTTAATGAAATCCCCCTCTAACCCAAATC(SEQIDNO.:33) QsCSL1_attB2R GGGGACCACTTTGTACAAGAAAGCTGGGTATCAGACCATTTTCTTGCTGATTCTAG(SEQIDNO.:34) Qs-3-O-GalT_attB1F GGGGACAAGTTTGTACAAAAAAGCAGGCTTAATGGTGGAGTCTCCAGCAGATC(SEQIDNO.:35) Qs-3-O-GalT_attB2R GGGGACCACTTTGTACAAGAAAGCTGGGTATCAGACACCCTGAATTCTTGATTTC(SEQIDNO.:36) Qs-3-O-RhaT/ GGGGACAAGTTTGTACAAAAAAGCAGGCTTAATGGTCTCCGGCGACGACGATG(SEQIDNO.:37) XylT_attB1F Qs-3-O-RhaT/ GGGGACCACTTTGTACAAGAAAGCTGGGTATCACGATTCATGATCTTGTGCAGCC(SEQIDNO.:38) XylT_attB2R
(74) TABLE-US-00003 TABLE4 AlignmentofUGTproteinsequencesintheregionoftheFamily1UGT44-amino acidPlantSecondaryProductGlycosyltransferase(PSPG)motif.Qs_2073886_D6(Qs-3- O-GalT)sharesthehistidineresidueconservedinUGTsthattransfer-D-galactoseor -L-arabinose.FigureadaptedfromLouveau,Orme[15].Accessionnumbers:AsUGT99A6 (AZQ26916),MtUGT73K1(AAW56091),AtUGT78D2(NP_197207),GmSSAT (XP_003532274),AtUGT78D3(NP_197205),AeGaT(BADO6514),GmUGT73P2 (BAI99584). GlcT AsUGT99A6 WAPQALILSHRAAGAFVTHCGWNSTLEAVAAGLPVVTWPHFTD Q MtUGT73K1 WVPQALILDHPSIGGFLTHCGWNATVEAISSGVPMVTMPGFGD Q AtUGT78D2 WAPQVELLKHEATGVFVTHCGWNSVLESVSGGVPMICRPFFGD Q AraT/ Qs_2073886_D6 WAPQLLILDHPAIGGLLNHSGWNSVLEGATAGLPMITWPLYAE H GalT GmSSAT WVPQGLILKHDAIGGFLTHCGANSVVEAICEGVPLITMPRFGD H AtUGT78D3 WAPQVELLNHEAMGVFVSHGGWNSVLESVSAGVPMICRPIFGD H AcGaT WAPQIQVLSHDAVGVVITHGGWNSVVESIAAGVPVICRPFFGD H GmUGT73P2 WAPQLLILENPAIGGLVTHCGWNTVVESVNAGLPMATWPLFAE H (SEQ ID NOs.: 39-46)
(75) TABLE-US-00004 TABLE 5 Glycosyltransferases identified herein (Qs QA-3-O-TriS sequences): Nucleotide AA Sequence CDS - sequence - Enzyme Biological activity SEQ ID NOs SEQ ID NOs QsCSL1 QA-GlcAT 1 2 Capable of transferring D-glucuronic acid (GlcpA) at the 3-O position of quillaic acid to form 3-{[-D-glucopyranosiduronic acid]oxy}-quillaic acid (QA-GlcpA). QsCSLG2 25 26 Qs-3-O-GalT QA-GalT 3 4 Capable of transferring D-Galactose (Galp) via a -1>2 linkage to QA-GlcpA to form =>3-{[-D-galactopyranosyl-(1>2)--D- glucopyranosiduronic acid]oxy}-quillaic acid (QA-GlcpA-Galp) Qs-3-O-RhaT/XylT QA-RhaT/XylT 5 6 Qs_0283850 27 28 DN20529_c0_g2_i8 29 30 Qs_0283870 The enzymes are capable of transferring 31 32 D-Xylose (Xylp)or L-Rhamnose via a 1,3 linkage to QA-GlcpA-Galp to form 3-{[-D-xylopyranosyl-(1>3)-[-D- galactopyranosyl-(1>2)]--D- glucopyranosiduronic acid]oxy}-quillaic acid (QA-GlcpA-[Galp]-Xylp) and/or (3-{[-L-rhamnopyranosyl-(1>3)-[-D- galactopyranosyl-(1>2)]--D- glucopyranosiduronic acid]oxy}-quillaic acid) (QA-GlcpA-[Galp]-Rhap) respectively
(76) TABLE-US-00005 TABLE 6 Other GTs which may be used in QA-glycosylation (QA-3-O-TriS sequences) Nucleotide AA Sequence CDS - Sequence - Enzyme Activity SEQ ID NOs SEQ ID NOs GmUGT73P2 QA-GalT: a triterpene 3-O-glucuronoside- 19 20 -1,2-D-galactosyltransferase
(77) TABLE-US-00006 TABLE 7 Ancillary activities AA CDS - Sequence - SEQ SEQ Enzyme Activity ID NOs ID NOs AsHMGR HMG-CoA reductase (HMGR); 7 8 tHMGR HMG-CoA reductase (HMGR); 9 10 AsSQS (Avena squalene synthase (SQS) 21 22 strigosa squalene synthase) AtATR2 cytochrome P450 reductase 23 24 (Arabidopsis thaliana cytochrome P450 reductase 2)
(78) TABLE-US-00007 TABLE 8 QA biosynthesis activities Enzyme (QA CDS - AA Sequence - polypeptides) Activity SEQ ID NOs SEQ ID NOs QsbAS (-amyrin cyclisation of 2,3-oxidosqualene (OS) to a 11 12 synthase) triterpene QsCYP716-C-28 enzyme capable of oxidising -amyrin or an 13 14 oxidised derivative thereof at the C-28 position to a carboxylic acid QsCYP716-C-16 enzyme capable of oxidising -amyrin or an 15 16 oxidised derivative thereof at the C-16 position to an alcohol QsCYP714-C-23 enzyme capable of oxidising -amyrin or an 17 18 oxidised derivative thereof at the C-23 position to an aldehyde
(79) TABLE-US-00008 TABLE 9 DNA (top right) and protein (bottom left) sequence identity between the gene and protein sequences of the three UGT sequences identified in the Q. saponaria genome, the new sequence DN20529_c0_g2_i8 identified in the de novo transcriptome and Qs-3-O-RhaT/XylT. Qs_0283860 region corresponds to the genomic region of the Qs_0283860 pseudogene starting from the predicted start codon and the predicted stop codon. NA identity Qs_0283860 Qs-3-O- Protein identity Qs_0283850 region Qs_0283870 DN20529_c0_g2_i8 RhaT/XylT Qs_0283850 97% 90% 98% 92% Qs_0283860 region N/A 89% 99% 92% Qs_0283870 86% N/A 89% 97% DN20529_c0_g2_i8 98% N/A 86% 92% Qs-3-O-RhaT/XylT 90% N/A 96% 90%
(80) TABLE-US-00009 TABLE10 Primersusedtoclonethefourglycosyltransferases.Genespecificsequencesare showninblack,whiletheattBsitesrequiredforGatewaycloningareshowningrey. Name Sequence CslG2_attB1F ATGGCGACCGTCTCCTCCCT (SEQIDNO.:47) CslG2_attB2R
TTAGGCCTTTCCCTTGCCTTT (SEQIDNO.:48) Qs_0283870_attB1F
ATGGTCTCCGGCGACGACGATG (SEQIDNO.:49) Qs_0283870_attB2R
TCACGATTCATGATCTTGTGCAGCC (SEQIDNO.:50) Qs_0283850_attB1F
ATGGTCTCCGGCGACGACGACG (SEQIDNO.:51) Qs_0283850_attB2R
TCATGCAACCTTGCCATTGTTAGCCCT (SEQIDNO.:52) Qs_0283860_attB1F
ATGGTCTCCGGCGACGACGAC (SEQIDNO.:53) Qs_0283860_attB2R
TCATGATTTCATTGCAGCCTTGCCA (SEQIDNO.:54)
(81) TABLE-US-00010 TABLE 11 Full NMR data for quillaic acid 3-O--D-glucopyranosiduronic acid (QsbAS/QsCYP716-C-16/QsCYP714-C-23/QsCYP716-C-28/QsCSL1 product) in MeOH-d.sub.4 (600, 150 MHz)
(82) TABLE-US-00011 TABLE 12 Full NMR data for quillaic acid 3-O--D-glucopyranosiduronic acid (QsbAS/QsCYP716-C-16/QsCYP714-C-23/QsCYP716-C-28/CslG2 product) in MeOH-d.sub.4 (600, 150 MHz)
(83) TABLE-US-00012 TABLE 13 Full NMR data for quillaic acid 3-O-{--D-galactopyranosyl-(1.fwdarw.2)--D- glucopyranosiduronic acid} (QsbAS/QsCYP716-C-16/QsCYP714-C-23/ QsCYP716-C-28/QsCSL1/Qs-3-O-GalT product) in MeOH-d.sub.4 (600, 150 MHz)
(84) TABLE-US-00013 TABLE 14 .sup.1H, .sup.13C NMR spectral data for QA-GlcpA-[Galp]-Rhap (QsbAS/QsCYP716-C- 16/QsCYP714-C-23/QsCYP716-C-28/QsCslG2/Qs-3-O-GalT/Qs_0283850 product) in MeOH-d.sub.4, (400, 100 MHz)
(85) TABLE-US-00014 TABLE 15 .sup.1H, .sup.13C NMR spectral data for QA-GlcpA-[Galp]-Xylp (QsbAS/QsCYP716-C- 16/QsCYP714-C-23/QsCYP716-C-28/QsCslG2/Qs-3-O-GalT/Qs_0283870 product) in MeOH-d.sub.4, (400, 100 MHz)
(86) TABLE-US-00015 TABLE 3 Family 1 UDP-dependent glycosyltransferases (UGT) used in phylogenetic analysis. UGTs believed to be active on triterpenes are highlighted in bold. Enzyme Accession number UGT family UGT Group Plant species Reported activity Reference AtUGT79B1 Q9LVW3 UGT79 A Arabidopsis thaliana Anthocyanidin 3-O-glucoside [1,2]-xylosyltransferase Yonekura-Sakakibara et al. (2012) AtUGT79B6 Q9FN26 UGT79 A Arabidopsis thaliana Flavonol 3-O-galactoside [1,2]-glucosyltransferase Yonekura-Sakakibara et al. (2014) Cs1-6RhaT ABA18631 UGT79 A Citrus sinensis Flavonoid 7-O/3-O-glucoside [1,6]-rhamnosyltransferase Frydman et al. (2013) GmUGT79A6 BAN91401 UGT79 A Glycine max Flavonol 3-O-glucoside/galactoside [1,6]-rhamnosyltransferase Rojas Rodas et al. (2014) LeABRT2 BAU68118 UGT79 A Lobelia erinus Delphinidin 3-O-glucoside [1,6]-rhamnosyltransferase Hsu et al. (2017) GmUGT91H4 BAI99585 UGT91 A Triterpene 3-O-galactoside [1,2]-rhamnosyltransferase Shibuya et al. (2010) GmUGT91H9 NP_001348424 UGT91 A
Triterpene 3-O-galactoside [1,2]-glucosyltransferase Yano et al. (2018) In3GGT Q53UH4 UGT91 A Ipomoea nil Anthocyanidin 3-O-glucoside [1,2]-glucosyltransferase Morita et al. (2005) GjUGT94E5 F8WKW8 UGT94 A Gardenia jasminoides Apocarotenoid glucoside [1,6]-glucosyltransferase Nagatoshi et al. (2012) BpUGT94B1 Q5NTH0 UGT94 A Bellis perennis Anthocyanidin 3-O-glucoside [1,2]-glucuronosyltransferase Sawada et al. (2005) Cm1-2RhaT1 AAL06646 UGT94 A Citrus maxima Flavonoid 7-O-glucoside [1,2]-rhamnosyltransferase Frydman et al. (2013) PgUGT94Q2 AGR44632 UGT94 A
Triterpene 3-O-glucoside [1,2]-glucosyltransferase Jung et al. (2014) SIGAME18 XP_004243636 UGT94 A Solanum lycopersicum Steroidal alkaloid 3-O-glucoside [1,2]-glucosyltransferase Itkin et al. (2013) VpUGT94F1 BAI44133 UGT94 A Veronica persica Flavonoid 3-O-glucoside [1,2]-glucosyltransferase Ono et al. (2010) AtUGT89C1 AAF80123 UGT89 B Arabidopsis thaliana Flavonol 7-O-rhamnosyltransferase Yonekura-Sakakibara et al. (2007) UGT89A2-Col-0 Q9LZD8 UGT89 B Arabidopsis thaliana Dihydroxybenzoic acid xylosyltransferase Chen and Li (2017) PoUGT90A7 ACB56926 UGT90 C Pilosella officinarum Flavonol glucosyltransferase Witte et al. (2009) AcUGT73G1 AAP88406 UGT73 D Allium cepa Flavonoid glucosyltransferase Kramer et al. (2003) AtUGT73B3 AAM47999 UGT73 D Arabidopsis thaliana Flavonoid-7-O-glucosyltransferase Kim et al. (2006) AtUGT73C1 AEC09294 UGT73 D Arabidopsis thaliana Cytokinin glucosyltransferase 1 Gandia-Herrero et al. (2008) AsUGT99D1 AZQ26921 UGT99 D
Triterpene-3-O-arabinosyltransferase Louveau et al. (2018) BvUGT73C10 AFN26666 UGT73 D
Triterpene-3-O-glucosyltransferase Augustin et al. (2012) CbBet5OGT CAB56231 UGT73 D Cleretum bellidiforme Betanidin-5-O-glucosyltransferase Vogt et al. (1999) CsUGT73A20 ALO19886 UGT73 D Camellia sinensis Flavonoid 7-O/3-O-glucosyltransferase Zhou et al. (2017) CsUGT73AM3 KGN59015 UGT73 D
Triterpene-3-O-glucosyltranferase Zhong et al. (2017) GmUGT73F2 BAM29362 UGT73 D Glycin max Triterpene 22-O-arabinoside [1,3]-glucosyltransferase Sayama et al. (2012) GmUGT73F4 BAM29363 UGT73 D Glycin max Triterpene 22-O-arabinoside [1,3]-xylosyltransferase Sayama et al. (2012) GmUGT73P2 (GmSGT2) BAI99584 UGT73 D Glycin max Triterpene 3-O-glucoronide [1,2]-galactosyltransferase Shibuya et al. (2010) GuUGAT ANJ03631 UGT73 D
Triterpene 3-O-glucoronosyltransferase/Triterpene 3-O-glucuronide Xu et al. (2016) [1,2]-glucuronosyltransferase MtUGT73F3 ACT34898 UGT73 D
Triterpene 28-O-glucosyltransferase Naoumkina et al. (2010) SIUGT73L4 ADQ37966 UGT73 D Solanum lycopersicum Steroidal alkaloid 3-O-glucoside [1,3]-xylosyltransferase Itkin et al. (2013) StSGT3 ABB84472 UGT73 D Solanum tuberosum Steroidal alkaloid 3-O-glucoside/galactoside [1,2]-rhamnosyltransferase McCue et al. (2007) CsUGT707B1 CCG85331 UGT707 E Crocus sativus Flavonol 3-O-glucoside [1,2]-glucosyltransferase Trapero et al. (2012) AtUGT71B6 NP_188815 UGT71 E Arabidopsis thaliana Abscisate -glucosyltransferase Priest et al. (2006) AtUGT71C1 NP_180536 UGT71 E Arabidopsis thaliana UDP-glucosyl transferase 71C1 Lim et al. (2008) OsUGT707A3 BAC83989 UGT71 E Oryza sativa Flavonoid 3-O-glycosyltransferase Ko et al. (2008) AtUGT72B1 Q9M156 UGT72 E Arabidopsis thaliana UDP-glycosyltransferase 72B1 Brazier-Hicks et al. (2007) AtUGT72E2 AED98252 UGT72 E Arabidopsis thaliana Hydroxycinnamate 4- -glucosyltransferase Lanot et al. (2006) MtUGT71G1 AAW56092 UGT71 E
Triterpenoid-O-glucosyltransferase Achnine et al. (2005) PgUGTPg1 AIE12479 UGT71 E
Protopanaxadiol-20-O-glucosyltransferase Yan et al. (2014) ScUGT5 BAJ11653 UGT88 E Sinningia cardinalis 3-Deoxyanthocyanidin 5-O-glucosyltransferase Nakatsuka and Nishihara (2010) AtUGT78D1 Q9S9P6 UGT78 F Arabidopsis thaliana Flavonol 3-O-glucosyltransferase Jones et al. (2003) Fh3GT1 ADK75021 UGT78 F Freesia hybrid cultivar Anthocyanidin 3-O-glucosyltransferase Sun et al. (2016) VmUF3GaT BAA36972 UGT78 F Vigna mungo Flavonoid 3-O-galactosyltransferase Mato et al. (1998) VvGT1 AAB81683 UGT78 F Vitis vinifera Anthocyanidin 3-O-glucosyltransferase Ford et al. (1998) AtUGT85A1 AAF18537 UGT85 G Arabidopsis thaliana Cytokinin-O-glucosyltransferase 2 Hou et al. (2004) PdUGT85A19 ABV68925 UGT85 G Prunus dulcis Cyanohydrin glucoside [1,6]-glucosyltransferase Franks et al. (2008) SbUGT85B1 AAF17077 UGT85 G Sorghum bicolor Cyanohydrin glycosyltransferase UGT85B1 Hansen et al. (2003) AtUGT76D1 AEC07843 UGT76 H Arabidopsis thaliana Flavonoid-7-O-glucosyltransferase Lim et al. (2004) SrUGT76G1 AAR06912 UGT76 H Stevia rebaudiana Diterpenoid 13-O-glucoside [1,3]-glucosyltransferase Richman et al. (2005) AtUGT83A1 Q9SGA8 UGT83 I Arabidopsis thaliana Unknown Ross et al. (2001) AtUGT87A1 O64732 UGT87 J Arabidopsis thaliana Unknown Ross et al. (2001) AtUGT87A2 NP_001077979 UGT87 J Arabidopsis thaliana Unknown Wang et al. (2012) AtUGT86A1 Q9SJL0 UGT86 K Arabidopsis thaliana Unknown Ross et al. (2001) AtUGT74E2 NP_172059 UGT74 L Arabidopsis thaliana Auxin (IBA) glycosyltransferase Tognetti et al. (2010) AsUGT74H5 ACD03250 UGT74 L Avena strigosa N-Methylanthranilate O-glucosyltransferase Owatworakit et al. (2012) PgUGT74A1 AGR44631 UGT74 L
Triterpene-3-O-glucosyltransferase Jung et al. (2014) SgUGT74AC1 AEM42999 UGT74 L
Triterpene (PPD)-3-O-glucosyltransferase Dai et al. (2015) VhUGT74M1 ABK76266 UGT74 L
Triterpene carboxylic acid 28-O-glucosyltransferase Meesapyodsuk et al. (2007) ZmIAGT AAA59054 UGT74 L Zea mays Auxin glucosyltransferase Szerszen et al. (1994) AtUGT75C1 Q0WW21 UGT75 L Arabidopsis thaliana Anthocyanin 5-O-glucosyltransferase Yamazaki et al (1999) GjUGT75L6 F8WKW0 UGT75 L Gardenia jasminoides Apocarotenoid glucosyltransferase Nagatoshi et al. (2012) Via5GT AHL68667 UGT75 L Vitis amurensis Rupr. Anthocyanin 5-O-glucosyltransferase He et al. (2015) cv. Zuoshanyi AtUGT84A1 Q5XF20 UGT84 L Arabidopsis thaliana Hydroxycinnamate glucosyltransferase 2 Milkowski et al. (2000) GtUF6CGT1 BAQ19550 UGT84 L Gentiana triflora Flavonoid 6-C-glucosyltransferase Sasaki et al. (2015) CuLGT BAA93039 UGT84 L
Triterpene (limonoid)-17-O-glucosyltransferase Kita et al. (2000) AtUGT92A1 Q9LXV0 UGT92 M Arabidopsis thaliana Unknown Ross et al. (2001) CcDOPA5GT BAD91804 UGT92 M Celosia cristata Cyclo-DOPA 5-O-glucosyltransferase Sasaki et al. (2005) MjcDOPA5GT BAD91803 UGT92 M Mirabilis jalapa Cyclo-DOPA 5-O-glucosyltransferase Sasaki et al. (2005) AtUGT82A1 Q9LHJ2 UGT82 N Arabidopsis thaliana Unknown Ross et al. (2001) SIGAME17 XP_004243637 UGT93 O Solanum lycopersicum Steroidal alkaloid 3-O-galactoside [1,4]-glucosyltransferase Itkin etal. (2013) ZmcisZog1 AAK53551 UGT93 O Zea mays cis-zeatin O-glucosyltransferase Martin et al. (2001) OsUGT709A4 BAC80066 UGT709A4 P Oryza sativa Isoflavonoid-7-O-glucosyltransferase Ko et al. (2008)
References for Table 3
(87) Achnine, L., Huhman, D. V., Farag, M. A., Sumner, L. W., Blount, J. W., and Dixon, R. A. (2005). Genomics-based selection and functional characterization of triterpene glycosyltransferases from the model legume Medicago truncatula. Plant J., 41:875-87. Augustin, J. M., Drok, S., Shinoda, T., Sanmiya, K., Nielsen, J. K., Khakimov, B., Olsen, C. E., Hansen, E. H., Kuzina, V., Ekstrom, C. T., Hauser, T., and Bak, S. (2012). UDP-glycosyltransferases from the UGT73C subfamily in Barbarea vulgaris catalyze sapogenin 3-O-glucosylation in saponin-mediated insect resistance. Plant Physiol., 160:1881-95. Brazier-Hicks, M., Offen, W. A., Gershater, M. C., Revett, T. J., Lim, E.-K., Bowles, D. J., Davies, G. J., and Edwards, R. (2007). Characterization and engineering of the bifunctional N- and O-glucosyltransferase involved in xenobiotic metabolism in plants. Proc. Natl. Acad. Sci. U.S.A., 104:20238-43. Chen, H.-Y. and Li, X. (2017). Identification of a residue responsible for UDP-sugar donor selectivity of a dihydroxybenzoic acid glycosyltransferase from Arabidopsis natural accessions. Plant J., 89:195-203. Dai, L., Liu, C., Zhu, Y., Zhang, J., Men, Y., Zeng, Y., and Sun, Y. (2015). Functional characterization of cucurbitadienol synthase and triterpene glycosyltransferase involved in biosynthesis of mogrosides from Siraitia grosvenorii. Plant Cell Physiol., 56:1172-82. Ford, C. M., Boss, P. K., and Hoj, P. B. (1998). Cloning and characterization of Vitis vinifera UDP-glucose:flavonoid 3-O-glucosyltransferase, a homologue of the enzyme encoded by the maize Bronze-1 locus that may primarily serve to glucosylate anthocyanidins in vivo. J. Biol. Chem., 273:9224-33. Franks, T. K., Yadollahi, A., Wirthensohn, M. G., Guerin, J. R., Kaiser, B. N., Sedgley, M., and Ford, C. M. (2008). A seed coat cyanohydrin glucosyltransferase is associated with bitterness in almond (Prunus dulcis) kernels. Funct. Plant Biol., 35(3):236-246. Frydman, A., Liberman, R., Huhman, D. V., Carmeli-Weissberg, M., Sapir-Mir, M., Ophir, R., W Sumner, L., and Eyal, Y. (2013). The molecular and enzymatic basis of bitter/non-bitter flavor of citrus fruit: evolution of branch-forming rhamnosyltransferases under domestication. Plant J., 73:166-78. Gandia-Herrero, F., Lorenz, A., Larson, T., Graham, I. A., Bowles, D. J., Rylott, E. L., and Bruce, N. C. (2008). Detoxification of the explosive 2,4,6-trinitrotoluene in Arabidopsis: discovery of bifunctional O- and C-glucosyltransferases. Plant J., 56:963-74. Hansen, K. S., Kristensen, C., Tattersall, D. B., Jones, P. R., Olsen, C. E., Bak, S., and Moller, B. L. (2003). The in vitro substrate regiospecificity of recombinant UGT85B1, the cyanohydrin glucosyltransferase from Sorghum bicolor. Phytochemistry, 64:143-51. He, F., Chen, W.-K., Yu, K.-J., Ji, X.-N., Duan, C.-Q., Reeves, M. J., and Wang, J. (2015). Molecular and biochemical characterization of the UDP-glucose: Anthocyanin 5-O-glucosyltransferase from Vitis amurensis. Phytochemistry, 117:363-72. Hou, B., Lim, E.-K., Higgins, G. S., and Bowles, D. J. (2004). N-glucosylation of cytokinins by glycosyltransferases of Arabidopsis thaliana. J. Biol. Chem., 279:47822-32. Hsu, Y.-H., Tagami, T., Matsunaga, K., Okuyama, M., Suzuki, T., Noda, N., Suzuki, M., and Shimura, H. (2017). Functional characterization of UDP-rhamnose-dependent rhamnosyltransferase involved in anthocyanin modification, a key enzyme determining blue coloration in Lobelia erinus. Plant J., 89:325-337. Itkin, M., Heinig, U., Tzfadia, O., Bhide, A. J., Shinde, B., Cardenas, P. D., Bocobza, S. E., Unger, T., Malitsky, S., Finkers, R., Tikunov, Y., Bovy, A., Chikate, Y., Singh, P., Rogachev, I., Beekwilder, J., Giri, A. P., and Aharoni, A. (2013). Biosynthesis of antinutritional alkaloids in solanaceous crops is mediated by clustered genes. Science, 341:175-9. Jones, P., Messner, B., Nakajima, J.-I., Schaffner, A. R., and Saito, K. (2003). UGT73C6 and UGT78D1, glycosyltransferases involved in flavonol glycoside biosynthesis in Arabidopsis thaliana. J. Biol. Chem., 278:43910-8. Jung, S.-C., Kim, W., Park, S. C., Jeong, J., Park, M. K., Lim, S., Lee, Y., Im, W.-T., Lee, J. H., Choi, G., and Kim, S. C. (2014). Two ginseng UDP-glycosyltransferases synthesize ginsenoside Rg3 and Rd. Plant Cell Physiol., 55:2177-88. Kim, J. H., Kim, B. G., Park, Y., Ko, J. H., Lim, C. E., Lim, J., Lim, Y., and Ahn, J.-H. (2006). Characterization of flavonoid 7-O-glucosyltransferase from Arabidopsis thaliana. Biosci. Biotechnol. Biochem., 70:1471-7. Kita, M., Hirata, Y., Moriguchi, T., Endo-Inagaki, T., Matsumoto, R., Hasegawa, S., Suhayda, C. G., and Omura, M. (2000). Molecular cloning and characterization of a novel gene encoding limonoid UDP-glucosyltransferase in Citrus. FEBS Lett., 469:173-8. Ko, J. H., Kim, B. G., Kim, J. H., Kim, H., Lim, C. E., Lim, J., Lee, C., Lim, Y., and Ahn, J.-H. (2008). Four glucosyltransferases from rice: cDNA cloning, expression, and characterization. J. Plant Physiol., 165:435-44. Kramer, C. M., Prata, R. T. N., Willits, M. G., De Luca, V., Steffens, J. C., and Graser, G. (2003). Cloning and regiospecificity studies of two flavonoid glucosyltransferases from Allium cepa. Phytochemistry, 64:1069-76. Lanot, A., Hodge, D., Jackson, R. G., George, G. L., Elias, L., Lim, E.-K., Vaistij, F. E., and Bowles, D. J. (2006). The glucosyltransferase UGT72E2 is responsible for monolignol 4-O-glucoside production in Arabidopsis thaliana. Plant J., 48:286-95. Lim, C. E., Choi, J. N., Kim, I. A., Lee, S. A., Hwang, Y.-S., Lee, C. H., and Lim, J. (2008). Improved resistance to oxidative stress by a loss-of-function mutation in the Arabidopsis UGT71C1 gene. Mol. Cells, 25:368-75. Lim, E.-K., Ashford, D. A., Hou, B., Jackson, R. G., and Bowles, D. J. (2004). Arabidopsis glycosyltransferases as biocatalysts in fermentation for regioselective synthesis of diverse quercetin glucosides. Biotechnol. Bioeng., 87:623-31. Louveau, T., Orme, A., Pfalzgraf, H., Stephenson, M. J., Melton, R., Saalbach, G., Hemmings, A. M., Leveau, A., Rejzek, M., Vickerstaff, R. J., Langdon, T., Field, R. A., and Osbourn, A. (2018). Analysis of two new arabinosyltransferases belonging to the carbohydrate-active enzyme (CAZY) glycosyl transferase family 1 provides insights into disease resistance and sugar donor specificity. Plant Cell, 30(12):3038-3057. Martin, R. C., Mok, M. C., Habben, J. E., and Mok, D. W. (2001). A maize cytokinin gene encoding an O-glucosyltransferase specific to cis-zeatin. Proc. Natl. Acad. Sci. U.S.A., 98:5922-6. Mato, M., Ozeki, Y., Itoh, Y., Higeta, D., Yoshitama, K., Teramoto, S., Aida, R., Ishikura, N., and Shibata, M. (1998). Isolation and characterization of a cDNA clone of UDP-galactose: flavonoid 3-O-galactosyltransferase (UF3GaT) expressed in Vigna mungo seedlings. Plant Cell Physiol., 39:1145-55. McCue, K. F., Allen, P. V., Shepherd, L. V. T., Blake, A., Maccree, M. M., Rockhold, D. R., Novy, R. G., Stewart, D., Davies, H. V., and Belknap, W. R. (2007). Potato glycosterol rhamnosyltransferase, the terminal step in triose side-chain biosynthesis. Phytochemistry, 68:327-34. Meesapyodsuk, D., Balsevich, J., Reed, D. W., and Covello, P. S. (2007). Saponin biosynthesis in Saponaria vaccaria. cDNAs encoding -amyrin synthase and a triterpene carboxylic acid glucosyltransferase. Plant Physiol., 143:959-69. Milkowski, C., Baumert, A., and Strack, D. (2000). Identification of four Arabidopsis genes encoding hydroxycinnamate glucosyltransferases. FEBS Lett., 486:183-4. Morita, Y., Hoshino, A., Kikuchi, Y., Okuhara, H., Ono, E., Tanaka, Y., Fukui, Y., Saito, N., Nitasaka, E., Noguchi, H., and lida, S. (2005). Japanese morning glory dusky mutants displaying reddish-brown or purplish-gray flowers are deficient in a novel glycosylation enzyme for anthocyanin biosynthesis, UDP-glucose:anthocyanidin 3-O-glucoside-2-O-glucosyltransferase, due to 4-bp insertions in the gene. Plant J., 42:353-63. Nagatoshi, M., Terasaka, K., Owaki, M., Sota, M., Inukai, T., Nagatsu, A., and Mizukami, H. (2012). UGT75L6 and UGT94E5 mediate sequential glucosylation of crocetin to crocin in Gardenia jasminoides. FEBS Lett., 586:1055-61. Nakatsuka, T. and Nishihara, M. (2010). UDP-glucose:3-deoxyanthocyanidin 5-O-glucosyltransferase from Sinningia cardinalis. Planta, 232:383-92. Naoumkina, M. A., Modolo, L. V., Huhman, D. V., Urbanczyk-Wochniak, E., Tang, Y., Sumner, L. W., and Dixon, R. A. (2010). Genomic and coexpression analyses predict multiple genes involved in triterpene saponin biosynthesis in Medicago truncatula. Plant Cell, 22:850-66. Ono, E., Ruike, M., Iwashita, T., Nomoto, K., and Fukui, Y. (2010). Co-pigmentation and flavonoid glycosyltransferases in blue Veronica persica flowers. Phytochemistry, 71:726-35. Owatworakit, A., Townsend, B., Louveau, T., Jenner, H., Rejzek, M., Hughes, R. K., Saalbach, G., Qi, X., Bakht, S., Roy, A. D., Mugford, S. T., Goss, R. J. M., Field, R. A., and Osbourn, A. (2013). Glycosyltransferases from oat (Avena) implicated in the acylation of avenacins. J. Biol. Chem., 288(6):3696-3704. Priest, D. M., Ambrose, S. J., Vaistij, F. E., Elias, L., Higgins, G. S., Ross, A. R. S., Abrams, S. R., and Bowles, D. J. (2006). Use of the glucosyltransferase UGT71 B6 to disturb abscisic acid homeostasis in Arabidopsis thaliana. Plant J., 46:492-502. Richman, A., Swanson, A., Humphrey, T., Chapman, R., McGarvey, B., Pocs, R., and Brandle, J. (2005). Functional genomics uncovers three glucosyltransferases involved in the synthesis of the major sweet glucosides of Stevia rebaudiana. Plant J., 41:56-67. Rojas Rodas, F., Rodriguez, T. O., Murai, Y., Iwashina, T., Sugawara, S., Suzuki, M., Nakabayashi, R., Yonekura-Sakakibara, K., Saito, K., Kitajima, J., Toda, K., and Takahashi, R. (2014). Linkage mapping, molecular cloning and functional analysis of soybean gene Fg2 encoding flavonol 3-O-glucoside (1->6) rhamnosyltransferase. Plant Mol. Biol., 84:287-300. Ross, J., Li, Y., Lim, E., and Bowles, D. J. (2001). Higher plant glycosyltransferases. Genome Biol., 2: REVIEWS3004. Sasaki, N., Nishizaki, Y., Yamada, E., Tatsuzawa, F., Nakatsuka, T., Takahashi, H. and Nishihara, M. (2015). Identification of the glucosyltransferase that mediates direct flavone C-glucosylation in Gentiana triflora. FEBS Lett., 589:182-187. Sasaki, N., Wada, K., Koda, T., Kasahara, K., Adachi, T., and Ozeki, Y. (2005). Isolation and characterization of cDNAs encoding an enzyme with glucosyltransferase activity for cyclo-DOPA from four o'clocks and feather cockscombs. Plant Cell Physiol., 46:666-70. Sawada, S., Suzuki, H., Ichimaida, F., Yamaguchi, M.-A., Iwashita, T., Fukui, Y., Hemmi, H., Nishino, T., and Nakayama, T. (2005). UDP-glucuronic acid:anthocyanin glucuronosyltransferase from red daisy (Bellis perennis) flowers. Enzymology and phylogenetics of a novel glucuronosyltransferase involved in flower pigment biosynthesis. J. Biol. Chem., 280:899-906. Sayama, T., Ono, E., Takagi, K., Takada, Y., Horikawa, M., Nakamoto, Y., Hirose, A., Sasama, H., Ohashi, M., Hasegawa, H., Terakawa, T., Kikuchi, A., Kato, S., Tat-suzaki, N., Tsukamoto, C., and Ishimoto, M. (2012). The Sg-1 glycosyltransferase locus regulates structural diversity of triterpenoid saponins of soybean. Plant Cell, 24:2123-38. Shibuya, M., Nishimura, K., Yasuyama, N., and Ebizuka, Y. (2010). Identification and characterization of glycosyltransferases involved in the biosynthesis of soyasaponin i in Glycine max. FEBS Lett., 584:2258-64. Sun, W., Liang, L., Meng, X., Li, Y., Gao, F., Liu, X., Wang, S., Gao, X., and Wang, L. (2016). Biochemical and molecular characterization of a flavonoid 3-O-glycosyltransferase responsible for anthocyanins and flavonols biosynthesis in Freesia hybrida. Front. Plant Sci., 7:410. Szerszen, J. B., Szczyglowski, K., and Bandurski, R. S. (1994). iaglu, a gene from Zea mays involved in conjugation of growth hormone indole-3-acetic acid. Science, 265:1699-701. Tognetti, V. B., Van Aken, O., Morreel, K., Vandenbroucke, K., van de Cotte, B., De Clercq, I., Chiwocha, S., Fenske, R., Prinsen, E., Boerjan, W., Genty, B., Stubbs, K. A., Inze, D., and Van Breusegem, F. (2010). Perturbation of indole-3-butyric acid homeostasis by the UDP-glucosyltransferase UGT74E2 modulates Arabidopsis architecture and water stress tolerance. Plant Cell, 22:2660-79. Trapero, A., Ahrazem, O., Rubio-Moraga, A., Jimeno, M. L., Gomez, M. D., and Gomez-Gomez, L. (2012). Characterization of a glucosyltransferase enzyme involved in the formation of kaempferol and quercetin sophorosides in Crocus sativus. Plant Physiol., 159:1335-54. Vogt, T., Grimm, R., and Strack, D. (1999). Cloning and expression of a cDNA encoding betanidin 5-O-glucosyltransferase, a betanidin- and flavonoid-specific enzyme with high homology to inducible glucosyltransferases from the Solanaceae. Plant J., 19:509-19. Wang, B., Jin, S.-H., Hu, H.-Q., Sun, Y.-G., Wang, Y.-W., Han, P., and Hou, B.-K. (2012). UGT87A2, an Arabidopsis glycosyltransferase, regulates flowering time via FLOWERING LOCUS C. New Phytol., 194:666-75. Witte, S., Moco, S., Vervoort, J., Matern, U., and Martens, S. (2009). Recombinant expression and functional characterisation of regiospecific flavonoid glucosyltransferases from Hieracium pilosella L. Planta, 229:1135-46. Xu, G., Cai, W., Gao, W., and Liu, C. (2016). A novel glucuronosyltransferase has an unprecedented ability to catalyse continuous two-step glucuronosylation of glycyrrhetinic acid to yield glycyrrhizin. New Phytol., 212:123-35. Yamazaki, M., Gong, Z., Fukuchi-Mizutani, M., Fukui, Y., Tanaka, Y., Kusumi, T., and Saito, K. (1999). Molecular cloning and biochemical characterization of a novel anthocyanin 5-O-glucosyltransferase by mRNA differential display for plant forms regarding anthocyanin. J. Biol. Chem., 274:7405-11. Yan, X., Fan, Y., Wei, W., Wang, P., Liu, Q., Wei, Y., Zhang, L., Zhao, G., Yue, J., and Zhou, Z. (2014). Production of bioactive ginsenoside compound K in metabolically engineered yeast. Cell Res, 24:770-3. Yano, R., Takagi, K., Tochigi, S., Fujisawa, Y., Nomura, Y., Tsuchinaga, H., Takahashi, Y., Takada, Y., Kaga, A., Anai, T., Tsukamoto, C., Seki, H., Muranaka, T., and Ishimoto, M. (2018) Isolation and characterization of the soybean Sg-3 gene that is involved in genetic variation in sugar chain composition at the C-3 position in soyasaponins. Plant Cell Physiol., 59:792-805. Yonekura-Sakakibara, K., Fukushima, A., Nakabayashi, R., Hanada, K., Matsuda, F., Sugawara, S., Inoue, E., Kuromori, T., Ito, T., Shinozaki, K., Wangwattana, B., Yamazaki, M., and Saito, K. (2012). Two glycosyltransferases involved in anthocyanin modification delineated by transcriptome independent component analysis in Arabidopsis thaliana. Plant J., 69:154-67. Yonekura-Sakakibara, K., Nakabayashi, R., Sugawara, S., Tohge, T., Ito, T., Koyanagi, M., Kitajima, M., Takayama, H., and Saito, K. (2014). A flavonoid 3-O-glucoside:2-O-glucosyltransferase responsible for terminal modification of pollen-specific flavonols in Arabidopsis thaliana. Plant J., 79:769-82. Yonekura-Sakakibara, K., Tohge, T., Niida, R., and Saito, K. (2007). Identification of a flavonol 7-O-rhamnosyltransferase gene determining flavonoid pattern in Arabidopsis by transcriptome coexpression analysis and reverse genetics. J. Biol. Chem., 282:14932-41. Zhong, Y., Xue, X., Liu, Z., Ma, Y., Zeng, K., Han, L., Qi, J., Ro, D.-K., Bak, S., Huang, S., Zhou, Y., and Shang, Y. (2017). Developmentally regulated glucosylation of bitter triterpenoid in cucumber by the UDP-glucosyltransferase UGT73AM3. Mol. Plant, 10:1000-1003. Zhao, X., Wang, P., Li, M., Jiang, X., Cui, L., Qian, Y., Zhuang, J., Gao, L., and Xia, T. (2017). Functional characterization of a new tea (Camellia sinensis) flavonoid glycosyltransferase. J. Agric. Food Chem. 65:2074-2083.
(88) TABLE-US-00016 Sequences SEQIDNO:1-Q.saponariaquillaicacid3-O-glucuronosyltransferase(cellulose synthase-likeenzymeQsCSL1)codingsequence(2142bp) ATGAAATCCCCCTCTAACCCAAATCAGAAACCCATCCTCCACACTTGTACAATTCAGCAGCCTCGT GCTACCCTTAACAAAATTCATAGTCTTATTCATTTCTCAGCCATACTTGTCCTATTTTATTACCGG ATAACCCGTCTATTCTTCACCGACGATTTCAAGGTACCCAAGTTACTATGGACTCTAATGACAATC TCCGAGTTCATTCTTGCCTTCATTTGGGTTCTCATCCAACCTTTCCGGTGGCGACCGGTGTCCCGT TCCGTCATACCAGAGAATATGCCGAAGGACATCAGTTTGCCGGCGGTGGACGTGTTTGTATGCACA GCTGACCCTCAAAAAGAACCCACAGTGGAGGTGATGAACACAATTTTATCAGCCATGGCTTTAGAC TACCCGGCGGAGAAGCTCGCCGTGTATCTTTCCGATGATGGGGGTTCTGCTGTCACCTTATATGCT ATAAAAGAAGCTTGTTGTTTTGCTAAGATGTGGCTTCCGTTTTGTAACAAGTATGGGATCAAATCA AGGTGTCCCGAGGCTTATTTTTCAAAGCTTGCCGCTGACGAGTGGCTTCACCGGAGTGTGGAATTC GTGGCAGAAGAAAAGGAGGTCAAGGCTAATTATGAAGAGTTCAAGAGAAATGTGCAGAAATTTGGT GAGCAACAAGAAAACAGTCGTGTTGTGCATGATCGTCACCCTCATGTTGAGATTATACACAATAAT TGGAATAACGAAGACCAAGCTCATGAGATGCCACTCCTTGTTTATGTCTCTCGTGAAAGAAGACCA TCTCACCATCCTCGATTCAAAGCTGGAGCTCTTAACACCCTTCTTCGAGTTTCTGGCATCATCAGC AACAGCCCCTACATACTGGTTCTAGACTGTGACATGTACTGCAATGACCCAACCTCAGCTAGACAA GCAATGTGCTTCCATCTTGATCCCCAACTGTCTAAAAATCTTGCTTTTGTACAATTCCCTCAAATA TTCTATAACGCTAGTAAGAATGACGTCTATGATGCCCAAGTCAGGGCGGCATACCAGACAAAGTGG CAGGGTATGGATGGACTTCAAGGACCAATTTTTTCTGGCACTGGCTTTTACTTAAAGAGGAAGGCA ATGTATGGAAACCCTGATCAAGATGATAATTGTCTACTCAAGCCATATAAGAAATTTGGCATGTCT GGAGAATTTGTAGAATCACTTAAGGTCCTTAACGAACAAGATGGTACCCAGAAGAAATTATTGGAT GGATTTTTACAAGAGGCCAAACTATTGGCCTCGTGTGCCTATGAAACAAAGACAAGTTGGGGTAAA GAGATTGGATTCTCATATGACTGTTTAATAGAGAGCACTTTCACTGGTTATCTTTTGCACTGCAGA GGGTGGATATCTGTTTATCTTTATCCCAAGAGACCATGTTTTTTAGGATGCTGTCCTACTGATATG AAGGATGCCATGGTTCAATATACCAAGTGGATGTCTGAGCTATTTTCAATTGCTATCTCAAGATTC AATCCTCTGCTCTATGGGGTGGCAAGAATGTCCATTCTTCAAAGCCTGTGTTATGGATCCTTTACA CTGGCGCCTATTTTGTCATTTCCTTTGTTCTTATATGGAACGGTTCCTCAATTATGCCTCTTGAAA GGCATATCTTTGTTTCCAAAGGTTTCGGACCCATGGTTTGCTGTGTTTGCAGCTATCTTTGTATCC TCCCTGTGTCAACACTGGTTCGAGGTCCTCTCTTGTGATGGTACATTTACGACTTGGTGTAATGAA CAGCGGAGTTGGCTTATAAAGTCGGTTTCCGGTAGTTTGTTTGGAGTTGTGGGCGCAATCTTGCAG CGGCTAGGCTTGAAGACAAAGTTTAGTTTATCAAACAAAGCCATGGACAAAGAAAAGCTGGAGAAA TATGAAAAGGGTAAATTTAATTTCCAAGGGGCTGCCATGTTCATGGTTCCTGTGTCTATTTTAGTC ATACTGAACACATTTTGCTTCCTCGGTGGGTTTTGGAAAGTGATCATAATGAAGAATATCCTGGAC ATGTTTGGACAACTTTCTCTCTCTGCCTACGTTCTGGTTCTCAGTTGTCCAGTTCTTGAAGGGATG TTAACTAGAATCAGCAAGAAAATGGTCTGA SEQIDNO:2-Q.saponariaquillaicacid3-O-glucuronosyltransferase(cellulose synthase-likeenzymeQsCSL1)translatednucleotidesequence(713aa): MKSPSNPNQKPILHTCTIQQPRATLNKIHSLIHFSAILVLFYYRITRLFFTDDFKVPKLLWTLMTI SEFILAFIWVLIQPFRWRPVSRSVIPENMPKDISLPAVDVFVCTADPQKEPTVEVMNTILSAMALD YPAEKLAVYLSDDGGSAVTLYAIKEACCFAKMWLPFCNKYGIKSRCPEAYFSKLAADEWLHRSVEF VAEEKEVKANYEEFKRNVQKFGEQQENSRVVHDRHPHVEIIHNNWNNEDQAHEMPLLVYVSRERRP SHHPRFKAGALNTLLRVSGIISNSPYILVLDCDMYCNDPTSARQAMCFHLDPQLSKNLAFVQFPQI FYNASKNDVYDAQVRAAYQTKWQGMDGLQGPIFSGTGFYLKRKAMYGNPDQDDNCLLKPYKKFGMS GEFVESLKVLNEQDGTQKKLLDGFLQEAKLLASCAYETKTSWGKEIGFSYDCLIESTFTGYLLHCR GWISVYLYPKRPCFLGCCPTDMKDAMVQYTKWMSELFSIAISRFNPLLYGVARMSILQSLCYGSFT LAPILSFPLFLYGTVPQLCLLKGISLFPKVSDPWFAVFAAIFVSSLCQHWFEVLSCDGTFTTWCNE QRSWLIKSVSGSLFGVVGAILQRLGLKTKFSLSNKAMDKEKLEKYEKGKFNFQGAAMFMVPVSILV ILNTFCFLGGFWKVIIMKNILDMFGQLSLSAYVLVLSCPVLEGMLTRISKKMV* SEQIDNO:3-Q.saponariaQA-GlcpA-1,2-D-galactosyltransferase(Qs-3-O-GalT) codingsequence(1479bp) ATGGTGGAGTCTCCAGCAGATCATGATGTGCTCAAAATCATTGTCCTTCCATGGGTAACCTCAGGT CACATGATTCCCATGGTAGATGCAGCCAGACTATTTGCTATGCATGGTGCAGATGTTACCATCATC ACCACCCCAGCTAATGCCCTTACATTCCAGAAATCCGTCGACCGTGATTTCAATTCCGGTCGTTTA ATCAGAACTCACACCCTTAAATTCCCTGCAGCAGAAGTTGGTGTACCTGAAGGAGTTGAAAACTTC AACAATACTTCCCCTGAAATGACCTCCAAAGTCTACCTTGGAGTCTCAATGCTCCGAGAACCAACC CAACAATTGATTGAGGATCTGCGTCCAGATTGTCTTATCACTGATATGTTCTATCCTTGGGCTGTG GATGTTGCTGACAAATTAGGCATTCCAAGGCTAATTTTTCAAGGTCCTGGAAGTTTTGGTTTGTCA GCTATGCATTCTATCAAACAGTATGAGCCCTTTAAGTCAGTAACTTCAGATACTGAGACATTCCCA CTACCTGGATTGCCGCATAAGGTAGAGATGACAAGGTTGCAGATACCAAAATGGGTTCGTGAGCCA AATGGGTACACTCAATTGATGGGCAGGGTAAAAGATTCGGAGAGAAGAAGCTATGGGTCATTGGTG AATAGCTTTTATGACTTCGAAGGCCCTTATGAAGAGCACTATAGGAAGGCAACAGGACAGAGGGTT TGGAGCATTGGACCAGTTTCAGTTTGGGTGAACCAAGATGCTGCAGATAAGGTTGGAAGAGGACAG GATCTTGTTGCTGAAGACCAAAACAGCTGGTTGAATTGGCTCAATTCCAAAGAGAAAAACTCTGTT CTGTATGTAAGTTTTGGGAGCATGGCCAAGTTCCCATCTGCTCAGCTTCTTGAAATAGCTCATGGG CTTGAAGCTTCAGGTCATAGTTTCATCTGGGTTGTCAGAAAAGTTGACGGGGATGATGATGTAGAC GTGTGGCTTCCAGATTTTGAGAAGAAAATGAAAGAGAACAACAAGGGTTTCATCATAAGGAATTGG GCACCACAATTGCTCATATTGGACCATCCAGCAATTGGAGGTTTGCTGAATCACAGTGGATGGAAT TCAGTACTGGAAGGTGCTACAGCAGGCTTGCCAATGATCACTTGGCCTCTGTATGCCGAGCATTTT TACAATGAAAGGTTGGTTCTAGATGTGTTGAAAATTGGAGTACCAGTTGGGGTGAAGGAGTGGAAG AACTTGCATGAGGTGGGTGAGTTGGTGAGAAGGGATGCAATTGCCAAGGCAATTAAATTGTTAATG GGTAGTGGAGAAGAAGCTGAGGTAATGAGGAAAAAAGCCAAAGAGCTTGGTGTTGGAGCAAAGAAA GGTATTCAGGTTGGAGGTTCTTCTCATACCAATTTGATAGCAGTGATTGATGAGTTAAAGTCACTA AAGAAATCAAGAATTCAGGGTGTCTGA SEQIDNO:4-Q.saponariaQA-GlcpA-1,2-D-galactosyltransferase(Qs-3-O-GalT) translatednucleotidesequence(492aa): MVESPADHDVLKIIVLPWVTSGHMIPMVDAARLFAMHGADVTIITTPANALTFQKSVDRDFNSGRL IRTHTLKFPAAEVGVPEGVENFNNTSPEMTSKVYLGVSMLREPTQQLIEDLRPDCLITDMFYPWAV DVADKLGIPRLIFQGPGSFGLSAMHSIKQYEPFKSVTSDTETFPLPGLPHKVEMTRLQIPKWVREP NGYTQLMGRVKDSERRSYGSLVNSFYDFEGPYEEHYRKATGQRVWSIGPVSVWVNQDAADKVGRGQ DLVAEDQNSWLNWLNSKEKNSVLYVSFGSMAKFPSAQLLEIAHGLEASGHSFIWVVRKVDGDDDVD VWLPDFEKKMKENNKGFIIRNWAPQLLILDHPAIGGLLNHSGWNSVLEGATAGLPMITWPLYAEHF YNERLVLDVLKIGVPVGVKEWKNLHEVGELVRRDAIAKAIKLLMGSGEEAEVMRKKAKELGVGAKK GIQVGGSSHTNLIAVIDELKSLKKSRIQGV* SEQIDNO:5-Q.saponariaQA-GlcpA-Galpdual-1,3-D-xylosyltransferase/-1,3-L- rhamnosyltransferase(Qs-3-O-RhaT/XylT)codingsequence(1515bp) ATGGTCTCCGGCGACGACGATGTTTCTCGTCGGCCACTGAAAGTTTACTTCATTGCACACCCCTCA CCTGGCCATATTGCCCCTCTGACCAAAATAGCCCATCTCTTCGCTGCCCTCGGTGAGCACGTGACT ATTCTCACTACTCCCGCCAATGTCCACTTCCATGAGAAATCCATCGACAAAGGAAAGGCTTCCGGC TATCATGTTAACATCCACACCGTTAAATTTCCTTCTAAAGAGGTCGGTCTCCCTGACGGCATCGAA AACTTCTCTTACGCCTCCGATGTTGAAACAGCAGCTAAAATTTGGGCTGGATTCGCCATGCTACAA ACTGAAATGGAGCAATATATGGAGCTTAACCCACCCGATTGCATCGTTGCCGACATGTTCACCTCC TGGACCTCCGACTTTGCTATCAAATTGGGAATCACAAGAATCGTTTTCAACGTCTATTGTATTTTC ACACGCTGTTTGGAAGAAGCCATCCGATCACCGGACTCGCCACACTTGAACAAAGAAATCTCTGAT AATGAACCGTTTGTTATCCCGGGTCTACCAGACCCCATAACAATTACCCGAGCTCAACTGCCCGAC GGTACCTTTTCTCCCATGAAAGAACTAGCTAGAACAGCTGAGTTGAAGAGCTTTGGAATGGTGATC AACGGGTTTTCCGAACTCGAAACCGATTACATCGAGCATTACAAGAAAATCATGGGTCACAAACGG ATTTGGCATGTCGGACCCCTTCAGCTAATCCACCGTAACGATGAAGACAAAATTCAGAGGAGCCAC AAGACAGCGGTGCTGAGTGATAACGATAACGAGTTAGTGAGTTGGCTTAACTCGAAGAAACCCGAC TCAGTTATTTACATTTGCTTCGGTAGTGCAACTCGTTTCTCTAATCACCAGCTCTATGAAATCGCC TGTGGATTAGAAGCTTCCGGGCACCCATTTTTGTGGGGCCTACTTTGGGTGCCAGAAGATGAAGAT AACGATGACGTGGGCAACAAATGGTTGCCAGCTTTCGAAGAAAGAATTAAAAAGGAAAATAAGGGA ATGATTTTAAGGGGGTGGGCTCCACAGATGTTAATCTTAAACCACCCGGCGATCGGTGGTTTCATG ACGCATTGTGGTTGGAATGCGGTGGTGGAAGCACTTTCATTCGGTGTTCCGACTATTACGCTTCCA GTTTTCTCGGAGCAGTTTTATACTGAGAGACTGATATCACAAGTGCTCAAGACTGGTGTGGAGGTT GGTGCAGAGAAGTGGACCTATGCATTTGATGCGGGGAAATATCCGGTGAGTAGGGAAAAGATAGCG ACGGCGGTGAAGAAGATATTAGACGATGGAGAAGAGGCAGAAGGAATGAGAAAGCGGGCCAGGGAG ATGAAAGAAAAAGCCCAAAAAAGTGTTGAAGAAGGTGGATCCTCTTATAATAATTTAACGGCTATG ATTGAAGATCTTAAAGAATTTAGGGCTAACAATGGCAAGGCTGCACAAGATCATGAATCGTGA SEQIDNO:6-Q.saponariaQA-GlcpA-Galpdual-1,3-D-xylosyltransferase/-1,3-L- rhamnosyltransferase(Qs-3-O-RhaT/XylT)translatednucleotidesequence(504aa): MVSGDDDVSRRPLKVYFIAHPSPGHIAPLTKIAHLFAALGEHVTILTTPANVHFHEKSIDKGKASG YHVNIHTVKFPSKEVGLPDGIENFSYASDVETAAKIWAGFAMLQTEMEQYMELNPPDCIVADMFTS WTSDFAIKLGITRIVFNVYCIFTRCLEEAIRSPDSPHLNKEISDNEPFVIPGLPDPITITRAQLPD GTFSPMKELARTAELKSFGMVINGFSELETDYIEHYKKIMGHKRIWHVGPLQLIHRNDEDKIQRSH KTAVLSDNDNELVSWLNSKKPDSVIYICFGSATRFSNHQLYEIACGLEASGHPFLWGLLWVPEDED NDDVGNKWLPAFEERIKKENKGMILRGWAPQMLILNHPAIGGFMTHCGWNAVVEALSFGVPTITLP VFSEQFYTERLISQVLKTGVEVGAEKWTYAFDAGKYPVSREKIATAVKKILDDGEEAEGMRKRARE MKEKAQKSVEEGGSSYNNLTAMIEDLKEFRANNGKAAQDHES* Part2-Otherbiosyntheticenzymes: SEQIDNO:7-AsHMGR(AvenastrigosaHMG-CoAreductase)codingsequence (1689bp): Thefull-lengthHMGRsequenceisprovidedbelow.The5region(underlined)can beremovedtogenerateatruncatedfeedback-insensitiveform(tHMGR).The sequencefortHMGRisalsogivenseparatelybelow. ATGGCTGTGGAGGTTCACCGCCGGGCTCCCGCGCCCCATGGCCGGGGCACCGGGGAGAAGGGCCGC GTGCAGGCCGGGGACGCGCTGCCGCTGCCGATCCGCCACACCAACCTCATCTTCTCGGCGCTCTTC GCCGCCTCCCTCGCATACCTCATGCGCCGCTGGAGGGAGAAGATCCGCAACTCCACGCCGCTCCAC GTCGTGGGGCTCACCGAGATCTTCGCCATCTGCGGCCTCGTCGCCTCCCTCATCTACCTCCTCAGC TTCTTCGGCATCGCCTTCGTGCAGTCCGTCGTATCCAACAGCGACGACGAGGACGAGGACTTCCTC ATCGCGGCTGCAGCATCCCAGGCCCCCCCGCCGCCCTCCTCCAAGCCCGCGCCGCAGCAGTGCGCC CTGCTGCAGAGCGCCGGAGTCGCGCCCGAGAAAATGCCCGAGGAGGACGAGGAAATCGTCGCCGGG GTCGTCGCAGGGAAGATCCCCTCCTACGTGCTCGAGACCAGGCTAGGCGACTGCCGCAGGGCAGCC GGGATCCGCCGCGAGGCGCTGCGCCGGATCACCGGCAGGGAGATCGACGGCCTTCCCCTCGACGGC TTCGACTACGACTCGATTCTCGGACAGTGCTGCGAGATGCCCGTCGGGTACGTGCAGCTGCCGGTC GGCGTCGCGGGGCCGCTCGTCCTCGACGGCCGCCGCATATACGTCCCGATGGCCACCACGGAGGGC TGCCTAATCGCCAGCACCAACCGCGGATGCAAGGCCATTGCCGAGTCCGGAGGCGCATCCAGCGTC GTGTACCGCGACGGGATGACCCGCGCCCCCGTAGCCCGCTTCCCCTCCGCACGACGCGCCGCAGAG CTCAAGGGCTTCCTGGAGAATCCGGCCAACTACGACACCCTGTCCGTGGTCTTTAACAGATCAAGC AGATTTGCAAGGCTGCAGGGGGTCAAGTGCGCCATGGCTGGGAGGAACTTGTACATGAGGTTCACC TGCAGCACCGGGGATGCCATGGGGATGAACATGGTCTCCAAGGGCGTCCAAAATGTGCTCGACTAT CTGCAGGAGGACTTCCCTGACATGGACGTTGTCAGCATCTCAGGCAACTTTTGTTCCGACAAGAAA TCAGCTGCTGTAAACTGGATTGAAGGCCGTGGAAAGTCCGTGGTTTGTGAGGCAGTAATCAGAGAG GAAGTTGTCCACAAGGTTCTCAAGACCAACGTTCAGTCACTCGTGGAGTTGAATGTGATCAAGAAC CTTGCTGGCTCAGCAGTTGCTGGTGCTCTTGGGGGTTTCAACGCCCACGCAAGCAACATCGTAACG GCTATCTTCATTGCCACTGGTCAGGATCCTGCACAGAATGTGGAGAGCTCACAGTGTATCACTATG TTGGAAGCTGTAAATGATGGCAGAGACCTTCACATCTCCGTTACAATGCCATCTATCGAGGTGGGC ACAGTTGGTGGAGGCACGCAGCTGGCCTCACAGTCGGCCTGCTTGGACCTACTGGGCGTCAAAGGC GCCAACAGGGAATCTCCGGGGTCGAACGCTAGGCTGCTGGCCACGGTGGTGGCTGGTGCCGTCCTA GCTGGGGAGCTGTCCCTCATCTCCGCCCAAGCTGCCGGCCATCTGGTCCAGAGCCACATGAAATAC AACAGATCCAGCAAGGACATGTCCAAGATCGCCTGCTGA SEQIDNO:8-AsHMGR(AvenastrigosaHMG-CoAreductase)translated nucleotidesequence(562aa): MAVEVHRRAPAPHGRGTGEKGRVQAGDALPLPIRHTNLIFSALFAASLAYLMRRWREKIRNSTPLH VVGLTEIFAICGLVASLIYLLSFFGIAFVQSVVSNSDDEDEDFLIAAAASQAPPPPSSKPAPQQCA LLQSAGVAPEKMPEEDEEIVAGVVAGKIPSYVLETRLGDCRRAAGIRREALRRITGREIDGLPLDG FDYDSILGQCCEMPVGYVQLPVGVAGPLVLDGRRIYVPMATTEGCLIASTNRGCKAIAESGGASSV VYRDGMTRAPVARFPSARRAAELKGFLENPANYDTLSVVFNRSSRFARLQGVKCAMAGRNLYMRFT CSTGDAMGMNMVSKGVQNVLDYLQEDFPDMDVVSISGNFCSDKKSAAVNWIEGRGKSVVCEAVIRE EVVHKVLKTNVQSLVELNVIKNLAGSAVAGALGGFNAHASNIVTAIFIATGQDPAQNVESSQCITM LEAVNDGRDLHISVTMPSIEVGTVGGGTQLASQSACLDLLGVKGANRESPGSNARLLATVVAGAVL AGELSLISAQAAGHLVQSHMKYNRSSKDMSKIAC* SEQIDNO:9-AstHMGR(AvenastrigosatruncatedHMG-CoAreductase)coding sequence(1275bp): ATGGCGCCCGAGAAAATGCCCGAGGAGGACGAGGAAATCGTCGCCGGGGTCGTCGCAGGGAAGATC CCCTCCTACGTGCTCGAGACCAGGCTAGGCGACTGCCGCAGGGCAGCCGGGATCCGCCGCGAGGCG CTGCGCCGGATCACCGGCAGGGAGATCGACGGCCTTCCCCTCGACGGCTTCGACTACGACTCGATT CTCGGACAGTGCTGCGAGATGCCCGTCGGGTACGTGCAGCTGCCGGTCGGCGTCGCGGGGCCGCTC GTCCTCGACGGCCGCCGCATATACGTCCCGATGGCCACCACGGAGGGCTGCCTAATCGCCAGCACC AACCGCGGATGCAAGGCCATTGCCGAGTCCGGAGGCGCATCCAGCGTCGTGTACCGCGACGGGATG ACCCGCGCCCCCGTAGCCCGCTTCCCCTCCGCACGACGCGCCGCAGAGCTCAAGGGCTTCCTGGAG AATCCGGCCAACTACGACACCCTGTCCGTGGTCTTTAACAGATCAAGCAGATTTGCAAGGCTGCAG GGGGTCAAGTGCGCCATGGCTGGGAGGAACTTGTACATGAGGTTCACCTGCAGCACCGGGGATGCC ATGGGGATGAACATGGTCTCCAAGGGCGTCCAAAATGTGCTCGACTATCTGCAGGAGGACTTCCCT GACATGGACGTTGTCAGCATCTCAGGCAACTTTTGTTCCGACAAGAAATCAGCTGCTGTAAACTGG ATTGAAGGCCGTGGAAAGTCCGTGGTTTGTGAGGCAGTAATCAGAGAGGAAGTTGTCCACAAGGTT CTCAAGACCAACGTTCAGTCACTCGTGGAGTTGAATGTGATCAAGAACCTTGCTGGCTCAGCAGTT GCTGGTGCTCTTGGGGGTTTCAACGCCCACGCAAGCAACATCGTAACGGCTATCTTCATTGCCACT GGTCAGGATCCTGCACAGAATGTGGAGAGCTCACAGTGTATCACTATGTTGGAAGCTGTAAATGAT GGCAGAGACCTTCACATCTCCGTTACAATGCCATCTATCGAGGTGGGCACAGTTGGTGGAGGCACG CAGCTGGCCTCACAGTCGGCCTGCTTGGACCTACTGGGCGTCAAAGGCGCCAACAGGGAATCTCCG GGGTCGAACGCTAGGCTGCTGGCCACGGTGGTGGCTGGTGCCGTCCTAGCTGGGGAGCTGTCCCTC ATCTCCGCCCAAGCTGCCGGCCATCTGGTCCAGAGCCACATGAAATACAACAGATCCAGCAAGGAC ATGTCCAAGATCGCCTGCTGA SEQIDNO:10-AstHMGR(AvenastrigosatruncatedHMG-CoAreductase) translatednucleotidesequence(424aa): MAPEKMPEEDEEIVAGVVAGKIPSYVLETRLGDCRRAAGIRREALRRITGREIDGLPLDGFDYDSI LGQCCEMPVGYVQLPVGVAGPLVLDGRRIYVPMATTEGCLIASTNRGCKAIAESGGASSVVYRDGM TRAPVARFPSARRAAELKGFLENPANYDTLSVVFNRSSRFARLQGVKCAMAGRNLYMRFTCSTGDA MGMNMVSKGVQNVLDYLQEDFPDMDVVSISGNFCSDKKSAAVNWIEGRGKSVVCEAVIREEVVHKV LKTNVQSLVELNVIKNLAGSAVAGALGGFNAHASNIVTAIFIATGQDPAQNVESSQCITMLEAVND GRDLHISVTMPSIEVGTVGGGTQLASQSACLDLLGVKGANRESPGSNARLLATVVAGAVLAGELSL ISAQAAGHLVQSHMKYNRSSKDMSKIAC SEQIDNO:11-Q.saponaria-amyrinsynthase,QsbAS(OQHZ-2074321)coding sequence(2277bp): ATGTGGAGGCTGAAGATAGCAGAAGGTGGTTCCGATCCATATCTGTTCAGCACAAACAACTTCGTG GGTCGCCAGACATGGGAGTTCGAACCGGAGGCCGGCACACCTGAGGAGCGAGCAGAGGTCGAAGCT GCCCGCCAAAACTTTTACAACAACCGTTACCAGGTCAAGCCCTGTGACGACCTCCTTTGGAGATAT CAGTTCCTGAGAGAGAAGAATTTCAAACAAACAATACCGCCTGTCAAGGTTGAAGATGGCCAAGAA ATTACTTATGAGATGGCCACAACCTCAATGCAGAGGGCGGCCCGTCACCTATCAGCCTTGCAGGCC AGCGATGGCCATTGGCCAGCTCAAATTGCTGGCCCCTTGTTCTTCATGCCACCCTTGGTCTTTTGT GTGTACATTACTGGGCATCTTAATACAGTATTCCCATCTGAACATCGCAAAGAAATCCTTCGTTAC ATGTACTATCACCAGAACGAAGATGGTGGGTGGGGACTGCACATAGAGGGTCACAGCACCATGTTT TGCACAGCACTCAACTACATTTGTATGCGTATCCTTGGGGAAGGACCAGAGGGGGGTCAAGACAAT GCTTGTGCCAGAGCACGAATGTGGATTCTTGATCATGGTGGTGTAACACATATTCCATCTTGGGGA AAGACCTGGCTTTCGATACTTGGTCTATTTGAGTGGTCTGGAAGCAATCCAATGCCTCCAGAGTTT TGGATCCTTCCTTCATTTCTTCCTATGCATCCAGCAAAAATGTGGTGCTATTGCCGGATGGTTTAC ATGCCCATGTCTTATTTATATGGGAAAAGGTTTGTTGGCCCAATCACGCCTCTCATTGTTCAGTTA AGAGAGGAAATACACACTCAAAATTACCATGAAATCAACTGGAAGTCAGTCCGCCATCTATGTGCA AAGGAGGATATCTACTATCCCCATCCACTCATCCAAGATTTGATTTGGGACAGTTTGTACATACTA ACGGAGCCTCTTCTCACTCGCTGGCCCTTGAACAAGTTGGTGCGGGAGAGGGCTCTCCAAGTAACA ATGAAGCATATCCACTATGAAGATGAAAATAGTCGATACATAACCATTGGATGTGTGGAAAAGGTG TTATGTATGCTTGCTTGTTGGGTTGATGATCCAAATGGAGATGCTTTCAAGAAGCACCTTGCTCGA GTCCCAGATTACGTATGGGTCTCTGAAGATGGAATTACTATGCAGAGTTTTGGTAGTCAAGAATGG GATGCTGGCTTTGCCGTCCAGGCTCTGCTTGCTTCTAATCTTACCGAGGAACTTGGCCCTGCTCTT GCCAAAGGACATGACTTCATAAAGCAATCTCAGGTTAAGGACAATCCTTCAGGTGACTTCAAAAGC ATGTATCGTCACATTTCTAGAGGATCATGGACCTTCTCTGACCAAGATCATGGATGGCAAGTTTCT GATTGCACTGCAGAAGGTCTGAAGTGTTGCCTGCTTTTGTCGATGTTGCCACCAGAAATTGTTGGT GAAAAAATGGAACCACAAAGGCTATTTGATTCTGTCAATGTGCTGCTCTCTCTACAGAGCAAAAAA GGTGGTTTAGCTGCCTGGGAGCCAGCAGGGGCGCAAGATTGGTTGGAATTACTCAATCCCACAGAA TTTTTTGCGGACATTGTCGTTGAGCATGAATATGTTGAATGTACTGGATCAGCAATTCAGGCATTA GTTTTGTTCAAGAAGCTGTATCCGGGGCACAGGAAAAAAGAGATTGACAGTTTCATTACAAATGCT GTCCGGTTCCTTGAGAATACACAAACGGCAGATGGCTCTTGGTATGGAAACTGGGGAGTTTGCTTC ACCTATGGTTGTTGGTTCGCACTGGGAGGGCTAGCAGCAGCTGGCAAGACTTACAACAACTGTCCT GCAATACGCAAAGCTGTTAATTTCCTACTTACAACACAAAGAGAAGACGGTGGTTGGGGAGAAAGC TATCTTTCAAGCCCAAAAAAGATATATGTACCCCTGGAAGGAAGCCGATCAAATGTGGTACATACT GCATGGGCTATGATGGGTCTAATTCATGCTGGGCAGGCTGAAAGAGACTCAACTCCTCTTCATCGT GCAGCAAAGTTGATCATCAATTATCAACTAGAAAATGGCGATTGGCCGCAACAGGAAATCACTGGA GTATTCATGAAAAACTGCATGTTACATTACCCTATGTACAGAAACATCTACCCAATGTGGGCTCTT GCAGAATACCGGAGGCGGGTTCCATTGCCTTAA SEQIDNO:12-QsbAS(OQHZ-2074321)translatednucleotidesequence(758aa): MWRLKIAEGGSDPYLFSTNNFVGRQTWEFEPEAGTPEERAEVEAARQNFYNNRYQVKPCDDLLWRY QFLREKNFKQTIPPVKVEDGQEITYEMATTSMQRAARHLSALQASDGHWPAQIAGPLFFMPPLVFC VYITGHLNTVFPSEHRKEILRYMYYHQNEDGGWGLHIEGHSTMFCTALNYICMRILGEGPEGGQDN ACARARMWILDHGGVTHIPSWGKTWLSILGLFEWSGSNPMPPEFWILPSFLPMHPAKMWCYCRMVY MPMSYLYGKRFVGPITPLIVQLREEIHTQNYHEINWKSVRHLCAKEDIYYPHPLIQDLIWDSLYIL TEPLLTRWPLNKLVRERALQVTMKHIHYEDENSRYITIGCVEKVLCMLACWVDDPNGDAFKKHLAR VPDYVWVSEDGITMQSFGSQEWDAGFAVQALLASNLTEELGPALAKGHDFIKQSQVKDNPSGDFKS MYRHISRGSWTFSDQDHGWQVSDCTAEGLKCCLLLSMLPPEIVGEKMEPQRLFDSVNVLLSLQSKK GGLAAWEPAGAQDWLELLNPTEFFADIVVEHEYVECTGSAIQALVLFKKLYPGHRKKEIDSFITNA VRFLENTQTADGSWYGNWGVCFTYGCWFALGGLAAAGKTYNNCPAIRKAVNFLLTTQREDGGWGES YLSSPKKIYVPLEGSRSNVVHTAWAMMGLIHAGQAERDSTPLHRAAKLIINYQLENGDWPQQEITG VFMKNCMLHYPMYRNIYPMWALAEYRRRVPLP* SEQIDNO:13-QsCYP716-C-28(OQHZ-2073932)(C-28oxidase,namedpreviously asCYP716A224[24])codingsequence(1443bp): ATGGAGCACTTGTATCTCTCCCTTGTGCTCCTGTTTGTTTCCTCAATCTCCCTCTCCCTCTTCTTC CTGTTCTACAAACACAAATCTATGTTCACCGGGGCCAACCTACCACCTGGTAAAATCGGTTACCCA TTGATCGGAGAGAGCTTGGAGTTCTTGTCCACGGGATGGAAGGGCCACCCGGAGAAATTCATCTTC GATCGCATGAGCAAGTACTCATCCCAAATCTTCAAGACCTCGATTTTAGGGGAACCAACGGCGGTG TTCCCGGGAGCCGTATGCAACAAGTTCCTCTTCTCCAACGAGAACAAGCTGGTGAATGCATGGTGG CCTGCCTCCGTGGACAAGATCTTTCCTTCCTCACTCCAGACATCCTCCAAAGAAGAGGCCAAGAAG ATGAGGAAGTTGCTTCCTCAGTTTCTCAAGCCCGAAGCTCTGCACCGCTACATTGGTATTATGGAT TCTATTGCCCAGAGACACTTTGCCGATAGCTGGGAAAACAAAAACCAAGTCATTGTCTTTCCTCTA GCAAAGAGGTATACTTTCTGGCTGGCTTGCCGTTTGTTCATTAGCGTCGAGGATCCGACCCACGTA TCCAGATTTGCTGACCCGTTCCAACTTTTGGCCGCCGGAATCATATCAATCCCAATCGACTTGCCA GGGACACCGTTCCGCAAGGCAATCAATGCGTCCCAGTTCATCAGGAAGGAATTGTTGGCCATCATC AGGCAGAGAAAGATCGATTTGGGTGAAGGGAAGGCATCTCCGACGCAGGACATACTGTCTCACATG TTGCTCACATGCGACGAGAACGGACAATACATGAATGAATTGGACATTGCCGACAAGATTCTTGGC TTGTTGGTCGGCGGACATGACACTGCCAGTGCCGCTTGCACTTTCATTGTCAAGTTCCTCGCTGAG CTTCCCCACATTTATGAACAAGTCTACAAGGAGCAAATGGAGATTGCAAAATCAAAAGTGCCAGGA GAGTTGTTGAATTGGGAGGACATCCAAAAGATGAAATATTCGTGGAACGTAGCTTGTGAAGTGATG AGACTTGCCCCTCCACTCCAAGGAGCTTTCAGGGAAGCCATTACTGACTTCGTCTTCAACGGTTTC TCCATTCCAAAAGGCTGGAAGTTGTACTGGAGCGCAAATTCCACCCACAAAAGTCCGGATTATTTC CCTGAGCCCGACAAGTTCGACCCAACTAGATTCGAAGGAAATGGACCTGCGCCTTACACCTTTGTT CCATTTGGGGGAGGACCCAGGATGTGCCCGGGCAAAGAGTATGCCCGATTGGAAATACTTGTGTTC ATGCATAACTTGGTGAAGAGGTTCAAGTGGGAGAAATTGGTTCCTGATGAAAAGATTGTGGTTGAT CCAATGCCCATTCCAGCAAAGGGTCTTCCTGTTCGCCTTTATCCTCACAAAGCTTGA SEQIDNO:14-QsCYP716-C-28(OQHZ-2073932)translatednucleotidesequence (480aa): MEHLYLSLVLLFVSSISLSLFFLFYKHKSMFTGANLPPGKIGYPLIGESLEFLSTGWKGHPEKFIF DRMSKYSSQIFKTSILGEPTAVFPGAVCNKFLFSNENKLVNAWWPASVDKIFPSSLQTSSKEEAKK MRKLLPQFLKPEALHRYIGIMDSIAQRHFADSWENKNQVIVFPLAKRYTFWLACRLFISVEDPTHV SRFADPFQLLAAGIISIPIDLPGTPFRKAINASQFIRKELLAIIRQRKIDLGEGKASPTQDILSHM LLTCDENGQYMNELDIADKILGLLVGGHDTASAACTFIVKFLAELPHIYEQVYKEQMEIAKSKVPG ELLNWEDIQKMKYSWNVACEVMRLAPPLQGAFREAITDFVFNGFSIPKGWKLYWSANSTHKSPDYF PEPDKFDPTRFEGNGPAPYTFVPFGGGPRMCPGKEYARLEILVFMHNLVKRFKWEKLVPDEKIVVD PMPIPAKGLPVRLYPHKA* SEQIDNO:15-QsCYP716-C-16(OQHZ-2012090)(C-16oxidase)coding sequence(1506bp/1443bp): Longandshortisoformsasdescribedhereinaredistinguishedbythepresenceof thefirst63nucleotides,underlinedinthesequencesbelow(21aminoacids). ATGATATATAATAATGATAGTAATGATAATGAATTAGTAATCAGCTCAGTTCAGCAACCATCCATG GATCCTTTCTTCATTTTTGGCTTACTTCTCTTGGCTCTCTTTCTCTCTGTTTCTTTTCTTCTCTAC CTTTCCCGTAGAGCCTATGCTTCTCTCCCCAACCCTCCGCCGGGGAAGCTCGGCTTCCCCGTCGTC GGCGAGAGTCTCGAATTTCTCTCCACCCGACGCAAAGGTGTTCCTGAGAAATTCGTCTTCGACAGA ATGGCCAAATACTGTCGGGATGTCTTTAAGACATCAATATTGGGAGCAACCACCGCCGTCATGTGC GGCACCGCCGGTAACAAATTCTTGTTCTCCAACGAGAAAAAACACGTCACTGGTTGGTGGCCGAAA TCTGTAGAGCTGATTTTCCCAACCTCACTTGAGAAATCATCCAACGAAGAATCCATCATGATGAAA CAATTCCTTCCCAACTTCTTGAAACCAGAACCTTTGCAGAAGTACATACCCGTTATGGACATAATT ACCCAAAGACACTTCAATACAAGCTGGGAAGGACGCAACGTGGTCAAAGTGTTTCCTACGGCTGCC GAATTCACCACGTTGCTGGCTTGTCGGGTATTCCTCAGTGTTGAGGATCCCATTGAAGTAGCCAAG ATTTCAGAGCCATTTGAAATCTTAGCTGCTGGGTTTCTTTCAATACCCATAAATCTTCCGGGTACC AAATTAAATAAAGCGGTTAAGGCAGCGGATCAGATTAGAGACGCAATTGTACAGATTTTGAAACGG AGAAGGGTTGAAATTGCGGAGAATAAAGCAAATGGAATGCAAGATATAGCGTCCATGTTGTTGACG ACACCAACTAATGCTGGGTTTTATATGACCGAGGCTCACATTTCTGAGAAAATTTTGGGTATGATT GTTGGTGGCCGTGATACTGCTAGTACTGTTATCACCTTCATCATCAAGTATTTGGCAGAGAATCCT GAAATTTATAATAAGGTCTATGAGGAGCAAATGGAAGTGGTAAAGTCAAAGAAACCAGGTGAGTTG CTGAACTGGGAAGATGTGCAGAAAATGAAGTACTCTTGGTGCGTAGCATGTGAAGCTATGCGACTT GCTCCTCCTGTTCAAGGTGGTTTCAAGGTGGCCATTAATGACTTTGTGTATTCTGGGTTCAACATT CGCAAGGGTTGGAAGTTATATTGGAGTGCCATTGCAACACACATGAATCCAGAATATTTCCCAGAA CCTGAGAAATTCAACCCCTCAAGGTTTGAAGGGAAGGGACCAGTACCTTACAGCTTCGTACCCTTC GGAGGCGGACCTCGGATGTGTCCCGGGAAAGAGTATTCCCGGCTGGAAACACTTGTTTTCATGCAT CATTTGGTGACGAGGTACAATTGGGAGAAAGTGTATCCCACAGAGAAGATAACAGTGGATCCAATG CCATTCCCTGTCAACGGCCTCCCCATTCGCCTTATTCCTCACAAGCACCAATGA SEQIDNO:16-QsCYP716-C-16translatednucleotidesequence(501aa/480aa): MIYNNDSNDNELVISSVQQPSMDPFFIFGLLLLALFLSVSFLLYLSRRAYASLPNPPPGKLGFPVV GESLEFLSTRRKGVPEKFVFDRMAKYCRDVFKTSILGATTAVMCGTAGNKFLFSNEKKHVTGWWPK SVELIFPTSLEKSSNEESIMMKQFLPNFLKPEPLQKYIPVMDIITQRHFNTSWEGRNVVKVFPTAA EFTTLLACRVFLSVEDPIEVAKISEPFEILAAGFLSIPINLPGTKLNKAVKAADQIRDAIVQILKR RRVEIAENKANGMQDIASMLLTTPTNAGFYMTEAHISEKILGMIVGGRDTASTVITFIIKYLAENP EIYNKVYEEQMEVVKSKKPGELLNWEDVQKMKYSWCVACEAMRLAPPVQGGFKVAINDFVYSGFNI RKGWKLYWSAIATHMNPEYFPEPEKFNPSRFEGKGPVPYSFVPFGGGPRMCPGKEYSRLETLVFMH HLVTRYNWEKVYPTEKITVDPMPFPVNGLPIRLIPHKHQ* SEQIDNO:17-QsCYP714-C-23(C-23oxidase)codingsequence(1524bp): ATGTGGTTCACAGTAGGATTGGTCTTGGTTTTCGCCCTATTCATACGTCTCTACAGCAGTCTGTGG TTGAAGCCTCGTGCAACTCGGATTAAGCTTAGCAATCAAGGAATTAAAGGTCCAAAACCAGCATTT CTTCTGGGTAATGTTGCAGAGATGAGAAGATTTCAATCTAAGCTTCCAAAATCTGAACTCAAACAA GGCCAAGTTTCTCATGATTGGGCTTCTAAATCTCTGTTTCCATTTTTCAGTCTTTGGTCCCAGAAA TACGGAAATACGTTCGTGTTCTCATTGGGGAACATACAGGTGCTCTATGTTTCTGATCATGAGTTG GTGAAAGAAATTAATCAGAATACCTCTTTAGATTTGGGCAAACCCAAGTACCTGCAGAAGGAGCGT GGCCCTTTGCTGGGACAAGGTATTTTGACCTCCAATGGACAGCTTTGGGCGTACCAGAGAAAAATC ATGACTCCTGAACTCTACAAGGAGAAAATCAAGGGCATGTGCGAGTTGATGGTGGAATCTGTAGCT TGGTTGGTTGAGGAATGGGGAACGAAGATCCAAGCTGAGGGTGGGGCAGCAGACATTAGAATAGAC GAGGATCTTAGAAGCTTCTCTGGTGATGTAATTTCAAAAGCTTGTTTTGGGAGCTGCTATGCCGGA GGGAGGGAAATCTTTCTTAGGCTCAGAGCTCTTCAACACCAAATTGCTTCCAAAGCCTTACTCATG GGCTTCCCTGGATTAAAGTACCTGCCCATTAAGAGCAACAGAGAGATATGGAGATTGGAGAAGGAG ATCTTCCAGCTGATTATGAAGCTGGCTGAAGATAGAAAAAAAGAACAACATGAGAGAGACCTATTA CAGATTATAATTGAGGGAGCTAAAAGTAGTGATCTGAGTTCGGAAGCAATGGCAAAATTCATTGTG GACAACTGCAAGAATGTCTACTTGGCTGGCCATGAAACTACTGCAATGTCTGCTGGTTGGACTTTG CTTCTCTTGGCTAATCATCCTGAGTGGCAAGCCCGTGTCCGTGATGAGATTTTACAAGTCACCGAG GGCCGCAATCCTGATTTTGACATGCTGCACAAGATGAAACTGTTAACAATGGTAATTCAGGAGGCA CTGCGACTCTACCCAACAGTCATATTCATGTCAAGAGAAGCATTGGAAGATATTAATGTTGGAAAC ATCCAAGTTCCAAAAGGTGTTAACATATGGATACCTGTGGTAAATCTTCAAAGGGACACAACGGTA TGGGGTGCAGACGCAAACGAGTTTAATCCTGAAAGGTTTGCCAATGGAGTTAACAATTCATGCAAG GTTCCACAACTTTACCTACCATTTGGAGCTGGACCTCGCATTTGTCCTGGAATTAATCTGGCCATG ACTGAGATCAAGATACTTCTGTGTATCCTGCTCACCAAGTTTTCGTTTTCAGTTTCACCCAACTAT CGCCACTCACCGGTGTTTAAATTGGTGCTTGAGCCTGAAAATGGAATCAATGTCATCATGAAGAAG CTCTAA SEQIDNO:18-QsCYP714-C-23translatednucleotidesequence(507aa): MWFTVGLVLVFALFIRLYSSLWLKPRATRIKLSNQGIKGPKPAFLLGNVAEMRRFQSKLPKSELKQ GQVSHDWASKSLFPFFSLWSQKYGNTFVFSLGNIQVLYVSDHELVKEINQNTSLDLGKPKYLQKER GPLLGQGILTSNGQLWAYQRKIMTPELYKEKIKGMCELMVESVAWLVEEWGTKIQAEGGAADIRID EDLRSFSGDVISKACFGSCYAGGREIFLRLRALQHQIASKALLMGFPGLKYLPIKSNREIWRLEKE IFQLIMKLAEDRKKEQHERDLLQIIIEGAKSSDLSSEAMAKFIVDNCKNVYLAGHETTAMSAGWTL LLLANHPEWQARVRDEILQVTEGRNPDFDMLHKMKLLTMVIQEALRLYPTVIFMSREALEDINVGN IQVPKGVNIWIPVVNLQRDTTVWGADANEFNPERFANGVNNSCKVPQLYLPFGAGPRICPGINLAM TEIKILLCILLTKFSFSVSPNYRHSPVFKLVLEPENGINVIMKKL* SEQIDNO:19-GmSGT2(GmUGT73P2)(Glycinemax(soybean)-D- galactosyltransferase)codingsequence(1488bp): ATGTGGTTCACAGTAGGATTGGTCTTGGTTTTCGCCCTATTCATACGTCTCTACAGCAGTCTGTGG TTGAAGCCTCGTGCAACTCGGATTAAGCTTAGCAATCAAGGAATTAAAGGTCCAAAACCAGCATTT CTTCTGGGTAATGTTGCAGAGATGAGAAGATTTCAATCTAAGCTTCCAAAATCTGAACTCAAACAA GGCCAAGTTTCTCATGATTGGGCTTCTAAATCTCTGTTTCCATTTTTCAGTCTTTGGTCCCAGAAA TACGGAAATACGTTCGTGTTCTCATTGGGGAACATACAGGTGCTCTATGTTTCTGATCATGAGTTG GTGAAAGAAATTAATCAGAATACCTCTTTAGATTTGGGCAAACCCAAGTACCTGCAGAAGGAGCGT GGCCCTTTGCTGGGACAAGGTATTTTGACCTCCAATGGACAGCTTTGGGCGTACCAGAGAAAAATC ATGACTCCTGAACTCTACAAGGAGAAAATCAAGGGCATGTGCGAGTTGATGGTGGAATCTGTAGCT TGGTTGGTTGAGGAATGGGGAACGAAGATCCAAGCTGAGGGTGGGGCAGCAGACATTAGAATAGAC GAGGATCTTAGAAGCTTCTCTGGTGATGTAATTTCAAAAGCTTGTTTTGGGAGCTGCTATGCCGGA GGGAGGGAAATCTTTCTTAGGCTCAGAGCTCTTCAACACCAAATTGCTTCCAAAGCCTTACTCATG GGCTTCCCTGGATTAAAGTACCTGCCCATTAAGAGCAACAGAGAGATATGGAGATTGGAGAAGGAG ATCTTCCAGCTGATTATGAAGCTGGCTGAAGATAGAAAAAAAGAACAACATGAGAGAGACCTATTA CAGATTATAATTGAGGGAGCTAAAAGTAGTGATCTGAGTTCGGAAGCAATGGCAAAATTCATTGTG GACAACTGCAAGAATGTCTACTTGGCTGGCCATGAAACTACTGCAATGTCTGCTGGTTGGACTTTG CTTCTCTTGGCTAATCATCCTGAGTGGCAAGCCCGTGTCCGTGATGAGATTTTACAAGTCACCGAG GGCCGCAATCCTGATTTTGACATGCTGCACAAGATGAAACTGTTAACAATGGTAATTCAGGAGGCA CTGCGACTCTACCCAACAGTCATATTCATGTCAAGAGAAGCATTGGAAGATATTAATGTTGGAAAC ATCCAAGTTCCAAAAGGTGTTAACATATGGATACCTGTGGTAAATCTTCAAAGGGACACAACGGTA TGGGGTGCAGACGCAAACGAGTTTAATCCTGAAAGGTTTGCCAATGGAGTTAACAATTCATGCAAG GTTCCACAACTTTACCTACCATTTGGAGCTGGACCTCGCATTTGTCCTGGAATTAATCTGGCCATG ACTGAGATCAAGATACTTCTGTGTATCCTGCTCACCAAGTTTTCGTTTTCAGTTTCACCCAACTAT CGCCACTCACCGGTGTTTAAATTGGTGCTTGAGCCTGAAAATGGAATCAATGTCATCATGAAGAAG CTCTAA SEQIDNO:20-GmSGT2(GmUGT73P2)(Glycinemax(soybean)-D- galactosyltransferase)translatednucleotidesequence(495aa): MEKKKGELKSIFLPFLSTSHIIPLVDMARLFALHDVDVTIITTAHNATVFQKSIDLDASRGRPIRT HVVNFPAAQVGLPVGIEAFNVDTPREMTPRIYMGLSLLQQVFEKLFHDLQPDFIVTDMFHPWSVDA AAKLGIPRIMFHGASYLARSAAHSVEQYAPHLEAKFDTDKFVLPGLPDNLEMTRLQLPDWLRSPNQ YTELMRTIKQSEKKSYGSLFNSFYDLESAYYEHYKSIMGTKSWGIGPVSLWANQDAQDKAARGYAK EEEEKEGWLKWLNSKAESSVLYVSFGSINKFPYSQLVEIARALEDSGHDFIWVVRKNDGGEGDNFL EEFEKRMKESNKGYLIWGWAPQLLILENPAIGGLVTHCGWNTVVESVNAGLPMATWPLFAEHFFNE KLVVDVLKIGVPVGAKEWRNWNEFGSEVVKREEIGNAIASLMSEEEEDGGMRKRAKELSVAAKSAI KVGGSSHNNMKELIRELKEIKLSKEAQETAPNP* SEQIDNO:21-AsSQS(Avenastrigosasqualenesynthase)codingsequence (1212bp): ATGGGGGCGCTGTCGCGGCCGGAGGAGGTGGTGGCGCTGGTCAAGCTGAGGGTGGCGGCGGGGCAG ATCAAGCGCCAGATCCCGGCCGAGGAACACTGGGCCTTCGCCTACGACATGCTCCAGAAGGTCTCC CGCAGCTTCGCGCTCGTCATCCAGCAGCTCGGACCCGAACTCCGCAATGCCGTGTGCATCTTCTAC CTCGTGCTCCGGGCCCTGGACACCGTCGAGGACGACACCAGCATCCCCAACGACGTGAAGCTGCCC ATCCTTCGGGATTTCTACCGCCATGTCTACAACCCCGACTGGCGTTATTCATGTGGAACAAACCAC TACAAGGTGCTGATGGATAAGTTCAGACTCGTCTCCACGGCTTTCCTGGAGCTAGGCGAAGGATAT CAAAAGGCAATTGAAGAAATCACTAGGCGAATGGGAGCAGGAATGGCAAAATTTATATGCCAGGAG GTTGAAACGATTGATGACTATAATGAGTACTGCCACTATGTAGCAGGGCTAGTAGGCTATGGACTT TCCAGGCTCTTTCATGCTGCTGGGACAGAAGATCTGGCTTCAGATCAACTTTCGAATTCAATGGGT TTGTTTCTTCAGAAAACCAATATAATAAGGGATTATTTGGAGGATATAAATGAGATACCAAAGTGC CGTATGTTTTGGCCTCGAGAAATATGGAGTAAATATGCAGATAAACTTGAGGACCTCAAGTATGAG GAAAATTCAGAAAAAGCAGTGCAATGCTTGAATGATATGGTGACTAATGCTTTGGTCCACGCCGAA GACTGTCTTCAATACATGTCTGCGTTGAAGGATAATACTAATTTTCGGTTTTGTGCAATACCTCAG ATAATGGCAATTGGGACATGTGCTATTTGCTACAATAATGTGAAAGTCTTTAGAGGAGTTGTTAAG ATGAGGCGTGGGCTCACTGCACGAATAATTGATGAGACAAAATCAATGTCAGATGTCTATTCTGCT TTCTATGAGTTCTCTTCATTGCTAGAGTCAAAGATTGACGATAACGACCCAAGTTCTGCACTAACA CGGAAGCGTGTAGAGGCAATAAAGAGGACTTGCAAGTCATCCGGTTTACTAAAGAGAAGGGGATAC GACCTGGAAAAGTCAAAGTATAGGCATATGTTGATCATGCTTGCACTTCTGTTGGTGGCTATTATC TTCGGTGTACTGTACGCCAAGTGA SEQIDNO:22-AsSQS(Avenastrigosasqualenesynthase)translatednucleotide sequence(403aa): MGALSRPEEVVALVKLRVAAGQIKRQIPAEEHWAFAYDMLQKVSRSFALVIQQLGPELRNAVCIFY LVLRALDTVEDDTSIPNDVKLPILRDFYRHVYNPDWRYSCGTNHYKVLMDKFRLVSTAFLELGEGY QKAIEEITRRMGAGMAKFICQEVETIDDYNEYCHYVAGLVGYGLSRLFHAAGTEDLASDQLSNSMG LFLQKTNIIRDYLEDINEIPKCRMFWPREIWSKYADKLEDLKYEENSEKAVQCLNDMVTNALVHAE DCLQYMSALKDNTNFRFCAIPQIMAIGTCAICYNNVKVFRGVVKMRRGLTARIIDETKSMSDVYSA FYEFSSLLESKIDDNDPSSALTRKRVEAIKRTCKSSGLLKRRGYDLEKSKYRHMLIMLALLLVAII FGVLYAK* SEQIDNO:23-AtATR2(ArabidopsisthalianacytochromeP450reductase2) codingsequence(2325bp): ATGAAAAACATGATGAATTATAAATTAAAACTCTGTTCTGTCTCAAAAAACTCAAAAGGAGTCTCT CTCTCACCTACACCACACCTAACCAAACCCCCTACGATTCACACAGAGAGAGATCTTCTTCTTCCT TCTTCTTCCTTCTTCTTTCTTCTTCTTTCTTCTTCTAGCTACAACATCTACAACGCCATGTCCTCT TCTTCTTCTTCGTCAACCTCCATGATCGATCTCATGGCAGCAATCATCAAAGGAGAGCCTGTAATT GTCTCCGACCCAGCTAATGCCTCCGCTTACGAGTCCGTAGCTGCTGAATTATCCTCTATGCTTATA GAGAATCGTCAATTCGCCATGATTGTTACCACTTCCATTGCTGTTCTTATTGGTTGCATCGTTATG CTCGTTTGGAGGAGATCCGGTTCTGGGAATTCAAAACGTGTCGAGCCTCTTAAGCCTTTGGTTATT AAGCCTCGTGAGGAAGAGATTGATGATGGGCGTAAGAAAGTTACCATCTTTTTCGGTACACAAACT GGTACTGCTGAAGGTTTTGCAAAGGCTTTAGGAGAAGAAGCTAAAGCAAGATATGAAAAGACCAGA TTCAAAATCGTTGATTTGGATGATTACGCGGCTGATGATGATGAGTATGAGGAGAAATTGAAGAAA GAGGATGTGGCTTTCTTCTTCTTAGCCACATATGGAGATGGTGAGCCTACCGACAATGCAGCGAGA TTCTACAAATGGTTCACCGAGGGGAATGACAGAGGAGAATGGCTTAAGAACTTGAAGTATGGAGTG TTTGGATTAGGAAACAGACAATATGAGCATTTTAATAAGGTTGCCAAAGTTGTAGATGACATTCTT GTCGAACAAGGTGCACAGCGTCTTGTACAAGTTGGTCTTGGAGATGATGACCAGTGTATTGAAGAT GACTTTACCGCTTGGCGAGAAGCATTGTGGCCCGAGCTTGATACAATACTGAGGGAAGAAGGGGAT ACAGCTGTTGCCACACCATACACTGCAGCTGTGTTAGAATACAGAGTTTCTATTCACGACTCTGAA GATGCCAAATTCAATGATATAAACATGGCAAATGGGAATGGTTACACTGTGTTTGATGCTCAACAT CCTTACAAAGCAAATGTCGCTGTTAAAAGGGAGCTTCATACTCCCGAGTCTGATCGTTCTTGTATC CATTTGGAATTTGACATTGCTGGAAGTGGACTTACGTATGAAACTGGAGATCATGTTGGTGTACTT TGTGATAACTTAAGTGAAACTGTAGATGAAGCTCTTAGATTGCTGGATATGTCACCTGATACTTAT TTCTCACTTCACGCTGAAAAAGAAGACGGCACACCAATCAGCAGCTCACTGCCTCCTCCCTTCCCA CCTTGCAACTTGAGAACAGCGCTTACACGATATGCATGTCTTTTGAGTTCTCCAAAGAAGTCTGCT TTAGTTGCGTTGGCTGCTCATGCATCTGATCCTACCGAAGCAGAACGATTAAAACACCTTGCTTCA CCTGCTGGAAAGGATGAATATTCAAAGTGGGTAGTAGAGAGTCAAAGAAGTCTACTTGAGGTGATG GCCGAGTTTCCTTCAGCCAAGCCACCACTTGGTGTCTTCTTCGCTGGAGTTGCTCCAAGGTTGCAG CCTAGGTTCTATTCGATATCATCATCGCCCAAGATTGCTGAAACTAGAATTCACGTCACATGTGCA CTGGTTTATGAGAAAATGCCAACTGGCAGGATTCATAAGGGAGTGTGTTCCACTTGGATGAAGAAT GCTGTGCCTTACGAGAAGAGTGAAAACTGTTCCTCGGCGCCGATATTTGTTAGGCAATCCAACTTC AAGCTTCCTTCTGATTCTAAGGTACCGATCATCATGATCGGTCCAGGGACTGGATTAGCTCCATTC AGAGGATTCCTTCAGGAAAGACTAGCGTTGGTAGAATCTGGTGTTGAACTTGGGCCATCAGTTTTG TTCTTTGGATGCAGAAACCGTAGAATGGATTTCATCTACGAGGAAGAGCTCCAGCGATTTGTTGAG AGTGGTGCTCTCGCAGAGCTAAGTGTCGCCTTCTCTCGTGAAGGACCCACCAAAGAATACGTACAG CACAAGATGATGGACAAGGCTTCTGATATCTGGAATATGATCTCTCAAGGAGCTTATTTATATGTT TGTGGTGACGCCAAAGGCATGGCAAGAGATGTTCACAGATCTCTCCACACAATAGCTCAAGAACAG GGGTCAATGGATTCAACTAAAGCAGAGGGCTTCGTGAAGAATCTGCAAACGAGTGGAAGATATCTT AGAGATGTATGGTAA SEQIDNO:24-AtATR2(ArabidopsisthalianacytochromeP450reductase2) translatednucleotidesequence(774aa): MKNMMNYKLKLCSVSKNSKGVSLSPTPHLTKPPTIHTERDLLLPSSSFFFLLLSSSSYNIYNAMSS SSSSSTSMIDLMAAIIKGEPVIVSDPANASAYESVAAELSSMLIENRQFAMIVTTSIAVLIGCIVM LVWRRSGSGNSKRVEPLKPLVIKPREEEIDDGRKKVTIFFGTQTGTAEGFAKALGEEAKARYEKTR FKIVDLDDYAADDDEYEEKLKKEDVAFFFLATYGDGEPTDNAARFYKWFTEGNDRGEWLKNLKYGV FGLGNRQYEHFNKVAKVVDDILVEQGAQRLVQVGLGDDDQCIEDDFTAWREALWPELDTILREEGD TAVATPYTAAVLEYRVSIHDSEDAKFNDINMANGNGYTVFDAQHPYKANVAVKRELHTPESDRSCI HLEFDIAGSGLTYETGDHVGVLCDNLSETVDEALRLLDMSPDTYFSLHAEKEDGTPISSSLPPPFP PCNLRTALTRYACLLSSPKKSALVALAAHASDPTEAERLKHLASPAGKDEYSKWVVESQRSLLEVM AEFPSAKPPLGVFFAGVAPRLQPRFYSISSSPKIAETRIHVTCALVYEKMPTGRIHKGVCSTWMKN AVPYEKSENCSSAPIFVRQSNFKLPSDSKVPIIMIGPGTGLAPFRGFLQERLALVESGVELGPSVL FFGCRNRRMDFIYEEELQRFVESGALAELSVAFSREGPTKEYVQHKMMDKASDIWNMISQGAYLYV CGDAKGMARDVHRSLHTIAQEQGSMDSTKAEGFVKNLQTSGRYLRDVW* SEQIDNO:25-Q.saponariaquillaicacid3-O-glucuronosyltransferase(cellulose synthase-likeenzymeQsCslG2)codingsequence(2124bp): ATGGCGACCGTCTCCTCCCTCCACACTTGCACTGTACAGCAACCCCGTGCAGCCATTAATCGAATT CACATTTTCTTACACTTTATTGCCATACTTTTCCTCTTTTACTACCGGGTCACCGGTCTTTTCTAT GACAATGCAGTACCCACTTTAGCTTGGTCTCTAATGACCTTAGCTGAGTTGATTTTCGCCTTCGTT TGGGTGCTCAGCCAAGCCTTCCGGTGGCGCCCGGTGTTGCGTTCAGTTATTCCTGAGAGGATTCCC AAAGATGTACGATTGCCCGCGGTGGATATCTTAATTTGTACGGCTGACCCATTAAAGGAACCGACG GTGGAGGTGATGAACACTGTCTTGTCCGCCATGGCATTGGACTATCCTGCGGAGAATCTGGCTGTA TATCTTTCTGATGACGGGGGTTCTCCGGTCACCTTATTTGCTATGAAGCAAGTGGGTCCGTTTGCT AAGCTGTGGCTTCCGTTTTGCAACAAGTACGGAATCAAAACAAGGCATCCTGAGTCTTTTTTCTCG GCATTTGCGGATGACGAAAGGCTTCACCGGAGTGATGAATTCAGGGCAGAGGAGGAGGCGATCAAG GACAAATATGAAGAATTTAAGAGAACTATAGAGAAATATGGTGGAGAAGGAAAAAATAGTCATGTT GTACAAGACCGGCCTCCTCATGTGGAGATTATACATGACACTAGGAAGATTAGAGAGAACAGTGAA GACCAAGCTGTGCCTCTTCTTGTCTACGTCTCTCGTGAGAAAAGACCATCCTACAATTCTCGGTTC AAAGCAGGAGCTCTGAACACCCTTCTTCGAGTTTCTGGGGTAATCAGCAATAGCCCATATGTATTG GTGTTAGACTGTGACATGTACTGCAATGATCCAACATCAGCTAGACAAGCAATGTGCTTCCATCTT GATCCACAAATGTCTCGCACTCTCTCTTTTGTACAATTCCCCCAGGTTTTCTACAATGTTAGTAAA AATGATATCTATGATGGCCAAGCTAGGGCAGCCTTTAAGACAAAGTGGCAAGGTATGGATGGACTA CGTGGGCCACTGCTTTCTGGTACTGGCTTTTATTTGAAGAGGAAGTCCTTGTATGGAAGTCCAAAC CAAGAAGATGATTGTTTACTTGAGCCCCATAAGAATTTTGGAAAGTGTGACAAGCTCATAGAATCA GTAAAGGTCATTTATGAACGTGATGTTTCAATAAAGGCAGATTCATCAGATGCCATTTTGCAAGAT GCCAAACAATTAGCATCTTGTCCCTATGAAACAAACACAAGCTGGGGCAAAGAGGTTGGGTTCTCG TATGACTGCTTATTAGAGAGTACATTCACAGGTTATCTGTTGCACTGCAGAGGGTGGACATCAGTT TATCTTTATCCAAAGAAGCCATGTTTCTTAGGGTGTACTCCAGTTGATATGAAGGAAGCCATGGTT CAGTATACGAAGTGGATTTCTGAATTATTTTTACTTGCTATCTCAAGATTCAACCCTCTGACATTT GGGATATCCAGAATGTCCATTCTCCAGAGCATGTGTTACGGATACCTTACAATCATGCCCATTTTA TCTGTTGCTATGATCTTCTATGCCACAGTTCCTCAATTGTGCCTCTTGAGAGGCGTACCTCTGTTT CCCAAGGTTTCAGACCCATGGTTTGCAGTGTTCCTAGCAATATTTGTGTCCTCCCTCTGTCAGCAC TTAATTGAAGTCCTCACGAGTGATGGCACGCTCAAGACTTGGTGGAATGAACAAAGAAATTGGGTG ATAAAGTCTGGTTCCGGTAGCGTATTTGGAGCTCTGAGTGGAATATTGAAGTGGTTTGGCATGAAG ATTAAATTTGGTTTATCAAACAAAGCCGTGGACAAAGAAAAGCTTGAGAAATATGAAAAGGGTAAG TTTGATTTCCAAGGGGCTGCCATGTTTATGGTTCCCTTAACTATATCAGTCATCTTGAACACATTA TGCCTTATCGGTGGTTTATGGAGAGTAATCACACTTAAAAACTTCGAAGAGATGTCAGGGCAGTTC ATCATCTCCTTGTACTTTCTAGCTCTCAGCTATCCAATTCTTGAAGGGTTACTAAGAAAAGGCAAG GGAAAGGCCTAA SEQIDNO:26-Q.saponariaquillaicacid3-O-glucuronosyltransferase(cellulose synthase-likeenzymeQsCslG2)translatednucleotidesequence(707aa): MATVSSLHTCTVQQPRAAINRIHIFLHFIAILFLFYYRVTGLFYDNAVPTLAWSLMTLAELIFAFV WVLSQAFRWRPVLRSVIPERIPKDVRLPAVDILICTADPLKEPTVEVMNTVLSAMALDYPAENLAV YLSDDGGSPVTLFAMKQVGPFAKLWLPFCNKYGIKTRHPESFFSAFADDERLHRSDEFRAEEEAIK DKYEEFKRTIEKYGGEGKNSHVVQDRPPHVEIIHDTRKIRENSEDQAVPLLVYVSREKRPSYNSRF KAGALNTLLRVSGVISNSPYVLVLDCDMYCNDPTSARQAMCFHLDPQMSRTLSFVQFPQVFYNVSK NDIYDGQARAAFKTKWQGMDGLRGPLLSGTGFYLKRKSLYGSPNQEDDCLLEPHKNFGKCDKLIES VKVIYERDVSIKADSSDAILQDAKQLASCPYETNTSWGKEVGFSYDCLLESTFTGYLLHCRGWTSV YLYPKKPCFLGCTPVDMKEAMVQYTKWISELFLLAISRFNPLTFGISRMSILQSMCYGYLTIMPIL SVAMIFYATVPQLCLLRGVPLFPKVSDPWFAVFLAIFVSSLCQHLIEVLTSDGTLKTWWNEQRNWV IKSGSGSVFGALSGILKWFGMKIKFGLSNKAVDKEKLEKYEKGKFDFQGAAMFMVPLTISVILNTL CLIGGLWRVITLKNFEEMSGQFIISLYFLALSYPILEGLLRKGKGKA SEQIDNO:27-Q.saponariaQA-GlcpA-Galp-1,3-L-rhamnosyltransferase (Qs_0283850)codingsequence(1485bp): ATGGTCTCCGGCGACGACGACGTTTCTCGTCGGCCACTGAAAGTTTACTTTATTGCACACCCCTCA CCTGGCCATATTGCCCCTCTAACCAAAATAGCCCAACTCTTTGCTGCACGTGGTGAGCACGTGACT ATTCTTACTACTCCCGCCAATGTCCACTTCCATGAGAAATCCATCGACAAAGGAAAGACTTCCGGC TATCATGTTAACATCCACGCCGTTAAATTTCCTTCTAAAGAGGTCGGTCTCCCCGACGGCATCGAA AACTTCTCTCACGCCTCCGATAATGAAACAGCAGCCAAAATTTGGGCCGGATTCTCCATGCTTCAA ACTGAAATGGAGCAATATATGGAACAAAACCCACCCGATTGCATTGTTGCCGACATGTTCAACCGC TGGACTTCCGACTTCGCTATCAAATTGGGAATCCCGAGAATAGTTTTCAACGTCTACTGTATTTTC ACACGCTGTTTGGAAGAAGCAATCAGATCACCTGACTCGCCACACTTGAAACTAAACTCCGATAAT GAACAGTTTATTATTCCGGGTCTACCCGACCCCATAACAATTACCCGAGCTCAACTGCCCGACGGT GCCTTTTCTGTCGTCAAAGAACAAGTTAGTGAAGCTGAGTTGAAAAGCTTCGGAATGGTGATCAAC GGGTTTTCCGAACTCGAAACCGAATACATCGAGTATTACAAGAATATCATGGGTCGAAAACGGATT TGGCATGTCGGACCCCTTCAGCTCATTTACCAAAACGATGACCCCAAAGTTCAGAGGAGCCAGAAG ACAGCGGTCGTGAGTGACAACGAGTTAGTGAGTTGGCTTGACTCGAAGAAACCCGACTCAGTGATT TACATTTCCTTCGGTAGTGCAATTCGTTTCTCTAATAAGCAGCTCTATGAAATAGCATGTGGATTA GAAGCTTCCGGCTACCCATTTTTGTGGGCCTTACTTTGGGTGCCAGAAGATGACGACGACGTGGGC AACAAATGGTTGCCTGATTTCGAAGAAAGAATAAAAAGAGAAAATAAGGGAATAATTTTCAGGGGG TGGGCCCCACAGATGTTAATCTTAAACCACCCGGCGATCGGTGGTTTCATGACGCATTGTGGTTGG AATGCGGTGGTGGAAGCGCTTTCTTTCGGTGTTCCGACTATTACGCTTCCGGTTTTCTCGGAGCAG TTTTATACTGAGAGACTGATATCACAAGTGCTCAAGACTGGTGTCGAGGTCGGTGCAGAGAAGTGG ACCTATGCATTTGATGCGGGGAAATATCCGGTGAGTCGGGAAAAGATAGCGACGGCGGTGAAGAAG ATATTAGACTGTGGAGAAGAGGCAGAAGGAATGAGAAAGCGGGCCAGGGAGATGAAAGAAAAAGCC CAAAAAAGTGTTGAAGAAGGTGGGTCCTCTTATAATAATTTAACGGCTATGATTGAAGATCTTAAA GAATTTAGGGCTAACAATGGCAAGGTTGCATGA SEQIDNO:28-Q.saponariaQA-GlpA-Galp-1,3-L-rhamnosyltransferase (Qs_0283850)translatednucleotidesequence(494aa): MVSGDDDVSRRPLKVYFIAHPSPGHIAPLTKIAQLFAARGEHVTILTTPANVHFHEKSIDKGKTSG YHVNIHAVKFPSKEVGLPDGIENFSHASDNETAAKIWAGFSMLQTEMEQYMEQNPPDCIVADMFNR WTSDFAIKLGIPRIVFNVYCIFTRCLEEAIRSPDSPHLKLNSDNEQFIIPGLPDPITITRAQLPDG AFSVVKEQVSEAELKSFGMVINGFSELETEYIEYYKNIMGRKRIWHVGPLQLIYQNDDPKVQRSQK TAVVSDNELVSWLDSKKPDSVIYISFGSAIRFSNKQLYEIACGLEASGYPFLWALLWVPEDDDDVG NKWLPDFEERIKRENKGIIFRGWAPQMLILNHPAIGGFMTHCGWNAVVEALSFGVPTITLPVFSEQ FYTERLISQVLKTGVEVGAEKWTYAFDAGKYPVSREKIATAVKKILDCGEEAEGMRKRAREMKEKA QKSVEEGGSSYNNLTAMIEDLKEFRANNGKVA SEQIDNO:29-Q.saponariaQA-GlcpA-Galp-1,3-L-rhamnosyltransferase (TRINITY_DN20529_c0_g2_i8)codingsequence(1491bp): ATGGTCTCCGGCGACGATACCGTTTCACGGCCACTGATAGTTTACTTTATTGCACACCCCTCACCT GGCCATATTGCCCCTCTAACCAAAATAGCCCAACTCTTCGCTGCACGTGGTGAGCACGTCACTATT CTTACTACTCCCGCCAATGTCCACTTCCATGAGAAATCCATCGACAAAAGAAAGAATTCCGGCTAT CATGTTAACATCCACACCGTTAAATTTCCTTCTAAAGAGGTCGGTCTCCCTGACGGCATCGAAAAC TTCTCTCACGCCTCCGATAATGAAACAGCAGCCAAAATTTGGGCCGGATTCTCCATGCTTCAAACT GAAATGGAGCAATATATGGAACAAAACCCACCCGATTGCATCGTTGCCGACATGTTCAACCGCTGG ACTTCCGACTTCGCTATCAAATTGGGAATCCCGAGAATAGTTTTCAACGTCTACTGTATTTTCACA CGCTGTTTGGAAGAAGCAATCAGATCACCTGACTCGCCACACTTGAAACTAAACTCCGATAATGAA CAGTTTATTATTCCCGGTCTACCCGACCCCATAACAATTACCCGAGCTCAACTCCCCGACGGTGCC TTTTCTGTCGTCAAAGAACAAGTTAGTGAAGCTGAGTTGAAAAGCTTCGGAATGGTGATCAACGGG TTTTCCGAACTCGAAACTGAATACATCGAGTATTACAAGAATATCATGGGTCGCAAACGGATTTGG CATGTCGGACCCCTTCAGCTAATTTACCAAAACGACGACCCCAAAGTTCAGAGGAGCCAGAAGACA GCGGTCTTGAGTGACAACGAGTTAGTGAGTTGGCTTGACTCGAAGAAACCCGACTCAGTGATTTAC ATTTCCTTCGGTAGTGCAATTCGTTTCTCTAATAAGCAGCTCTATGAAATCGCATGTGGATTAGAA GCTTCCGGCTACCCATTTTTGTGGGCCTTACTTTGGGTGCCAGAAGATGATGACGACGTGGGCAAC AAATGGTTGCCGGGTTTCGAAGAAAGAATAAAAAGAGAAAATAAGGGAATAATTTTCAGGGGGTGG GCCCCACAGATGTTAATCTTAAACCACCCGGCGATCGGTGGTTTCATGACGCATTGTGGTTGGAAT GCGGTGGTGGAAGCACTTTCATTCGGTGTTCCGACTATTACGCTTCCAGTTTTCTCGGAGCAGTTT TATACTGAGAGACTGATATCACAAGTGCTCAAGACTGGTGTGGAGGTTGGTGCAGAGAAGTGGACC TATGCATTTGATGCGGGGAAATATCCGGTGAGTAGGGAAAAGATAGCGACGGCGGTGAAGAAGATA TTAGACGATGGAGAAGAGGCAGAAGGAATGAGAAAGCGGGCCAGGGAGATGAAAGAAAAAGCCCAA AAAAGTGTTGAAGAAGGTGGATCCTCTTATAATAATTTAACGGCTATGATTGAAGATCTTAAAGAA TTTAGGGCTAACAATGGCAAGGCTGCAATGAAATCATGA SEQIDNO:30-Q.saponariaQA-GlcpA-Galp-1,3-L-rhamnosyltransferase (TRINITY_DN20529_c0_g2_i8)translatednucleotidesequence(496aa): MVSGDDTVSRPLIVYFIAHPSPGHIAPLTKIAQLFAARGEHVTILTTPANVHFHEKSIDKRKNSGY HVNIHTVKFPSKEVGLPDGIENFSHASDNETAAKIWAGFSMLQTEMEQYMEQNPPDCIVADMFNRW TSDFAIKLGIPRIVFNVYCIFTRCLEEAIRSPDSPHLKLNSDNEQFIIPGLPDPITITRAQLPDGA FSVVKEQVSEAELKSFGMVINGFSELETEYIEYYKNIMGRKRIWHVGPLQLIYQNDDPKVQRSQKT AVLSDNELVSWLDSKKPDSVIYISFGSAIRFSNKQLYEIACGLEASGYPFLWALLWVPEDDDDVGN KWLPGFEERIKRENKGIIFRGWAPQMLILNHPAIGGFMTHCGWNAVVEALSFGVPTITLPVFSEQF YTERLISQVLKTGVEVGAEKWTYAFDAGKYPVSREKIATAVKKILDDGEEAEGMRKRAREMKEKAQ KSVEEGGSSYNNLTAMIEDLKEFRANNGKAAMKS SEQIDNO:31-Q.saponariaQA-GlcpA-Galp-1,3-D-xylosyltransferase (Qs_0283870)codingsequence(1515bp): ATGGTCTCCGGCGACGACGATGTTTCTCGTCGGCCACTGAAAGTTTACTTCATTGCACACCCCTCA CCTGGCCATATTGCCCCTCTGACCAAAATAGCCCATCTCTTCGCTGCCCTCGGTGAGCACGTGACT ATTCTCACTACTCCCGCCAATGTCCACTTCCATGAGAAATCCATCGACAAAGGAAAGGCTTCCGGC TATCATGTTAACATCCACACCGTTAAATTTCCTTCTAAAGAGGTCGGTCTCCCTGACGGCATCGAA AACTTCTCTTACGCCTCCGATGTTGAAACAGCAGCTAAAATTTGGGCTGGATTCGCCATGCTACAA ACTGAAATGGAGCAATATATGGAGCTTAACCCACCCGATTGCATCGTTGCCGACATGTTCACCTCC TGGACCTCCGACTTTGCTATCAAATTGGGAATCACAAGAATCGTTTTCAACGTCTATTGTATTTTC ACACGCTGTTTGGAAGAAGCCATCCGATCACCGGACTCGCCACACTTGAACAAAGAAATCTCTGAT AATGAACCGTTTGTTATCCCGGGTCTACCAGACCCCATAACAATTACCCGAGCTCAACTGCCCGAC GGTACCTTTTCTCCCATGAAAGAACTAGCTAGAACAGCTGAGTTGAAGAGCTTTGGAATGGTGATC AACGGGTTTTCCGAACTCGAAACCGATTACATCGAGCATTACAAGAAAATCATGGGTCACAAACGG ATTTGGCATGTCGGACCCCTTCAGCTAATCCACCGTAACGATGAAGACAAAATTCAGAGGAGCCAC AAGACAGCGGTGCTGAGTGATAACGATAACGAGTTAGTGAGTTGGCTTAACTCGAAGAAACCCGAC TCAGTTATTTACATTTGCTTCGGTAGTGCAACTCGTTTCTCTAATCACCAGCTCTATGAAATCGCC TGTGGATTAGAAGCTTCCGGGCACCCATTTTTGTGGGGCCTACTTTGGGTGCCAGAAGATGAAGAT AACGATGACGTGGGCAACAAATGGTTGCCAGCTTTCGAAGAAAGAATTAAAAAGGAAAATAAGGGA ATGATTTTAAGGGGGTGGGCTCCACAGATGTTAATCTTGAATCACCCGGCGATCGGTGGTTTCATG ACGCATTGTGGTTGGAATGCGGCGGTGGAGGCGCTTTCTTCCGGTGTTCCGATTATTACATTTCCG GTTTTCTCGGATCAGTTTTATAATGAAAGGCTGATATCACAAGTGCATAAGTGTGGGGTGGGGGTT GGTACGGAGGCGTGGAGCTATGCATTCGATGCCGGGAAGAATCCGGTGGGTCGGGAAAAGATAATG ACGGCGGTGAAGAAGATATTAGACGGTGGAGAAGAGGCGGAAGGAATGAGAAAGAGGGCCCGGGAG CTGAAAGAAATAGCTAAAAGAAGTGTGGAAGAAGGTGGGTCCTCTTATAATAATTTAACGGCTATG ATTCAAGATCTGAAAGAATTTAGAGCTAACAATGGCAAGGCTGCACAAGATCATGAATCGTGA SEQIDNO:32-Q.saponariaQA-GlcpA-Galp-1,3-D-xylosyltransferase (Qs_0283870)translatednucleotidesequence(504aa): MVSGDDDVSRRPLKVYFIAHPSPGHIAPLTKIAHLFAALGEHVTILTTPANVHFHEKSIDKGKASG YHVNIHTVKFPSKEVGLPDGIENFSYASDVETAAKIWAGFAMLQTEMEQYMELNPPDCIVADMFTS WTSDFAIKLGITRIVFNVYCIFTRCLEEAIRSPDSPHLNKEISDNEPFVIPGLPDPITITRAQLPD GTFSPMKELARTAELKSFGMVINGFSELETDYIEHYKKIMGHKRIWHVGPLQLIHRNDEDKIQRSH KTAVLSDNDNELVSWLNSKKPDSVIYICFGSATRFSNHQLYEIACGLEASGHPFLWGLLWVPEDED NDDVGNKWLPAFEERIKKENKGMILRGWAPQML1LNHPAIGGFMTHCGWNAAVEALSSGVPIITFP VFSDQFYNERLISQVHKCGVGVGTEAWSYAFDAGKNPVGREKIMTAVKKILDGGEEAEGMRKRARE LKEIAKRSVEEGGSSYNNLTAMIQDLKEFRANNGKAAQDHES