METABOLIC ENGINEERING

20230279444 · 2023-09-07

    Inventors

    Cpc classification

    International classification

    Abstract

    The invention relates generally to materials and methods for biosynthesising quillaic acid in a host by expressing heterologous nucleotide sequences in the host each of which encodes a polypeptide which in combination have said QA biosynthesis activity. Example polypeptides include (i) a Beta-amyrin synthase; (ii) an enzyme capable of oxidising Beta-amyrin or an oxidised derivative thereof at the C-28 position to a carboxylic acid; (iii) an enzyme capable of oxidising Beta-amyrin or an oxidised derivative thereof at the C-16a position to an alcohol; and (iv) an enzyme capable of oxidising Beta-amyrin or an oxidised derivative thereof at the C-23 position to an aldehyde. Preferred nucleotide sequences are obtained from, or derived from, Q. saponaria.

    Claims

    1. A method of converting a host from a phenotype whereby the host is unable to carry out quillaic acid (QA) biosynthesis from 2,3-oxidosqualene (OS) to a phenotype whereby the host is able to carry out said QA biosynthesis, which method comprises the step of expressing a heterologous nucleic acid within the host or one or more cells thereof, following an earlier step of introducing the nucleic acid into the host or an ancestor of either, wherein the heterologous nucleic acid comprises a plurality of nucleotide sequences each of which encodes a polypeptide which in combination have said QA biosynthesis activity.

    2. A method as claimed in claim 1 wherein the nucleic acid encodes all of the following polypeptides (i) a β-amyrin synthase (bAS) for cyclisation of OS to a triterpene; (ii) an enzyme capable of oxidising β-amyrin or an oxidised derivative thereof at the C-28 position to a carboxylic acid (“C-28 oxidase”); (iii) an enzyme capable of oxidising β-amyrin or an oxidised derivative thereof at the C-16α position to an alcohol (“C-16α oxidase”); and (iv) an enzyme capable of oxidising β-amyrin or an oxidised derivative thereof at the C-23 position to an aldehyde (“C-23 oxidase”), wherein each of the polypeptides is optionally obtained from Q. saponaria.

    3. A method as claimed in claim 2 wherein the C-28 oxidase, C-16α oxidase, and C-23 oxidase are all CYP450 enzymes.

    4. A method as claimed in claim 3 wherein (i) the C-28 oxidase is a CYP716; (ii) the C-16α oxidase is a CYP 716 or CYP87; (iii) the C-23 oxidase is a CYP714, CYP72, or CYP 94.

    5. A method as claimed in claim 4 wherein the bAS, C-28 oxidase, C-16α oxidase, and C-23 oxidase polypeptides are selected from the respective polypeptides in Tables 1 or 2, or substantially homologous variants or fragments of any of said polypeptides, optionally as defined in Table 4 or are encoded by the respective polynucleotides in Tables 1 or 2, or substantially homologous variants or fragments of any of said polynucleotides, optionally as defined in Table 4.

    6. A method as claimed in claim 5 wherein the polypeptides are selected from the list consisting of: (i) the β-amyrin synthase (bAS) shown in SEQ ID: No 2; (ii) the C-28 oxidase shown in SEQ ID: No 4 or 18 or as encoded by any of SEQ ID NOs: 19-28; (iii) the C-16α oxidase shown in SEQ ID: No 6, 10 or 12 ; (iv) the C-23 oxidase shown in the SEQ ID: No 8, 14 or 16; or substantially homologous variants or fragments of any of said polypeptides.

    7. A method as claimed in claim 6 wherein the polypeptides are selected from the list consisting of: (i) the β-amyrin synthase (bAS) shown in SEQ ID: No 2; (ii) the C-28 oxidase shown in SEQ ID: No 4; (iii) the C-16α oxidase shown in SEQ ID: No 6; (iv) the C-23 oxidase shown in the SEQ ID: No 8; or substantially homologous variants or fragments of any of said polypeptides.

    8. A method as claimed in claim 1wherein the nucleic acid further encodes one or more of the following polypeptides: (i) an HMG-CoA reductase (HMGR); (ii) a squalene synthase (SQS); wherein the HMGR or SQS are optionally selected from the respective polypeptides in Table 3 or substantially homologous variants or fragments of any of said polypeptides, or are encoded by the respective polynucleotides in Table 3, or substantially homologous variants or fragments of any of said polynucleotides.

    9. A method as claimed in claim 1 wherein the nucleotide sequences are present on two or more different nucleic acid molecules.

    10. A method as claimed in claim 9 wherein the nucleic acid molecules are introduced by co-infiltration of a plurality of Agrobacterium tumefaciens strains each carrying one or more of the nucleic acid molecules.

    11. A method as claimed in claim 10 wherein the nucleic acid molecules are transient expression vectors, wherein each of the transient expression vectors comprises an expression cassette comprising: (i) a promoter, operably linked to (ii) an enhancer sequence derived from the RNA-2 genome segment of a bipartite RNA virus, in which a target initiation site in the RNA-2 genome segment has been mutated; (iii) a nucleotide sequence encoding one of the polypeptides which in combination have said QA biosynthesis activity; (iv) a terminator sequence; and optionally (v) a 3′ UTR located upstream of said terminator sequence.

    12. (canceled)

    13. A host cell containing or transformed with a heterologous nucleic acid which comprises a plurality of nucleotide sequences each of which encodes a polypeptide which in combination have quillaic acid (QA) from 2,3-oxidosqualene (OS) biosynthesis activity, wherein expression of said nucleic acid imparts on the transformed host the ability to carry out QA biosynthesis.

    14. A host cell as claimed in claim 13 wherein the heterologous nucleic acid encodes all of the following polypeptides: (i) a β-amyrin synthase (bAS) for cyclisation of OS to a triterpene; (ii) an enzyme capable of oxidising β-amyrin or an oxidised derivative thereof at the C-28 position to a carboxylic acid (“C-28 oxidase”); (iii) an enzyme capable of oxidising β-amyrin or an oxidised derivative thereof at the C-16α position to an alcohol (“C-16α oxidase”); and (iv) an enzyme capable of oxidising β-amyrin or an oxidised derivative thereof at the C-23 position to an aldehyde (“C-23 oxidase”), wherein each of the polypeptides is optionally obtained from Q. saponaria.

    15. A process for producing the host cell of claim 13 by co-infiltrating a plurality of recombinant constructs comprising said nucleic acid into the cell for transient expression thereof.

    16. A process for producing the host cell of claim 13 by transforming a cell with heterologous nucleic acid by introducing said nucleic acid into the cell via a vector and causing or allowing recombination between the vector and the cell genome to introduce the nucleic acid into the genome.

    17. A method for producing a transgenic plant, which method comprises the steps of: (a) performing a process as claimed in claim 16 wherein the host cell is a plant cell, (b) regenerating a plant from the transformed plant cell.

    18. A transgenic plant which is obtainedby the method of claim 17, or which is a clone, or selfed or hybrid progeny or other descendant of said transgenic plant, wherein expression of said heterologous nucleic acid imparts an increased ability to carry out QA synthesis compared to a wild-type plant otherwise corresponding to said transgenic plant.

    19. A plant as claimed in claim 18 which is a crop plant or a moss.

    20. A host cell as claimed in claim 13 which is a microorganism.

    21. A host cell as claimed in claim 20 which is a yeast.

    22. A host cell as claimed in claim 21 which further contains or is transformed with heterologous nucleic acid which comprises one or more nucleotide sequences each of which encodes a polypeptide which is a plant cytochrome P450 reductases (CPR).

    23. A host cell as claimed in claim 22 wherein the CPR is shown in SEQ ID No: 35 or is a substantially homologous variant or fragment of said polypeptide.

    24. (canceled)

    25. A method of producing a product which is QA or a derivative thereof in a heterologous host, which method comprises culturing a host cell as claimed in claim 13.

    26. A method of producing a product which is QA or a derivative thereof in a heterologous host, which method comprises growing a plant as claimed in claim 18 and then harvesting it and purifying the product therefrom.

    27. (canceled)

    28. An isolated nucleic acid molecule which nucleic acid comprises a nucleotide sequence which is selected from the group consisting of: (i) a nucleotide sequence which encodes all or part of SEQ ID NO: 2, 4, 6, or 8; (ii) a nucleotide sequence which encodes a variant sequence which is a homologous variant of any of these SEQ ID NOs sharing at least about 60% identity therewith; and\or (iii) a nucleotide sequence which is selected from SEQ ID NO: 1, 3, 5, or 7 or the genomic equivalent thereof.

    29. A nucleic acid as claimed in claim 28 wherein the QA-nucleotide sequence encodes a derivative of the amino acid sequence shown in SEQ ID NO: 2, 4, 6, or 8 by way of addition, insertion, deletion or substitution of one or more amino acids.

    30. An isolated polypeptide encoded by the nucleic acid of claim 28.

    31. (canceled)

    32. A recombinant vector which comprises the nucleic acid of claim 28.

    33. A vector as claimed in claim 32 wherein the nucleic acid is operably linked to a promoter for transcription in a host cell, wherein the promoter is optionally an inducible promoter.

    34. (canceled)

    35. (canceled)

    36. A method which comprises the step of introducing the vector of claim 32into a host cell.

    37. A host cell containing or transformed with a vector according to claim 32.

    38. A host cell as claimed in claim 37 which is microbial, optionally a yeast cell.

    39. A host cell which is a plant cell having a heterologous nucleic acid as claimed in claim 28 within its chromosome.

    40. A method which comprises the step of introducing the vector of claim 32into a host cell, and causing or allowing recombination between the vector and the host cell genome such as to transform the host cell.

    41. A method for producing a transgenic plant, which method comprises the steps of: (a) performing a method as claimed in claim 40 wherein the host cell is a plant cell, (b) regenerating a plant from the transformed plant cell.

    42. A transgenic plant which is obtained by the method of claim 41, or which is a clone, or selfed or hybrid progeny or other descendant of said transgenic plant, which in each case includes a heterologous nucleic acid having a nucleotide sequence which is selected from the group consisting of: (i) a nucleotide sequence which encodes all or part of SEQ ID NO: 2, 4, 6, or 8; (ii) a nucleotide sequence which encodes a variant sequence which is a homologous variant of any of these SEQ ID NOs sharing at least about 60% identity therewith; and\or (iii) a nucleotide sequence which is selected from SEQ ID NO: 1, 3, 5, or 7 or the genomic equivalent thereof.

    43. A plant which comprises a heterologous nucleic acid having a nucleotide sequence which is selected from the group consisting of: (i) a nucleotide sequence which encodes all or part of SEQ ID NO: 2, 4, 6, or 8; (ii) a nucleotide sequence which encodes a variant sequence which is a homologous variant of any of these SEQ ID NOs sharing at least about 60% identity therewith; and\or (iii) a nucleotide sequence which is selected from SEQ ID NO: 1, 3, 5, or 7 or the genomic equivalent thereof.

    44. A method for influencing or affecting the QA biosynthesis in a cell, the method comprising the step of causing or allowing expression of a heterologous nucleic acid as claimed in claim 28 within the cell.

    Description

    FIGURES

    [0228] FIG. 1: QS-21.

    [0229] FIG. 2: Production of quillaic acid via β-amyrin, from common universal precursors. The pathway from β-amyrin requires oxidation at three (C-16α, C-23 and C-28) positions. These oxidation steps are shown in a linear fashion for simplicity only, although as explained above they can in principle progress in in other sequence (see FIG. 11).

    [0230] FIG. 3: PCR amplification of candidate genes in leaf (L) and root (R) tissue of Q. saponaria. It was possible to get a product for most candidates in both tissues.

    [0231] FIG. 4: Expression of Q. saponaria β-amyrin synthase (QsbAS) in Nicotiana benthamiana. GC-MS analysis of leaf extracts reveals production of β-amyrin only in leaves expressing the cloned β-amyrin synthase, but not in control (GFP) leaves.

    [0232] FIG. 5: Conversion of β-amyrin by P450s from Q. saponaria. Two P450s in the CYP716 family were found to oxidise β-amyrin. Left side: GC-MS analysis of N. benthamiana leaf extracts showing that CYP716-2073932 converted the majority of β-amyrin to a new product identified as oleanolic acid at 12.08 min. The mass spectrum for this product versus an authentic oleanolic acid standard is shown on the right side. CYP716-2012090 (both long and short isoforms) converted a small amount of β-amyrin putatively identified as 16α-hydroxy-β-amyrin (marked with *). The mass spectrum for this product is given in FIG. 5S.

    [0233] FIG. 5S: El mass spectrum for the putative 16α-hydroxy-β-amyrin. Trace amounts of this product were formed upon coexpression of QsbAS and CYP716-2012090.

    [0234] FIG. 6A: Conversion of oleanolic acid to echinocystic acid by CYP716-2012090. Left side: GC-MS analysis of N. benthamiana leaf extracts showing that coexpression of the two CYP716 members from Q. saponaria with QsbAS and CYP716-2073932 results in accumulation of a product at 12.42 min identified as echinocystic acid. The mass spectrum for this compound versus an authentic echinocystic acid standard is shown on the right side.

    [0235] FIG. 6B: Conversion of oleanolic acid to hederagenin by OQHZ-2018687. Screening C-23 oxidase candidates for oleanolic acid-oxidising activity. Revealed that a new product was observed in samples expressing candidates #6 and #7 (which carry the same enzyme, also referred to as CYP714-7 herein). This new product had an identical retention time and mass spectrum to a 23-hydroxy-oleanolic acid (hederagenin) standard and suggests that the enzyme is a C-23 oxidase.

    [0236] FIG. 7: LC-MS analysis of leaf extracts of N. benthamiana expressing combinations of QsbAS and the C-28 (CYP716-2073932), C-16α (CYP716-2012090) and C-23 (CYP714-7) oxidases from Q. saponaria. Quillaic acid (19.886 min) was observed only in the samples expressing all three P450s. Mass spectra for the various samples at 19.886 min are shown below along with a quillaic acid standard.

    [0237] FIG. 8: Comparison of quillaic acid production between plant samples expressing different C-23 oxidases. All samples derive from leaves expressing tHMGR, QsbAS, and Q. saponaria C-28 (CYP716-2073932) and C-16α (CYP716-2012090) oxidases. The C-23 oxidases were derived from either Q. saponaria (CYP714-7, top), M. truncatula (CYP72A68, 2.sup.nd down) or A. strigosa (CYP94D65, 3.sup.rd down).

    [0238] The CAD chromatogram is shown at the top. Mass spectra (negative mode) of interest are shown below.

    [0239] A common ion with m/z 485 (shown in red) was common to both the quillaic acid standard and novel peak in tHMGR/QsbAS/CYP716-2073932/CYP716-2012090/CYP94D65 samples. This ion fits the expected molecular mass of quillaic acid (minus H.sup.+). *A second compound was found in high abundance with m/z 487 that was putatively identified as cauphyllogenin (featuring a C-23 alcohol instead of an aldehyde as seen in quillaic acid). Mass spectra for these products are shown in FIG. 8S.

    [0240] Fewer alternative C-23-oxidised side products, including the C-23 alcohol (cauphylogenin) and acid (16α-hydroxy-gypsogenic acid (16OH-GA)) were found in the Q. saponaria C-23-expressing sample, suggesting greater specificity for production of the aldehyde.

    [0241] FIG. 9: Expression of Q. saponaria genes in yeast. GC-MS traces are given at the top for the different strains, mass spectra for peaks of interest are given below.

    [0242] FIG. 10: A) Simplified overview of the mevalonate (MVA) pathway required for triterpene biosynthesis and potential rate-limiting enzymes. B) β-amyrin content in N. benthamiana can be improved from coexpression of tHMGR or SQS with an oat β-amyrin synthase (AsbAS). C) Coexpression of SQS with tHMGR further improves β-amyrin content over tHMGR alone.

    [0243] FIG. 11: Oxidised derivatives of β-amyrin.

    [0244] FIG. 12: Biosynthesis of quillaic acid from 2,3-oxidosqualene and the associated enzymes from Q. saponaria. The oxidation steps may not occur exactly in this order.

    [0245] FIG. 13: LC-CAD analysis of representative leaves expressing the four characterised enzymes from Q. saponaria required to make quillaic acid (upper). As a control, the C-16α oxidase was excluded (lower) and instead accumulates the precursor gypsogenin (see FIG. 12).

    [0246] FIG. 14: LC analysis of a quillaic acid standard versus the product isolated from N. benthamiana. A) LC-CAD traces showing analysis of the isolated product (middle) and the quillaic acid standard (lower). Both samples showed a major peak at 19.5 minutes. A methanol-only blank run is shown in the top trace. B) MS (ESI/APC) analysis of the product at 19.5 minutes in both positive (upper) and negative (lower) mode. The isolated product is shown to the left with the quillaic acid standard on the right.

    [0247] FIG. 15: GC-MS analysis of a quillaic acid standard versus the product isolated from N. benthamiana. A) The standard is shown in the lower trace, with the isolated product shown in the upper trace. Both samples showed a major peak at 15.3 minutes. B) Comparison of El mass spectra of the two products at 15.3 min. The isolated product is shown above, with the quillaic acid standard below.

    [0248] FIG. 16: .sup.1H NMR (methanol d.sub.4) comparison of a quillaic acid standard (bottom) versus the isolated product from N. benthamiana (top).

    EXAMPLES

    Example 1- Mining for Candidate Quillaic Acid Biosynthetic Genes in a Q. Saponaria Transcriptome

    [0249] Recently, a transcriptomic dataset from Q. saponaria was made available through the 1KP project [1]. This dataset is derived from HiSeq sequencing (Illumina) of Q. saponaria leaf tissue.

    [0250] Although commercial sources of QS-21 are usually derived from bark, the leaf tissue has also been shown to be a substantial source of QS-21 and other saponins [2], so we reasoned the relevant biosynthetic genes might be present in this database. The transcriptome dataset was mined for potential biosynthetic genes.

    β-Amyrin Synthase

    [0251] The first candidate searched for was the β-amyrin synthase (bAS) OSC. Numerous bAS enzymes are characterised, including from related Fabales species.

    [0252] A bAS enzyme from Glycyrrhiza glabra (Genbank ID Q9MB42.1) was used as a query to identify OSC sequences. This returned a single full-length sequence (OQHZ-2074321) predicted to be a triterpene synthase (henceforth referred to as QsbAS).

    [0253] Other partial OSC sequences were also identified in this dataset, however these were predicted to be sterol (cycloartenol) synthases and were discounted.

    [0254] The full nucleotide and predicted protein sequence of QsbAS are given as SEQ ID NOs: 1 and 2 in Sequence Appendix A.

    β-Amyrin Oxidases

    [0255] We surmised that a likely class of enzymes responsible for oxidation of β-amyrin would be cytochrome P450s (P450s). These enzymes are encoded by very large gene superfamilies with usually more than 200 representatives in a single plant genome.

    [0256] Although function is often difficult to predict based on sequence homology, in recent years, the CYP716 family has emerged as a preeminent family of triterpene oxidases [3]. Previously 11 CYP716s had been characterised as β-amyrin C-28 oxidases (Sequence Appendix B). These P450s were isolated from taxonomically distinct species, (including Fabales species), suggesting that the C-28 β-amyrin oxidase in Q. saponaria may possibly be catalysed by a member of this family.

    [0257] Furthermore CYP716 enzymes have also been shown to be capable of catalysing oxidation at other (non-C-28) positions around the β-amyrin scaffold, including one C-16α oxidase (CYP716Y1), from Bupleurum falcatum (Sequence Appendix B). Two full-length CYP716s were identified in the transcriptome dataset, using the Medicago truncatula C-28 oxidase CYP716A12 as a search query. These are OQHZ-2073932 and OQHZ-2012090 (which may be referred to herein as CYP716-2073932 and CYP716-2012090).

    [0258] (Note that CYP716-2073932 has also been formally designated CYP716A224 by the P450 nomenclature committee [3]). The full nucleotide and predicted protein sequence of these CYP716s are given in as SEQ ID NOs: 3 and 4 in Sequence Appendix A.

    Example 2- Cloning Candidate Genes From Q. Saponaria

    [0259] Q. saponaria trees were sourced from a nursery (Burncoose Nurseries, Cornwall) within the UK. RNA was extracted from the leaves and roots of a single tree using a Qiagen RNeasy Plant RNA extraction kit, with a modified protocol as detailed by [26]. This RNA was further used as a template for cDNA synthesis using Superscript III (Invitrogen) according to the manufacturer’s instructions..

    [0260] For amplification of target genes, primers were designed for each of the four genes described above (SEQ ID NOs: 1, 3, 5, and 7). For CYP716-2012090, two sets of primers were designed allowing cloning of both long and short isoforms of the protein, differing at the N-terminus by 21 amino acids. This was due to poor alignment of this region with other characterised CYP716s.

    [0261] Each of the primers incorporated attB adapters at the 5′ end to allow directional Gateway®-based cloning. These adapters are shown in italics at the 5′ end, with the gene-specific sequences following in the 5′ -> 3′ direction.

    TABLE-US-00002 Primer name Sequence 5′ --> 3′ QsbAS1_F: GGGGACAAGTTTGTACAAAAAAGCAGGCTTA ATGTGGAGGCTGAAGATAGCAGAAGG QsbAS1_R: GGGGACCACTTTGTACAAGAAAGCTGGGTA TTAAGGCAATGGAACCCGCCTCC QsCYP716_2012090L_F: GGGGACAAGTTTGTACAAAAAAGCAGGCTTA ATGATATATAATAATGATAGTAATGATAATG QsCYP716_2012090S_F: GGGGACAAGTTTGTACAAAAAAGCAGGCTTA ATGGATCCTTTCTTCATTTTTGGC QsCYP716 _2012090_R: GGGGACCACTTTGTACAAGAAAGCTGGGTA TCATTGGTGCTTGTGAGG QsCYP716_2073932_F: GGGGACAAGTTTGTACAAAAAAGCAGGCTTA ATGGAGCACTTGTATCTCTCCCTTGTG QsCYP716_2073932_R: GGGGACCACTTTGTACAAGAAAGCTGGGTA TCAAGCTTTGTGAGGATAAAGGCGAAC QsCYP714_2018687 F: GGGGACAAGTTTGTACAAAAAAGCAGGCTTA ATGTGGTTCACAGTAGGATTGG QsCYP714_2018687 R: GGGGACCACTTTGTACAAGAAAGCTGGGTA TTAGAGCTTCTTCATGATGACATTG

    [0262] Two PCR reactions were performed for each gene, utilising either leaf or root cDNA as a template. As described above, two sets of PCRs were setup for CYP716-2012090 separate reactions, utilising different forward primers. PCRs were performed in a total volume of 50 .Math.L using iProof (BioRad) with HF buffer according to the manufacturer’s instructions. For amplification of QsbAS and CYP716 enzymes, PCR thermal cycling involved an initial denaturation step at 98° C. (30 sec), followed by 30 cycles of denaturation (98° C., 10 sec), annealing (50° C., 10 sec) and extension (72° C., 3 min), with a final extension at 72° C. (5 mins). These parameters were identical for amplification of the CYP714, except that the extension time during the 30 cycles was reduced to 2 mins.

    [0263] Successful amplification of all genes was observed using the cDNA from both root and leaf tissues as a PCR template (FIG. 3). PCR products derived from the leaf cDNA were further purified and recombined into a pDONR207 Entry vector as described previously [5]. The resulting plasmids were sequenced by Eurofins Genomics to verify the presence and sequence of the inserted genes. A single representative plasmid was chosen for each gene and recombined into the binary vector pEAQ-HT-DEST1 [4], before transformation into competent Agrobacterium tumefaciens as described previously [5]. For transient expression in N. benthamiana, A. tumefaciens strains were grown and prepared for infiltration as described previously [5, 27].

    Example 3- Transient Expression of Q. Saponaria Genes in N. Benthamiana

    QsbAS Is a Monofunctional Β-Amyrin Synthase

    [0264] Transient expression of the various cloned genes was performed in N. benthamiana. All combinations included coinfiltration of a strain carrying a feedback-insensitive truncated form of the A. strigosa HMG-CoA reductase (tHMGR). This enzyme has been demonstrated to increase triterpene content upon transient expression in N. benthamiana [5]. The sequences utilised are shown as SEQ ID Nos 29-32.

    [0265] Leaves were harvested, extracted and analysed by GC-MS as described previously [5]. GC-MS analysis of QsbAS-expressing leaves revealed the presence of compound identified as β-amyrin by comparison of the retention time and mass spectra of a β-amyrin standard (FIG. 4). No other new products were found in the chromatogram suggesting that QsbAS is a monofunctional β-amyrin synthase.

    Discovery of the C-28 and C-16α Oxidases

    [0266] Next, QsbAS was tested with combinations of the various P450s. This revealed that both of the CYP716 enzymes showed activity towards β-amyrin. The CYP716-2073932 was found to be the C-28 oxidase and converted most of the β-amyrin to oleanolic acid. CYP716-2012090 converted a small amount of β-amyrin to a product putatively identified as 16α-hydroxy-β-amyrin (based on comparison to previously published mass spectra [6, 7] (FIG. 5; FIG. 5s).

    [0267] When these two CYP716 enzymes were combined, a third product was identified with an identical retention time and mass spectrum to echinocystic acid, an intermediate to quillaic acid consisting of β-amyrin plus the C-28 carboxylic acid and C-16α alcohol (FIG. 6A).

    Example 4 - Discovery of the C-23 Oxidase From Q. Saponaria

    [0268] Following the discovery of the C-28 and C-16α oxidases, attention was focussed on the outstanding Q. saponaria C-23 oxidase. The identification of the C-28 and C-16α oxidases was facilitated by homology-based searches of known triterpene-oxidising P450s. Other candidates were considered based on homology to known triterpene oxidases, including two CYP72 family members (OQHZ-2012357 and OQHZ-2019977), for which a C-23 oxidase has been identified in the related Fabaceae species Medicago truncatula. However upon cloning and testing in planta neither of these candidates displayed obvious activity towards β-amyrin, or its C-28/C-16α oxidised derivatives (data not shown).

    [0269] Consequently, it was deduced that the outstanding Q. saponaria C-23 oxidase may be within a P450 family not previously implicated in triterpene oxidation.

    [0270] The 1KP transcriptome data was therefore searched for all putative cytochrome P450s.

    [0271] Approximately 150 P450-encoding contigs were found in the dataset. Out of these, 35 appeared to encode a full-length enzyme (approx. 1500 bp, see Table 5).

    TABLE-US-00003 List of all 35 full-length cytochrome P450s represented in the Q. saponaria 1KP dataset. Putative families/clans were assigned based on Genbank BLAST searches. Candidates anticipated to be involved in primary metabolism were not considered further. This resulted in 25 final candidates (“QuickRef” column). Note candidate names used here derive from the contig number of the independently assembled transcriptome. Consequently this number results in a different naming system from the one used previously for the CYP716/CYP72 enzymes Quick Ref Name Clan Putative Family Comments Potential Candidate Cloned/ Tested - >CYP51_c13199_g1_i1 51 51G Sterol demethylase - >CYP701_c35443_g1_i2 71 701A Gibberellin biosynthesis 1 >CYP704_c31665_g1_i1 86 704C ✓ ✓ 2 >CYP704_c36842_g1_i1 86 704C ✓ ✓ 3 >CYP704_c36842_g1_i3 86 704C ✓ - >CYP707_c29564_g1_i1 85 707A Abscisic acid deactivation 4 >CYP71_c35642_g1_i1 71 71D ✓ ✓ - >CYP710_c19839_g1_i1 710 710A Sterol C-22 desaturase 5 >CYP712_c19176_g1_i2 71 93A ✓ ✓ 6 >CYP714_c36368_g1_i1 72 714C Identical to 7 ✓ ✓ 7 >CYP714_c36368_g1_i2 72 714C Q. saponaria C23 oxidase 1KP: OHQZ- 2018687 ✓ ✓ - >CYP716_c41117_g1_i1 85 716A Q. saponaria C28 oxidase (CYP716-2073932) - >CYP716_c23557_g1_i1 85 716A Q. saponaria C16α oxidase CYP716- 2012090 - >CYP72_c34500_g2 i1 72 72A Cloned (OQHZ-2012357) - >CYP721_c37141_g1_i1 72 734A Brassinosteroid inactivation - >CYP73_c37071_g1_i2 71 73A Transcinnamate-4-monoxygenase 8 >CYP74_c32585_g1_i1 71 74A ✓ 9 >CYP75_c4825_g1_i1 71 75B ✓ 10 >CYP75_c38772_g1_i1 71 75B ✓ ✓ 11 >CYP77_c33191_g1_i1 71 77A ✓ ✓ 12 >CYP78_c41068_g1_i1 71 78A ✓ 13 >CYP81_c36730_g1_i2 71 81E ✓ 14 >CYP82_c34310_g1_i1 71 82C ✓ 15 >CYP82_c36962_g1_i1 71 82C ✓ 16 >CYP82_c37078_g1_i1 71 82D Identical to 17 ✓ ✓ 17 >CYP82_c37078_g1_i2 71 82D ✓ ✓ 18 >CRP82_c3431_g1_i1 71 82D ✓ ✓ 19 >CYP84_c28124_g1_i1 71 84A ✓ ✓ 20 >CYP86_c36146_g2_i1 86 86A ✓ 21 >CYP89_c37100_g1_i1 71 89A ✓ ✓ - >CYP90_c31983_g1_i1 85 90A Brassinosteroid biosyrthesis 22 >CYP92_c28169_g1_i1 71 71A ✓ 23 >CYP94_c30674_g1_i1 86 94A ✓ ✓ 24 >CYP94_c11979_g1_i1 86 94A ✓ ✓ 25 >CYP96_c36742_g2_i1 86 86B ✓

    [0272] Amongst these full-length contigs were the C-28 and C-16α oxidases described above. It was therefore reasoned that the outstanding C-23 oxidase might also be represented within these sequences.

    [0273] The 35 P450 candidates were further assigned putative clan and families based on their homology to named P450s from other species (Table 5). A number of the candidates were anticipated to be involved in primary metabolism (and shared a high degree of sequence conservation to enzymes from unrelated species such as Arabidopsis), and were subsequently eliminated from the list.

    [0274] This gave a final list of 25 candidates, for which cloning primers were ordered. For easy reference, these are numbered 1-25 in Table 5 and described herein using these numbers.

    [0275] PCR amplification of the 25 candidates was next attempted. As with the previous candidates, two PCRs were performed for each candidate using cDNA templates derived from both leaf (L) and root (R) respectively. Strong PCR products were successfully produced for 20 out of the 25 candidates (data not shown). These were subsequently purified (from the leaf cDNA template samples) and cloned into the Gateway® Entry vector pDONR207.

    [0276] Candidates were sequenced to verify the correct gene had been cloned. In most cases the cloned sequences closely matched the anticipated sequence. Some redundancy was found amongst the clones; the sequences of #6 and #7 were found to be identical, as were #16 and #17. Upon checking the predicted sequence in the original transcriptomic data, it was realised that the contigs for these pairs were highly similar and primers had not been designed to distinguish between them. Regardless, the clones were treated as separate and cloned into the pEAQ-HT-DEST1 binary vector before transformation in A. tumefaciens.

    [0277] The 15 candidates were next transiently expressed in N. benthamiana. The candidates were first assessed for their potential to oxidise β-amyrin by coexpression with the Q. saponaria β-amyrin synthase (QsbAS). No new products were detected in these samples by GC-MS analysis. Candidates were therefore further assessed for their ability to oxidise oleanolic acid, by coexpression with QsbAS and the C-28 oxidase (CYP716-2073932). This time, a distinct new product could be detected in extracts of leaves expressing candidates #6 and #7 (6 and 7 encode the same enzyme, as described above). The new products had identical retention times and mass spectra to a standard of 23-hydroxy-oleanolic acid (aka hederagenin). The enzyme encoded by candidate #7 is expected to be a CYP714 family member (yet to be formally named). Before the presently claimed priority date is it believed that no members of this family had been reported to be triterpene oxidases. Since the priority date other examples have been reported (see e.g. Kim et. al (2018). “A Novel Multifunctional C-23 Oxidase, CYP714E19, Is Involved in Asiaticoside Biosynthesis”. Plant Cell Physiol.) 1200-1213.

    [0278] The sequences are included in Appendix A as SEQ ID Nos 7 and 8.

    [0279] As the C-23 candidates were derived from our own assembly of this data, the corresponding sequence in the 1KP dataset were searched for by BLASTn (https://db.cngb.org/blast4onekp/). Surprisingly, #7 is not represented by a full-length sequence in this database but several smaller contigs are returned (Table 6). The top hit from these is OHQZ-2018687, an 821 bp contig.

    TABLE-US-00004 List of contigs from the 1KP dataset which are returned from a BLASTn query of the C-23 oxidase. The top-scoring hit is OQHZ-2018687 Sequences producing significant alignments: Length Score (Bits) E-Value scaffold-OQHZ-2018687-Quillaja_saponaria 821 bp 1222 0.0 scaffold-OQHZ-2012766-Quillaja_saponaria 705 bp 985 0.0 scaffold-OQHZ-2018686-Quillaja_saponaria 859 bp 843 0.0 scaffold-OQHZ-2012767-Quillaja_saponaria 661 bp 841 0.0 scaffold-OQHZ-2022788-Quillaja_saponaria 102 bp 185 9e-46 scaffold-OQHZ-2041685-Quillaja_saponaria 129 bp 170 2e-41 scaffold-OQHZ-2022787-Quillaja_saponaria 102 bp 161 1e-38 scaffold-OQHZ-2008891-Quillaja_saponaria 323 bp 95.1 1e-18 scaffold-OQHZ-2072427-Quillaja_saponaria 1046 bp 66.2 6e-10 scaffold-OQHZ-2049459-Quillaja_saponaria 196 bp 50.0 4e-05 scaffold-OQHZ-2007159-Quillaja_saponaria 892 bp 50.0 4e-05

    Example 5 - Combinatorial Biosynthesis With Q. Saponaria Enzymes Allows for Synthesis of Quillaic Acid in N. Benthamiana

    [0280] The β-amyrin synthase and C-28, C-16α and C-23 oxidases from Q. saponaria described above should be sufficient for production of quillaic acid when expressed together (see FIG. 2).

    [0281] Prior to testing the C-23 oxidase from Q. saponaria, the other candidate genes from Q. saponaria were combined withC-23 β-amyrin oxidases characterised from other species i.e. CYP72A68v2 from M. truncatula (barrel medic) and CYP94D65 from Avena strigosa (black oat) (SEQ ID Nos 13-16).

    [0282] In this first experiment, the QsbAS and two CYP716 enzymes from Q. saponaria were combined with the M. truncatula and A. strigosa C-23 oxidases using transient expression in N. benthamiana to determine whether quillaic acid could be observed in these samples. LC-MS-CAD analysis revealed that both sets of combinations [0283] tHMGR/QsbAS/CYP716-2073932/CYP716-2012090/CYP72A68v2 [0284] tHMGR/QsbAS/CYP716-2073932/CYP716-2012090/CYP94D65resulted in appearance of novel products which matched the retention time and mass spectrum of a quillaic acid standard (results not shown).

    [0285] The abundance of quillaic acid appeared to be highest in the sample expressing CYP72A68v2.

    [0286] Other related products were also observed in these samples: In the combination expressing the oat C-23 oxidase (CYP94D65), the most abundant new peak was identified as cauphyllogenin (C-23 alcohol instead of the aldehyde seen in quillaic acid), while the Medicago C-23 oxidase (CYP72A68v2) gave rise to substantial accumulation of 16α-hydroxy gypsogenin (C-23 carboxyllic acid instead of the aldehyde seen in quillaic acid).

    [0287] To verify that quillaic acid could be produced in N. benthamiana with the exclusive use of the Q. saponaria enzymes, the QsbAS enzyme was transiently expressed with various combinations of the P450s. As expected, analysis of leaves coexpressing QsbAS with all P450s resulted in appearance of a peak which matched the retention time and mass spectrum of a quillaic acid standard. This peak was absent in samples from leaves expressing any less than the full pathway (FIG. 7).

    [0288] Furthermore, a comparison was made between the present sample expressing the full Q. saponaria complement of enzymes, versus the equivalent (stored) samples where C-23 oxidases from M. truncatula and oat had been used. This revealed that the amount of quillaic acid appeared to be highest in the sample expressing the Q. saponaria C-23 oxidase (FIG. 8). The sample expressing the Q. saponaria C-23 oxidase also appeared to contain significantly less of the unwanted putative side products cauphyllogenin and 16α-hydroxy gypsogenic acid (FIG. 8). These metabolites reflect the different C-23 oxidase specificity of the oat and Medicago enzymes, which predominantly make the C-23 alcohol and acid, respectively. Hence, the Q. saponaria C-23 oxidase appears to be much more specific for the C-23 aldehyde, reflecting its expected function in QS-21 biosynthesis.

    Example 6 - Expressing Q. Saponaria Genes in Yeast

    [0289] Saccharomyces cerevisiae may be utilised as a host chassis for commercial QA production.

    [0290] We therefore demonstrated cloned Quillaja genes are active in this host. A strain of S. cerevisiae derived from S288C (Genotype: MATa/MATα; ura3Δ0/ura3Δ0; leu2Δ0/leu2Δ0; his3Δ1/his3Δ1; met15Δ0/MET15; LYS2/lys2Δ0; YHR072w/YHR072w::kanM) was used which contains three auxotrophic selection markers (-URA/-HIS/-LEU) allowing for expression of genes from up to three plasmids.

    [0291] Three Gateway-compatible yeast expression vectors were employed, including pYES-DEST52 (uracil selection), pAG423 (histidine selection) and pAG435 (leucine selection).

    [0292] The Q. saponaria enzymes were recombined into these vectors as described in Table 7. Briefly, the β-amyrin synthase (QsbAS) was recombined into the pYES-DEST52 vector, while the C-28 oxidase (CYP716-2073932) and C-16α oxidase (both long (L) and short (S) isoforms) were recombined into pAG423.

    [0293] To enhance the efficiency of functioning of the cytochrome P450s, the third plasmid (pAG435) was used to express the Arabidopsis thaliana cytochrome P450 reductase 2 (AtATR2) enzyme. This serves as a coenzyme for reducing plant P450s back to an active state following substrate oxidation. All vectors contain galactose-inducible promoters for expression of the inserted genes.

    TABLE-US-00005 List of yeast strains generated Strain Number Media Vectors pYES2 URA3 pAG423 HIS3 pAG435 LEU2 62 -URA QsbAS - - 63 -URA -LEU -HIS QsbAS QsCYP716-2073932 AtATR2 64 -URA -LEU -HIS QsbAS QsCYP716-2012090-long AtATR2 65 -URA -LEU -HIS QsbAS QsCYP716-2012090-short AtATR2

    [0294] The yeast strains were cultured in synthetic yeast media with galactose and incubated for 2 days at 30° C. Strains were pelleted by centrifugation, saponified and metabolites were extracted with ethyl acetate. GC-MS analysis revealed that all strains accumulated a peak at 10.6 minutes which was identified as β-amyrin (FIG. 9). Strain 63, (expressing the C-28 oxidase) was found to accumulate small amounts of additional products which were identified as C-28 oxidised β-amyrin derivatives, including oleanolic acid (12.01 min) and intermediate C-28 alcohol erythrodiol (11.51 min) (FIG. 9, 2.sup.nd trace down). No products were identified in strain 64 or 65 (expressing C-16α oxidase isoforms) which could readily be identified as 16-hydroxy-β-amyrin implying this may not be optimal substrate for this enzyme.

    [0295] The above data demonstrates that yeast can be engineered to produce quillaic acid precursors..

    Example 7 - Production of QA by Stable Transformation

    [0296] Triterpenes have previously been produced using engineered transgenic plant lines (e.g. Arabidopsis, Wheat). A series of Golden Gate [23] vectors which allow for construction of multigene vectors and allow integration of an entire pathway into a single locus have been reported. These can be applied analogously to the present invention, in the light of the disclosure herein.

    Example 8 - Conclusions From Examples 1 to 7

    [0297] Quillaic acid is a triterpenoid and a key precursor to the saponin QS-21 produced by Quillaja saponaria.

    [0298] Here, four enzymes (a β-amyrin synthase and C-16α, C-23 and C-28 oxidases) from Q. saponaria were identified which were capable of production of quillaic acid when transiently expressed in NicotiaN. benthamiana.. These enzymes are predicted to be involved in the early steps of the QS-21 biosynthetic pathway, required for generation of the quillaic acid scaffold (FIG. 1).

    [0299] The identity of the products described herein were validated through use of authentic standards, giving a high degree of confidence in these results.

    [0300] The activity of the β-amyrin synthase (QsbAS) and three cytochrome P450 monoxygenases which oxidise β-amyrin at the C-28, C-23 and C-16α positions (referred to herein as CYP716-2073932, CYP714-7 and CYP716-2012090, respectively) in the biosynthesis of quillaic acid is shown schematically in FIG. 12.

    Example 9 -- Estimating Production of Quillaic Acid in N. Benthamiana

    [0301] To estimate quillaic acid production in N. benthamiana following transient expression, an analysis was carried out by LC-CAD. Agroinfiltration was performed as previously described using the Q. saponaria β-amyrin synthase and C-16α, C-23 and C-28 oxidases. As a control, leaves infiltrated with only two (C-23 and C-28) oxidases were used and accumulate gypsogenin instead of quillaic acid (FIG. 12).

    [0302] The oat HMG-CoA reductase (tHMGR) was also included in all infiltrations as it increases production of β-amyrin. Representative chromatograms from these samples are shown in FIG. 13. Three leaves from different plants were used for each test condition as biological replicates.

    [0303] To estimate production of quillaic acid in these leaves, the area of the quillaic acid peak was compared to that of the internal standard (included at 1.1 mg/g dry leaf weight). The average value from the three replicates was found to be 1.44 mg/g.

    Example 10 - Purification of Quillaic Acid From N. Benthamiana

    [0304] To determine unambiguously that quillaic acid production had been achieved in N. benthamiana, purification of the product was undertaken.

    [0305] A total of 209 N. benthamiana plants were vacuum infiltrated with A. tumefaciens carrying the pEAQ-HT-DEST1 constructs harbouring the Q. saponaria β-amyrin synthase, C-16α, C-23 and C-28 oxidases. The oat tHMGR was also included to boost yields. Leaves were harvested four days after infiltration yielding 150.3 g dry material after lyophilisation. Metabolites were extracted with ethanol using a Büchi Speed Extractor E-914 and several rounds of silica gel flash chromatography was used to isolate a total of 30 mg of product. The isolated product was found to have an identical retention time and mass spectrum to that of an authentic quillaic acid standard (Extrasynthese) by LC-MS (FIG. 14) and GC-MS (FIG. 15). Furthermore, .sup.1H NMR spectroscopic analysis of the isolated product was also in accordance with the quillaic acid standard (FIG. 16).

    [0306] This confirms that quillaic acid can be produced through transient expression in N. benthamiana through transient expression of the Q. saponaria enzymes. The isolated yield of the product was in the region of 0.2 mg/g dry weight, although some minor impurities were detected in the sample. This yield is lower than the estimated yield from LC-CAD in Example 9, indicating losses of the product during this isolation process. Nevertheless this demonstrates that practical quantities of quillaic acid can be produced and isolated from N. benthamiana using the presently characterised enzymes.

    Methods

    Infiltration

    [0307] Agroinfiltration was performed using a needleless syringe as previously described (Reed et al., 2017). All genes were expressed from pEAQ-HT-DEST1 binary expression vectors (Sainsbury et al., 2009) in A. tumefaciens LBA4404. All plants co-expressed the oat tHMGR, the Quillaja β-amyrin synthase (QsbAS), and β-amyrin C-28 (CYP716-2073932) and C-16α (CYP716-2012090S) oxidases. For quillaic acid production the C-23 (CYP714-7) oxidase was also co-expressed while green fluorescent protein (GFP) was used instead for controls. Cultivation of bacteria and plants is as described in (Reed et al., 2017). Three plants were infiltrated per test condition and analysed separately as biological replicates.

    LC-MS Analysis

    [0308] Leaves were harvested 5 days after agroinfiltration and freeze-dried. Freeze-dried leaf material (10 mg per sample) was ground at 1000 rpm for 1 min (Geno/Grinder 2010, Spex SamplePrep). Extractions were carried out in 550 .Math.L 80% methanol with 20 .Math.g/mL of digitoxin (internal standard; Sigma) for 20 min at 40° C., with shaking at 1400 rpm (Thermomixer Comfort, Eppendorf). The sample was partitioned twice with 400 .Math.L hexane. The aqueous phase was dried under vacuum at 40° C. (EZ-2 Series Evaporator, Genevac). Dried material was resuspended in 75 .Math.L of 100% methanol and filtered at 12, 500 g for 30 sec (0.2 .Math.m, Spin-X, Costar). Filtered samples were transferred to glass vials and analysed as detailed below.

    Preparation of N. Benthamiana Leaf Extracts

    [0309] Analysis was carried out using a Prominence HPLC system with single quadrupole mass spectrometer LCMS-2020 (Shimadzu) and Corona Veo RS Charged Aerosol Detector (CAD) (Dionex). Detection: MS (dual ESI/APCI ionization, DL temp 250° C., neb gas flow 15 L.min-1, heat block temp 400° C., spray voltage Pos 4.5 kV, Neg -3.5 kV) CAD: data collection rate 10 Hz, filter constant 3.6 s, 925 evaporator temp. 35° C., ion trap voltage 20.5 V. Method: Solvent A: [H.sub.2O + 0.1 % formic acid] Solvent B: [acetonitrile (CH.sub.3CN) + 0.1% formic acid. Injection volume: 10 .Math.L. Gradient: 15% [B] from 0 to 1.5 min, 15% to 60% [B] from 1.5 to 26 min, 60% to 100% [B] from 26 to 26.5 min, 100% [B] from 26.5 to 28.5 min, 100% to 15% [B] from 28.5 to 29 min, 35% [B] from 29 to 30 min. Method was performed using a flow rate of 0.3 mL.min-1 and a Kinetex column 2.6 .Math.m XB-C18 100 Å, 50 × 2.1 mm (Phenomenex).

    Analysis of N. Benthamiana Leaf Extracts

    [0310] Analysis was performed using LabSolutions software (Shimadzu). To provide an estimate of product yields, the area of the peak for quillaic acid (as determined by CAD) was divided by that of the internal standard (digitoxin, 1.1 .Math.g/mg dry leaf tissue). Results were averaged from the three replicates. A minor peak for an endogenous N. benthamiana product with the same retention time as quillaic acid was observed in controls (calculated average 0.25 .Math.g/mg). Therefore his value was subtracted from the estimated quillaic acid yield.

    Large Scale Infiltration

    [0311] Agroinfiltration was carried out as detailed above using tHMGR, QsbAS, CYP716-2073932, CYP716-2012090S and CYP714-7 oxidases. A total of 209 plants were infiltrated by vacuum as previously described (Reed et al., 2017) and were harvested after four days.

    Purification of Quillaic Acid From N. Benthamiana

    [0312] Leaves from the large scale infiltration were harvested, lyophilised and extraction was performed using a SpeedExtractor E-914 (Büchi) as detailed in (Reed et al., 2017) with the exception that the program involved four cycles (100° C. and 130 bar pressure). Cycle one (hexane) had zero hold time, and cycles two to four (ethanol) had 5 min hold times. The run finished with a 2 min solvent flush and 6 min N.sub.2 flush. The hexane portion of the extraction was discarded and the ethanol portion was used for subsequent flash chromatography, performed using an Isolera One (Biotage) with details of individual columns given below. Fractions were checked for quillaic acid after each column by GC-MS and thin layer chromatography (TLC) as detailed in (Reed et al., 2017). At each stage, the purest fractions were pooled and dried onto silica gel 60 (Material Harvest) for loading onto the subsequent column. Column 1: SNAP Ultra 50 g (Biotage), flow rate: 100 mL/min, 90 mL fractions with the following gradient: Solvent A: [hexane] Solvent B: [ethyl acetate]; gradients: 5% [B] to 100% [B] over 10 column volumes, and held at 100% [B] for a further 5 column volumes. Column 2: SNAP Ultra 50 g column (Biotage), flow rate 100 mL/min, 90 mL fractions with the following gradient: Solvent A: [dichloromethane] Solvent B: [ethyl acetate]; 10% [B] to 60% [B] over 10 column volumes, and held at 100% [B] for a further 2 column volumes. Column 3: SNAP Ultra 10 g (Biotage), flow rate: 36 mL/min, 17 mL fractions with same gradient as column 2. Following column 3 the fractions were treated with activated charcoal to remove coloured impurities and loaded onto column 4. Column 4: SNAP Ultra 10 g column (Biotage) (36 mL/min, 17 mL fractions) with an isocratic mobile phase 15% ethyl acetate in dichloromethane over 20 column volumes. The pooled fractions were treated with a small amount of HCl (400 .Math.L of conc HCl in ~40 mL ethanol) which helped to reduce streaking on the TLC plate. Column 5: SNAP Ultra 10 g column (Biotage) (36 mL/min, 17 mL fractions) with an isocratic mobile phase 15% ethyl acetate in dichloromethane over 30 column volumes with a final flush of 100% ethyl acetate over 5 column volumes. The purest fractions were pooled and dried to yield a 30 mg of a white powder with small amounts of yellow impurities. This was analysed by GC-MS, LC-MS and NMR as below.

    GC-MS, LC-MS and NMR Analysis of Purified Quillaic Acid

    [0313] GC-MS analysis was performed as described in (Reed et al., 2017). LC-MS analysis was performed as described above for quillaic acid quantification. NMR spectra were recorded in Fourier transform mode at a nominal frequency of 400 MHz for .sup.1H NMR in deuterated methanol. For each method of analysis a quillaic acid standard (Extrasynthese) was used for comparison.

    References for Materials and Methods

    [0314] Reed J, Stephenson MJ, Miettinen K, Brouwer B, Leveau A, Brett P, Goss RJM, Goossens A, O′Connell MA, Osbourn A. 2017. A translational synthetic biology platform for rapid access to gram-scale quantities of novel drug-like molecules. Metab Eng 42: 185-193.

    [0315] Sainsbury F, Thuenemann EC, Lomonossoff GP. 2009. pEAQ: versatile expression vectors for easy and quick transient expression of heterologous proteins in plants. Plant Biotechnol J 7(7): 682-693.

    Other References

    [0316] 1. Johnson, M.T.J., et al., Evaluating Methods for Isolating Total RNA and Predicting the Success of Sequencing Phylogenetically Diverse Plant Transcriptomes. PLOS ONE, 2012. 7(11): p. e50226.

    [0317] 2. Schlotterbeck, T., et al., The Use of Leaves from Young Trees of Quillaja saponaria (Molina) Plantations as a New Source of Saponins. Economic Botany, 2015. 69(3): p. 262-272.

    [0318] 3. Miettinen, K., et al., The ancient CYP716 family is a major contributor to the diversification of eudicot triterpenoid biosynthesis. Nat Commun, 2017. 8: p. 14153.

    [0319] 4. Sainsbury, F., E.C. Thuenemann, and G.P. Lomonossoff, pEAQ: versatile expression vectors for easy and quick transient expression of heterologous proteins in plants. Plant Biotechnol J, 2009. 7(7): p. 682-93.

    [0320] 5. Reed, J., et al., A translational synthetic biology platform for rapid access to gram-scale quantities of novel drug-like molecules. Metab Eng, 2017.

    [0321] 6. Moses, T., et al., Combinatorial biosynthesis of sapogenins and saponins in Saccharomyces cerevisiae using a C-16α hydroxylase from Bupleurum falcatum. Proc Natl Acad Sci USA, 2014. 111(4): p. 1634-39.

    [0322] 7. Moses, T., et al., Unravelling the Triterpenoid Saponin Biosynthesis of the African Shrub Maesa lanceolata. Mol Plant, 2014. 8: p. 122-35.

    [0323] 8. Fukushima, E.O., et al., Combinatorial biosynthesis of legume natural and rare triterpenoids in engineered yeast. Plant Cell Physiol, 2013. 54(5): p. 740-9.

    [0324] 9. Fukushima, E.O., et al., CYP716A subfamily members are multifunctional oxidases in triterpenoid biosynthesis. Plant Cell Physiol, 2011. 52(12): p. 2050-61.

    [0325] 10. Carelli, M., et al., Medicago truncatula CYP716A12 is a multifunctional oxidase involved in the biosynthesis of hemolytic saponins. Plant Cell, 2011. 23(8): p. 3070-81.

    [0326] 11. Han, J.Y., et al., The involvement of β-amyrin 28-oxidase (CYP716A52v2) in oleanane-type ginsenoside biosynthesis in Panax ginseng. Plant Cell Physiol, 2013. 54(12): p. 2034-46.

    [0327] 12. Fiallos-Jurado, J., et al., Saponin determination, expression analysis and functional characterization of saponin biosynthetic genes in Chenopodium quinoa leaves. Plant Sci, 2016. 250: p. 188-97.

    [0328] 13. Khakimov, B., et al., Identification and genome organization of saponin pathway genes from a wild crucifer, and their use for transient production of saponins in NicotiaN. benthamiana. Plant J, 2015. 84(3): p. 478-90.

    [0329] 14. Andre, C.M., et al., Multifunctional oxidosqualene cyclases and cytochrome P450 involved in the biosynthesis of apple fruit triterpenic acids. New Phytol, 2016. 211(4): p. 1279-94.

    [0330] 15. Huang, L., et al., Molecular characterization of the pentacyclic triterpenoid biosynthetic pathway in Catharanthus roseus. Planta, 2012. 236(5): p. 1571-81.

    [0331] 16. Xu, G., et al., A novel glucuronosyltransferase has an unprecedented ability to catalyse continuous two-step glucuronosylation of glycyrrhetinic acid to yield glycyrrhizin. New Phytologist, 2016. 212(1): p. 123-135.

    [0332] 17. Shibuya, M., et al., Identification and characterization of glycosyltransferases involved in the biosynthesis of soyasaponin I in Glycine max. FEBS Lett, 2010. 584(11): p. 2258-64.

    [0333] 18. Wang, P., et al., Synthesis of the potent immunostimulatory adjuvant QS-21A. J Am Chem Soc, 2005. 127(10): p. 3256-7.

    [0334] 19. Moses, T., et al., Comparative analysis of CYP93E proteins for improved microbial synthesis of plant triterpenoids. Phytochemistry, 2014. 108: p. 47-56.

    [0335] 20. Dai, Z., et al., Producing aglycons of ginsenosides in bakers’ yeast. Sci Rep, 2014. 4: p. 3698.

    [0336] 21. Dai, Z., et al., Metabolic engineering of Saccharomyces cerevisiae for production of ginsenosides. Metab Eng, 2013. 20(0): p. 146-56.

    [0337] 22. Salmon, M., et al., A conserved amino acid residue critical for product and substrate specificity in plant triterpene synthases. Proc Natl Acad Sci USA, 2016. 113(30): p. E4407-14.

    [0338] 23. Engler, C., et al., A golden gate modular cloning toolbox for plants. ACS Synth Biol, 2014. 3(11): p. 839-43.

    [0339] 24. Mugford, S.T., et al., Modularity of plant metabolic gene clusters: a trio of linked genes that are collectively required for acylation of triterpenes in oat. Plant Cell, 2013. 25(3): p. 1078-92.

    [0340] 25. Paddon, C.J., et al., High-level semi-synthetic production of the potent antimalarial artemisinin. Nature, 2013. 496(7446): p. 528-32.

    [0341] 26. MacKenzie, D.J., et al., Improved RNA Extraction from Woody Plants for the Detection of Viral Pathogens by Reverse Transcription-Polymerase Chain Reaction. Plant Disease, 1997. 81(2): p. 222-226.

    [0342] 27. Sainsbury, F. and G.P. Lomonossoff, Transient expressions of synthetic biology in plants. Current Opinion in Plant Biology, 2014. 19(0): p. 1-7.

    Appendix A: Sequence Tables and Sequences

    [0343] TABLE-US-00006 Q. saponaria sequences Clone number refers to the contig number from the original 1KP transcriptome assembly (https://db.cngb.org/blast4onekp/) Activity SID Clone/name Length Other comment QsbAS 1 OQHZ-2074321 277 bp Q. saponaria β-amyrin synthase, QsbAS1 2 758aa C-28 3 OQHZ-2073932 1443 bp Q. saponaria β-amyrin - C-28 oxidase 4 CYP716A224 480aa C-16α 5 OQHZ-2012090 1506 bp Q. saponaria β-amyrin/oleanolic acid C-16α oxidase 6 CYP716 501aa C-23 7 OQHZ-2018687 1524 bp Q. saponaria oleanolic acid C-23 oxidase 8 CYP714 507aa

    Table 2 - Non-Q. Saponaria Sequences

    [0344] Cytochrome P450s which oxidise β-amyrin (or derivatives thereof) at the relevant positions (16α, 28, 23) found in quillaic acid. Enzymes named in bold have been tested by transient expression in N. benthamiana and found to generate products consistent with those reported by the referenced studies.

    [0345] Initials preceding gene name are species as follows: As - Avena strigosa, At - Arabidopsis thaliana, Bf - Bupleurum falcatum, Bv - Barbarea vulgaris, Cq - Chenopodium quinoa, Cr -Catharanthus roseus, Md - Malus domestica, Ml - Maesa lanceolata, Mt - Medicago truncatula, Pg - Panax ginseng, Vv - Vitis vinifera.

    TABLE-US-00007 Gene Enzyme preferred Substrate Genbank ID (nucleotide) Reference (P lab) C-16α 9 nt BfCYP716Y1 β-amyrin KC963423.1 [6] (Goosens lab, VIB, Ghent, Belgium) 10 aa 11 nt MICYP87D16 β-amyrin KF318735.1 [7] (Goosens lab, VIB, Ghent, Belgium) 12 aa

    TABLE-US-00008 C-23 13 nt MtCYP72A68v2 Oleanolic acid AB558150.1 [8] (Muranaka Lab, Osaka, Japan). 14 aa 15 nt AsCYP94D65 (β-amyrin UNPUBLISHED UNPUBLISHED (Osbourn Lab, JIC 16 aa

    TABLE-US-00009 C-28 17 nt MtCYP716A12 β-amyrin FN995113.1 [9, 10] (Muranaka Lab, Osaka, Japan / Calderini Lab, IGV, Perugia Italy) 18 aa

    TABLE-US-00010 Gene Enzyme preferred Substrate Genbank ID Reference 19 VvCYP716A15 β-amyrin [9] 20 VvCYP716A17 β-amyrin AB619803.1 [9] 21 PgCYP716A52v2 β-amyrin JX036032.1 [11] 22 MlCYP716A75 β-amyrin KF318733.1 [7] 23 CqCYP716A78 β-amyrin KX343075.1 [12] 24 CqCYP716A79 β-amyrin KX343076.1 [12] 25 BvCYP716A80 β-amyrin KP795926.1 [13] 26 BvCYP716A81 β-amyrin KP795925.1 [13] 27 MdCYP716A175 β-amyrin XM_008392874.2 [14] 28 CrCYP716AL1 β-amyrin JN565975.1 [15]

    TABLE-US-00011 Accessory enzymes SEQ ID NO: Name 29 AsHMGR (Avena strigosa HMG-CoA reductase) coding sequence (1689bp): 30 AsHMGR (Avena strigosa HMG-CoA reductase) translated nucleotide sequence (562aa): 31 AstHMGR (Avena strigosa truncated HMG-CoA reductase) coding sequence (1275bp): 32 AstHMGR (Avena strigosa truncated HMG-CoA reductase) translated nucleotide sequence (424aa): 33 AsSQS (Avena strigosa squalene synthase) coding sequence (1212bp): 34 AsSQS (Avena strigosa squalene synthase) translated nucleotide sequence (403aa): 35 AtATR2 (Arabidopsis thaliana cytochrome P450 reductase 2) coding sequence (2325bp): 36 AtATR2 (Arabidopsis thaliana cytochrome P450 reductase 2) translated nucleotide sequence (774aa):

    TABLE-US-00012 Comparisons between the gene sequences as found in the 1KP dataset and the sequenced clones obtained by PCR from the Q. saponaria plants in the present disclosure Name IkP Contig Number Nucleotide substitutions Amino acid substitution QsbAS OQHZ-2074321 C1020G G1635A F340L - C-28 OQHZ-2073932 G904A 1304V C-16 OQHZ- 2012090 G1296A - T1305C - T1311C - T1314A - A1317C - T1326C - A1347G G1359C - T1363C - G1368A - G1371A - G1374T - G1377T - T1395G - A1397C K466T A1407T K469N G1412A G471E A1413G T1467C - C-23 OQHZ- 2018687 A564T -

    Table 8

    [0346] Pairwise alignments of the 18 P450s were made using Clustal Omega (version 1.2.4 - accessed through https://www.ebi.ac.uk). Numbers in the table represent percentage amino acid identity between genes. Sequences are organised according to function and the Q. saponaria genes characterised herein are given in bold. All pairwise values are represented twice, therefore redundant sequences are shown in the upper right of the table with a grey background. The Table is split across two pages for ease of presentation.embedded imageembedded image

    SEQ ID NO: 1 -Q. Saponaria Β-Amyrin Synthase, QsbAS (OQHZ-2074321) Coding Sequence (2277bp)

    [0347] TABLE-US-00013 ATGTGGAGGCTGAAGATAGCAGAAGGTGGTTCCGATCCATATCTGTTCAG CACAAACAACTTCGTGGGTCGCCAGACATGGGAGTTCGAACCGGAGGCCG GCACACCTGAGGAGCGAGCAGAGGTCGAAGCTGCCCGCCAAAACTTTTAC AACAACCGTTACCAGGTCAAGCCCTGTGACGACCTCCTTTGGAGATATCA GTTCCTGAGAGAGAAGAATTTCAAACAAACAATACCGCCTGTCAAGGTTG AAGATGGCCAAGAAATTACTTATGAGATGGCCACAACCTCAATGCAGAGG GCGGCCCGTCACCTATCAGCCTTGCAGGCCAGCGATGGCCATTGGCCAGC TCAAATTGCTGGCCCCTTGTTCTTCATGCCACCCTTGGTCTTTTGTGTGT ACATTACTGGGCATCTTAATACAGTATTCCCATCTGAACATCGCAAAGAA ATCCTTCGTTACATGTACTATCACCAGAACGAAGATGGTGGGTGGGGACT GCACATAGAGGGTCACAGCACCATGTTTTGCACAGCACTCAACTACATTT GTATGCGTATCCTTGGGGAAGGACCAGAGGGGGGTCAAGACAATGCTTGT GCCAGAGCACGAATGTGGATTCTTGATCATGGTGGTGTAACACATATTCC ATCTTGGGGAAAGACCTGGCTTTCGATACTTGGTCTATTTGAGTGGTCTG GAAGCAATCCAATGCCTCCAGAGTTTTGGATCCTTCCTTCATTTCTTCCT ATGCATCCAGCAAAAATGTGGTGCTATTGCCGGATGGTTTACATGCCCAT GTCTTATTTATATGGGAAAAGGTTTGTTGGCCCAATCACGCCTCTCATTG TTCAGTTAAGAGAGGAAATACACACTCAAAATTACCATGAAATCAACTGG AAGTCAGTCCGCCATCTATGTGCAAAGGAGGATATCTACTATCCCCATCC ACTCATCCAAGATTTGATTTGGGACAGTTTGTACATACTAACGGAGCCTC TTCTCACTCGCTGGCCCTTGAACAAGTTGGTGCGGGAGAGGGCTCTCCAA GTAACAATGAAGCATATCCACTATGAAGATGAAAATAGTCGATACATAAC CATTGGATGTGTGGAAAAGGTGTTATGTATGCTTGCTTGTTGGGTTGATG ATCCAAATGGAGATGCTTTCAAGAAGCACCTTGCTCGAGTCCCAGATTAC GTATGGGTCTCTGAAGATGGAATTACTATGCAGAGTTTTGGTAGTCAAGA ATGGGATGCTGGCTTTGCCGTCCAGGCTCTGCTTGCTTCTAATCTTACCG AGGAACTTGGCCCTGCTCTTGCCAAAGGACATGACTTCATAAAGCAATCT CAGGTTAAGGACAATCCTTCAGGTGACTTCAAAAGCATGTATCGTCACAT TTCTAGAGGATCATGGACCTTCTCTGACCAAGATCATGGATGGCAAGTTT CTGATTGCACTGCAGAAGGTCTGAAGTGTTGCCTGCTTTTGTCGATGTTG CCACCAGAAATTGTTGGTGAAAAAATGGAACCACAAAGGCTATTTGATTC TGTCAATGTGCTGCTCTCTCTACAGAGCAAAAAAGGTGGTTTAGCTGCCT GGGAGCCAGCAGGGGCGCAAGATTGGTTGGAATTACTCAATCCCACAGAA TTTTTTGCGGACATTGTCGTTGAGCATGAATATGTTGAATGTACTGGATC AGCAATTCAGGCATTAGTTTTGTTCAAGAAGCTGTATCCGGGGCACAGGA AAAAAGAGATTGACAGTTTCATTACAAATGCTGTCCGGTTCCTTGAGAAT ACACAAACGGCAGATGGCTCTTGGTATGGAAACTGGGGAGTTTGCTTCAC CTATGGTTGTTGGTTCGCACTGGGAGGGCTAGCAGCAGCTGGCAAGACTT ACAACAACTGTCCTGCAATACGCAAAGCTGTTAATTTCCTACTTACAACA CAAAGAGAAGACGGTGGTTGGGGAGAAAGCTATCTTTCAAGCCCAAAAAA GATATATGTACCCCTGGAAGGAAGCCGATCAAATGTGGTACATACTGCAT GGGCTATGATGGGTCTAATTCATGCTGGGCAGGCTGAAAGAGACTCAACT CCTCTTCATCGTGCAGCAAAGTTGATCATCAATTATCAACTAGAAAATGG CGATTGGCCGCAACAGGAAATCACTGGAGTATTCATGAAAAACTGCATGT TACATTACCCTATGTACAGAAACATCTACCCAATGTGGGCTCTTGCAGAA TACCGGAGGCGGGTTCCATTGCCTTAA

    SEQ ID NO 2 -QsbAS (OQHZ-2074321) Translated Nucleotide Sequence (758aa)

    [0348] TABLE-US-00014 MWRLKIAEGGSDPYLFSTNNFVGRQTWEFEPEAGTPEERAEVEAARQNFY NNRYQVKPCDDLLWRYQFLREKNFKQTIPPVKVEDGQEITYEMATTSMQR AARHLSALQASDGHWPAQIAGPLFFMPPLVFCVYITGHLNTVFPSEHRKE ILRYMYYHQNEDGGWGLHIEGHSTMFCTALNYICMRILGEGPEGGQDNAC ARARMWILDHGGVTHIPSWGKTWLSILGLFEWSGSNPMPPEFWILPSFLP MHPAKMWCYCRMVYMPMSYLYGKRFVGPITPLIVQLREEIHTQNYHEINW KSVRHLCAKEDIYYPHPLIQDLIWDSLYILTEPLLTRWPLNKLVRERALQ VTMKHIHYEDENSRYITIGCVEKVLCMLACWVDDPNGDAFKKHLARVPDY VWVSEDGITMQSFGSQEWDAGFAVQALLASNLTEELGPALAKGHDFIKQS QVKDNPSGDFKSMYRHISRGSWTFSDQDHGWQVSDCTAEGLKCCLLLSML PPEIVGEKMEPQRLFDSVNVLLSLQSKKGGLAAWEPAGAQDWLELLNPTE FFADIVVEHEYVECTGSAIQALVLFKKLYPGHRKKEIDSFITNAVRFLEN TQTADGSWYGNWGVCFTYGCWFALGGLAAAGKTYNNCPAIRKAVNFLLTT QREDGGWGESYLSSPKKIYVPLEGSRSNVVHTAWAMMGLIHAGQAERDST PLHRAAKLIINYQLENGDWPQQEITGVFMKNCMLHYPMYRNIYPMWALAE YRRRVPLP*

    SEQ ID NO: 3 - QsCYP716-2073932 (OQHZ-2073932) (C-28 Oxidase, Named Previously as CYP716A224 [3]) Coding Sequence (1443bp)

    [0349] TABLE-US-00015 ATGGAGCACTTGTATCTCTCCCTTGTGCTCCTGTTTGTTTCCTCAATCTC CCTCTCCCTCTTCTTCCTGTTCTACAAACACAAATCTATGTTCACCGGGG CCAACCTACCACCTGGTAAAATCGGTTACCCATTGATCGGAGAGAGCTTG GAGTTCTTGTCCACGGGATGGAAGGGCCACCCGGAGAAATTCATCTTCGA TCGCATGAGCAAGTACTCATCCCAAATCTTCAAGACCTCGATTTTAGGGG AACCAACGGCGGTGTTCCCGGGAGCCGTATGCAACAAGTTCCTCTTCTCC AACGAGAACAAGCTGGTGAATGCATGGTGGCCTGCCTCCGTGGACAAGAT CTTTCCTTCCTCACTCCAGACATCCTCCAAAGAAGAGGCCAAGAAGATGA GGAAGTTGCTTCCTCAGTTTCTCAAGCCCGAAGCTCTGCACCGCTACATT GGTATTATGGATTCTATTGCCCAGAGACACTTTGCCGATAGCTGGGAAAA CAAAAACCAAGTCATTGTCTTTCCTCTAGCAAAGAGGTATACTTTCTGGC TGGCTTGCCGTTTGTTCATTAGCGTCGAGGATCCGACCCACGTATCCAGA TTTGCTGACCCGTTCCAACTTTTGGCCGCCGGAATCATATCAATCCCAAT CGACTTGCCAGGGACACCGTTCCGCAAGGCAATCAATGCGTCCCAGTTCA TCAGGAAGGAATTGTTGGCCATCATCAGGCAGAGAAAGATCGATTTGGGT GAAGGGAAGGCATCTCCGACGCAGGACATACTGTCTCACATGTTGCTCAC ATGCGACGAGAACGGACAATACATGAATGAATTGGACATTGCCGACAAGA TTCTTGGCTTGTTGGTCGGCGGACATGACACTGCCAGTGCCGCTTGCACT TTCATTGTCAAGTTCCTCGCTGAGCTTCCCCACATTTATGAACAAGTCTA CAAGGAGCAAATGGAGATTGCAAAATCAAAAGTGCCAGGAGAGTTGTTGA ATTGGGAGGACATCCAAAAGATGAAATATTCGTGGAACGTAGCTTGTGAA GTGATGAGACTTGCCCCTCCACTCCAAGGAGCTTTCAGGGAAGCCATTAC TGACTTCGTCTTCAACGGTTTCTCCATTCCAAAAGGCTGGAAGTTGTACT GGAGCGCAAATTCCACCCACAAAAGTCCGGATTATTTCCCTGAGCCCGAC AAGTTCGACCCAACTAGATTCGAAGGAAATGGACCTGCGCCTTACACCTT TGTTCCATTTGGGGGAGGACCCAGGATGTGCCCGGGCAAAGAGTATGCCC GATTGGAAATACTTGTGTTCATGCATAACTTGGTGAAGAGGTTCAAGTGG GAGAAATTGGTTCCTGATGAAAAGATTGTGGTTGATCCAATGCCCATTCC AGCAAAGGGTCTTCCTGTTCGCCTTTATCCTCACAAAGCTTGA

    SEQ ID NO: 4 - QsCYP716_2073932 (OQHZ-2073932) Translated Nucleotide Sequence (480aa)

    [0350] TABLE-US-00016 MEHLYLSLVLLFVSSISLSLFFLFYKHKSMFTGANLPPGKIGYPLIGESL EFLSTGWKGHPEKFIFDRMSKYSSQIFKTSILGEPTAVFPGAVCNKFLFS NENKLVNAWWPASVDKIFPSSLQTSSKEEAKKMRKLLPQFLKPEALHRYI GIMDSIAQRHFADSWENKNQVIVFPLAKRYTFWLACRLFISVEDPTHVSR FADPFQLLAAGIISIPIDLPGTPFRKAINASQFIRKELLAIIRQRKIDLG EGKASPTQDILSHMLLTCDENGQYMNELDIADKILGLLVGGHDTASAACT FIVKFLAELPHIYEQVYKEQMEIAKSKVPGELLNWEDIQKMKYSWNVACE VMRLAPPLQGAFREAITDFVFNGFSIPKGWKLYWSANSTHKSPDYFPEPD KFDPTRFEGNGPAPYTFVPFGGGPRMCPGKEYARLEILVFMHNLVKRFKW EKLVPDEKIVVDPMPIPAKGLPVRLYPHKA*

    SEQ ID NO: 5 - QsCYP716-2012090 (OQHZ-2012090) (C-16α Oxidase) Coding Sequence (1506bp/1443bp): NB Long and Short Isoforms as Described Herein are Distinguished by the Presence of the First 63 Nucleotides, Underlined in the Sequences Below (21 Amino Acids)

    [0351] TABLE-US-00017 ATGATATATAATAATGATAGTAATGATAATGAATTAGTAATCAGCTCAGT TCAGCAACCATCCATGGATCCTTTCTTCATTTTTGGCTTACTTCTCTTGG CTCTCTTTCTCTCTGTTTCTTTTCTTCTCTACCTTTCCCGTAGAGCCTAT GCTTCTCTCCCCAACCCTCCGCCGGGGAAGCTCGGCTTCCCCGTCGTCGG CGAGAGTCTCGAATTTCTCTCCACCCGACGCAAAGGTGTTCCTGAGAAAT TCGTCTTCGACAGAATGGCCAAATACTGTCGGGATGTCTTTAAGACATCA ATATTGGGAGCAACCACCGCCGTCATGTGCGGCACCGCCGGTAACAAATT CTTGTTCTCCAACGAGAAAAAACACGTCACTGGTTGGTGGCCGAAATCTG TAGAGCTGATTTTCCCAACCTCACTTGAGAAATCATCCAACGAAGAATCC ATCATGATGAAACAATTCCTTCCCAACTTCTTGAAACCAGAACCTTTGCA GAAGTACATACCCGTTATGGACATAATTACCCAAAGACACTTCAATACAA GCTGGGAAGGACGCAACGTGGTCAAAGTGTTTCCTACGGCTGCCGAATTC ACCACGTTGCTGGCTTGTCGGGTATTCCTCAGTGTTGAGGATCCCATTGA AGTAGCCAAGATTTCAGAGCCATTTGAAATCTTAGCTGCTGGGTTTCTTT CAATACCCATAAATCTTCCGGGTACCAAATTAAATAAAGCGGTTAAGGCA GCGGATCAGATTAGAGACGCAATTGTACAGATTTTGAAACGGAGAAGGGT TGAAATTGCGGAGAATAAAGCAAATGGAATGCAAGATATAGCGTCCATGT TGTTGACGACACCAACTAATGCTGGGTTTTATATGACCGAGGCTCACATT TCTGAGAAAATTTTGGGTATGATTGTTGGTGGCCGTGATACTGCTAGTAC TGTTATCACCTTCATCATCAAGTATTTGGCAGAGAATCCTGAAATTTATA ATAAGGTCTATGAGGAGCAAATGGAAGTGGTAAAGTCAAAGAAACCAGGT GAGTTGCTGAACTGGGAAGATGTGCAGAAAATGAAGTACTCTTGGTGCGT AGCATGTGAAGCTATGCGACTTGCTCCTCCTGTTCAAGGTGGTTTCAAGG TGGCCATTAATGACTTTGTGTATTCTGGGTTCAACATTCGCAAGGGTTGG AAGTTATATTGGAGTGCCATTGCAACACACATGAATCCAGAATATTTCCC AGAACCTGAGAAATTCAACCCCTCAAGGTTTGAAGGGAAGGGACCAGTAC CTTACAGCTTCGTACCCTTCGGAGGCGGACCTCGGATGTGTCCCGGGAAA GAGTATTCCCGGCTGGAAACACTTGTTTTCATGCATCATTTGGTGACGAG GTACAATTGGGAGAAAGTGTATCCCACAGAGAAGATAACAGTGGATCCAA TGCCATTCCCTGTCAACGGCCTCCCCATTCGCCTTATTCCTCACAAGCAC CAATGA

    SEQ ID NO 6 - QsCYP716_2073932 Translated Nucleotide Sequence (501aa/480aa)

    [0352] TABLE-US-00018 MIYNNDSNDNELVISSVQQPSMDPFFIFGLLLLALFLSVSFLLYLSRRAY ASLPNPPPGKLGFPVVGESLEFLSTRRKGVPEKFVFDRMAKYCRDVFKTS ILGATTAVMCGTAGNKFLFSNEKKHVTGWWPKSVELIFPTSLEKSSNEES IMMKQFLPNFLKPEPLQKYIPVMDIITQRHFNTSWEGRNVVKVFPTAAEF TTLLACRVFLSVEDPIEVAKISEPFEILAAGFLSIPINLPGTKLNKAVKA ADQIRDAIVQILKRRRVEIAENKANGMQDIASMLLTTPTNAGFYMTEAHI SEKILGMIVGGRDTASTVITFIIKYLAENPEIYNKVYEEQMEVVKSKKPG ELLNWEDVQKMKYSWCVACEAMRLAPPVQGGFKVAINDFVYSGFNIRKGW KLYWSAIATHMNPEYFPEPEKFNPSRFEGKGPVPYSFVPFGGGPRMCPGK EYSRLETLVFMHHLVTRYNWEKVYPTEKITVDPMPFPVNGLPIRLIPHKH Q*

    SEQ ID NO 7 - QsCYP714_c36368 (C-23 Candidate #7) Coding Sequence (1524bp)

    [0353] TABLE-US-00019 ATGTGGTTCACAGTAGGATTGGTCTTGGTTTTCGCCCTATTCATACGTCT CTACAGCAGTCTGTGGTTGAAGCCTCGTGCAACTCGGATTAAGCTTAGCA ATCAAGGAATTAAAGGTCCAAAACCAGCATTTCTTCTGGGTAATGTTGCA GAGATGAGAAGATTTCAATCTAAGCTTCCAAAATCTGAACTCAAACAAGG CCAAGTTTCTCATGATTGGGCTTCTAAATCTCTGTTTCCATTTTTCAGTC TTTGGTCCCAGAAATACGGAAATACGTTCGTGTTCTCATTGGGGAACATA CAGGTGCTCTATGTTTCTGATCATGAGTTGGTGAAAGAAATTAATCAGAA TACCTCTTTAGATTTGGGCAAACCCAAGTACCTGCAGAAGGAGCGTGGCC CTTTGCTGGGACAAGGTATTTTGACCTCCAATGGACAGCTTTGGGCGTAC CAGAGAAAAATCATGACTCCTGAACTCTACAAGGAGAAAATCAAGGGCAT GTGCGAGTTGATGGTGGAATCTGTAGCTTGGTTGGTTGAGGAATGGGGAA CGAAGATCCAAGCTGAGGGTGGGGCAGCAGACATTAGAATAGACGAGGAT CTTAGAAGCTTCTCTGGTGATGTAATTTCAAAAGCTTGTTTTGGGAGCTG CTATGCCGGAGGGAGGGAAATCTTTCTTAGGCTCAGAGCTCTTCAACACC AAATTGCTTCCAAAGCCTTACTCATGGGCTTCCCTGGATTAAAGTACCTG CCCATTAAGAGCAACAGAGAGATATGGAGATTGGAGAAGGAGATCTTCCA GCTGATTATGAAGCTGGCTGAAGATAGAAAAAAAGAACAACATGAGAGAG ACCTATTACAGATTATAATTGAGGGAGCTAAAAGTAGTGATCTGAGTTCG GAAGCAATGGCAAAATTCATTGTGGACAACTGCAAGAATGTCTACTTGGC TGGCCATGAAACTACTGCAATGTCTGCTGGTTGGACTTTGCTTCTCTTGG CTAATCATCCTGAGTGGCAAGCCCGTGTCCGTGATGAGATTTTACAAGTC ACCGAGGGCCGCAATCCTGATTTTGACATGCTGCACAAGATGAAACTGTT AACAATGGTAATTCAGGAGGCACTGCGACTCTACCCAACAGTCATATTCA TGTCAAGAGAAGCATTGGAAGATATTAATGTTGGAAACATCCAAGTTCCA AAAGGTGTTAACATATGGATACCTGTGGTAAATCTTCAAAGGGACACAAC GGTATGGGGTGCAGACGCAAACGAGTTTAATCCTGAAAGGTTTGCCAATG GAGTTAACAATTCATGCAAGGTTCCACAACTTTACCTACCATTTGGAGCT GGACCTCGCATTTGTCCTGGAATTAATCTGGCCATGACTGAGATCAAGAT ACTTCTGTGTATCCTGCTCACCAAGTTTTCGTTTTCAGTTTCACCCAACT ATCGCCACTCACCGGTGTTTAAATTGGTGCTTGAGCCTGAAAATGGAATC AATGTCATCATGAAGAAGCTCTAA

    SEQ ID NO: 8 - QsCYP714_c36368 (C-23 Candidate #7) Translated Nucleotide Sequence (507aa)

    [0354] TABLE-US-00020 MWFTVGLVLVFALFIRLYSSLWLKPRATRIKLSNQGIKGPKPAFLLGNVA EMRRFQSKLPKSELKQGQVSHDWASKSLFPFFSLWSQKYGNTFVFSLGNI QVLYVSDHELVKEINQNTSLDLGKPKYLQKERGPLLGQGILTSNGQLWAY QRKIMTPELYKEKIKGMCELMVESVAWLVEEWGTKIQAEGGAADIRIDED LRSFSGDVISKACFGSCYAGGREIFLRLRALQHQIASKALLMGFPGLKYL PIKSNREIWRLEKEIFQLIMKLAEDRKKEQHERDLLQIIIEGAKSSDLSS EAMAKFIVDNCKNVYLAGHETTAMSAGWTLLLLANHPEWQARVRDEILQV TEGRNPDFDMLHKMKLLTMVIQEALRLYPTVIFMSREALEDINVGNIQVP KGVNIWIPVVNLQRDTTVWGADANEFNPERFANGVNNSCKVPQLYLPFGA GPRICPGINLAMTEIKILLCILLTKFSFSVSPNYRHSPVFKLVLEPENGI NVIMKKL*

    SEQ ID NO: 9; BfCYP716Y1 (Bupleurum Falcatum C-16α Oxidase) Coding Sequence 1437bp)

    [0355] TABLE-US-00021 ATGGAACTTTCTATCACTCTGATGCTTATTTTCTCAACAACCATCTTCTT TATATTTCGTAATGTGTACAACCATCTCATCTCTAAACACAAAAACTATC CCCCTGGAAGTATGGGCTTGCCTTACATTGGCGAAACACTTAGTTTCGCG AGATACATCACCAAAGGAGTCCCTGAAAAATTCGTAATAGAAAGACAAAA GAAATATTCAACAACAATATTTAAGACCTCCTTGTTCGGAGAAAACATGG TGGTGTTGGGCAGTGCAGAGGGCAACAAATTTATTTTTGGAAGCGAGGAG AAGTATTTACGAGTGTGGTTTCCAAGTTCTGTGGACAAAGTGTTCAAAAA ATCTCATAAGAGAACGTCGCAGGAAGAAGCTATTAGGTTGCGCAAAAACA TGGTGCCATTTCTCAAAGCAGATTTGTTGAGAAGTTATGTACCAATAATG GACACATTTATGAAACAACATGTGAACTCGCATTGGAATTGCGAGACCTT GAAGGCTTGTCCTGTGATCAAGGATTTTACGTTTACTTTAGCTTGTAAAC TTTTTTTTAGTGTAGACAATCCTTTGGAGCTAGAGAAGTTAATCAAGCTA TTTGTGAATATAGTGAATGGCCTCCTTACGGTCCCTATTGATCTCCCGGG GACAAAATTTAGAGGAGTTATAAAGAGTGTCAAGACTATTCGCCATGCGC TTAAAGTGTTGATCAGGCAACGAAAGGTGGATATTAGAGAGAAAAGAGCC ACACCTACGCAAGATATATTGTCGATAATGCTGGCACAGGCTGAGGACGA GAACTATGAAATGAATGATGAAGATGTGGCCAATGACTTTCTTGCAGTTT TGCTTGCTAGTTATGATTCTGCCAATACTACACTCACCATGATTATGAAA TATCTTGCTGAATATCCCGAAATGTATGATCGAGTTTTCAGAGAACAAAT GGAGGTGGCAAAGACGAAAGGAAAAGATGAATTACTCAACTTGGACGACT TGCAAAAGATGAATTATACTTGGAATGTAGCTTGTGAAGTACTGAGAATT GCAACACCAACGTTCGGAGCATTCAGAGAGGTTATTGCAGATTGTACATA CGAAGGGTACACCATACCAAAAGGCTGGAAGCTATATTATGCCCCGCGTT TTACCCATGGAAGTGCAAAATACTTTCAAGATCCAGAGAAATTTGATCCA TCGCGATTTGAAGGTGATGGTGCGCCTCCTTATACATTCGTTCCATTCGG AGGAGGGCTCCGGATGTGCCCTGGATACAAGTATGCAAAGATTATAGTAC TAGTGTTCATGCACAATATAGTTACAAAGTTCAAATGGGAGAAAGTTAAC CCTAATGAGAAAATGACAGTAGGAATCGTATCAGCGCCAAGTCAAGGACT TCCACTGCGTCTCCATCCCCACAAATCTCCATCTTAA

    SEQ ID NO: 10; BfCYP716Y1 (Bupleurum Falcatum C-16α Oxidase) Coding Sequence (478aa)

    [0356] TABLE-US-00022 MELSITLMLIFSTTIFFIFRNVYNHLISKHKNYPPGSMGLPYIGETLSFA RYITKGVPEKFVIERQKKYSTTIFKTSLFGENMVVLGSAEGNKFIFGSEE KYLRVWFPSSVDKVFKKSHKRTSQEEAIRLRKNMVPFLKADLLRSYVPIM DTFMKQHVNSHWNCETLKACPVIKDFTFTLACKLFFSVDNPLELEKLIKL FVNIVNGLLTVPIDLPGTKFRGVIKSVKTIRHALKVLIRQRKVDIREKRA TPTQDILSIMLAQAEDENYEMNDEDVANDFLAVLLASYDSANTTLTMIMK YLAEYPEMYDRVFREQMEVAKTKGKDELLNLDDLQKMNYTWNVACEVLRI ATPTFGAFREVIADCTYEGYTIPKGWKLYYAPRFTHGSAKYFQDPEKFDP SRFEGDGAPPYTFVPFGGGLRMCPGYKYAKIIVLVFMHNIVTKFKWEKVN PNEKMTVGIVSAPSQGLPLRLHPHKSPS*

    SEQ ID NO: 11; MlCYP87D16 (Maesa Lanceolata C-16α Oxidase) Coding Sequence 1428bp)

    [0357] TABLE-US-00023 ATGTGGGTAGTGGGATTAATTGGTGTGGCTGTGGTAACAATATTGATAAC TCAGTATGTATACAAATGGAGAAATCCAAAGACTGTGGGTGTTCTGCCAC CTGGTTCAATGGGTCTGCCTTTGATCGGGGAGACTCTTCAACTTCTCAGC CGTAATCCATCCTTGGATCTTCATCCTTTCATCAAGAGCAGAATCCAAAG ATATGGGCAGATATTCGCGACCAATATCGTAGGTCGACCCATAATAGTAA CCGCTGATCCGCAGCTCAATAATTACCTTTTCCAACAAGAAGGAAGAGCA GTAGAACTGTGGTACTTGGACAGCTTTCAAAAGCTATTTAACTTAGAAGG TGCAAACAGGCCGAACGCAGTTGGTCACATTCACAAGTACGTTAGAAGTG TATACTTGAGTCTCTTTGGCGTCGAGAGCCTTAAAACAAAGTTGCTTGCC GATATTGAGAAAACAGTCCGCAAAAATCTTATTGGTGGGACAACCAAAGG CACCTTTGATGCAAAACATGCTTCTGCCAATATGGTTGCTGTTTTTGCTG CAAAATACTTGTTCGGACATGATTACGAGAAATCGAAAGAAGATGTAGGC AGCATAATCGACAACTTCGTACAAGGACTTCTCGCATTCCCATTGAATGT TCCCGGTACAAAGTTCCACAAATGTATGAAGGACAAGAAAAGGCTGGAAT CAATGATCACTAACAAGCTAAAGGAGAGAATAGCTGATCCGAACAGCGGA CAAGGGGATTTCCTTGATCAAGCAGTGAAAGACTTGAATAGCGAATTCTT CATAACAGAGACTTTTATCGTTTCGGTGACGATGGGAGCTTTATTTGCGA CGGTTGAATCGGTTTCGACAGCAATTGGACTAGCTTTCAAGTTTTTTGCA GAGCACCCCTGGGTTTTGGATGACCTCAAGGCTGAGCATGAGGCTGTCCT TAGCAAAAGAGAGGATAGAAATTCACCTCTCACGTGGGACGAATATAGAT CGATGACACACACGATGCACTTTATCAATGAAGTCGTCCGTTTGGGAAAT GTTTTTCCTGGAATTTTGAGGAAAGCACTGAAAGATATTCCATATAATGG TTATACAATTCCGTCCGGTTGGACCATTATGATTGTGACCTCTACCCTTG CGATGAACCCTGAGATATTCAAGGATCCTCTTGCATTCAATCCGAAACGT TGGCGGGATATTGATCCCGAAACTCAAACTAAAAACTTTATGCCTTTCGG TGGTGGGACGAGACAATGCGCAGGTGCAGAGCTAGCCAAGGCATTCTTTG CTACCTTCCTCCATGTTTTAATCAGCGAATATAGCTGGAAGAAAGTGAAG GGAGGAAGCGTTGCTCGGACACCTATGTTAAGTTTTGAAGATGGCATATT TATTGAGGTCACCAAGAAAAACAAGTGA

    SEQ ID NO: 12; MlCYP87D16 (Maesa Lanceolata C-16α Oxidase) Coding Sequence (475aa)

    [0358] TABLE-US-00024 MWVVGLIGVAVVTILITQYVYKWRNPKTVGVLPPGSMGLPLIGETLQLLS RNPSLDLHPFIKSRIQRYGQIFATNIVGRPIIVTADPQLNNYLFQQEGRA VELWYLDSFQKLFNLEGANRPNAVGHIHKYVRSVYLSLFGVESLKTKLLA DIEKTVRKNLIGGTTKGTFDAKHASANMVAVFAAKYLFGHDYEKSKEDVG SIIDNFVQGLLAFPLNVPGTKFHKCMKDKKRLESMITNKLKERIADPNSG QGDFLDQAVKDLNSEFFITETFIVSVTMGALFATVESVSTAIGLAFKFFA EHPWVLDDLKAEHEAVLSKREDRNSPLTWDEYRSMTHTMHFINEVVRLGN VFPGILRKALKDIPYNGYTIPSGWTIMIVTSTLAMNPEIFKDPLAFNPKR WRDIDPETQTKNFMPFGGGTRQCAGAELAKAFFATFLHVLISEYSWKKVK GGSVARTPMLSFEDGIFIEVTKKNK*

    SEQ ID NO: 13; MtCYP72A68v2 (Medicago Truncatula C-23 Oxidase) Coding Sequence 1563bp)

    [0359] TABLE-US-00025 ATGGAATTATCTTGGGAAACAAAATCAGCCATAATTCTCATCACTGTGAC ATTTGGTTTGGTATACGCATGGAGGGTATTGAATTGGATGTGGCTGAAGC CAAAGAAGATAGAGAAGCTTTTAAGAGAACAAGGCCTTCAAGGGAACCCT TATAGACTTTTGCTTGGAGATGCAAAGGATTATTTTGTGATGCAAAAGAA AGTTCAATCCAAACCCATGAATCTATCTGATGATATTGCGCCACGTGTCG CTCCTTACATTCATCATGCTGTTCAAACTCATGGGAAAAAGTCTTTTATT TGGTTTGGAATGAAACCATGGGTGATTCTCAATGAACCTGAACAAATAAG AGAAGTATTCAACAAGATGTCTGAGTTCCCAAAGGTTCAATATAAGTTTA TGAAGTTAATAACTCGCGGTCTTGTTAAACTAGAAGGAGAAAAGTGGAGC AAGCATAGAAGAATAATCAACCCTGCGTTTCACATGGAAAAATTGAAGAT TATGACACCAACATTCTTGAAAAGCTGCAATGATTTGATTAGCAATTGGG AAAAAATGTTGTCTTCAAATGGATCATGTGAAATGGACGTATGGCCTTCC CTTCAGAGCTTGACAAGTGATGTTATCGCTCGTTCGTCATTTGGAAGTAG TTATGAAGAAGGAAGAAAAGTATTTCAACTTCAAATAGAGCAAGGTGAAC TTATAATGAAAAATCTAATGAAATCTTTAATCCCTTTATGGAGGTTTTTA CCTACCGCTGATCATAGAAAGATAAATGAAAATGAAAAACAAATAGAAAC TACTCTTAAGAATATAATTAACAAGAGGGAAAAAGCAATTAAGGCAGGTG AAGCCACTGAGAATGACTTATTAGGTCTCCTCCTAGAGTCGAACCACAGA GAAATTAAAGAACATGGAAACGTCAAGAATATGGGATTGAGTCTTGAAGA AGTAGTCGGGGAATGCAGGTTATTCCATGTTGCAGGGCAAGAGACTACTT CAGATTTGCTTGTTTGGACGATGGTGTTGTTGAGTAGGTACCCTGATTGG CAAGAACGTGCAAGGAAGGAAGTATTAGAGATATTTGGCAATGAAAAACC CGACTTTGATGGACTAAATAAACTTAAGATTATGGCCATGATTTTGTATG AGGTTTTGAGGTTGTACCCTCCTGTAACCGGCGTTGCTCGAAAAGTTGAG AATGATATAAAACTTGGAGACTTGACATTATATGCTGGAATGGAGGTTTA CATGCCAATTGTTTTGATTCACCATGATTGTGAACTATGGGGTGATGATG CTAAGATTTTCAATCCTGAGAGATTTTCTGGTGGAATTTCCAAAGCAACA AACGGTAGATTTTCATATTTTCCGTTTGGAGCGGGTCCTAGAATCTGCAT TGGACAAAACTTTTCCCTGTTGGAAGCAAAGATGGCAATGGCATTGATTT TAAAGAATTTTTCATTTGAACTTTCTCAAACATATGCTCATGCTCCATCT GTGGTGCTTTCTGTTCAGCCACAACATGGTGCTCATGTTATTCTACGCAA AATCAAAACATAA

    SEQ ID NO: 14; MtCYP72A68v2 (Medicago Truncatula C-23 Oxidase) Translated Nucleotide Sequence 520aa)

    TABLE-US-00026 MELSWETKSAIILITVTFGLVYAWRVLNWMWLKPKKIEKLLREQGLQGNP YRLLLGDAKDYFVMQKKVQSKPMNLSDDIAPRVAPYIHHAVQTHGKKSFI WFGMKPWVILNEPEQIREVFNKMSEFPKVQYKFMKLITRGLVKLEGEKWS KHRRIINPAFHMEKLKIMTPTFLKSCNDLISNWEKMLSSNGSCEMDVWPS LQSLTSDVIARSSFGSSYEEGRKVFQLQIEQGELIMKNLMKSLIPLWRFL PTADHRKINENEKQIETTLKNIINKREKAIKAGEATENDLLGLLLESNHR EIKEHGNVKNMGLSLEEVVGECRLFHVAGQETTSDLLVWTMVLLSRYPDW QERARKEVLEIFGNEKPDFDGLNKLKIMAMILYEVLRLYPPVTGVARKVE NDIKLGDLTLYAGMEVYMPIVLIHHDCELWGDDAKIFNPERFSGGISKAT NGRFSYFPFGAGPRICIGQNFSLLEAKMAMALILKNFSFELSQTYAHAPS VVLSVQPQHGAHVILRKIKT*

    SEQ ID NO: 15; AsCYP94D65 (Avena Strigosa C-23 Oxidase) Coding Sequence 1551 bp)

    [0360] TABLE-US-00027 ATGGAGCCGGCGCCCTTGAGCTCATCGCCGGTGCTTATCTGCCTCCTACT CCTACTCCTACCCATCGTCCTCTATTTTGTGTACCGGAAAAATAATCTGA AGAGGAAGCAGCAGCAGCAGCAGCAGAATGGGCCGCGGGAGCTGCGGGCG TACCCGATCGTGGGCACGCTTCCACACTTCATCAAGAACGGGCGGCGCTT CCTGGAGTGGTCGTCGGCCGTCATGCAGCGCAGCCCGACGCACACCATGA TCCTCAAGGTGCTGGGCCTGTCGGGCACCGTGTTCACGGCGAGCCCGGCC AGCGTGGAACACGTGCTGAAGACGCGCTTCGCGAACTACCCGAAAGGCGG TCTGGTCGATATCCAGACCGACTTCCTTGGGCACGGCATCTTCAACTCGG ACGGCGAGGAGTGGCAGCAGCAGCGCAAGATGGCCAGCTACGAGTTCAAC CAGCGGTCGCTCAGGAGCTTCGTGGTGCACGCCGTCCGTTTCGAGGTGGT GGAGCGCCTGCTGCCGCTGCTGGAGCGGGCCGCCGGGGCTGGAGCGGCCG TCGACCTGCAGGACGTGCTGGAGCGCTTCGCCTTCGACAACATCTGCCGC GTGGCTTTCGGCCAGGACCCGGCATGCCTCACGGAGGAGAGCATGGGCGC GAGGCAGAGCGTGGAGTTGATGCACGCCTTCGATGTGGCAAGCACCATCG TCATTACCAGGTTCGTGTCTCCGACGTGGTTGTGGCGCCTGATGAAGCTG CTCAACGTGGGGCCGGAGCGGCGGATGCGGAAGGCACTGGCATCCATCCA CGGCTACGCCGACAACATCATCCGGGAGAGGAAGAAGAAGAAGAAGACAT CAGGGAAGGACGACGACCTCCTGTCGCGCTTCGCCGATTCCGGCGAGCAC AGCGACGAGAGCCTCCGCTACGTGATCACCAACTTCATACTCGCCGGCCG CGACTCCAGCTCCGCCGCGCTCACATGGTTTTTCTGGCTCGTCTCCACCA GGCCCGAGGTACAGGACAGGATCTCCAAGGAGATCCGAGCGGCGCGCCAG GCAAGCGCAACGACGACGGGGCCCTTCGGCCTGGAGGAGCTGCGCGAGAT GCACTACATCCACGCCGCCATCACGGAGTCCATGCGGCTCTACCCGCCGG TGCCCATCAACGCGCGCACCTCCACCGAGGACGATGTCCTTCCAGACGGC ACCGTGGTCGGGAAAGGCTGGCGGGTGATCTACTCCGCCTACGCCATGGG GCGGATGGAGGACGCCTGGGGAAAGGACGGGGACGAGTTCCGGCCGGAGA GGTGGCTGGACGCGGAGACAGGGGTGTTCAGGCCGGAGGCACCCTGCAAG TACCCGGTGTTCCACGTCGGCCCAAGAATGTGCCTCGGCAAAGAGATGGC CTACATACAGATGAAGTCCATCGTGGCGTCCGTGTTTGAGAGGTTCAGCT TGCGCTACCTCGGCGGGGACGCCCATCCCGGCCTCCAGCTCGCTGGAACT CTGCGCATGGAAGGCGGCTTGCCGATGCACCTAGAAATCAGTACTAACTA G

    SEQ ID NO: 16; AsCYP94D65 (Avena Strigosa C-23 Oxidase) Translated Nucleotide Sequence 516aa)

    [0361] TABLE-US-00028 MEPAPLSSSPVLICLLLLLLPIVLYFVYRKNNLKRKQQQQQQNGPRELRA YPIVGTLPHFIKNGRRFLEWSSAVMQRSPTHTMILKVLGLSGTVFTASPA SVEHVLKTRFANYPKGGLVDIQTDFLGHGIFNSDGEEWQQQRKMASYEFN QRSLRSFVVHAVRFEVVERLLPLLERAAGAGAAVDLQDVLERFAFDNICR VAFGQDPACLTEESMGARQSVELMHAFDVASTIVITRFVSPTWLWRLMKL LNVGPERRMRKALASIHGYADNIIRERKKKKKTSGKDDDLLSRFADSGEH SDESLRYVITNFILAGRDSSSAALTWFFWLVSTRPEVQDRISKEIRAARQ ASATTTGPFGLEELREMHYIHAAITESMRLYPPVPINARTSTEDDVLPDG TVVGKGWRVIYSAYAMGRMEDAWGKDGDEFRPERWLDAETGVFRPEAPCK YPVFHVGPRMCLGKEMAYIQMKSIVASVFERFSLRYLGGDAHPGLQLAGT LRMEGGLPMHLEISTN*

    SEQ ID NO: 17; MtCYP716A12 (Medicago Truncatula C-28 Oxidase) Coding Sequence 1440bp)

    [0362] TABLE-US-00029 ATGGAGCCTAATTTCTATCTCTCCCTTCTCCTTCTCTTTGTCACTTTCAT ATCTCTCTCTCTTTTTTTCATATTCTACAAACAGAAATCTCCATTAAATT TGCCACCTGGTAAAATGGGTTACCCAATCATAGGTGAAAGCCTTGAGTTC TTATCAACAGGATGGAAAGGACATCCTGAAAAATTCATTTTCGACCGTAT GCGTAAATATTCCTCAGAACTCTTTAAAACATCAATCGTAGGAGAATCTA CGGTGGTTTGTTGCGGAGCAGCAAGTAACAAGTTTTTGTTTTCAAACGAG AATAAACTTGTGACTGCATGGTGGCCAGATAGTGTAAACAAAATCTTCCC TACTACTTCTCTTGACTCTAACTTGAAGGAAGAATCCATCAAGATGAGAA AATTGCTTCCACAATTCTTTAAACCCGAAGCTCTACAACGTTATGTTGGT GTCATGGATGTTATTGCTCAAAGACATTTTGTTACTCATTGGGATAATAA AAATGAAATCACCGTCTACCCCTTGGCCAAGAGGTACACCTTTTTGTTAG CTTGTCGGTTGTTCATGAGCGTTGAAGACGAGAATCATGTAGCAAAATTT AGTGATCCATTTCAGTTAATTGCGGCCGGAATCATATCTCTACCAATTGA TTTGCCAGGAACACCATTCAACAAAGCTATAAAGGCCTCAAACTTTATAA GAAAGGAGTTGATTAAGATCATAAAGCAAAGGAGGGTAGATTTGGCAGAA GGGACAGCATCACCAACACAAGATATATTGTCTCACATGTTGTTGACAAG TGATGAAAATGGAAAGAGTATGAATGAACTTAATATTGCTGATAAGATTC TTGGCCTTTTGATCGGAGGACATGACACTGCTAGCGTCGCATGCACTTTC CTTGTCAAATATCTCGGCGAGTTACCTCACATTTATGATAAAGTCTATCA AGAGCAAATGGAAATTGCAAAATCGAAACCAGCAGGAGAATTGTTGAATT GGGATGACCTGAAGAAAATGAAATACTCTTGGAACGTAGCTTGTGAAGTA ATGAGACTTTCCCCTCCACTCCAAGGAGGTTTCAGGGAAGCCATCACTGA CTTTATGTTCAATGGATTCTCAATTCCTAAGGGATGGAAGCTTTATTGGA GTGCAAATTCAACACATAAGAACGCAGAATGTTTTCCCATGCCAGAGAAA TTTGACCCAACAAGATTTGAAGGAAATGGACCAGCTCCTTATACTTTTGT TCCCTTTGGTGGAGGACCAAGGATGTGTCCTGGAAAAGAGTATGCAAGAT TAGAAATACTTGTTTTCATGCACAATTTGGTGAAAAGGTTTAAGTGGGAA AAGGTGATTCCAGATGAGAAGATTATTGTTGATCCATTCCCCATCCCTGC AAAGGATCTTCCAATTCGCCTTTATCCACACAAAGCTTAA

    SEQ ID NO: 18; MtCYP716A12 (Medicago Truncatula C-28 Oxidase) Coding Sequence (479aa)

    [0363] TABLE-US-00030 MEPNFYLSLLLLFVTFISLSLFFIFYKQKSPLNLPPGKMGYPIIGESLEF LSTGWKGHPEKFIFDRMRKYSSELFKTSIVGESTVVCCGAASNKFLFSNE NKLVTAWWPDSVNKIFPTTSLDSNLKEESIKMRKLLPQFFKPEALQRYVG VMDVIAQRHFVTHWDNKNEITVYPLAKRYTFLLACRLFMSVEDENHVAKF SDPFQLIAAGIISLPIDLPGTPFNKAIKASNFIRKELIKIIKQRRVDLAE GTASPTQDILSHMLLTSDENGKSMNELNIADKILGLLIGGHDTASVACTF LVKYLGELPHIYDKVYQEQMEIAKSKPAGELLNWDDLKKMKYSWNVACEV MRLSPPLQGGFREAITDFMFNGFSIPKGWKLYWSANSTHKNAECFPMPEK FDPTRFEGNGPAPYTFVPFGGGPRMCPGKEYARLEILVFMHNLVKRFKWE KVIPDEKIIVDPFPIPAKDLPIRLYPHKA*

    SEQ ID NO: 29; AsHMGR (Avena Strigosa HMG-CoA Reductase) Coding Sequence (1689 bp): NB: Full-Length HMGR Sequence is Provided Below. The 5′ Region (Underlined) Can Be Removed to Generate a Truncated Feedback-Insensitive Form (tHMGR). the Sequence for tHMGR is Also Given Separately Below

    [0364] TABLE-US-00031 ATGGCTGTGGAGGTTCACCGCCGGGCTCCCGCGCCCCATGGCCGGGGCAC CGGGGAGAAGGGCCGCGTGCAGGCCGGGGACGCGCTGCCGCTGCCGATCC GCCACACCAACCTCATCTTCTCGGCGCTCTTCGCCGCCTCCCTCGCATAC CTCATGCGCCGCTGGAGGGAGAAGATCCGCAACTCCACGCCGCTCCACGT CGTGGGGCTCACCGAGATCTTCGCCATCTGCGGCCTCGTCGCCTCCCTCA TCTACCTCCTCAGCTTCTTCGGCATCGCCTTCGTGCAGTCCGTCGTATCC AACAGCGACGACGAGGACGAGGACTTCCTCATCGCGGCTGCAGCATCCCA GGCCCCCCCGCCGCCCTCCTCCAAGCCCGCGCCGCAGCAGTGCGCCCTGC TGCAGAGCGCCGGAGTCGCGCCCGAGAAAATGCCCGAGGAGGACGAGGAA ATCGTCGCCGGGGTCGTCGCAGGGAAGATCCCCTCCTACGTGCTCGAGAC CAGGCTAGGCGACTGCCGCAGGGCAGCCGGGATCCGCCGCGAGGCGCTGC GCCGGATCACCGGCAGGGAGATCGACGGCCTTCCCCTCGACGGCTTCGAC TACGACTCGATTCTCGGACAGTGCTGCGAGATGCCCGTCGGGTACGTGCA GCTGCCGGTCGGCGTCGCGGGGCCGCTCGTCCTCGACGGCCGCCGCATAT ACGTCCCGATGGCCACCACGGAGGGCTGCCTAATCGCCAGCACCAACCGC GGATGCAAGGCCATTGCCGAGTCCGGAGGCGCATCCAGCGTCGTGTACCG CGACGGGATGACCCGCGCCCCCGTAGCCCGCTTCCCCTCCGCACGACGCG CCGCAGAGCTCAAGGGCTTCCTGGAGAATCCGGCCAACTACGACACCCTG TCCGTGGTCTTTAACAGATCAAGCAGATTTGCAAGGCTGCAGGGGGTCAA GTGCGCCATGGCTGGGAGGAACTTGTACATGAGGTTCACCTGCAGCACCG GGGATGCCATGGGGATGAACATGGTCTCCAAGGGCGTCCAAAATGTGCTC GACTATCTGCAGGAGGACTTCCCTGACATGGACGTTGTCAGCATCTCAGG CAACTTTTGTTCCGACAAGAAATCAGCTGCTGTAAACTGGATTGAAGGCC GTGGAAAGTCCGTGGTTTGTGAGGCAGTAATCAGAGAGGAAGTTGTCCAC AAGGTTCTCAAGACCAACGTTCAGTCACTCGTGGAGTTGAATGTGATCAA GAACCTTGCTGGCTCAGCAGTTGCTGGTGCTCTTGGGGGTTTCAACGCCC ACGCAAGCAACATCGTAACGGCTATCTTCATTGCCACTGGTCAGGATCCT GCACAGAATGTGGAGAGCTCACAGTGTATCACTATGTTGGAAGCTGTAAA TGATGGCAGAGACCTTCACATCTCCGTTACAATGCCATCTATCGAGGTGG GCACAGTTGGTGGAGGCACGCAGCTGGCCTCACAGTCGGCCTGCTTGGAC CTACTGGGCGTCAAAGGCGCCAACAGGGAATCTCCGGGGTCGAACGCTAG GCTGCTGGCCACGGTGGTGGCTGGTGCCGTCCTAGCTGGGGAGCTGTCCC TCATCTCCGCCCAAGCTGCCGGCCATCTGGTCCAGAGCCACATGAAATAC AACAGATCCAGCAAGGACATGTCCAAGATCGCCTGCTGA

    SEQ ID NO: 30; AsHMGR (Avena Strigosa HMG-CoA Reductase) Translated Nucleotide sequence (562aa)

    [0365] TABLE-US-00032 MAVEVHRRAPAPHGRGTGEKGRVQAGDALPLPIRHTNLIFSALFAASLAY LMRRWREKIRNSTPLHVVGLTEIFAICGLVASLIYLLSFFGIAFVQSVVS NSDDEDEDFLIAAAASQAPPPPSSKPAPQQCALLQSAGVAPEKMPEEDEE IVAGVVAGKIPSYVLETRLGDCRRAAGIRREALRRITGREIDGLPLDGFD YDSILGQCCEMPVGYVQLPVGVAGPLVLDGRRIYVPMATTEGCLIASTNR GCKAIAESGGASSVVYRDGMTRAPVARFPSARRAAELKGFLENPANYDTL SVVFNRSSRFARLQGVKCAMAGRNLYMRFTCSTGDAMGMNMVSKGVQNVL DYLQEDFPDMDVVSISGNFCSDKKSAAVNWIEGRGKSVVCEAVIREEVVH KVLKTNVQSLVELNVIKNLAGSAVAGALGGFNAHASNIVTAIFIATGQDP AQNVESSQCITMLEAVNDGRDLHISVTMPSIEVGTVGGGTQLASQSACLD LLGVKGANRESPGSNARLLATVVAGAVLAGELSLISAQAAGHLVQSHMKY NRSSKDMSKIAC*

    SEQ ID NO: 31; AstHMGR (Avena Strigosa Truncated HMG-CoA Reductase) Coding Sequence (1275bp)

    [0366] TABLE-US-00033 ATGGCGCCCGAGAAAATGCCCGAGGAGGACGAGGAAATCGTCGCCGGGGT CGTCGCAGGGAAGATCCCCTCCTACGTGCTCGAGACCAGGCTAGGCGACT GCCGCAGGGCAGCCGGGATCCGCCGCGAGGCGCTGCGCCGGATCACCGGC AGGGAGATCGACGGCCTTCCCCTCGACGGCTTCGACTACGACTCGATTCT CGGACAGTGCTGCGAGATGCCCGTCGGGTACGTGCAGCTGCCGGTCGGCG TCGCGGGGCCGCTCGTCCTCGACGGCCGCCGCATATACGTCCCGATGGCC ACCACGGAGGGCTGCCTAATCGCCAGCACCAACCGCGGATGCAAGGCCAT TGCCGAGTCCGGAGGCGCATCCAGCGTCGTGTACCGCGACGGGATGACCC GCGCCCCCGTAGCCCGCTTCCCCTCCGCACGACGCGCCGCAGAGCTCAAG GGCTTCCTGGAGAATCCGGCCAACTACGACACCCTGTCCGTGGTCTTTAA CAGATCAAGCAGATTTGCAAGGCTGCAGGGGGTCAAGTGCGCCATGGCTG GGAGGAACTTGTACATGAGGTTCACCTGCAGCACCGGGGATGCCATGGGG ATGAACATGGTCTCCAAGGGCGTCCAAAATGTGCTCGACTATCTGCAGGA GGACTTCCCTGACATGGACGTTGTCAGCATCTCAGGCAACTTTTGTTCCG ACAAGAAATCAGCTGCTGTAAACTGGATTGAAGGCCGTGGAAAGTCCGTG GTTTGTGAGGCAGTAATCAGAGAGGAAGTTGTCCACAAGGTTCTCAAGAC CAACGTTCAGTCACTCGTGGAGTTGAATGTGATCAAGAACCTTGCTGGCT CAGCAGTTGCTGGTGCTCTTGGGGGTTTCAACGCCCACGCAAGCAACATC GTAACGGCTATCTTCATTGCCACTGGTCAGGATCCTGCACAGAATGTGGA GAGCTCACAGTGTATCACTATGTTGGAAGCTGTAAATGATGGCAGAGACC TTCACATCTCCGTTACAATGCCATCTATCGAGGTGGGCACAGTTGGTGGA GGCACGCAGCTGGCCTCACAGTCGGCCTGCTTGGACCTACTGGGCGTCAA AGGCGCCAACAGGGAATCTCCGGGGTCGAACGCTAGGCTGCTGGCCACGG TGGTGGCTGGTGCCGTCCTAGCTGGGGAGCTGTCCCTCATCTCCGCCCAA GCTGCCGGCCATCTGGTCCAGAGCCACATGAAATACAACAGATCCAGCAA GGACATGTCCAAGATCGCCTGCTGA

    SEQ ID NO: 32; AstHMGR (Avena Strigosa Truncated HMG-CoA Reductase) Translated Nucleotide Sequence (424aa)

    [0367] TABLE-US-00034 MAPEKMPEEDEEIVAGVVAGKIPSYVLETRLGDCRRAAGIRREALRRITG REIDGLPLDGFDYDSILGQCCEMPVGYVQLPVGVAGPLVLDGRRIYVPMA TTEGCLIASTNRGCKAIAESGGASSVVYRDGMTRAPVARFPSARRAAELK GFLENPANYDTLSVVFNRSSRFARLQGVKCAMAGRNLYMRFTCSTGDAMG MNMVSKGVQNVLDYLQEDFPDMDVVSISGNFCSDKKSAAVNWIEGRGKSV VCEAVIREEVVHKVLKTNVQSLVELNVIKNLAGSAVAGALGGFNAHASNI VTAIFIATGQDPAQNVESSQCITMLEAVNDGRDLHISVTMPSIEVGTVGG GTQLASQSACLDLLGVKGANRESPGSNARLLATVVAGAVLAGELSLISAQ AAGHLVQSHMKYNRSSKDMSKIAC*

    SEQ ID NO: 33; AsSQS (Avena Strigosa Squalene Synthase) Coding Sequence (1212bp)

    [0368] TABLE-US-00035 ATGGGGGCGCTGTCGCGGCCGGAGGAGGTGGTGGCGCTGGTCAAGCTGAG GGTGGCGGCGGGGCAGATCAAGCGCCAGATCCCGGCCGAGGAACACTGGG CCTTCGCCTACGACATGCTCCAGAAGGTCTCCCGCAGCTTCGCGCTCGTC ATCCAGCAGCTCGGACCCGAACTCCGCAATGCCGTGTGCATCTTCTACCT CGTGCTCCGGGCCCTGGACACCGTCGAGGACGACACCAGCATCCCCAACG ACGTGAAGCTGCCCATCCTTCGGGATTTCTACCGCCATGTCTACAACCCC GACTGGCGTTATTCATGTGGAACAAACCACTACAAGGTGCTGATGGATAA GTTCAGACTCGTCTCCACGGCTTTCCTGGAGCTAGGCGAAGGATATCAAA AGGCAATTGAAGAAATCACTAGGCGAATGGGAGCAGGAATGGCAAAATTT ATATGCCAGGAGGTTGAAACGATTGATGACTATAATGAGTACTGCCACTA TGTAGCAGGGCTAGTAGGCTATGGACTTTCCAGGCTCTTTCATGCTGCTG GGACAGAAGATCTGGCTTCAGATCAACTTTCGAATTCAATGGGTTTGTTT CTTCAGAAAACCAATATAATAAGGGATTATTTGGAGGATATAAATGAGAT ACCAAAGTGCCGTATGTTTTGGCCTCGAGAAATATGGAGTAAATATGCAG ATAAACTTGAGGACCTCAAGTATGAGGAAAATTCAGAAAAAGCAGTGCAA TGCTTGAATGATATGGTGACTAATGCTTTGGTCCACGCCGAAGACTGTCT TCAATACATGTCTGCGTTGAAGGATAATACTAATTTTCGGTTTTGTGCAA TACCTCAGATAATGGCAATTGGGACATGTGCTATTTGCTACAATAATGTG AAAGTCTTTAGAGGAGTTGTTAAGATGAGGCGTGGGCTCACTGCACGAAT AATTGATGAGACAAAATCAATGTCAGATGTCTATTCTGCTTTCTATGAGT TCTCTTCATTGCTAGAGTCAAAGATTGACGATAACGACCCAAGTTCTGCA CTAACACGGAAGCGTGTAGAGGCAATAAAGAGGACTTGCAAGTCATCCGG TTTACTAAAGAGAAGGGGATACGACCTGGAAAAGTCAAAGTATAGGCATA TGTTGATCATGCTTGCACTTCTGTTGGTGGCTATTATCTTCGGTGTACTG TACGCCAAGTGA

    SEQ ID NO: 34; AsSQS (Avena Strigosa Squalene Synthase) Translated Nucleotide Sequence (403aa)

    TABLE-US-00036 MGALSRPEEVVALVKLRVAAGQIKRQIPAEEHWAFAYDMLQKVSRSFALV IQQLGPELRNAVCIFYLVLRALDTVEDDTSIPNDVKLPILRDFYRHVYNP DWRYSCGTNHYKVLMDKFRLVSTAFLELGEGYQKAIEEITRRMGAGMAKF ICQEVETIDDYNEYCHYVAGLVGYGLSRLFHAAGTEDLASDQLSNSMGLF LQKTNIIRDYLEDINEIPKCRMFWPREIWSKYADKLEDLKYEENSEKAVQ CLNDMVTNALVHAEDCLQYMSALKDNTNFRFCAIPQIMAIGTCAICYNNV KVFRGVVKMRRGLTARIIDETKSMSDVYSAFYEFSSLLESKIDDNDPSSA LTRKRVEAIKRTCKSSGLLKRRGYDLEKSKYRHMLIMLALLLVAIIFGVL YAK*

    SEQ ID NO: 35; AtATR2 (Arabidopsis Thaliana Cytochrome P450 Reductase 2) Coding Sequence (2325bp)

    [0369] TABLE-US-00037 atgaaaaacatgatgaattataaattaaaactctgttctgtctcaaaaaa ctcaaaaggagtctctctctcacctacaccacacctaaccaaacccccta cgattcacacagagagagatcttcttcttccttcttcttccttcttcttt cttcttctttcttcttctagctacaacatctacaacgccatgtcctcttc ttcttcttcgtcaacctccatgatcgatctcatggcagcaatcatcaaag gagagcctgtaattgtctccgacccagctaatgcctccgcttacgagtcc gtagctgctgaattatcctctatgcttatagagaatcgtcaattcgccat gattgttaccacttccattgctgttcttattggttgcatcgttatgctcg tttggaggagatccggttctgggaattcaaaacgtgtcgagcctcttaag cctttggttattaagcctcgtgaggaagagattgatgatgggcgtaagaa agttaccatctttttcggtacacaaactggtactgctgaaggttttgcaa aggctttaggagaagaagctaaagcaagatatgaaaagaccagattcaaa atcgttgatttggatgattacgcggctgatgatgatgagtatgaggagaa attgaagaaagaggatgtggctttcttcttcttagccacatatggagatg gtgagcctaccgacaatgcagcgagattctacaaatggttcaccgagggg aatgacagaggagaatggcttaagaacttgaagtatggagtgtttggatt aggaaacagacaatatgagcattttaataaggttgccaaagttgtagatg acattcttgtcgaacaaggtgcacagcgtcttgtacaagttggtcttgga gatgatgaccagtgtattgaagatgactttaccgcttggcgagaagcatt gtggcccgagcttgatacaatactgagggaagaaggggatacagctgttg ccacaccatacactgcagctgtgttagaatacagagtttctattcacgac tctgaagatgccaaattcaatgatataaacatggcaaatgggaatggtta cactgtgtttgatgctcaacatccttacaaagcaaatgtcgctgttaaaa gggagcttcatactcccgagtctgatcgttcttgtatccatttggaattt gacattgctggaagtggacttacgtatgaaactggagatcatgttggtgt actttgtgataacttaagtgaaactgtagatgaagctcttagattgctgg atatgtcacctgatacttatttctcacttcacgctgaaaaagaagacggc acaccaatcagcagctcactgcctcctcccttcccaccttgcaacttgag aacagcgcttacacgatatgcatgtcttttgagttctccaaagaagtctg ctttagttgcgttggctgctcatgcatctgatcctaccgaagcagaacga ttaaaacaccttgcttcacctgctggaaaggatgaatattcaaagtgggt agtagagagtcaaagaagtctacttgaggtgatggccgagtttccttcag ccaagccaccacttggtgtcttcttcgctggagttgctccaaggttgcag cctaggttctattcgatatcatcatcgcccaagattgctgaaactagaat tcacgtcacatgtgcactggtttatgagaaaatgccaactggcaggattc ataagggagtgtgttccacttggatgaagaatgctgtgccttacgagaag agtgaaaactgttcctcggcgccgatatttgttaggcaatccaacttcaa gcttccttctgattctaaggtaccgatcatcatgatcggtccagggactg gattagctccattcagaggattccttcaggaaagactagcgttggtagaa tctggtgttgaacttgggccatcagttttgttctttggatgcagaaaccg tagaatggatttcatctacgaggaagagctccagcgatttgttgagagtg gtgctctcgcagagctaagtgtcgccttctctcgtgaaggacccaccaaa gaatacgtacagcacaagatgatggacaaggcttctgatatctggaatat gatctctcaaggagcttatttatatgtttgtggtgacgccaaaggcatgg caagagatgttcacagatctctccacacaatagctcaagaacaggggtca atggattcaactaaagcagagggcttcgtgaagaatctgcaaacgagtgg aagatatcttagagatgtatggtaa

    SEQ ID NO: 36; AtATR2 (Arabidopsis Thaliana Cytochrome P450 Reductase 2) Translated Nucleotide Sequence (774aa)

    [0370] TABLE-US-00038 MKNMMNYKLKLCSVSKNSKGVSLSPTPHLTKPPTIHTERDLLLPSSSFFF LLLSSSSYNIYNAMSSSSSSSTSMIDLMAAIIKGEPVIVSDPANASAYES VAAELSSMLIENRQFAMIVTTSIAVLIGCIVMLVWRRSGSGNSKRVEPLK PLVIKPREEEIDDGRKKVTIFFGTQTGTAEGFAKALGEEAKARYEKTRFK IVDLDDYAADDDEYEEKLKKEDVAFFFLATYGDGEPTDNAARFYKWFTEG NDRGEWLKNLKYGVFGLGNRQYEHFNKVAKVVDDILVEQGAQRLVQVGLG DDDQCIEDDFTAWREALWPELDTILREEGDTAVATPYTAAVLEYRVSIHD SEDAKFNDINMANGNGYTVFDAQHPYKANVAVKRELHTPESDRSCIHLEF DIAGSGLTYETGDHVGVLCDNLSETVDEALRLLDMSPDTYFSLHAEKEDG TPISSSLPPPFPPCNLRTALTRYACLLSSPKKSALVALAAHASDPTEAER LKHLASPAGKDEYSKWVVESQRSLLEVMAEFPSAKPPLGVFFAGVAPRLQ PRFYSISSSPKIAETRIHVTCALVYEKMPTGRIHKGVCSTWMKNAVPYEK SENCSSAPIFVRQSNFKLPSDSKVPIIMIGPGTGLAPFRGFLQERLALVE SGVELGPSVLFFGCRNRRMDFIYEEELQRFVESGALAELSVAFSREGPTK EYVQHKMMDKASDIWNMISQGAYLYVCGDAKGMARDVHRSLHTIAQEQGS MDSTKAEGFVKNLQTSGRYLRDVW*