Biosynthesis of opiate alkaloids

09862979 ยท 2018-01-09

Assignee

Inventors

Cpc classification

International classification

Abstract

The disclosure relates to a nucleic acid molecule isolated from a Papaver somniferum cultivar that produces the opiate alkaloid noscapine which comprises 10 genes involved in the biosynthesis of opiate alkaloids.

Claims

1. An expression vector comprising a nucleotide molecule selected from the group consisting of: i) the nucleotide sequence of SEQ ID NO: 7, 8, 9 or 10; ii) a nucleotide sequence degenerate to the nucleotide sequence defined in (i) as a result of the genetic code; iii) a nucleotide sequence comprising at least 90% sequence identity to the nucleotide sequence of SEQ ID NO: 7, 8, 9 or 10, wherein said nucleotide sequence encodes a polypeptide having opiate alkaloid biosynthetic activity; iv) a nucleotide sequence that encodes a polypeptide comprising the amino acid sequence of SEQ ID NO: 17, 18, 19 or 20; and v) a nucleotide sequence that encodes a polypeptide comprising at least 90% sequence identity to the amino acid sequence of SEQ ID NO: 17, 18, 19 or 20, wherein said polypeptide opiate alkaloid biosynthetic activity.

2. The expression vector according to claim 1, wherein said nucleic acid molecule comprises or consists of the nucleotide sequence of SEQ ID NO: 7, wherein said nucleic acid molecule encodes a polypeptide with cytochrome P450 activity.

3. The expression vector according to claim 1, wherein said nucleic acid molecule comprises or consists of the nucleotide sequence of SEQ ID NO: 8, wherein said nucleic acid molecule encodes a polypeptide with carboxylesterase activity.

4. The expression vector according to claim 1, wherein said nucleic acid molecule comprises or consists of the nucleotide sequence of SEQ ID NO: 9, wherein said nucleic acid molecule encodes a polypeptide with short-chain dehydrogenase/reductase activity.

5. The expression vector according to claim 1, wherein said nucleic acid molecule comprises or consists of the nucleotide sequence of SEQ ID NO: 10, wherein said nucleic acid molecule encodes a polypeptide with acetyltransferase activity.

6. The expression vector according to claim 1, wherein said nucleic acid molecule is operably linked to a promoter for expression in a microbial cell.

7. The expression vector according to claim 1, wherein said nucleic acid molecule is operably linked to a promoter for expression in a plant cell.

8. The expression vector according to claim 6, wherein said promoter is a constitutive promoter or inducible promoter.

9. The expression vector according to claim 7, wherein said promoter is a constitutive promoter or inducible promoter.

10. The expression vector according to claim 1, wherein said vector is a viral vector.

11. A microbial cell transformed with the expression vector according to claim 1.

12. The microbial cell according to claim 11, wherein said microbial cell is a bacterial cell.

13. The microbial cell according to claim 11, wherein said microbial cell is a yeast cell.

14. A plant cell transformed with the expression vector according to claim 1.

15. The plant cell according to claim 14, wherein said plant cell is of the genus Papaver.

16. A process for modifying one or more opiate alkaloids or opiate alkaloid intermediate metabolites, comprising: i) providing the microbial cell according to claim 11 in culture with at least one opiate alkaloid or opiate alkaloid intermediate metabolite; ii) cultivating the microbial cell under conditions that modify one or more opiate alkaloid or opiate alkaloid intermediate; and optionally iii) isolating said opiate alkaloid or opiate alkaloid intermediate from the microbial cell or cell culture.

17. The process according to claim 16, wherein said microbial cell is a bacterial cell.

18. The process according to claim 16, wherein said microbial cell is a yeast cell.

19. A process for modifying one or more opiate alkaloids, comprising: i) cultivating the plant cell of claim 15 to produce a transgenic plant; and optionally ii) harvesting said transgenic plant or part thereof.

20. The process according to claim 19, wherein said harvested plant or part thereof is dried and opiate alkaloid is extracted.

Description

BRIEF DESCRIPTION OF THE DRAWINGS

(1) An embodiment of the invention will now be described by example only and with reference to the following figures:

(2) FIGS. 1A-1B: Identification of genes exclusively present in the genome of a noscapine producing poppy variety, HN1 (High Noscapine 1). (A) Relative abundance of the major alkaloids extracted from the capsules of three commercial varieties of poppy, HM1 (High Morphine 1), HT1 (High Thebaine 1) and HN1. M=morphine, C=codeine, T=thebaine, O=oripavine and N=Noscapine. (B) EST libraries from stem and capsule were generated by pyrosequencing and unique contiguous sequences assembled as described in material and methods. Ten genes (PSMT1, PSMT2, PSMT3, CYP82X1, CYP82X2, CYP82Y1, CYP719A21, PSAT1, PSSDR1 and PSCXE1) as defined in the text, were represented only in EST libraries from the HN1 variety. EST abundance of five other functionally characterized P. somniferum genes (BBE, TNMT, SalR, SalAT and T6DM) show them to be expressed in all three varieties and at consistently higher levels in stem compared to capsule as is also the case for the HN1 specific genes as shown in colour code (FIG. 1B). PCR on genomic DNA from all three varieties revealed that the ten HN1 specific genes are absent from the genomes of the HM1 and HT1 varieties (FIG. 5A);

(3) FIGS. 2A-2B: Segregation analysis of noscapine content in an F2 mapping population demonstrates requirement for the noscapine gene cluster. (A) Box plot depiction of noscapine levels as percentage dry weight (DW) in glasshouse grown parental lines HN1 and HM1 and the F1 generation. (B) The field grown F2 generation segregated into three classes of zero, low and high noscapine. F2 GC? and F2 GC+ indicate the absence and presence respectively of the noscapine gene cluster. Numbers in brackets indicate number of individuals in each class;

(4) FIG. 3: The HN1 gene cluster. The structure and position of the ten HN1 specific genes expressed in stems and capsule tissues is shown above the central black line which represents 401 Kb of genomic sequence. Exons are represented by filled grey boxes and introns by fine black lines. Arrows indicate the 5 to 3 orientation of each gene. Additional open reading frames depicted below the central black line are as defined by the key. None of these ORFs are represented in the stem and capsule EST libraries;

(5) FIGS. 4A-4G: Functional characterisation using virus induced gene silencing of 6 genes from the HN1 gene cluster. Results from both leaf latex and capsules are consistent with each of these genes encoding enzymes involved in noscapine biosynthesis (A-F). All compounds that accumulate, apart from scoulerine, have been putatively identified on the basis of mass spectra as detailed in FIGS. 6A-6F. The mass-to-charge (m/z) value (M) followed by retention time (T) in seconds is shown for each compound on the horizontal axis. (G) Proposed pathway for noscapine biosynthesis based on VIGS data. Solid arrows depict steps supported by VIGS data, dotted arrows depict additional proposed steps. For the secoberbine intermediates, R1=H or OH, R2=H or OH and R3=CH2OH or CHO or COOH (FIGS. 6A-6F). The noscapine structure is numbered according to the IUPAC convention;

(6) FIGS. 5A-5B: The ten genes exclusively expressed in the HN1 variety occur in the genome of HN1 but are absent from that of varieties HT1 and HM1. (A) Amplification of fragments from the ten genes exclusively expressed in HN1 using two different primer pairs. (B) Amplification of fragments of genes from the protoberberine and morphinan branch pathways that are expressed in all three varieties. Primers used are detailed in Table 3; HyperLadder I (Bioline Reagents, London, UK) was used as molecular size standard;

(7) FIGS. 6A-6F. Evidence for putative identities of intermediates from VIGS experiments. All panels show the mass spectra of the pseudomolecular parent ion at the chromatographic peak apex in black and corresponding MS2 fragmentation spectra in red, scaled to relative abundance. MS2 spectra were generated by targeting the parent ion with a isolation width of 3 m/z and using collisional isolation dissociation energy set to 35%. All mass spectra were obtained at a resolution setting of 7500. Text printed above selected diagnostic ions indicate the exact monoisotopic mass of the ion, the calculated formula within limits C=1:100, O=0:200, N=0:3 and H=1:200, and the number/total number of formulae returned within a 5 ppm error window. Fragments were reconciled against theoretical fragments generated by submitting candidate parent structures to Mass Frontier software (version 5.01.2; HighChem, Bratislava, Slovakia). Candidate parent structures were derived from PubChem searches and the comprehensive review of Papaver spp. alkaloids (Sariyar (2002) Pure Appl. Chem. 74, 557-574). (A) Tetrahydrocolumbamine; this compound was characterized from a peak eluting at 174 s from VIGS-silenced CYP719A21. Eight out of ten observed MS2 fragments were calculated as feasible by Mass Frontier; only the two most abundant diagnostic fragments are shown. (B) Secoberbine intermediate 1 (C21H25NO6); this compound was characterized from a peak eluting at 147 s from VIGS-silenced CYP82X2. If R1=OH, R2=H, and R3=CH2OH, then this compound is narcotolinol which is consistent with both annotated fragments. Another candidate formula fit would be demethoxylated narcotindiol (R1=H, R2=OH, R3=CH2OH); however this structure would not form the observed fragment at 206.0816. (C) Secoberbine intermediate 2 (C21H23NO6); this compound was characterized from a peak eluting at 103 s from VIGS-silenced CYP82X2. If R1=OH, R2=H, and R3=CHO, then this compound would be a desmethylated derivative of macrantaldehyde. (D) Papaveroxine; this compound was characterized from a peak eluting at 214 s from VIGS-silenced PSCXE1. The 398.1600 fragment observed is consistent with deacetylation. (E) Narcotinehemiacetal; this compound was characterized from a peak eluting at 121 s from VIGS-silenced PSSDR1. (F) Narcotoline (4-desmethylnoscapine); this compound was characterized from a peak eluting at 208 s from VIGS-silenced PSMT2. Other isobaric possibilities were 6- or 7-desmethylnoscapine. However, the 206.0816 fragment observed is consistent with a hydroxylated 4 position. Alternative structures could be discounted by comparing the candidate fragmentation spectra with that from synthetic 7-desmethylnoscapine, which eluted at a different retention time and lacked the characteristic 206.0816 fragment;

(8) FIGS. 7A-7M are sequences of (A) PSMT1 nucleic acid sequence, SEQ ID NO: 1; (B) PSMT2 nucleic acid sequence, SEQ ID NO: 2; (C) PSMT3 nucleic acid sequence, SEQ ID NO: 3; (D) CYP82X1 nucleic acid sequence, SEQ ID NO: 4; (E) CYP719A21 nucleic acid sequence, SEQ ID NO: 5; (F) CYP82X2 nucleic acid sequence, SEQ ID NO: 6; (G) CYP82Y1 nucleic acid sequence, SEQ ID NO: 7; (H) PSCXE1 nucleic acid sequence, SEQ ID NO: 8; (I) PSSDR1 nucleic acid sequence, SEQ ID NO: 9; (J) PSAT1 nucleic acid sequence, SEQ ID NO: 10; PSMT1 protein sequence, SEQ ID NO: 11; PSMT2 protein sequence, SEQ ID NO: 12; PSMT3 protein sequence, SEQ ID NO: 13; (K) CYP82X1 protein sequence, SEQ ID NO: 14; CYP719A21 protein sequence, SEQ ID NO: 15; CYP82X2 protein sequence, SEQ ID NO: 16; CYP82Y1 protein sequence, SEQ ID NO: 17; PSCXE1 protein sequence, SEQ ID NO: 18; PSSDR1 protein sequence, SEQ ID NO: 19; (L) PSAT1 protein sequence, SEQ ID NO: 20; VIGS PSMT1 protein sequence, SEQ ID NO: 21; VIGS PSMT2 protein sequence, SEQ ID NO: 22; and VIGS CYP82X1 protein sequence, SEQ ID NO: 23; VIGS CYP719A21 protein sequence, SEQ ID NO: 24; VIGS CYP82X2 protein sequence, SEQ ID NO: 25; VIGS CYP82Y1 protein sequence, SEQ ID NO: 26; VIGS PSCXE1 protein sequence, SEQ ID NO: 27; (M) VIGS PSSDR1 protein sequence, SEQ ID NO: 28; VIGS PSAT1 protein sequence, SEQ ID NO: 29; and VIGS PSPDS protein sequence, SEQ ID NO: 30.

(9) Table 1 Illustrates the % identity of CYP82Y1, PSCXE1, PSDFR1 and PSAT1 (SEQ ID 17-20) with their respective closest functionally characterised homologues. Accession numbers given are from GenBank, Swiss-Prot or PDB databases;

(10) Table 2. Genotyping of F3 families derived from two F2 phenotypic classes: low noscapine and high noscapine. The observed versus expected segregation ratios strongly support the hypothesis that individuals in the low noscapine F2 class are heterozygous for the HN1 gene cluster and individuals in the high noscapine class are homozygous;

(11) Table 3. Primer sequences and associated information.

(12) TABLE-US-00001 TABLE 1 % Accession Protein Identity number Annotation CYP82Y1 54 CYP82X1 from Papaver (SEQ ID somniferum NO: 17) 48 CYP82X2 from Papaver somniferum 39 ABM46919.1 CYP82E3, nicotine demethylase from Nicotiana tomentosiformis PSCXE1 45 2O7R_A AeCXE1, Carboxyl esterase from (SEQ ID Actinidia eriantha NO: 18) PSSDR1 46 AAB41550.1 Vestitone reductase from (SEQ ID Medicago sativa NO: 19) 45 ABQ97018.1 Dihydroflavonol 4-reductase from Saussurea medusa PSAT1 66 Q94FT4.1 Salutaridinol 7-O-acetyltransferase (SEQ ID from Papaver somniferum NO: 20)

(13) TABLE-US-00002 TABLE 2 F3 seed Expected segregation in family F3 if F2 low noscapine Noscapine class (obtained Number Observed segregation class is heterozygous and genotyping through self- of F3 of gene cluster in F3 and the high noscapine Chi-Square result of F2 pollination of individuals progeny class is homozygous X- individual F2 individual) genotyped GC+ GC? GC+ GC? squared p-value low noscapine/GC+ S-111809 28 18 10 21 7 1.714 0.190 low noscapine/GC+ S-111835 26 18 8 19.5 6.5 0.462 0.497 high noscapine/GC? S-111714 28 28 28 high noscapine/GC? S-111854 54 54 54

(14) TABLE-US-00003 TABLE3 Primersequences(5-to3-) SEQID SEQID Gene Forward NO Reverse NO Notes Application PSMT1 GATTCCCGATTTACTCCTG 31 AACACAAAATACGATTAC 32 primerpair1 Primersforthe ATG TTACTTTTGTCC amplificationof PSMT1 TGCCTCATGTTATTTCTGT 33 GCATGAAATGGATGTAGT 34 primerpair2 fragmentsfrom TGCC TATCTTGG genomicDNAof PSMT2 ATTGATGTCGGTGGTGGTC 35 ATTCCCGTTCAAGTAAAC 36 primerpair1 HM1,HT1andHN1 ACG ATGCGG asshownin PSMT2 GCAACTGTTTCATTAACAG 37 CAGTAAATTCACACATTC 38 primerpair2 FIG.5 GCACATCC CGTATCTTCCC PSMT3 GCTTCAGCATTGGTTAACG 39 GAGGGTAAGCCTCAATAA 40 primerpair1 AGTGC CAGACTGG PSMT3 AGACCGTTTGTACCGAATT 41 TCGTTCCATTCGTGAAGA 42 primerpair2 CTGC ATGC CYP82X1 GAACCATTAAACACTTGAG 43 TGCAATTGAATTTAGCTC 44 primerpair1 TCATGC ATCTCC CYP82X1 TTGATGAACGACAAGGAAC 45 ATTCATGATTGTGACCTT 46 primerpair2 CG TGTAATCC CYP82X2 ATGTGGAAAACGGTAAGCA 47 ACGATTCTGTCATCATCA 48 primerpair1 AGTGG TTTTCGC CYP82X2 CAACCTCAATCTAGCTAGA 49 CCCAAGATTTTCATATCC 50 primerpair2 GTCG TTTACAA CYP82Y1 CAATAATTGAGTAATTTCA 51 GCTCCGTAAGTGCTCCTG 52 primerpair1 GTTCATTCATGG TG CYP82Y1 GAATTGTGGTAAAAAATTA 53 CCCTTCACATCTACCATC 54 primerpair2 GATGCAG CCTT CYP719A21 CAAAGAGTCAATCTGACTC 55 CGAGTGCCCATGCAGTGG 56 primerpair1 AAGCTAGC CYP719A21 TCAAACCCTGCTACTAACA 57 CACTCCATCAGACACACA 58 primerpair2 CTTACTTGC AGACC PSAT1 TTTTATCGACCTTGAGGAA 59 AAATGGCAGTTCCACCGC 60 primerpair1 CAATTAGG PSAT1 GACTTCATGATGAAATCAG 61 CACTGCTGACTTCCATAT 62 primerpair2 ATGCAC CAAAGC PSCXE1 ATGCTGTTGATGCTTTAAA 63 AGCTGAATTTGTCGATCA 64 primerpair1 CTGGG ATAAGTGG PSCXE1 AATAAAAATCCAACAATGG 65 ACTGGCATGATATGCAAC 66 primerpair2 CAGATCC ATTAGC PSSDR1 GGAAGATGTGAGCCACCTT 67 GATACACTGGGAGGAGGA 68 primerpair1 AAAGC TGGG PSSDR1 GAGAGTAACCACATCTTTG 69 CGGCAAAATTCATTCCTT 70 primerpair2 TTGTCGG GAGC 71 72 BBE GTTTACTCCCACGTGCATC 71 CATTCCTCGTCTAATTCA 72 TCTGC TNMT GTTTACTCCCACGTGCATC 73 GCTTCACTACTTCTTCTT 74 GAAAAG SalR AAACAATGCTGGGGTTGC 75 CATTATAATTTCCAATGC 76 CGTAGTTC SalAT TAAGAGAGGGAGACCACGA 77 CATTCGTTGTTGTTGCTG 78 G GTAAG T6ODM CTTATGAAGCTAGGTAATG 79 CATCCTCATTGCTTGTGT 80 GTATGGA CC PSMT1 CTCTAAAATGCCAAACGCG 81 sequencingprimer Primersusedas PSMT1 GACCCTTTGGGACTTCCTC 82 sequencingprimer sequencing G primerstoobtain PSMT1 CGTGTTGTTTGGTCCCTCG 83 sequencingprimer genomicDNA PSMT1 TGCCTCATGTTATTTCTGT 84 sequencingprimer sequencefromHN1 TGCC PSMT1 GATTCCCGATTTACTCCTG 85 sequencingprimer ATGG PSMT1 AACACAAAATACGATTACT 86 sequencingprimer TACTTTTGTCC PSMT1 TGCCTCATGTTATTTCTGT 87 sequencingprimer TGCC PSMT1 GCATGAAATGGATGTAGTT 88 sequencingprimer ATCTTGG PSMT1 AAATCGTTCGCTCTTTACC 89 sequencingprimer GC PSMT1 CACACCAAACTTGATCATT 90 sequencingprimer GTC PSMT2 ATTGTTGATATTGAATCAG 91 sequencingprimer AAACTTTC PSMT2 TCAATACCAGTACTGTTAG 92 sequencingprimer TTTCCG PSMT2 GCAACTGTTTCATTAACAG 93 sequencingprimer GCACATCC PSMT2 ATTGATGTCGGTGGTGGTC 94 sequencingprimer ACG PSMT2 GCACACTGTCTTTTTCTTC 95 sequencingprimer CACC PSMT2 ACCGGAATGAGAATGCATA 96 sequencingprimer AAGTAAAGG PSMT2 CCAATACCCAATCAATTAA 97 sequencingprimer ACTC PSMT2 CAGTAAATTCACACATTCC 98 sequencingprimer GTATCTTCCC PSMT3 ATTGTATAGCCAAAGTTGC 99 sequencingprimer AGGTAGGG PSMT3 AGACCGTTTGTACCGAATT 100 sequencingprimer CTGC PSMT3 GCAGTGAAAGCCATATCCA 101 sequencingprimer AAGC PSMT3 AACCGTCCCCAAGATGATT 102 sequencingprimer CC PSMT3 TCGTTCCATTCGTGAAGAA 103 sequencingprimer TGC PSMT3 GAGGGTAAGCCTCAATAAC 104 sequencingprimer AGACTGG CYP82X1 GAACCATTAAACACTTGAG 105 sequencingprimer TCATGC CYP82X1 TTGATGAACGACAAGGAAC 106 sequencingprimer CG CYP82X1 TCGACAGCGCTTACGAACG 107 sequencingprimer CYP82X1 CAATTATCAAAGAATCAAT 108 sequencingprimer GC CYP82X1 TGCAATTGAATTTAGCTCA 109 sequencingprimer TCT CYP82X1 ATTCATGATTGTGACCTTT 110 sequencingprimer GTAATCC CYP82X1 GACAGAGGGCCCAAGTTAA 111 sequencingprimer GG CYP82X1 AGCAAACCATTCGTCCATC 112 sequencingprimer C CYP82X1 TACGACAGGTTGCTAGCTT 113 sequencingprimer GG CYP82X2 AATAATGGATCAGTCACGG 114 sequencingprimer CTTCC CYP82X2 AATCCATCAGATTTTCAAC 115 sequencingprimer CAGAGAGG CYP82X2 TGTCAGCCAACCATTCGTC 116 sequencingprimer CATCCTAAC CYP82X2 GGCTTCCCGGAGATGACCC 117 sequencingprimer AGATTTTAT CYP82X2 TTGTTATTTTCATGACTAT 118 sequencingprimer TACCACCAGCTTCCTCTTA CYP82X2 AGTGGAGGAGGCACAAAAG 119 sequencingprimer TTAGGATGGAC CYP82X2 CCATGTCTGATAAATACGG 120 sequencingprimer GTCGGTGTTC CYP82X2 TTGTTGATAAGGACGACTA 121 sequencingprimer AGAATAAGCAGAAGATA CYP82X2 ACGATTCTGTCATCATCAT 122 sequencingprimer TTTCGC CYP82X2 AGTCGTGTATCGTTCGCTT 123 sequencingprimer AATGC CYP82X2 CATGCCTATCTATTTCCTC 124 sequencingprimer CCTTGCCCTC CYP82X2 TGTCAGCCAACCATTCGTC 125 sequencingprimer CATCCTAAC CYP82X2 TGTTCGATCACGTTGTCTC 126 sequencingprimer TTTTTGCCATAA CYP82X2 TAACAATAAAAGTACTGAT 127 sequencingprimer AATGGTGGTCGAAGGAGAA CYP82Y1 TATTGATGTGGACCAGTAC 128 sequencingprimer C CYP82Y1 TGTAACTCTTGGTCACATG 129 sequencingprimer G CYP82Y1 CGCGTACTTGACATTTAAC 130 sequencingprimer G CYP82Y1 GGATCATCGCCAAAAGAAA 131 sequencingprimer C CYP719A21 CAAAGAGTCAATCTGACTC 132 sequencingprimer AAGCTAGC CYP719A21 TGAAATGCCTGAGATCACT 133 sequencingprimer AAAATCG CYP719A21 TCAAACCCTGCTACTAACA 134 sequencingprimer CTTACTTGC CYP719A21 TGTAAAGACACTTCATTGA 135 sequencingprimer TGGGC CYP719A21 TTCGATTTGTGTAAACATT 136 sequencingprimer AATGATATTTGG CYP719A21 GAGATGATCAAGTGGTTTA 137 sequencingprimer ACCATTCC CYP719A21 CGAGTGCCCATGCAGTGG 138 sequencingprimer PSCXE1 AATAAAAATCCAACAATGG 139 sequencingprimer CAGATCC PSCXE1 ATGCTGTTGATGCTTTAAA 140 sequencingprimer CTGGG PSCXE1 GGTTAATCGAGAGATGTTT 141 sequencingprimer TGTGGTAGG PSCXE1 CGATGACACAGAGCAAGAA 142 sequencingprimer CGAC PSCXE1 CGCGGGTATATGTGTAGCA 143 sequencingprimer ATCG PSCXE1 CGGCAACGCCAGTTCCC 144 sequencingprimer PSSDR1 CTAACAGGCAAACAATAAC 145 sequencingprimer AGGTTGC PSSDR1 GGAAGATGTGAGCCACCTT 146 sequencingprimer AAAGC PSSDR1 AAAGGTACTGACAGAAAGA 147 sequencingprimer GCTTGCC PSSDR1 AGATACACTGGGAGGAGGA 148 sequencingprimer TGGG PSSDR1 CGGCAAAATTCATTCCTTG 149 sequencingprimer AGC PSSDR1 AACATATAGCCAAAGGACT 150 sequencingprimer CTTCG PSAT1 AGGATACACAATGACCCAA 151 sequencingprimer C PSAT1 TTTTATCGACCTTGAGGAA 152 sequencingprimer CAATTAGG PSAT1 TGTTCACTAGGTGGAAAGA 153 sequencingprimer G PSAT1 AGTACAATACCGAGAAATC 154 sequencingprimer CGACAAG PSAT1 GCTCAATTAATGGAACAGT 155 sequencingprimer AGTTACCC specificPCR conditions: PsMT1 VIC?-CGTGTTGTTTGGTC 156 GCACACTGTCTTTTTCTT 157 30cylces,20s Primerpairs CCTCG CCACC extensionat72? forgenotyping PsMT2 VIC?-GCAACTGTTTCATT 158 GCCAGCGCTAATACAAGG 159 36cylces,50s ofthe AACAGGCACATCC ATGTGG extensionat72? F2mapping PsMT3 VIC?-GCAGTGAAAGCCAT 160 TCGTTCCATTCGTGAAGA 161 30cylces,30s population ATCCAAAGC ATGC extensionat72? CYP82X1 VIC?-GCTACGAAAGATAA 162 AGCAAACCATTCGTCCAT 163 30cylces,30s TGGTGCAGC CC extensionat72? CYP82X2 VIC?-ATGTGGAAAACGGT 164 ACGATTCTGTCATCATCA 165 30cylces,50s AAGCAAGTGG TTTTCGC extensionat72? CYP719A21 VIC?-TGAAATGCCTGAGA 166 GGAATGGTTAAACCACTT 167 30cylces,30s TCACTAAAATCG GATCATCTC extensionat72? PSCXE1 VIC?-ATGCCAGTTTAAGA 168 GGGAACTGGCGTTGCCG 169 30cylces,30s GCAATAGAAATGG extensionat72? PSSDR1 VIC?-GAAGATGTGAGCCA 170 GCTCAAGGAATGAATTTT 171 30cylces,30s CCTTAAAGC GCCG extensionat72? CYP82X2 GTTGACGCAGGAAGCTTTT 172 GGAACATAAGATTTAACT 173 PrimerpairforPCR GC CCGCCTC amplificationof theBAClibrary screeningprobe PSMT1 aaactcgagaagctTGGTC 174 aaaggtaccCATGTACTA 175 Primerpairsfor ATAATCATCAATCAG CTACATCATCTCC theamplification PSMT2 aaactcgagaagcttGTGT 176 aaaggtaccACTTGAATA 177 andcloningof AACTAAGCCAGCGC TATCACCGC fragmentsselected CYP82X1 aaaggatccTTTGAGTAAT 178 aaaggtaccAACATCTAC 179 forVIGS GGTGAAAAGA TCTCGAGGATTG CYP82X2 aaactcgagaagcttTAGG 180 aaaggtaccTTAACTCCG 181 AGGGTATGTCCGGC CCTCGGCTCC CYP82Y1 aaaggatccTTCAGTTCAT 182 aaaggtaccGTTCATAGT 183 TCATGGCG AAATAATAACAGGCG CYP719A21 aaactcgagaagcttATGA 184 aaaggtaccCCAACAGGC 185 TCATGAGTAACTTATGGA CATTCCGTTG PSCXE1 aaaggatccTGGCAGATCC 186 aaaggtaccTTATGATAG 187 TTATGAATTCC GAAGCAGCTTATTC PSSDR1 aaaggatccGAAATTGACG 188 aaaggtaccCATTCAAAA 189 AGACAATATGG ACGAATATGTGTGC PSAT1 aaaggatccCCTAAGAGAG 190 aaaggtaccAATACAAGT 191 ATCCTCCAACTG ATGAAAACAAGAGAATAA PSPDS GAGGTGTTCATTGCCATGT 192 GTTTCGCAAGCTCCTGCA 193 CAA TAGT
Materials and Methods
Plant Material

(15) Three GSK Australia poppy varieties that predominantly accumulate either noscapine (High Noscapine, HN1), morphine (High Morphine, HM1) or thebaine (High Thebaine HT1), were grown in Maxi (Fleet) Rootrainers? (Haxnicks, Mere, UK) under glass in 16 hour days at the University of York horticulture facilities. The growth substrate consisted of 4 parts John Innes No. 2, 1 part Perlite and 2 parts Vermiculite. The HM1?HN1 F2 mapping population was grown at the GlaxoSmithKline Australia field-trial site, Latrobe, Tasmania from September 2009 to February 2010.

(16) Crossing and Selfing

(17) Crosses were carried out between HN1 and HM1 individuals to generate F1 hybrid seed. At the hook stage of inflorescence development, immature stamens were removed from selected HN1 flower buds. HN1 stigmas were fertilized with pollen from synchronously developing HM1 flowers shortly after onset of anthesis. To prevent contaminating pollen from reaching the receptive stigmas, emasculated flowers were covered with a muslin bag for four days after pollination. Both the F1 and F2 generations were self-pollinated to produce F2 and F3 seed, respectively. Self-pollination was ensured by covering the flowers shortly before onset of anthesis with a muslin bag.

(18) RNA Isolation and cDNA Synthesis

(19) Upper stems (defined as the 2 cm section immediately underneath the capsule) and whole capsules were harvested at two developmental stages represented by 1-3 days and 4-6 days, after petal fall. Five plants were used per developmental stage and cultivar. The material was ground to a fine powder in liquid nitrogen using a mortar and pestle. RNA was isolated from the powder using a CTAB-based extraction method (Chang et al (1993) Plant Mol. Biol. Rep. 11, 113-116) with small modifications: (i) three sequential extractions with chloroform:isoamylalcohol (24:1) were performed and (ii) the RNA was precipitated overnight with lithium chloride at 4? C. After spectrophotometric quantification, equal amounts of RNA were pooled from five plants per cultivar, development stage and organ. The pooled samples underwent a final purification step using an RNeasy Plus MicroKit (Qiagen, Crawley, UK). RNA was typically eluted in 30-100 ?l water. cDNA was prepared with the SMART cDNA Library Construction Kit (Clontech, Saint-Germainen-Laye, France) according to the manufacturer's instructions but using SuperScript II Reverse Transcriptase (Invitrogen, Paisley, UK) for first strand synthesis. The CDSIII/3PCR primer was modified to: 5 ATT CTA GAT CCR ACA TGT TTT TVN 3 where R=A or G, V=A, C or G; N=A/T or C/G (SEQ ID NO 194). Following digestion with Mmel (New England Biolabs, Hitchin, UK) the cDNA was finally purified using a QIAquick PCR Purification kit (Qiagen, Crawley, UK).

(20) cDNA Pyrosequencing: Pyrosequencing was performed on the Roche 454 GS-FLX sequencing platform (Branford, Conn.) using cDNA prepared from the following four samples of each of the three varieties: i. upper stem, 1-3 days after petal fall ii. upper stem, 4-6 days after petal fall iii. capsule, 1-3 days after petal fall iv. capsule, 4-6 days after petal fall
Raw Sequence Analysis, Contiguous Sequence Assembly and Annotation

(21) The raw sequence datasets were derived from parallel tagged sequencing on the 454 sequencing platform (Meyer et al (2008) Nature Prot. 3, 267-78). Primer and tag sequences were first removed from all individual sequence reads. Contiguous sequence assembly was only performed on sequences longer than 40 nucleotides and containing less than 3% unknown (N) residues. Those high quality Expressed Sequence Tag (EST) sequences were assembled into unique contiguous sequences with the CAPS Sequence Assembly Program (Huang and Madan (1999) Genome Res. 9, 868-877), and the resulting contigs were annotated locally using the BLAST2 program (Altschul et al. (1997) Nucleic Acids Res. 25, 3389-3402) against the non-redundant peptide database downloaded from the NCBI.

(22) Expression profiling: The number of ESTs associated with a specific consensus sequence representing each of the candidate genes detailed in FIG. 1B was counted for each EST library. EST numbers were normalised on the basis of total number of ESTs obtained per library. For each variety, EST counts were combined for the two developmental stages from both stems and capsules. Differences in candidate gene expression levels between organs and varieties were visualised as a heat map using Microsoft Excel.

(23) Preparation of Genomic DNA from Glasshouse Grown Plants

(24) In order to amplify and obtain genomic sequences of the candidate genes 30-50 mgs of leaf material was collected from 4-6 week old glasshouse-grown seedlings from each of the three varieties. Genomic DNA was extracted using the BioSprint 96 Plant kit on the BioSprint 96 Workstation (Qiagen, Crawley, UK) according to the manufacturer's protocol. Extracted DNA was quantified using Hoescht 33258 and normalized to 10 ng/ul.

(25) Amplification and Sequencing of Candidate Genes from Genomic DNA

(26) Primers for amplification and Sanger-sequencing of the candidate genes from genomic DNA were based on the respective contiguous sequences assembled from the ESTs or on BAC sequences. The primer sequences are shown in Table 3. PCR amplifications were performed on pools of genomic DNA comprising DNA from four individuals. Amplification was typically carried out on 10 ng genomic DNA in 1? Phusion High Fidelity Buffer supplemented with 200 nM forward and reverse primers, 0.2 mM dNTPs, 0.02 units/?l Phusion Hot Start DNA Polymerase (Finnzymes, Vantaa, Finnland). Standard PCR conditions were used throughout with annealing temperatures and times dependent on primers and PCR equipment.

(27) DNA Extraction from the Field-Grown F2 Mapping Population

(28) 40-50 mg of leaf tissue was harvested from F2 plants at the small rosette growth stage (?10 leaves present on each plant) into 1.2 ml sample tubes. A 3 mm tungsten carbide bead was added to each tube and samples were kept at ?80? C. for a minimum of two hours prior to freeze-drying for 18 hours. Following freeze drying, samples were powdered by bead-milling (Model TissueLyser, Qiagen, Hilden, Germany) at 30 Hz for two 60 s cycles separated by plate inversion. DNA extraction was performed with the Nucleospin Plant II kit (Macherey-Nagel, D?ren, Germany) using the supplied Buffer Set PL2/3 following the manufacturer's protocol for centrifugal extraction. DNA was quantified by UV-spectroscopy.

(29) Genotyping of the HN1?HM1 F2 Mapping Population for the Presence or Absence of the HN1-Specific Candidate Genes

(30) Plants of the F2 mapping population were genotyped for the presence or absence of eight candidate genes. The gene primer pairs (Table 3) were designed with fluorescent tags (5-VIC?-labeled) for use on the ABI 3730xl capillary apparatus (Applied Biosystems, Foster City, Calif.). PCR amplifications were typically carried out on 10 ng genomic DNA in 1? GoTaq buffer supplemented with 1 mM MgCl.sub.2, 500 nM forward and reverse primer, 0.125 mM dNTPs, 0.1 U GoTaq (Promega, Southampton, UK). The amplification conditions were: 1 min 94? C., 30-36 cycles of 30 s denaturation at 94? C., 30 s annealing at 62? C. and 20-50 s extension at 72? C., followed by a final extension for 5 min at 72? C. Cycle number and extension times depended on the candidate gene (Table 3). Amplification products were diluted 1:20 in H.sub.2O and fractionated on an ABI 3730xl capillary sequencer (Applied Biosystems, Foster City, Calif.). Data were scored using GeneMarker? software (Softgenetics, State College, Pa.).

(31) Poppy Straw Analysis from Field Grown F2 Plants

(32) Poppy capsules were harvested by hand from the mapping population once capsules had dried to approximately 10% moisture on the plant. After manually separating the seed from the capsule, the capsule straw samples (Poppy Straw) were then ground in a ball mill (Model MM04, Retsch, Haan, Germany) into a fine powder. Samples of ground poppy straw were then weighed accurately to 2?0.003 g and extracted in 50 ml of a 10% acetic acid solution. The extraction suspension was shaken on an orbital shaker at 200 rpm for a minimum of 10 min, then filtered to provide a clear filtrate. The final filtrate was passed through a 0.22 ?m filter prior to analysis. The loss on drying (LOD) of the straw was determined by drying in an oven at 105? C. for 3 hours.

(33) All solutions were analysed using a Waters Acquity UPLC system (Waters Ltd., Elstree, UK). fitted with a Waters Acquity BEH C18 column, 2.1 mm?100 mm with 1.7 micron packing. The mobile phase used a gradient profile with eluent A consisting of 10 mM ammonium bicarbonate of pH 10.2 and eluent B methanol. The mobile phase gradient conditions used are as listed in the table below with a linear gradient. The flow rate was 0.5 ml per minute and the column maintained at 60? C. The injection volume was 2 ?l and eluted peaks were ionised in positive APCI mode and detected within 5 ppm mass accuracy using a Thermo LTQ-Orbitrap. The runs were controlled by Thermo Xcalibur software (Thermo Fisher Scientific Inc., Hemel Hempstead, UK).

(34) Gradient Flow Program:

(35) TABLE-US-00004 TIME (minutes) % Eluent A % Eluent B Flow (ml/min) 0.0 98. 2.0 0.50 0.2 98.0 2.0 0.50 0.5 60.0 40 0.50 4.0 20.0 80.0 0.50 4.5 20.0 80.0 0.50

(36) Mass spectra were collected over the 150-900 m/z range at a resolution setting of 7500. All data analysis was carried out in the R programming language in a 64-bit Linux environment (R 2.11). Peak-picking was performed using the Bioconductor package, XCMS (Smith et al (2006) Anal. Chem. 78, 779-787), employing the centWave algorithm (Tautenhahn et al (2008) BMC Bioinformatics 9, 504). Redundancy in peak lists was reduced using the CAMERA package (Kuhl et al (2012) Anal. Chem. 84, 283-289). Alkaloids were identified by comparing exact mass and retention time values to those of standards and quantified by their pseudomolecular ion areas using custom R scripts.

(37) Bacterial Artificial Chromosome (BAC) Library Construction

(38) The HN1 BAC library was constructed from high molecular weight (HMW) genomic DNA processed at Amplicon Express, Inc. (Pullman, Wash.) from four week old seedlings using the method described (Tao et al (2002) Theor. Appl. Genet. 105, 1058-1066). The HMW DNA was partially digested with the restriction enzyme HindIII and size selected prior to ligation of fragments into the pCC1BAC vector (Epicentre Biotechnologies, Madison, Wis.) and transformation of DH10B E. coli cells, which were then plated on Luria-Bertani (LB) agar with chloramphenicol, X-gal and IPTG at appropriate concentrations. Clones were robotically picked with a Genetix QPIX (Molecular Devices, Sunnyvale, Calif.) into 240 384-well plates containing LB freezing media. Plates were incubated for 16 hours, replicated and then frozen at ?80? C. The replicated copy was used as a source plate for nylon filters that were made and used for screening using the PCR DIG Probe Synthesis Kit (Roche Applied Science, Indianapolis, Ind.). To estimate insert sizes, DNA aliquots of 10 BAC minipreps were digested with 5U of NotI enzyme for 3 hours at 37? C. The digestion products were separated by pulsed-field gel electrophoresis (CHEF-DRIII system, Bio-Rad, Hercules, Calif.) in a 1% agarose gel in TBE. Insert sizes were compared to those of the Lambda Ladder MidRange I PFG Marker (New England Biolabs, Ipswich, Mass.). Electrophoresis was carried out for 18 hours at 14? C. with an initial switch time of 5 s, a final switch time of 15 s, in a voltage gradient of 6 V/cm. The average BAC clone size for the library was found to be 150 Kb.

(39) Filter Construction and Screening

(40) Filter design and screening was carried out at Amplicon Express, Inc. (Pullman, Wash.). Bioassay dishes containing LB agar plate media and 12.5 ?g/mL chloramphenicol were prepared. Positively charged nylon Amersham Hybond-N.sup.+ membrane (GE Healthcare Bio-Sciences, Piscataway, N.J.) was applied to the media surface and the GeneMachines G3 (Genomics Solutions, Bath, UK) was used to robotically grid 18,432 clones in duplicate on filters. The filters were incubated at 37? C. for 12 to 14 hours. The filters were processed using the nylon filter lysis method (Sambrook et al., Molecular Cloning: A Laboratory Manual (Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y., 2001, ed. 3, vol. 1, chap. 1) with slight modifications. Following processing, the DNA was linked to the hybridization membrane filters according to the Hybond N+ manual by baking at 80? C. for 2 hours. To screen the library a 643 bp digoxigenin (DIG)-labeled probe representing position 2161-2803 in the genomic sequence of CYP82X2 (SEQ ID NO 6) was generated from 1.5 ng gDNA by PCR reaction using the primers shown in Table 3 and the PCR DIG synthesis kit (Roche Applied Science, Indianapolis, Ind.) according to the manufacturer's instructions. A non-labeled probe was amplified, diluted and spotted to each filter in the following dilutions of 2 ng, 1 ng, 0.1 ng and 0.0 ng as a positive control. The controls were baked at 80? C. for 30 min. Following a 30 min prehybridizing wash in DIG EasyHyb solution at 45? C. approximately 0.5 ?l of denatured DIG labeled PCR product was added per ml of hybridization solution with the nylon filters and incubated with gentle shaking overnight at 45? C. The nylon filters were washed twice in a 2? standard sodium citrate (SSC), 0.1% sodium dodecyl sulfate (SDS) buffer at room temperature for 5 min each, and twice with a 0.5?SSC, 0.1% SDS buffer at 65? C. for 15 minutes each. The hybridized probe was detected using NBT/BCIP stock solution according to the manufacturer's instructions (Roche Applied Science, Indianapolis, Ind.) and was found to hybridize to six BAC clones.

(41) BAC sequencing and automated sequence assembly: The six positive BAC clones from the BAC library were sequenced at Amplicon Express, Inc. (Pullman, Wash.) by Focused Genome Sequencing (FGS) with an average depth of 100? coverage. FGS is a Next Generation Sequencing (NGS) method developed at Amplicon Express that allows very high quality assembly of BAC clone sequence data using the Illumina HiSeq platform (Illumina, Inc, San Diego, Calif.). The proprietary FGS process makes NGS tagged libraries of BAC clones and generates a consensus sequence of the BAC clones with all reads assembled at 80 bp overlap and 98% identity. The gapped contiguous sequences were ordered and orientated manually based on mate pair sequences from four libraries of insert size 5000, 2000, 500 and 170 bp. Overlapping BAC clones, PS_BAC193L09, PS_BAC179L19, PS_BAC150A23 and PS_BAC164F07, which together encoded all 10 genes from the HN1 cluster, were selected for further sequence assembly. Where possible, gaps and ambiguous regions on both BAC clones were covered by primer walking with traditional Sanger sequencing to validate the assembly. Combination of the four overlapping BAC sequences gave a single continuous consensus sequence assembly of 401 Kb. The sequences of the 10 genes from the HN1 cluster were determined independently by Sanger sequencing and the 100% agreement of the Sanger determined gene sequences with the assembly from FGS provided quality assurance for the whole assembly.

(42) Annotation of the assembled sequence: The sequences of the four BAC clones were annotated with an automated gene prediction program FGENESH (Salamov and Solovyev (2002) Genome Res. 10, 516-522). The gene structure including exon-intron arrangement for the 10 genes in the HN1 cluster was validated by comparison with cDNA sequence for each gene. cDNA sequence was not available for any of the remaining ORFs detailed in FIG. 3 since they are not represented in any of the EST libraries. The predicted function of all ORFs was evaluated by BLAST analysis (Altschul et al (1997) Nucleic Acids Res. 25, 3389-3402) and those ORFs with significant hits (e-value less than 1e.sup.?8) were included in FIG. 3.

(43) Generation of Plasmid Constructs for Virus Induced Gene Silencing (VIGS)

(44) The tobacco rattle virus (TRV) based gene silencing system (Liu et al (2002) Plant J. 30, 415-422) was used to investigate the gene function of PSMT1, PSMT2, CYP719A21, CYP82X2, PSSDR1 and PSCXE1. DNA fragments selected for silencing were amplified by PCR and cloned into the silencing vector pTRV2 (GenBank accession no: AF406991). They were linked to a 129 bp-long fragment (SEQ ID NO: 30) of the P. somniferum PHYTOENE DESATURASE gene (PSPDS) in order to simultaneously silence the respective candidate genes and PSPDS. Plants displaying the photo-bleaching phenotype resulting from PSPDS silencing (Hileman et al (2005) Plant J. 44, 334-341) were identified as plants successfully infected with the respective silencing constructs and selected for further analysis.

(45) Generation of the pTRV2:PDS construct: A 622 bp fragment of PSPDS was amplified from cDNA prepared from HN1 using primers shown in Table 3. Sau3Al digestion of the 622 bp PCR product yielded among others a fragment of 129 bp (SEQ ID NO: 30) which was cloned into the BamHI site of the pTRV2 vector. The orientation and fidelity was confirmed by sequencing and the resulting pTRV2:PDS vector was used in the generation of the VIGS construct for each candidate gene. The pTRV2:PDS construct also served as the control in the VIGS experiments.

(46) DNA fragments selected for silencing the respective candidate genes were amplified from either HN1 genomic or cDNA. Primers used for amplification as well as the positions of the selected sequences within the respective open reading frames are shown in Table 3. The PSMT1, CYP719A21 and CYP82X2 fragments were first cloned into pTV00 (Ratcliff et al (2001) Plant J., 237-245) using HindIII and KpnI and then subcloned into pTRV2:PDS using BamHI and KpnI. PSMT2, PSCXE1 and PSSDR1 fragments were cloned directly into pTRV2:PDS using BamHI and KpnI. The orientation and fidelity of all constructs was confirmed by sequencing.

(47) Transformation of Agrobacterium tumefaciens with VIGS constructs: VIGS constructs were propagated in E. coli strain DH5? and transformed into electrocompetent Agrobacterium tumefaciens (strain GV3101) by electroporation.

(48) Infiltration of plants: Separate overnight liquid cultures of A. tumefaciens containing individual VIGS constructs (each consisting of a selected DNA fragment from the target gene linked to the 129 bp-long fragment from the P. somniferum PHYTOENE DESATURASE gene) were used to inoculate LB medium containing 10 mM MES, 20 ?M acetosyringone and 50 ?g/ml kanamycin. Cultures were maintained at 28? C. for 24 hours, harvested by centrifugation at 3000?g for 20 min, and resuspended in infiltration solution (10 mM MES, 200 ?M acetosyringone, 10 mM MgCl.sub.2,) to an OD.sub.600 of 2.5. A. tumefaciens harbouring the individual VIGS constructs including the control, pTRV2:PDS, were each mixed 1:1 (v/v) with A. tumefaciens containing pTRV1 (GenBank accession no: AF406990), and incubated for two hours at 22? C. prior to infiltration. Two week old seedlings of HN1 grown under standard greenhouse conditions (22? C., 16 h photoperiod), with emerging first leaves, were infiltrated as described (Nagel and Facchini (2010) Nat. Chem. Biol. 6, 273-275).

(49) Latex and capsule analysis of silenced plants: Leaf latex of infiltrated plants displaying photo-bleaching as a visual marker for successful infection and silencing was analyzed when the first flower buds emerged (?7 week old plants). Latex was collected from cut petioles, with a single drop dispersed into 500 ?l of 10% acetic acid. This was diluted 10? in 1% acetic acid to give an alkaloid solution in 2% acetic acid for further analysis. Capsules were harvested from the same plants used for latex analysis and single capsules were ground to a fine powder in a ball mill (Model MM04, Retsch, Haan, Germany). Samples of ground poppy straw were then weighed accurately to 10?0.1 mg and extracted in 0.5 ml of a 10% acetic acid solution with gentle shaking for 1 h at room temperature. Samples were then clarified by centrifugation and a 50 ?l subsample diluted 10? in 1% acetic acid to give an alkaloid solution in 2% acetic acid for further analysis. All solutions were analyzed as described for the poppy straw analysis from field grown F2 plants. Likewise, all data analysis was carried out using the R programming language. Putative alkaloid peaks were quantified by their pseudomolecular ion areas using custom scripts. Peak lists were compiled and any peak-wise significant differences between samples were identified using 1-way ANOVA with p-values adjusted using the Bonferroni correction for the number of unique peaks in the data set. For any peak-wise comparisons with adjusted p-values <0.05, Tukey's HSD test was used to identify peaks that were significantly different between any given sample and the control. Alkaloids were identified by comparing exact mass and retention time values to those of standards. Where standards were not available, the Bioconductor rcdk package (Smith et al (2006) Anal. Chem. 78, 779-787) was used to generate pseudomolecular formulae from exact masses within elemental constraints C=1 100, H=1 200, O=0 200, N=0 3 and mass accuracy <5 ppm. The hit with the lowest ppm error within these constraints was used to assign a putative formula.

EXAMPLE 1

Transcriptomic Analysis Reveals the Exclusive Expression of 10 Genes Encoding Five Distinct Enzyme Classes in a High Noscapine Producing Poppy Variety, HN1. These Genes are Absent from the Genome of Two Noscapine Non-Producing Varieties

(50) Capsule extract from three opium poppy varieties developed in Tasmania for alkaloid production designated as High Morphine 1 (HM1), High Thebaine 1 (HT1) and High Noscapine 1 (HN1) on the basis of the most abundant alkaloid in each case (FIG. 1A) underwent metabolite profiling. Noscapine was found to be unique to HN1 relative to HM1 and HT1. Roche 454 pyrosequencing was performed on cDNA libraries derived from stem and capsule tissue from all three varieties. Analysis of Expressed Sequence Tag (EST) abundance led to the discovery of a number of previously uncharacterized genes that are expressed in the HN1 variety but are completely absent from the HM1 and HT1 EST libraries (FIG. 1B). The corresponding genes were putatively identified as three O-methyltransferases (PSMT1, PSMT2, PSMT3), four cytochrome P450s (CYP82X1, CYP82X2, CYP82X3 and CYP719A21), an acetyltransferase (PSAT1), a carboxylesterase (PSCXE1) and a short-chain dehydrogenase/reductase (PSSDR1). In contrast a number of other functionally characterized genes associated with benzylisoquinoline alkaloid synthesis, including Berberine Bridge Enzyme (BBE), Tetrahydroprotoberberine cis-N-MethylTransferase (TNMT), Salutaridine Reductase (SalR), Salutaridinol 7-O-AcetylTransferase (SalAT) and Thebaine 6-O-demethylase (T6ODM) were expressed in all three varieties (FIG. 1B). PCR analysis on genomic DNA from all three varieties revealed that the genes exclusively expressed in the HN1 variety are present as expected in the genome of HN1 but absent from the genomes of the HM1 and HT1 varieties (FIG. 1B and FIGS. 5A-5B).

EXAMPLE 2

Analysis of an F2 Mapping Population Shows the Genes are Tightly Linked in HN1 and their Presence is Associated with the Production of Noscapine

(51) An F2 mapping population of 271 individuals was generated using HN1 and HM1 as parents. Genotyping of the field grown F2 population revealed that the HN1 specific genes are tightly linked and associated with the presence of noscapine suggesting they occur as a gene cluster involved in noscapine biosynthesis (FIG. 2B). Analysis of noscapine levels in field grown F2 capsules revealed that individuals containing this putative gene cluster fall into two classes. The first class containing 150 individuals, have relatively low levels of noscapine and the second class containing 63 individuals exhibit the high noscapine trait of the parental HN1 variety (FIG. 2B). The 58 F2 individuals that lack the putative gene cluster contain undetectable levels of noscapine (FIG. 2B). F3 family analysis confirmed that F2 individuals exhibiting the high noscapine trait were homozygous for the gene cluster while those exhibiting the low noscapine trait were heterozygous (Table 2). Noscapine levels in both the F1 population (FIG. 2A) and the heterozygous F2 class are much lower than the intermediate levels expected for a semi-dominant trait, suggesting involvement of some form of repression. The step change to high noscapine in homozygous F2 class suggests this trait is linked to the gene cluster locus rather than spread quantitatively among other loci.

EXAMPLE 3

Bacterial Artificial Chromosome Sequencing Confirms that the 10 Genes Exist as a Complex Gene Cluster

(52) To further characterize the putative noscapine gene cluster, a Bacterial Artificial Chromosome (BAC) library was prepared from genomic DNA isolated from HN1 and six overlapping BACs containing genes from the cluster were identified. Next generation and Sanger sequencing was used to generate a high quality assembly of 401 Kb confirming the arrangement of the 10 genes in a cluster spanning 221 Kb (FIG. 3). Only one other homologous gene, a carboxylesterase (PSCXE2), was found in the genomic sequence flanking the gene cluster (FIG. 3) but PSCXE2 was not represented in any of our EST libraries. Interspersed among the ten genes are both retrotransposon and DNA transposable element (TE) sequences (FIG. 3), which may have some function in gene rearrangement for cluster formation as thought to be the case for the thalianol and marneral clusters from A. thaliana (Field et al (2011) PNAS 108, 16116-16121).

EXAMPLE 4

Virus Induced Gene Silencing Results in Accumulation of Pathway Intermediates Allowing Gene Function to be Linked to Noscapine Synthesis and a Novel Bifurcated Biosynthetic Pathway to be Proposed

(53) In order to functionally characterize the genes in the HN1 cluster Virus Induced Gene Silencing (VIGS) was performed on poppy seedlings. VIGS in poppy seedlings persists through to mature plant stages (Hileman et al (2005) Plant J. 44, 334-341), and therefore both leaf latex and capsule extracts were routinely assayed (FIGS. 4A-4F). Silencing PSMT1 resulted in accumulation of scoulerine in capsules and also low levels of reticuline in latex, indicating that this gene product is responsible for the first committed step in the pathway to noscapine synthesis (FIG. 4A). The predicted product of PSMT1 is tetrahydrocolumbamine (FIG. 6A), which accumulated in seedlings and capsules that were silenced for CYP719A21 (FIG. 4B). CYP719A21 shows high homology to cytochrome P450 oxidases that act as methylenedioxy bridge-forming enzymes (D?az Ch?vez et al (2011) Arch. Biochem. Biophys. 507, 186193; Ikezawa et al (2009) Plant Cell Rep. 28, 123-133). Therefore CYP719A21 may encode a canadine synthase (FIG. 6). Silencing of a second cytochrome P450 gene, CYP82X2, resulted in accumulation of several secoberbine intermediates some of which may represent side products to the main synthetic pathway (FIG. 4C, FIGS. 6B-6C). Silencing of the carboxylesterase gene PSCXE1 resulted in accumulation of up to 20% total alkaloid content of putative papaveroxine (FIG. 6D) implying acetylation of a secoberbine intermediate as depicted in FIG. 4G. The PSAT1 gene from the HN1 cluster is an obvious candidate for this reaction. Silencing of PSSDR1 resulted in accumulation of what was putatively identified as narcotinehemiacetal (FIG. 6E), an immediate precursor of noscapine (FIG. 4G). These data support a biosynthetic route to noscapine that involves early O-methylation of a secoberbine intermediate at the position equivalent to the C4 hydroxyl group of noscapine (FIG. 4G). However, silencing PSMT2, resulted in accumulation of up to 20% narcotoline, indicating that O-methylation at the C4 hydroxyl group can also occur as a final step in noscapine production (FIG. 4F). These results imply bifurcation of the main pathway at the secoberbine intermediate stage with PSMT2 being responsible for both the O-methylation of a secoberbine intermediate and narcotoline. Silencing PSMT2 results in accumulation of high levels of narcotoline as flux is directed down the desmethyl branch of the pathway (FIGS. 4F and 6F).