Cyclopropene amino acids and methods

Abstract

The invention relates to a polypeptide comprising an amino acid having a cyclopropene group wherein said cyclopropene group is joined to the amino acid via a carbamate group. Suitably the cyclopropene group is a 1,3-disubstituted cyclopropene such as a 1,3-dimethylcyclopropene. Suitably the cyclopropene group is present as a residue of a lysine amino acid. The invention also relates to methods of making the polypeptides. The invention also relates to an amino acid comprising cyclopropene wherein said cyclopropene group is joined to the amino acid moiety via a carbamate group.

Claims

1. A polypeptide comprising an amino acid having a cyclopropene group wherein said cyclopropene group is linked to the amino acid via a carbamate group and the carbamate group does not form a part of the polypeptide backbone, wherein said cyclopropene group is a 1,3-disubstituted cyclopropene.

2. A polypeptide according to claim 1 wherein said cyclopropene is a 1,3-dimethylcyclopropene.

3. A polypeptide according to claim 1 wherein said cyclopropene group is present as a residue of a lysine amino acid.

4. A polypeptide according to claim 1 further comprising a tetrazine compound linked to said cyclopropene group.

5. An amino acid comprising cyclopropene wherein said cyclopropene group is linked to the amino acid moiety via a carbamate group and the carbamate group does not form a part of the peptide backbone, wherein said cyclopropene group is a 1,3-disubstituted cyclopropene.

6. An amino acid according to claim 5 wherein said cyclopropene is a 1,3-dimethylcyclopropene.

7. An amino acid according to claim 5 wherein said amino acid is a lysine amino acid.

8. An amino acid according to claim 7 which comprises N.sup.-[((2-methylcycloprop-2-en-1-yl)methoxy)carbonyl]-1-lysine.

9. An amino acid according to claim 7 which consists of ##STR00006##

10. A method of producing a polypeptide comprising a cyclopropene group wherein said cyclopropene group is joined to an amino acid moiety of the polypeptide via a carbamate group, said method comprising genetically incorporating said amino acid comprising said cyclopropene group joined to said amino acid moiety via said carbamate group, into said polypeptide.

11. A method according to claim 10 wherein producing the polypeptide comprises (i) providing a nucleic acid encoding the polypeptide which nucleic acid comprises an orthogonal codon encoding the amino acid having a cyclopropene group; (ii) translating said nucleic acid in the presence of an orthogonal tRNA synthetase/tRNA pair capable of recognising said orthogonal codon and incorporating said amino acid having a cyclopropene group into the polypeptide chain.

12. A method according to claim 10 wherein said orthogonal codon comprises an amber codon (TAG), said tRNA comprises MbtRNA.sub.CUA and said tRNA synthetase comprises MbPylRS; or wherein said orthogonal codon comprises an amber codon (TAG), said tRNA comprises MmtRNA.sub.CUA and said tRNA synthetase comprises MmPylRS.

13. A method according to claim 10 wherein said carbamate group does not form a part of the peptide backbone, and wherein said cyclopropene group is a 1,3-disubstituted cyclopropene.

14. A method of producing a polypeptide comprising a tetrazine group, said method comprising providing a polypeptide according to claim 1, contacting said polypeptide with a tetrazine compound, and incubating to allow joining of the tetrazine to a cyclopropene group of the polypeptide by an inverse electron demand Diels-Alder cycloaddition reaction.

15. A method according to claim 14 wherein said reaction is allowed to proceed for 10 minutes or less, preferably for 1 minute or less, preferably for 30 seconds or less.

16. A polypeptide according to claim 1 wherein said polypeptide comprises two or more amino acids each having a cyclopropene group, wherein each said cyclopropene group is linked to each said amino acid via a carbamate group and wherein each carbamate group does not form a part of the polypeptide backbone.

17. A polypeptide according to claim 16 wherein said polypeptide comprises four amino acids each having a cyclopropene group.

18. An antibody drug conjugate (ADC) comprising a polypeptide according to claim 1.

Description

BRIEF DESCRIPTION OF THE DRAWINGS

(1) Embodiments of the present invention will now be described further, with reference to the accompanying drawings, in which:

(2) FIG. 1A-B: SORT-M enables proteome tagging and labelling at diverse codons, with diverse chemistries, and in genetically targeted cells and tissues. FIG. 1A: Proteome tagging via SORT (stochastic orthogonal recoding of translation) uses an orthogonal aminoacyl-tRNA synthetase/tRNA pair. The pyrrolysyl-tRNA synthetase/tRNA pair is used in this study. This synthetase (and its previously evolved active-site variants) recognizes a range of unnatural amino acids (yellow star, and yellow hexagon), does not aminoacylate endogenous tRNAs, but efficiently aminoacylates its cognate tRNAwithout regard to anticodon identity; PyltRNA is not a substrate for endogenous aminoacyl-tRNA synthetases. Orthogonal pyrrolysyl-tRNA synthetase/tRNA.sub.XXX pairs (XXX indicates choice of anticodon, yellow) in which the anticodon has been altered compete for the decoding of sense codons (dark blue and pink) via a pathway that is orthogonal to that used by natural synthetases and tRNAs (dark blue and pink) to direct natural amino acids. SORT allows the incorporation of diverse chemical groups into the proteome, in response to diverse codons. Since there is no competition at the active site of the orthogonal synthetase, starvation and minimal media are not required. In addition the expression pattern of the orthogonal proteome tagging system can be genetically directed allowing tissue specific proteome labelling. Selective pressure incorporation approaches are shown in for comparison to SORT. FIG. 1B: The combination of encoding amino acids (1-3) across the proteome via SORT and chemoselective modification of 3 with tetrazine probes (4a-g, 5, 6 and 7) allows detection of labelled proteins via SORT-M (stochastic orthogonal recoding of translation and chemoselective modification). Amino acid structures: N.sup.-((tert-butoxy)carbonyl)-L-lysine 1, N.sup.-(1-propynlyoxy)carbonyl)-L-lysine 2 and N.sup.-(((2-methylcycloprop-2-en-1-yl)methoxy)carbonyl)-L-lysine.

(3) FIG. 2A-C Shows Quantitative Site-Specific Incorporation of 3 into Proteins Expressed in E. coli and its Rapid and Quantitative Labelling with Tetrazine Probes

(4) FIG. 2A: The PylRS/RNA.sub.CUA pair directs efficient, site-specific incorporation of 3 into sfGFP bearing an amber stop codon at position 150. Incorporation of 3 is more efficient than 1 a well-established excellent substrate for the PylRS/tRNA.sub.CUA pair.

(5) FIG. 2B: Specific and quantitative labelling of 2 nmol sfGFP bearing 3 with 10 equivalents of tetrazine fluorophore 4a. ESI-MS analysis of sfGFP-3 purified from E. coli grown with 1 mM 3 bearing the PylRS/RNA.sub.CUA pair and SfGFP150TAG confirms the incorporation of 3. sfGFP150-3: Expected mass: 27951.5 Da, Found mass: 279501.0 Da, minor peak 27820 corresponding to loss of N-terminal methionine. Labelling sfGFP150-3 with 4a is quantitative, as judged by ESI-MS of the labelling reaction. Expected mass: 28758.4 Da, Found mass: 287581.0 Da, minor peak 28627 corresponds to loss of N-terminal methionine.

(6) FIG. 2C: Determining the rate constant for labelling of sfGFP-3 (10.6 sfGFP incorporating 3 at position 150), with 10 equivalents of 4a. 2 nmol of purified sfGFP-3, (10.6 M in 20 mM Tris-HCl, 100 mM NaCl, 2 mM EDTA, pH 7.4) were incubated with 20 nmol of tetrazine-dye conjugate 4a (10 l of a 2 mM solution in DMSO). At different time points 8 l, aliquots were taken from the solution and quenched with a 700-fold excess of BCN and plunged into liquid nitrogen. Samples were mixed with NuPAGE LDS sample buffer supplemented with 5% -mercaptoethanol, heated for 10 min to 90 C. and analyzed by 4-12% SDS page. The amounts of labelled proteins were quantified by scanning the fluorescent bands with a Typhoon Trio phosphoimager (GE Life Sciences). Bands were quantified with the ImageQuant TL software (GE Life Sciences) using rubber band background subtraction. The rate constant was determined by fitting the data to a single-exponential equation. The calculated observed rate k was divided by the concentration of 4a to obtain rate constant k for the reaction. Measurements were done in triplicate. All data processing was performed using Kaleidagraph software (Synergy Software, Reading, UK). For comparison the rate of labelling sfGFP bearing N-5-norbornene-2-yloxycarbonyl-L-lysine (NorK), a known substrate for PylRS, was determined in a similar way using 11.25 M sfGFP bearing NorK at position 150 (SfGFP-NorK) and 20 equivalents of 4a.

(7) FIG. 3 showsprimers (SEQ ID NOS. 18-45).

(8) FIG. 4A-B Shows SORT-M Enables Codon Specific Proteome Tagging and Labelling in E. coli

(9) FIG. 4A: Proteome labelling with 3 via the indicated PylRS/tRNA.sub.XXX pair. Cells contained two plasmids, one encoding MbPylRS, the other encoding T4 lysozyme and the indicated tRNA.sub.XXX. Cells were grown in the presence of 0.1 mM 3 from OD.sub.600=0.2 and T4 lysozyme expression, induced by the addition of 0.2 mM arabinose after 1 h. After a further 3 h cells were harvested. Tagged proteins in the lysate were detected via an inverse electron demand Diels-Alder reaction between incorporated 3 and tetrazine fluorophore 4a (20 mM, 1 h, RT). The amino acids in parentheses are the natural amino acids encoded by the endogenous tRNA bearing the corresponding anti-codon. FIG. 4B: Lane profile analysis for each codon.

(10) FIG. 5 Shows Specific Amino Acid Replacement in SORT Demonstrated by ESI-MS

(11) T4 lysozyme isolated after SORT with UUU(Lys) in the presence of 1 mM 3. Expected mass WT T4 lysozyme: 19512.2 Da, Found mass: 195102.0 Da. Expected mass WT T4 lysozyme Lys 3.fwdarw.single mutation: 19622.3 Da, Found mass: 196202.0 Da.

(12) FIG. 6 Shows Incorporation of 3 (0.1 mM) via SORT-M is Not Toxic to Cells

(13) Chemically competent DH10B cells were transformed with two plasmids: pBKwtPylRS necessary for expression of PylRS, and pBAD_wtT4L_MbPylT.sub.XXX plasmids that is required for expression of PyltRNA.sub.XXX and expresses lysozyme under arabinose control. The cells were recovered in 1 ml SOB medium for one hour at 37 C. prior to aliquoting to 10 ml LB-KT (LB media with 50 g ml.sup.1 kanamycin, and 25 g ml.sup.1 tetracycline) and incubated overnight (37 C., 250 rpm, 12 h). The overnight culture (OD.sub.6003) was diluted to a OD.sub.6000.3 in 10 mL LB-KT.sub.1/2 (LB media with 25 g ml.sup.1 kanamycin, and 12.5 g ml.sup.1 tetracycline) supplemented with 3 at different concentrations, 0, 0.1, 0.5 mM. 200 L aliquots of these cultures were transferred into a 96-well plate and OD.sub.600 measured using a Microplate reader, Infinite 200 Pro (TECAN). OD.sub.600 was measured for each sample every 10 min with linear 1 mm shaking between the measurements.

(14) FIG. 7 Shows Measurement of Time-Dependent Variation in Incorporation of 3 in Proteome via SORT-M at Different Concentrations of 3 in Response to AAA Codon

(15) Chemically competent DH10B cells were transformed with two plasmids: pBKwtPylRS necessary for expression of PylRS, and pBAD_wtT4LMbPylT.sub.UUU plasmid that is required for expression of PyltRNA.sub.UUU. pBAD_wtT4L_MbPylT.sub.UUU plasmid also contains the gene for expression of T4 lysozyme that is downstream of arabinose-inducible promoter. After transformation, cells were recovered in 1 ml SOB medium for one hour at 37 C. prior to inoculation in 10 ml LB-KT (LB media with 50 g ml.sup.1 kanamycin, and 25 g tetracycline). The culture was incubated overnight (37 C., 250 rpm, 12 h) and subsequently diluted to an OD.sub.6000.3 in 30 mL LB-KT.sub.1/2 (LB media with 25 g ml.sup.1 kanamycin, and 12.5 g ml.sup.1 tetracycline) supplemented with 3 at different concentrations, 0, 0.1, 0.5 mM. The cultures was incubated (37 C., 250 rpm) for 1 h, when OD.sub.600 reached approximately 0.6. 2 ml culture aliquot was collected in a separate tube for each of three cultures. This is the pre-induction culture (lane labelled as 1 in the gel image). Subsequently arabinose was added at a final concentration of 0.2% (v/v) to induce expression of T4 lysozyme and culture aliquots of 2 mL were collected every hour (lanes labelled as 2, 3 and 4 corresponding to 1, 2 and 3 h culture collection after induction). For each of the collected cultures, bacterial cells were pelleted by centrifugation at 4 C., washed with ice cold PBS (31 mL) and subsequently the pellets were frozen and stored at 20 C. The pellets were then thawed in 200 L of ice cold PBS and lysed by sonication (910 s ON/20 s OFF, 70% power). The lysates were clarified by centrifugation at 15,000 RPM, 4 C. for 30 minutes. The supernatants were transferred to fresh 1.5 mL tubes. 50 L of supernatant was transferred to a new tube for the labeling reactions, and the rest was frozen in liquid nitrogen and stored at 80 C. To the 50 L of supernatant, 0.5 L of 2 mM 4a was added and the lysates were incubated at 25 C. for 1 hour. After 1 h, 17 L of 4LDS sample buffer supplemented (6 mM BCN and 5% BME) was added and mixed by vortexing gently. Samples were incubated for 10 min before boiling at 90 C. for 10 min. Samples were analysed by 4-12% SDS-PAGE and fluorescent images were acquired using Typhoon Trio phosphoimager (GE Life Sciences)

(16) FIG. 8A-C shows Site-specific incorporation of 3 into proteins at diverse codons and specific proteome labelling using SORT-M in human cells. FIG. 8A: Western blot analysis demonstrates the efficient amino acid dependant expression of an mCherry-EGFP fusion protein separated by an amber stop codon bearing a C-terminal HA-tag (mCh-TAG-EGFP-HA) in HEK293T cells. Anti-FLAG detected tagged PylRS. FIG. 8B: Specific labelling of mCh-TAG-EGFP-HA (immunoprecipitated from 10.sup.6 cells) with 4a (20 M in 50 L PBS, 1 h, RT) confirms the incorporation of 3 into protein in HEK293 cells. FIG. 8C: SORT-M labelling of 3 that is statistically incorporated into newly synthesised proteins across the whole proteome of mammalian cells directed by six different PylRS/PyltRNA.sub.XXX (mutants using 0.5 mM 3. Labeling with 4g (20 M in PBS, 1 h, RT, as above). The amino acids in parentheses are the natural amino acids encoded by the endogenous tRNA bearing the corresponding anti-codon.

(17) FIG. 9A: Full blots from FIG. 8A-C.

(18) FIG. 9B: Full blots from FIG. 10A-C.

(19) FIG. 10 shows Site-specific incorporation of amino acid 3 into protein produced in Drosophila melanogaster. FIG. 10A: Incorporation of 3 demonstrated by a dual luciferase reporter. Dual luciferase assay on ovary extract from 10 female flies expressing Triple-Rep-L in the presence or absence of 10 mM 1 or 10 mM 3. The data show a representative example from 1 of 3 biological replicates. The error bars represent the standard deviation of 3 technical replicates from a single biological replicate. FIG. 10B: Site-specific incorporation of 3 (or 1) into GFP_TAG_mCherry-HA in flies expressing PylRS/PyltRNA.sub.CUA. The full-length protein resulting from unnatural amino acid incorporation is detected by anti-HA western blot. FIG. 10C: Specific labelling of encoded 3 with tetrazine probes. Flies were fed with no amino acid, amino acid 1 (500 flies) or amino acid 3 (100 flies). 5 times more flies were fed with 1 in order to generate comparable amount of reporter protein. The full-length protein containing the unnatural amino acid was immunoprecipitated from lysed ovaries with anti-GFP beads. The beads were labelled (4g, 4 M, 200 L, PBS, RT, 2 h) washed. Full length protein was detected by anti-HA blot and the same gel imaged on a fluorescence scanner shows specific fluorescent labelling of the protein incorporating 3 but not 1, confirming the identity of the incorporated amino acid.

(20) FIG. 11A-B (example 6): Specific protein labeling at genetically encoded unnatural amino acids 1 and 2. FIG. 11A: Genetically encoded 1, but not 2, in calmodulin is specifically labeled with probe 3. Coomassie and fluorescence images demonstrate the specificity of labeling and ESI MS before labelling (black, expected mass: 17875, found mass: 17874) and after labelling (red, expected mass: 18553, found mass: 18552) demonstrate the reaction is quantitative. FIG. 11B: Genetically encoded 2, but not 1, in calmodulin is specifically labeled with probe 4. Coomassie and fluorescence images demonstrate the specificity of labeling and ESI MS before labeling (black, expected mass: 17930, found mass: 17930) and after labelling (green, expected mass: 18484, found mass: 18485) demonstrate the reaction is quantitative. Raw (before deconvolution) ESI-MS spectra are not shown.

(21) FIG. 12A-B (example 6): Incorporating 1 and 2 at positions 1 and 40 of Calmodulin and the kinetics of specific labelling. FIG. 12A: Expression was performed in E. coli bearing ribo-Q1, O-gst-cam.sub.1TAG-40AGTA, the PylRS/tRNA.sub.UACU pair and the MjPrpRS/tRNA.sub.CUA pair. Amino acids 1 and 2 were used at 4 and 1 mM, respectively. FIG. 12B: Labelling time course for reaction of CaM1.sub.12.sub.40 with 3 and 4. Each reaction was followed for 2 h by in gel fluorescence and mobility shift.

(22) FIG. 13A-B (example 6): Concerted, quantitative one-pot, dual labeling of Calmodulin in 30 minutes. FIG. 13A: Dye dependent labeling of CaM1.sub.12.sub.40; sequential labeling with purification after first labeling in lane 4, sequential labeling without purification in lane 5, one-pot dual labeling in lane 6. FIG. 13B: ESI-MS of one-pot protein labeling, before labeling (black, expected mass: 18000 found mass: 18000), after labeling (gold, expected mass: 19233 found mass: 19234). Raw (before deconvolution) ESI-MS spectra are not shown.

(23) FIG. 14A-B shows a concerted, rapid, one-pot quantitative dual labelling of proteins in aqueous medium at physiological pH and temperature. FIG. 14A: Unnatural amino acids and fluorophores used in this example. FIG. 14B: Concerted labeling at an encoded terminal alkyne and an encoded cyclopropene via mutually orthogonal cycloadditions.

(24) FIG. 15 shows Amino acid and DNA sequence of Drosophila GFP-amber-mCherry-HA.

(25) GFP (amino acid residues 1-238), Amber codon at position 248, mCherry (amino acid residues 255-489), HA tag (amino acid residues 491-499), Myc tag (amino acid residues 500-509), His tag (amino acid residues 510-515) and SV40 NLS (amino acid residues 523-528).

(26) FIG. 16 shows structure of exemplary amino acid N.sup.-[((2-methylcycloprop-2-en-1-yl)methoxy)carbonyl]-1-lysine.

EXAMPLESDESCRIPTION OF THE EMBODIMENTS

(27) Although illustrative embodiments of the invention have been disclosed in detail herein, with reference to the accompanying drawings, it is understood that the invention is not limited to the precise embodiment and that various changes and modifications can be effected therein by one skilled in the art without departing from the scope of the invention as defined by the appended claims and their equivalents.

Chemical SynthesesGeneral Methods

(28) All chemicals and solvents were purchased from Sigma-Alrich, Alfa Aesar or Fisher Scientific and used without further purification unless otherwise stated. Qualitative analysis by thin layer chromatography (TLC) was performed on aluminium sheets coated with silica (Merck TLC 60F-254). The spots were visualized under short wavelength ultra-violet lamp (254 nm) or stained with basic, aqueous potassium permanganate, ethanolic ninhydrin or vanillin. Flash column chromatography was performed with specified solvent systems on silica gel 60 (mesh 230-400).

(29) LC-MS analysis was performed on Agilent 1200 machine. The solvents used consisted of 0.2% formic acid in water (buffer A) and 0.2% formic acid in acetonitrile (buffer B). LC was performed using Phenomenex Jupiter C18 column (1502 mm, 5 m) and monitored using variable wavelengths. Retention times (R.sub.t) are recorded to a nearest 0.1 min and m/z ratio to nearest 0.01 mass units. The following programme was used for small molecule LC gradient: 0-1 min (A:B 10:90-10:90, 0.3 mL/min), 1-8 min (A:B 10:90-90:10, 0.3 mL/min), 8-10 min (A:B 90:10-90:10, 0.3 mL/min), 10-12 (A:B 90:10-10:90, 0.3 mL/min).

(30) Mass spectrometry analysis following LC was carried out in ESI mode on a 6130 Quadrupole spectrometer and recorded in both positive and negative ion modes. NMR analysis was carried out on a Bruker 400 MHz instrument. All reported chemical shifts () relative to TMS were referenced to the residual protons in deuterated solvents used: d.sub.1chloroform (.sup.1H =7.26 ppm, .sup.13C =77.16 ppm), d.sub.6dimethylsulfoxide (.sup.1H =2.49 ppm, .sup.13C =39.52 ppm), D.sub.2O (.sup.1H =4.70). APT or two-dimensional experiments (COSY, HSQC) were always performed to provide additional information used for analysis where needed. Coupling constants are given in Hz and described as: singlets, doubletd, triplett, quartetq, broad singletbr, multipletm, doublet of doubletsdd, etc. and combinations thereof.

Protein Expression, Purification and Labelling of Site-Specifically Incorporated 3 in E. coli

(31) Expression and purification of sfGFP-3 from E. coli Electrocompetent E. coli DH10B cells were co-transformed with pBK-MbPylRS and psfGFP150TAG PyIT.sup.14, 26. Transformed cells were recovered in S.O.B. (1 mL, supplemented with 0.2% glucose) for 1 h at 37 C. and used to inoculate LB containing 50 g/mL kanamycin and 25 g/mL tetracycline (LB-KT). The cells were incubated with shaking overnight at 37 C., 250 r.p.m. 1 mL of overnight culture was used to inoculate 100 mL of LB-KT, the day culture was then incubated (37 C., 250 r.p.m). At O.D..sub.6000.3, the culture was divided equally and supplemented with either 3 (1 mM) or H.sub.2O (500 L) and incubated further (37 C., 250 r.p.m). At O.D..sub.6000.6 protein expression was induced by the addition of arabinose (0.2%), after 4 h, the cells were harvested by centrifugation (4000 r.p.m, 20 min) and the pellet frozen until further use.

(32) The frozen bacterial pellet was thawed on ice and resuspended in 2.5 mL lysis buffer (Bugbuster, Novagen, 50 g/mL DNAse 1, Roche inhibitor cocktail and 20 mM imidazole). Cells were incubated (4 C., 30 minutes) then clarified by centrifugation (16000 g, 4 C., 30 minutes). The clarified lysates were transferred to fresh tubes and 100 L Ni-NTA slurry added. The mixtures was incubated with agitation (4 C., 1 h) and then collected by centrifugation (1000 g, 4 C., 5 min). The beads were resuspended three times in 500 L wash buffer (10 mM Tris-HCL, 40 mM imidazole, 200 mM NaCl, pH 8) and collected by centrifugation (1000 g, 4 C., 5 min). Finally, the beads were resuspended in 100 L, elution buffer (10 mM Tris-HCL, 300 mM imidazole, 200 mM NaCl, pH 8), pelleted by centrifugation (1000 g, 4 C., 5 min) and the supernatant collected into fresh tubes. The elution was repeated three times with 100 L of elution buffer. The purified proteins were analysed by 4-12% SDS-PAGE and LC-MS.

Protein Mass Spectrometry

(33) Using an Agilent 1200 LC-MS system, ESI-MS was additionally carried out with a 6130 Quadrupole spectrometer. The solvent system consisted of 0.1% formic acid in H.sub.2O as buffer A, and 0.1% formic acid in acetonitrile (MeCN) as buffer B. Protein UV absorbance was monitored at 214 and 280 nm. Protein MS acquisition was carried out in positive ion mode and total protein masses were calculated by deconvolution within the MS Chemstation software (Agilent Technologies).

In Vitro Labeling of Purified sfGFP150-3

(34) To Purified sfGFP150-1 or sfGFP150-3 protein (30 M, in elution buffer) was added 4a (10 molar equivalents, from a 2 mM stock solution in DMSO). The reactants were mixed by aspirating several times and the mixture then incubated at room temperature for 2 hours, a sample was analysed by ESI-MS. Following incubation the proteins were separated by 4-12% SDS-PAGE and analysed by using Typhoon Trio phosphoimager (GE Life Sciences).

Time Course of sfGFP150-3 and sfGFP150-NorK Labelling and Rate Constant Determination

(35) 2 nmol sfGFP-3 (10.6 M) was labeled at room temperature by the addition of 20 nmol of tetrazine-dye conjugate 4a (10 l of a 2 mM solution in DMSO) the samples were mixed by aspirating several times. At different time points, 8 L aliquots were taken from the solution and quenched with a 700-fold excess of bicyclo[6.1.0]non-4-yn-9-ylmethanol (BCN) and plunged into liquid nitrogen. Samples were mixed with NuPAGE LDS sample buffer supplemented with 5% -mercaptoethanol, heated for 10 min to 90 C. and analyzed by 4-12% SDS page. The amounts of labelled proteins were quantified by scanning the fluorescent bands with a Typhoon Trio phosphoimager (GE Life Sciences). Bands were quantified with the ImageQuant TL software (GE Life Sciences) using rubber band background subtraction. The rate constant was determined by fitting the data to a single-exponential equation. The calculated observed rate k was divided by the concentration of 4a to obtain rate constant k for the reaction. Measurements were done in triplicate. All data processing was performed using Kaleidagraph software (Synergy Software, Reading, UK). For comparison the rate of labelling sfGFP bearing Ne-5-norbornene-2-yloxycarbonyl-L-lysine (NorK), a known substrate for PylRS, was determined in a similar way using 11.25 mM sfGFP bearing NorK at position 150 (SfGFP-NorK) and 20 equivalents of 4a.

Plasmid Construction for pBAD_wtT4L_MbPylT.SUB.XXX

(36) pBAD_T4L83TAG_MbPylT.sub.CUA was digested with Ncol and Kpnl restriction enzymes. The same restriction enzymes were also used to digest the wild-type T4 lysozyme from (D67) pBAD_wtT4L. The insert and backbone were ligated in 3:1 ratio using T4 DNA ligase (RT, 2 hours), transformed into chemically competent DH10B cells and grown on Tetracycline agar plates (37 C., 18 hours). Single colonies were picked and the correct sequence was confirmed by DNA sequencing (GATC Gmbh.), this step created pBAD_wtT4L_MbPylT.sub.CUA. All final constructs were confirmed by DNA sequencing.

Proteomic Incorporation of 3 Via SORT in E. coli Expressing T4 Lysozyme

(37) Electrocompetent E. coli DH10B cells (50 L) were either doubly transformed with pBAD_wtT4L_MbPylT.sub.XXX plasmid (2 L, necessary for expression of PyltRNA.sub.XXX and expresses T4 lysozyme under arabinose control) and pBKwtPylS plasmid (2 L necessary for expression of PylRS) or singly transformed with pBAD_wtT4L_MbPylT.sub.XXX alone. Transformed cells were recovered in 1 mL S.O.B. (supplemented with 0.2% glucose) for 1 h at 37 C. 100 L of the recovery was used to inoculate 5 mL LB-KT (50 g/mL kanamycin and 25 g/mL tetracycline) or LB-T (25 g/mL tetracycline). Cultures were incubated overnight (37 C., 250 r.p.m.). 1 mL of each overnight culture was used to inoculate 15 mL strength antibiotic containing media LB-T or LB-KT. Cultures were incubated at 37 C. until O.D..sub.6000.3 was reached, at this time each culture was divided into 5 mL aliquots and supplemented with either 3 (0.1 mM final conc.) or H.sub.2O (50 L). Cultures were then incubated (37 C., 250 r.p.m.). At O.D..sub.600 0.6. T4 lysozyme expression was initiated by the addition of arabinose (0.2% final conc.) and cultures incubated for a further 4 hours. Cells were harvested by centrifugation (4000 rpm, 4 C., 20 minutes) and then resuspended three times in 1 mL of ice cold PBS and collected by centrifugation (4000 rpm, 4 C., 20 minutes). The final bacterial pellets were immediately frozen for storage.

E. coli: Chemoselective Labelling Proteomes Tagged with 3 with Tetrazine-Dye Conjugates

(38) Frozen bacterial pellets were resuspended in 500 L PBS and lysed using a bath sonicator (energy output 7.0, 90 s total sonication time. 10 s blasts and 20 s breaks, Misonix Sonicator 3000). The lysate was cleared by centrifugation (4 C., 14000 r.p.m., 30 min) and the supernatant aspirated to a fresh tube. To 50 L of cleared cell lysate was added. 4a (2 mM, stock in DMSO, final concentration20 M). The reactions were mixed by aspirating several times and the samples then incubated in the dark (room temperature, 1 h). After this time 17 L of 4LDS sample buffer supplemented (6 mM BCN and 5% BME) was added and mixed by vortexing gently. Samples were incubated for 10 min before boiling at 90 C. for 10 min. Samples were analysed by 4-12% SDS-PAGE and fluorescent images were acquired using Typhoon Trio phosphoimager (GE Life Sciences).

(39) The same protocol for fluorescent labelling of the E. Coli proteins was applied for all tetrazine-dye conjugates.

Site-Specific Incorporation of 3 in HEK293 Cells and Chemoselective Labelling with Tetrazine Probes

Site Specific Incorporation of 3 in HEK Cells

(40) HEK293 Cells (ATCC CRL-1573) were plated on 24 well plates and grown to near confluence. The cells were transfected using Lipofectamine 2000 (Invitrogen) with the pMmPylS-mCherry-TAG-EGFP-HA construct and the p4CMVE-U6-PylT construct..sup.18 After 16 hrs growth with or without 1 mM 3 or with 1 mM 1 the cells were lysed on ice using RIPA buffer (Sigma). The lysates were spun down and the supernatant was added to 4LDS sample buffer (Life technologies). The samples were run out by SDS-PAGE, transferred to a nitrocellulose membrane and blotted using primary rat anti-HA(clone 3F10, Roche, No. 867 423) and mouse anti-FLAG (clone G191, Abnova, cat. MAB8183), the secondary antibodies were anti-rat (Invitrogen, A11077) and anti-mouse (Cell Signaling Technologies, No. 7076S).

Labelling Site-Specifically Incorporated 3 from HEK 293 Cells

(41) Adherent HEK293T cells (ATCC CRL-11268; 410.sup.6 per immunoprecipitation) were transfected with 7.5 g p4CMVE-U6-PylT and 7.5 g pPylRS-mCherry-TAG-EFGP-HA.sup.18 using TranslT-293 transfection reagent according to the manufacturer's protocol and cultured for 48 hours in DMEM/10% FBS, supplemented with 0.5 mM 1 or 2 mM 3 where indicated. Cells were washed twice with PBS and lysed on ice for 30 minutes in 1 mL Lysis Buffer (150 mM NaCl, 1% Triton X-100, 50 mM Tris HCl (pH 8.0). After clarifying the lysate by centrifugation (10 min at 16000 g), HA-tagged proteins were captured using 50 L MACS HA-tag MicroBeads (Miltenyl Biotec) per transfection, washed with 0.5 mL RIPA (150 mM NaCl, 1% Igepal CA-630, 0.5% sodium deoxycholate, 0.1% SDS, 50 mM Tris HCl (pH 8.0) and 0.5 mL PBS (pH 7.4). The suspension of MicroBeads was incubated with 50 L PBS (pH 7.4), 20 M 4a for 1 hour and subsequently washed with 0.5 mL RIPA to remove excess dye. HA-tagged proteins were eluted from beads using SDS sample buffer and separated on a 4-12% Bis-Tris PAGE gel (Invitrogen), imaged using a Typhoon imager (GE Healthcare) and subsequently stained with DirectBlue or transferred for western blotting with Anti-HA-tag pAb-HRP-DirecT (MBL).

Expression and Purification of SfGFP from Mammalian Cells

(42) HEK293T were transfected in a 10 cm tissue culture dish with 15 ug DNA using PEI and incubated for 72 hours with 3 (0.5 M). Cells were washed twice with PBS and lysed in 1 mL RIPA buffer. Cleared lysate was added to 50 L GFP-Trap M (ChromoTek) and incubated for 4 h. Beads were washed with 1 mL RIPA, 1 mL PBS, 1 mL PBS+500 mM NaCl, 1 mL ddH2O and eluted in 1% Acetic Acid/ddH2O. Purified protein was labeled with 2 M 4a for 4 h and loaded on a 4-12% Bis-Tris PAGE gel. Fluorescence of 4a-labeled sfGFP was detected on a Typhoon imager and gel was stained subsequently with DirectBlue.

Fly Plasmids, Transgenicflies and Culture

(43) For all fly experiments no randomisation or blinding was used within this study

Plasmid Construction for Transgenic Fly Line Generation

(44) The PyltRNA.sub.CUA anticodon was mutated using the QuikChange mutagenesis kit and pSG108 (pJet 1.2-U6-PylT, gift from S. Greiss) as a template. This contains the PylT gene without its 3 terminal CCA fused to the Drosophila U6-b promoter. Primers FMT19 and FMT20 were used to generate PyltRNA.sub.TGC to decode alanine codons (creating pFT18); primers FMT23 and FMT24 were used to generate PyltRNAccr to decode serine codons (creating pFT20); primers FMT27 and FMT28 were used to generate PyltRNA.sub.CAG to decode leucine codons (creating pFT22) and primers FMT29 and FMT30 were used to generate PyltRNA.sub.CAT to decode methionine codons (creating pFT23). The mutated tRNA expression cassettes were subcloned from pFT18, pFT20, pFT22 and pFT23 into pUC18 using EcoRI and HinDIII then multimerised using AsiSI, BamHI and BglII to create 2, then 4 copies of the tRNA. The 4 copy versions of the tRNA cassette were subcloned into pSG118 using AsiSI and MluI to create pFT58 (Ala), pFT60 (Ser), pFT62 (Leu) and pFT63 (Met). pSG118 contains the M. mazei PylRS gene..sup.20

Fly Lines and Culture Conditions

(45) Transgenic lines were created by P element insertion using a Drosophila embryo injection service (BestGene Inc.). Lines were generated using the following plasmids: pFT58 (Ala), pFT60 (Ser), pFT62 (Leu) and pFT63 (Met). nos-Gal4-VP16 (Bloomington 4937) and MS1096-Gal4 (Bloomington 8860) were used as Gal4 drivers. All flies were grown at 25 C. on standard Iberian medium. Flies were fed unnatural amino acids by mixing dried yeast with the appropriate concentration of amino acid (usually 10 mM) diluted in dH.sub.2O to make a paste. Ovaries were prepared from females that were grown on Iberian fly food supplemented with a yeast paste with or without the amino acid for a minimum of 48 hours. For proteome labelling experiments transgenic male flies of constructs FT58, FT60, FT62 and FT63 were crossed with nos-vp16-GAL4 virgins to generate FT58/nos-vp16-GAL4, FT60/nos-vp16-GAL4, FT62/nos-vp16-GAL4 and FT63/nos-vp16-GAL4 respectively.

Site Specific Incorporation of 3 in D. melanogaster

Luciferase Assays

(46) Ovaries from 10 females of Triple Rep-L flies recombined with nos-Gal4-VP16 fed 3, 1 or no amino acid were dissected in 100 l 1 Passive lysis buffer and processed for luciferase assays as previously described.sup.20.

Immunoprecipitation and Labelling of Site Specifically Incorporated 3

(47) Ovaries from 100 (for control and 3) or 500 (for 1) females were dissected in PBS then lysed in 300 or 1500 l RIPA buffer containing ix complete protease inhibitor cocktail (Roche). A sample was taken into 4LDS buffer as a total lysate control then the remainder was used for immunoprecipitation with GFP-TRAP agarose beads (Chromotek) following the manufacturer's instructions. The total volume of the IP was 3 ml. After overnight incubation, the beads were washed 2 with RIPA buffer then 2 with PBS. For tetrazine labeling, the beads were resuspended in 200 l PBS+4 M 4g and incubated for 2 hours on a roller at RT. The beads were washed 3 times with 500 L of wash buffer then resuspended in 4LDS sample buffer.

Example 1Synthesis of N.SUP..-[((2-methylcycloprop-2-en-1-yl)methoxy)carbonyl]-L-lysine 3

(48) A class of reaction useful in protein labelling is the very rapid and specific inverse electron demand Diels-Alder reaction between strained alkenes (or alkynes) and tetrazines..sup.21-25

(49) While we, and others, have previously encoded unnatural amino acids bearing strained alkenes, alkynes and tetrazines via genetic code expansion and demonstrated their use for site-specific protein labelling via inverse electron demand Diels-Alder reactions,.sup.26-30 all the molecules used to date are rather large. We have previously shown that a variety of carbamate derivatives of lysine are good substrates for PylRS,.sup.31 and it has been demonstrated that 1,3 disubstituted cyclopropenes, unlike 3,3 disubstituted cyclopropenes,.sup.32,24 react efficiently with tetrazines..sup.22 We therefore designed and synthesized a carbamate derivative of lysine, bearing a 1,3 disubstituted cyclopropene (N.sup.-[((2-methylcycloprop-2-en-1-yl)methoxy)carbonyl]-L-lysine 3, FIG. 1b), for incorporation into proteins and labelling with tetrazines.

Synthesis of Methylcycloprop-2-en-1-yl}methoxy)carbonyl]-L-lysine (3)

(50) ##STR00004##

i. Ethyl 2-methylcycloprop-2-ene-1-carboxylate S1

(51) A 100 mL 2-neck round bottom flask was charged with CH.sub.2Cl.sub.2 (2 mL) and rhodium acetate (442 mg, 1 mmol, 0.05 eq), and fitted with a dry ice condenser. Propyne (approx. 10 mL) was condensed into the rhodium acetate suspension and the flask lowered into a water bath (20 C.), a steady reflux of propyne was obtained. Ethyl diazoacetate (2.1 mL, 20 mmol, 1 eq) was added to the stirred propyne solution drop-wise over 1 h using a syringe pump. The reaction was stirred at room temperature for a further 10 minutes whereby TLC analysis showed the reaction to be complete by after this time. The cyclopropene product was then purified by silica gel flash column chromatography eluting with pentane and diethyl ether (90:10). This gave the desired product S1 as a colourless volatile liquid (1.9 g, 75% yield). .sup.1H NMR analysis .sub.H (400 MHz, CDCl.sub.3) 6.35 (1H, t, J 1.4), 4.18-4.09 (2H, m), 2.16 (3H, d, J 1.3), 2.12 (1H, d, J 1.6), 1.26 (3H, t, J 7.1); LRMS m/z (ES.sup.+) 127.2 [M+H].sup.+.

(52) These values are in good agreement with literature. {Liao, 2004 #1}

ii. and iii. (2-Methylcycloprop-2-en-1-yl)methyl (4-nitrophenyl) Carbonate S3

(53) DIBAL-H (22.5 mL of a 1M solution in CH.sub.2Cl.sub.2, 22.5 mmol, 1.5 eq) was added drop-wise to a stirred solution of cyclopropene ester S1 (1.9 g, 15 mmol, 1 eq) in CH.sub.2Cl.sub.2 (15 mL) at 10 C. The reaction was stirred at 10 C. for 20 minutes before quenching with the cautious addition of H.sub.2O (1 mL), then NaOH (1 mL of a 1 M solution in H.sub.2O) and H.sub.2O (2.3 mL). The mixture was stirred for a further 2 h at room temperature before it was dried (Na.sub.2SO.sub.4) and filtered. Hunig's base (3.9 mL, 22.5 mmol, 1.5 eq) was added to the filtrate (containing crude cyclopropene alcohol S2) followed by the addition of 4-nitrophenyl chloroformate (3.3 g, 16.5 mmol, 1.1 eq). After stirring at room temperature for 18 hours a significant colourless precipitate formed, and TLC analysis showed complete consumption of the crude cyclopropene alcohol S2. The reaction was diluted with CH.sub.2Cl.sub.2 and then dry loaded onto silica gel, whereby the activated carbonate S3 was purified by silica gel column chromatography eluting with ethyl acetate and hexane (20:80). This gave the desired cyclopropene carbonate S3 as a colourless oil (2.7 g, 73% yield over 2 steps). .sup.1H NMR analysis .sub.H (400 MHz, CDCl.sub.3) 8.28 (2H, d, J 9.2), 7.39 (2H, d, J 9.2), 6.62 (1H, s), 4.21 (1H, dd, J 10.9, 5.3), 4.14 (1H, dd, J 10.9, 5.3), 2.18 (3H, d, 1.3), 1.78 (1H, td, J 5.3, 1.3).

iv. N.SUP..-(Fmoc)-N.SUP..-(((2-methylcycloprop-2-en-1-yl)methoxy)carbonyl)-L-lysine S4

(54) Fmoc-Lys-OH.HCl (6.7 g, 16.5 mmol, 1.5 eq) was dissolved in THF (30 mL) and DMF (10 mL), to this solution was added Hnig's base (9.0 mL, 55.0 mmol, 5 eq) followed by cyclopropene carbonate S3 (2.7 g, 11.0 mmol, 1 eq) an immediate yellow coloration was observed upon addition of the carbonate. The reaction was stirred at room temperature for 6 hours and was adjudged complete by the consumption of starting material after this time as shown by TLC analysis. The crude reaction mixture was dry loaded onto silica gel and the major product purified by silica gel column chromatography eluting with ethyl acetate, hexane and acetic acid (50:49:1 then 99:0:1). This gave the desired product S4 as a colourless gum (4.3 g, 82% yield). .sup.1H NMR analysis .sub.H (400 MHz, CDCl.sub.3) 7.77 (2H, t, J 7.6), 7.65-7.55 (2H, m), 7.39 (2H, t, J 7.6), 7.31 (2H, t, J 7.3), 6.54 (1H, s), 5.68-5.57 (1H, m), 4.84 (1H, br-s), 4.44-4.32 (2H, m), 4.22 (1H, t, J 7.0), 3.98-3.87 (1H, m), 3.17-3.09 (2H, m), 2.15-2.06 (6H, m), 1.99-1.86 (1H, m), 1.84-1.70 (1H, m), 1.68-1.59 (1H, m), 1.58-1.34 (2H, m); LRMS m/z (ES.sup.+) 479.3 [M+H].sup.+, 501.3 [M+Na].sup.+, m/z (ES.sup.) 477.2 [MH].sup..

N.SUP..-[({2-methylcycloprop-2-en-1-yl}methoxy)carbonyl]-L-lysine 3

(55) N.sup.-(Fmoc)-N.sup.-(((2-methylcycloprop-2-en-1-yl)methoxy)carbonyl)-L-lysine S4 (3.5 g, 7.0 mmol, 1 eq) was dissolved in THF and H.sub.2O (3:1 40 mL), to this solution was added sodium hydroxide (0.9 g, 22.6 mmol, 3.1 eq). The reaction was stirred at room temperature for 8 hours after which time the reaction was adjudged complete by LC-MS analysis. The reaction mixture was diluted with H.sub.2O (100 mL) and the pH adjusted to 5 by the addition of HCl (1M). The aqueous solution was washed with Et.sub.2O (5100 mL), then concentrated to dryness yielding a colourless solid. The solid was purified by preparative HPLC, the product fractions were combined and the solvent removed by freeze-drying. This gave N.sup.-[({2-methylcycloprop-2-en-1-yl}methoxy)carbonyl]-L-lysine 3 as a colourless solid. .sub.H (400 MHz, D.sub.2O) 6.45 (1H, s), 3.90-3.61 (2H, m), 3.09 (1H, t, J 6.4), 2.98-2.86 (2H, m), 1.92 (3H, s), 1.52-1.37 (2H, m), 1.37-1.22 (2H, m), 1.21-1.08 (2H, m), 0.83 (1H, d, J 5.2). LRMS m/z (ES.sup.+) 257.2 [M+H].sup.+, m/z (ES.sup.) 255.2 [MH].sup.. .sub.C (100 MHz, D.sub.2O) 101.1 (CH), 72.3 (CH.sub.2), 55.9 (CH), 40.2 (CH.sub.2), 34.3 (CH.sub.2), 28.9 (CH.sub.2), 20.3 (CH.sub.2), 16.6 (CH.sub.3), 10.8 (CH) HRMS (ES.sup.+) Found: (M+Na).sup.+ 279.1302. C.sub.12H.sub.20O.sub.4N.sub.2Na required M.sup.+, 279.1315.

Example 2Encoding the Site-Specific Incorporation of 3 in E. coli

(56) We demonstrated that 3 is efficiently and site-specifically incorporated into recombinant proteins in response to the amber codon using the PylRSARNA.sub.CUA pair and an SfGFP gene bearing an amber codon at position 150 (Supplementary FIG. 2a). The yield of protein is 8 mg per litre of culture, which is greater than that obtained for a well-established efficient substrate for PylRS N.sup.-[(tert-butoxy)carbonyl]-L-lysine 1 (4 mg per litre of culture).sup.33 Electrospray ionisation mass spectrometry of SfGFP bearing 3 at position 150 (SfGFP-3) confirms the incorporation of the unnatural amino acid (Supplementary FIG. 2b). SfGFP-3 was specifically labelled with the fluorescent tetrazine probe 4a, while SfGFP-1 was left unlabelled (Supplementary FIG. 2b). 2 nmol of SfGFP-3 was quantitatively labelled with 10 equivalents of 4a in 30 minutes, as judged by both fluorescence imaging and mass spectrometry (Supplementary FIG. 2b). The second order rate constant for labelling SfGFP-3 with 4a was 271.8 M.sup.1 s.sup.1 (Supplementary FIG. 2c).sup.26

(57) Since PylRS does not recognize the anticodon of its cognate tRNA.sup.34 it is possible to alter the anticodon of this tRNA to decode distinct codons. We created a new tRNA in which the anticodon of PyltRNA.sub.CUA was converted from CUA to UUU (Supplementary Table 1), to decode a set of lysine codons. We added 0.1 mM 3 to cells containing PylRS, PyltRNA.sub.UUU, and the gene for T4 lysozyme. Following expression of T4 lysozyme we detected proteins in the lysate bearing 3 with the tetrazine probe 4a (20 microM 1 h, Supplementary FIG. 3). Control experiments show that the observed labelling requires the presence of the synthetase and tRNA, and electrospray ionization mass spectrometry demonstrates the incorporation of 3 in place of lysine in T4 lysozyme (Supplementary FIG. 4). The addition of 3 (0.1 or 0.5 mM) has little or no effect on cell growth (Supplementary FIG. 5) suggesting that the amino acid is not toxic at the concentration used, and there is substantial labelling within 1 h of amino acid addition (Supplementary FIG. 6).

Example 3Genetic Encoding of 3 in Human Cells

(58) Full-length mCherry-3-GFP-HA was expressed in HEK293 cells carrying the PylRSARNA.sub.CUA pair and mCherry-TAG-EGFP-HA (a fusion between the mCherry gene and the EGFP gene with a C-terminal HA tag, separated by the amber stop codon (TAG))..sup.18 Full-length protein was detected only in the presence of the 3 (FIG. 8a. Full gels in Supplementary FIG. 11). mCherry-3-EGFP-HA was selectively labelled with 4a, while mCherry-1-EGFP-HA was not labelled (FIG. 8b).sup.18 demonstrating the site-specific incorporation of 3 with the PylRS/tRNA.sub.CUA pair in human cells.

Example 4Genetic Encoding of 3 in D. melanogaster

(59) We demonstrated that 3 can be site specifically incorporated into proteins in D. melanogaster. To achieve this, we used flies containing the PylRS/tRNA.sub.CUA pair (with the tRNA expressed ubiquitously from a U6 promoter and UAS-PylRS expression directed to ovaries using a nos-vp16-GAL4 driver), and a dual luciferase reporter bearing an amber codon between firefly and renilla luciferase..sup.20 We observe a strong luciferase signal that is dependent on the addition of 1 or 3, and the dual luciferase signal is larger with 3. These experiments demonstrate that 3 is taken up by flies and is more efficiently incorporated in vivo in response to an amber codon than 1 (FIG. 10a), a known excellent substrate for PylRS. 3 may be supplied by feeding food supplemented with amino acid 3 at 10 mM. In additional experiments, we demonstrated by western blot the efficient incorporation of 3 into a GFP-TAG-mCherry-HA construct (Supplementary FIG. 15) expressed in ovaries.sup.20 (FIG. 10b), and the specific fluorescent labelling of the incorporated amino acid with 4g (FIG. 10c).

Example 5Synthesis of Tetrazine-BODIPY FL 4d

(60) ##STR00005##

(61) Boc-protected Tetrazine S6 was synthesized using the procedure reported earlier.sup.6. 4M HCl in dioxane (500 L, 2.0 mmol) was added to a stirring solution of Tetrazine S5 (8 mg, 0.02 mmol) in DCM (500 L). The reaction was carried out for 2 h at room temperature and subsequently the solvent was removed under reduced pressure to yield primary amine hydrochloride S6 as a pink solid (6 mg, 0.02 mmol, 100%). The compound was directly used in the next step without any further purification.

ii. 4d

(62) BODIPY FL succinimidyl ester (5 mg, 0.013 mmol, Life technologies) and Hnig's base (50 l, 2.8 mmol) were added to the solution of Tetrazine-amine S2 (6 mg, 0.02 mmol) in dry DMF (1 mL). The reaction mixture was stirred at room temperature for 16 h. The reaction mixture was diluted with 4 ml of water and the product was purified by semi-preparative reverse phase HPLC using a gradient from 10% to 90% of buffer B in buffer A (buffer A: H.sub.2O; bufferB: acetonitrile). The identity and purity of the tetrazine-BODIPY FL conjugate 4d was confirmed by LC-MS. ESI-MS: [MH].sup., calcd. 581.38, found 581.2.

Summary of Examples 1 to 5

(63) We have characterized the synthesis of, and the genetically encoded, site-specific incorporation of a cyclopropene containing amino acid 3, and demonstrated the quantitative labelling of 3, with tetrazine probes, in proteins expressed in E. coli, mammalian cells and D. melanogaster, thereby showing the widespread utility and industrial application of the present invention.

SUPPLEMENTARY REFERENCES TO EXAMPLES 1 to 5

(64) 1. Gautier, A. et al. Genetically Encoded Photocontrol of Protein Localization in Mammalian Cells. Journal of the American Chemical Society 132, 4086-4088 (2010). 2. Karp, N A, Kreil, D. P. & Lilley, K. S. Determining a significant change in protein expression with DeCyder during a pair-wise comparison using two-dimensional difference gel electrophoresis. Proteomics 4, 1421-1432 (2004). 3. Karp, N. A. & Lilley, K. S. Design and analysis issues in quantitative proteomics studies. Proteomics 7 Suppl 1, 42-50 (2007). 4. Lilley, K. S. in Current Protocols in Protein Science (John Wiley & Sons, Inc., 2001). 5. Von Stetina, J. R., Lafever, K. S., Rubin, M. & Drummond-Barbosa, D. A Genetic Screen for Dominant Enhancers of the Cell-Cycle Regulator alpha-Endosulfine Identifies Matrimony as a Strong Functional Interactor in Drosophila. G3 (Bethesda) 1, 607-613 (2011). 6. Lang, K. et al. Genetically encoded norbornene directs site-specific cellular protein labelling via a rapid bioorthogonal reaction. Nat Chem 4, 298-304 (2012).

Example 6Dual Labelling of Proteins

(65) The ability to attach two distinct molecules to programmed sites in proteins will facilitate a variety of applications including FRET.sup.1,2 to study protein structure, conformation and dynamics. Several approaches for doubly labeling proteins have been reported. One approach relies on the installation of one unnatural amino acid that is specifically labeled in combination with cysteine thiol labeling, but this approach is generally limited to proteins that do not contain free thiols..sup.3,4 Chemical ligation approaches can be combined with the genetic encoding of a single unnatural amino acid for protein labeling,.sup.5 but this may limit the size and/or sites that may be labeled. Perhaps the most generally applicable approach for protein double labelling is based on the genetic incorporation of two distinct amino acids in response to two distinct codons introduced at user defined sites in the gene of interest.

(66) An ideal strategy for dual labeling requires i) the efficient, cellular, incorporation of two distinct unnatural amino acids into a protein that can be labelled in mutually orthogonal reactions, and the development of mutually orthogonal reactions that allow the simultaneous addition of two molecules to the protein for rapid, quantitative labelling of the protein in aqueous media at physiological pH, temperature and pressure.

(67) Scheme A (FIG. 14) shows concerted, rapid, one-pot quantitative dual labelling of proteins in aqueous medium at physiological pH and temperature. (a) Unnatural amino acids and fluorophores used in this example. (b) Concerted labeling at an encoded terminal alkyne and an encoded cyclopropene via mutually orthogonal cycloadditions.

(68) The cellular, genetically directed incorporation of two distinct unnatural amino acids into proteins has been demonstrated in response to an amber and quadruplet codon,.sup.6 two distinct stop codons,.sup.7,8 or two distinct quadruplet codons..sup.9 We previously demonstrated the evolution of an orthogonal ribosome (ribo-Q1) that efficiently reads quadruplet codons and amber codons on orthogonal mRNA using cognate extended anticodon tRNAs or amber suppressors respectively..sup.6 We demonstrated that the pyrrolysyl-tRNA synthetase/tRNA pair and synthetically evolved derivatives of the MjTyrRS/tRNA pair are mutually orthogonal in their aminoacylation specificity and can be used to direct the incorporation of pairs of unnatural amino acids in response to amber and quadruplet codons..sup.6 We recently described several major advances in this system, including the evolution of a series of quadruplet decoding tRNAs based on the pyrrolysyl-tRNA synthetase (PylRS)/tRNA pair that efficiently direct the incorporation of unnatural amino acids in response to quadruplet codons using the evolved orthogonal translation machinery..sup.9 We demonstrated the very efficient incorporation of a matrix of pairs of unnatural amino acids using the evolved PylRS/tRNA.sub.UACU pair and derivatives of the MjTyrRS/tRNA.sub.CUA pair with orthogonal messages bearing TAG and AGTA codons and ribo-Q1..sup.9

(69) A limited range of chemistries have been investigated for the double labeling of proteins containing pairs of unnatural amino acids. The incorporation of azide- and alkyne-containing amino acids, and their non-quantitative labeling with alkyne and azide based fluorophores has been reported.sup.7, but this is not ideal for double labeling of proteins; if the encoded azide and alkyne are in proximity they can react to form a triazole in the protein, a strategy which allows genetically directed protein stapling,.sup.6 but precludes labeling with probes. Moreover, an efficient one-pot reaction is not feasible because of the reaction between azide- and alkyne-bearing probes with each other. The incorporation of ketone and azide containing amino acids has been reported,.sup.8,10 which allows one-pot reaction of the encoded ketone with alpha effect nucleophiles, and the azides with alkyne probes..sup.10 However this approach is problematic because encoded azides are subject to reduction in many proteins when expressed in E. coli,.sup.8,11 which will prevents quantitative labeling. Moreover, ketone labeling with alpha effect nucleophiles is very slow (rate constant approximately 10.sup.4 M.sup.1 s.sup.1) and the reaction is optimal at pH4-5.5,.sup.12 which limits its utility for many proteins that are denatured or precipitate when kept for long periods under acidic conditions. We recently genetically installed a deactivated tetrazine containing amino acid.sup.13 and a norbornene containing amino acid.sup.14-16 into proteins using our optimized orthogonal translation system..sup.9 Because the rate of inverse electron demand Diels Alder reaction between the deactivated tetrazine and norbornene is very slow, but the tetrazine can react with bicyclononyne based probes and the norbornene can react with activated tetrazine probes we were able to use this approach to specifically and quantitatively double label proteins..sup.9 While this approach has the advantage of proceeding in aqueous media at physiological pH, temperature and pressure; it does require sequential labeling steps (to avoid inverse electron Demand reactions between probes), each of which takes several hours, with purification between steps. All approaches reported to date for doubly labeling proteins at genetically encoded unnatural amino acids take tens of hours to days to reach completion.

(70) An ideal approach to double label proteins would allow rapid one-pot labeling of genetically installed bio-orthogonal functional groups, proceed rapidly in aqueous media at physiological pH, temperature and pressure and be implemented simply by adding the labeling reagents to a recombinant protein bearing the site specifically incorporated bioorthogonal groups. A promising pair of mutually orthogonal reactions for one-pot labeling under aqueous conditions at physiological pH are the Cu(I)-catalysed 3+2 cycloaddition between azides and terminal alkynes,.sup.17 and the inverse electron demand Diels Alder reaction of a strained alkenes and a tetrazine.sup.18-23(FIG. 11). The reaction of strained alkynes and azides can also be orthogonal to strained alkene tetrazine reactions, but since tetrazines react with strained alkynes this approach requires careful tuning of the rate constants for each reaction..sup.24 No combination of 3+2 cycloaddition and inverse electron demand Diels Alder reaction has been demonstrated for protein labelling.

(71) We demonstrated in examples 1 to 5 that a 1,3 disubstituted cyclopropene containing amino acid, 2 (referred to as 3 in examples 1 to 5 and elsewhere in this document), can be efficiently and site specifically incorporated into proteins using the PylRS/tRNA.sub.CUA pair..sup.25 This amino acid, unlike the 3,3 disubstituted cyclopropene incorporated for photoclickreactions,.sup.26 reacts with tetrazines.sup.19,27 with on-protein rate constants of 27 M.sup.1 s.sup.1..sup.25 Here we demonstrate the efficient genetic encoding of a terminal alkyne containing amino acid 1 and a cyclopropene containing amino acid 2 into a single protein and their rapid, quantitative, one-pot labeling with azide and tetrazine probes (FIG. 11). This work provides the first approach to the concerted double labeling of proteins in a one-pot process under aqueous conditions, at physiological pH, and provides a step change in the speed of double labeling, from days in previous work to 30 minutes in the approach reported here.

(72) Proteins containing either 1 or 2 were overexpressed to examine the specificity of the orthogonality of the proposed labeling reactions. A fusion protein of glutathione-S-transferase and calmodulin (GST-CaM) with amino acid 1 at position 1 in calmodulin was expressed from cells containing ribo-Q1 (an evolved orthogonal ribosome.sup.6,28,29), O-gst-cam.sub.1TAG (a fusion gene between glutathione-S-transferase (gst) and calmodulin (cam) on an orthogonal message.sup.30 in which the first codon of cam is replaced with a TAG codon), and MjPrpRS/tRNA.sub.CUA (a synthetase/tRNA pair developed for incorporating t in response to the TAG codon).sup.31 grown in the presence of (4 mM). The GST tag was subsequently removed by cleavage using thrombin at an engineered thrombin-cleavage site between GST and CaM. CaM1.sub.1 (CaM containing 1 at position 1, 100 pmole) was labelled with the azide containing fluorophore 3 (2 nmole), in a Cu (I)-catalysed click reaction. The reaction was quantitative as judged by both the quantitative shift of the fluorescently labelled protein by SDS-PAGE and electrospray ionization mass spectrometry (ESI-MS) (FIG. 11a).

(73) The cyclopropene containing amino acid, 2, was site specifically incorporated at position 40 of calmodulin. The modified protein was expressed in cells bearing the PylRS/tRNA.sub.CUA (that efficiently directs the site specific incorporation of 2),.sup.25 ribo-Q1, and O-gst-cam.sub.40TAG grown in the presence of 2 (1 mM). CaM2.sub.40 (100 pmol) (obtained after thrombin cleavage of the GST tag) was labelled with the tetrazine containing fluorophore 4 (2 nmole). The reaction was quantitative as judged by both the quantitative shift of the fluorescently labelled protein by SDS-PAGE and electrospray ionization mass spectrometry (ESI-MS) (FIG. 11b). CaM2.sub.40 was not labeled with 3 under the conditions that led to quantitative labeling of CaM1.sub.1 with 3 (FIG. 11a). Similarly, CaM1.sub.1 was not labeled with 4 under conditions where CaM2.sub.40 was quantitatively labeled with 4. These experiments demonstrate that the two labeling reagents react quantitatively with their target amino acid, but do not react with their non-targeted unnatural amino acid in proteins.

(74) Next we investigated labeling 1 and 2 within the same protein. We site-specifically incorporated 1 and 2 at positions 1 and 40 of calmodulin to produce CaM1.sub.12.sub.40 (FIG. 12). We directed the incorporation of amino acid 1 with an MjPrpRS/tRNA.sub.CUA pair and the incorporation of amino acid 2 with the evolved PylRS/tRNA.sub.UACU pair, which efficiently decodes the quadruplet AGTA codon on orthogonal messages using ribo-Q1..sup.9 Unnatural amino acids were incorporated in response to UAG and AGTA codons at positions 1 and 40 in calmodulin, within a GST-calmodulin gene on an orthogonal message (O-gst-cam.sub.1TAG-40AGTA). Expression of full-length GST-CaM1.sub.12.sub.40 was dependent on the addition of amino acids 1 and 2 to E. coli, and ESI-MS demonstrated the genetically directed incorporation of amino acids 1 and 2 (FIG. 12c). The yield of full length GST-CaM1.sub.12.sub.40 was 2 mg per L of culture.

(75) To determine the time required to quantitatively label CaM1.sub.12.sub.40 with azide 3 or tetrazine 4 we incubated 100 pmol of CaM1.sub.12.sub.40 with 2 nmol of either 3 or 4 and followed each reaction by both mobility shift on SDS-PAGE and fluorescent imaging upon labeling (FIG. 12b). These experiments demonstrate that fluorophore labeling is complete in 30 minutes.

(76) Next we investigated the labeling of CaM1.sub.12.sub.40 with both 3 and 4 (FIG. 13). We first tested the addition of 4 (2 nmol) to CaM1.sub.12.sub.40 (100 pmol) followed by purification to remove free 4, and subsequent labelling with 3 (2 nmol) (FIG. 13a lane 4). This led to efficient double labelling as judged by SDS-PAGE mobility shift and fluorescence imaging. Next we performed sequential labeling without purification by incubating CaM1.sub.12.sub.40 with 4 for 30 minutes and then adding 3 and click reagents and incubating further for 30 min (FIG. 13a lane 5). This also led to efficient double labelling as judged by SDS-PAGE mobility shift and fluorescence imaging. Finally, we simultaneously added 4 (2 nmol), 3 (2 nmol) and click reagents to CaM1.sub.12.sub.40 (100 pmol) and incubated for 30 minutes. (FIG. 13a lane 6). This again led to efficient double labelling as judged by SDS-PAGE mobility shift and fluorescence imaging. In all doubly labeled proteins we observe a decrease in the BODIPY-FL fluorescence relative to the singly labeled control upon excitation at 688 nm (compare lanes 4, 5, and 6 to lane 3 in FIG. 13a), consistent with in gel Frster resonance energy transfer (FRET) to between BODIPY-FL and BODIPY-TMR-X. ESI-MS further demonstrates that this concerted, one-pot protocol leads to genetically directed efficient, rapid and quantitative double labeling of proteins.

(77) In summary, in this example we show an efficient and rapid protocol for expressing recombinant proteins bearing a site specifically incorporated alkyne and a site specifically incorporated cyclopropene. We demonstrate that the inverse electron demand Diels Alder reaction of an encoded 1,3 disubstituted cyclopropene and tetrazine probe, and the 3+2 cycloaddition reaction of the encoded alkyne and azide probe are mutually orthogonal to each other and to the functional groups in proteins. By combining the genetic encoding of an alkyne and a cyclopropene in a single protein and labelling with the mutually orthogonal reactions we demonstrate the concerted, one-pot rapid double labeling of a protein in aqueous media at physiological pH and temperature. This strategy has utility for doubly labeling proteins for a variety of studies and applications, and may be extended to the double labeling of diverse molecules in diverse cells and organisms.

(78) Note on example 6: The chemical designations in example 6 and in the corresponding figures (drawings) discussed in example 6 are self-contained and apply only to example 6. Discussion of chemical designations in the rest of this document are consistent with the exception of example 6. For example, the skilled reader will immediately appreciate that compound 2 of example 6 corresponds to compound 3 in the rest of this document (i.e. the exemplary cyclopropene amino acid of the invention). Compounds 3 and 4 of example 6 are tetrazine compounds.

REFERENCES TO EXAMPLE 6

(79) (1) Zhang, J.; Campbell, R. E.; Ting, A. Y.; Tsien, R. Y. Nature Reviews Molecular Cell Biology 2002, 3, 906. (2) Kajihara, D.; Abe, R.; Iijima, I.; Komiyama, C.; Sisido, M.; Hohsaka, T. Nat Methods 2006, 3, 923. (3) Brustad, E. M.; Lemke, E. A.; Schultz, P. G.; Deniz, A. A. J Am Chem Soc 2008, 130, 17664. (4) Nguyen, D. P.; Elliott, T.; Holt, M.; Muir, T. W.; Chin, J. W. J Am Chem Soc 2011, 133, 11418. (5) Wissner, R. F.; Batjargal, S.; Fadzen, C. M.; Petersson, E. J. J Am Chem Soc 2013, 135, 6529. (6) Neumann, H.; Wang, K.; Davis, L.; Garcia-Alai, M.; Chin, J. W. Nature 2010, 464, 441. (7) Wan, W.; Huang, Y.; Wang, Z.; Russell, W. K.; Pai, P. J.; Russell, D. H.; Liu, W. R. Angew Chem Int Ed Engl 2010, 49, 3211. (8) Chatterjee, A.; Sun, S. B.; Furman, J. L.; Xiao, H.; Schultz, P. G. Biochemistry 2013. (9) Wang, K; Sachdeva, A.; Cox, D. J.; Wilt N. W.; Wallace, S.; Mehl, R. A.; Chin, J. W. submitted. (10) Wu, B.; Wang, Z.; Huang, Y.; Liu, W. R. Chembiochem: a European journal of chemical biology 2012, 13, 1405. (11) Sasmal, P. K.; Carregal-Romero, S.; Han, A. A.; Streu, C. N.; Lin, Z.; Namikawa, K.; Elliott, S. L.; Koster, R. W.; Parak, W. J.; Meggers, E. ChemBioChem 2012, 13, 1116. (12) Rotenberg, S. A.; Calogeropoulou, T.; Jaworski, J. S.; Weinstein, I. B.; Rideout, D. Proceedings of the National Academy of Sciences of the United States of America 1991, 88, 2490. (13) Seitchik, J. L.; Peeler, J. C.; Taylor, M. T.; Blackman, M. L.; Rhoads, T. W.; Cooley, R. B.; Refakis, C.; Fox, J. M.; Mehl, R. A. J Am Chem Soc 2012, 134, 2898. (14) Lang, K.; Davis, L.; Torres-Kolbus, J.; Chou, C.; Deiters, A.; Chin, J. W. Nat Chem 2012, 4, 298. (15) Plass, T.; Mulles, S.; Koehler, C.; Szymaski, J.; Mueller, R.; Wieler, M.; Schultz, C.; Lemke, E. A. Angewandte Chemie International Edition 2012, 51, 4166. (16) Kaya, E.; Vrabel, M.; Deiml, C.; Prill, S.; Fluxa, V. S.; Carell, T. Angewandte Chemie International Edition 2012, 51, 4466. (17) Wang, Q.; Chan, T. R.; Hilgraf, R.; Fokin, V. V.; Sharpless, K. B.; Finn, M. G. J Am Chem Soc 2003, 125, 3192. (18) Devaraj, N. K.; Weissleder, R. Accounts of Chemical Research 2011, 44, 816. (19) Yang, J.; ekut, J.; Cole, C. M.; Devaraj, N. K. Angewandte Chemie International Edition 2012, 151, 7476. (20) Blackman, M. L.; Royzen, M.; Fox, J. M. J Am Chem Soc 2008, 130, 13518. (21) Lang, K.; Davis, L.; Wallace, S.; Mahesh, M.; Cox, D. J.; Blackman, M. L.; Fox, J. M.; Chin, J W. J Am Chem Soc 2012, 134, 10317. (22) Borrmann, A.; Milks, S.; Plass, T.; Dommerholt, J.; Verkade, J. M. M.; Wieler, M.; Schultz, C.; van Hest, J. C. M.; van Delft, F. L.; Lemke, E. A. ChemBioChem 2012, 13, 2094. (23) Schoch, J.; Staudt, M.; Samanta, A.; Wiessler, M.; Jaschke, A. Bioconjug Chem 2012, 23, 1382. (24) Karver, M. R.; Weissleder, R.; Hilderbrand, S. A. Angew Chem Int Ed Engl 2012, 51, 920. (25) Bianco, A.; Elliott, T. S.; Townsley, F. M.; Pisa, R.; Davis, L.; Elssser, S. J.; Ernst, R. J.; Lang, K.; Sachdeva, A.; Chin, J. W. Under Review. (26) Yu, Z.; Pan, Y.; Wang, Z.; Wang, J.; Lin, Q. Angewandte Chemie International Edition 2012, 51, 10600. (27) Kamber, D. N.; Nazarova, L. A.; Liang, Y.; Lopez, S. A.; Patterson, D. M.; Shih, H. W.; Houk, K. N.; Prescher, J. A. J Am Chem Soc 2013, 135, 13680. (28) Wang, K.; Schmied, W. H.; Chin, J. W. Angew Chem Int Ed Engl 2012, 51, 2288. (29) Wang, K.; Neumann, H.; Peak-Chew, S. Y.; Chin, J. W. Nature biotechnology 2007, 25, 770. (30) Rackham, O.; Chin, J. W. Nature chemical biology 2005, 1159. (31) Deiters, A.; Schultz, P. G. Bioorganic &amp; Medicinal Chemistry Letters 2005, 15, 1521.

Cyclopropene amino acids and methods

Assignee

Inventors

Cpc classification

Classification Explorer

C07K14/43595

CHEMISTRY; METALLURGY

Classification Explorer

C07C271/22

CHEMISTRY; METALLURGY

Classification Explorer

C12N9/2462

CHEMISTRY; METALLURGY

Classification Explorer

C07K1/1072

CHEMISTRY; METALLURGY

Classification Explorer

C07C269/06

CHEMISTRY; METALLURGY

Classification Explorer

C07C2601/02

CHEMISTRY; METALLURGY

Classification Explorer

C12Y601/01

CHEMISTRY; METALLURGY

Classification Explorer

C12P21/02

CHEMISTRY; METALLURGY

International classification

Classification Explorer

C07K14/00

CHEMISTRY; METALLURGY

Classification Explorer

C07C271/22

CHEMISTRY; METALLURGY

Classification Explorer

C12N9/36

CHEMISTRY; METALLURGY

Classification Explorer

C07K14/435

CHEMISTRY; METALLURGY

Classification Explorer

C07C269/06

CHEMISTRY; METALLURGY

Classification Explorer

C12P21/02

CHEMISTRY; METALLURGY

Classification Explorer

C07K1/107

CHEMISTRY; METALLURGY

Abstract

Claims

Description