CIRCULARLY PERMUTATED HALOALKANE TRANSFERASE FUSION MOLECULES
20220275350 · 2022-09-01
Assignee
Inventors
- Julien HIBLOT (Heidelberg, DE)
- Magnus HUPPERTZ (Oldenburg, DE)
- Kai JOHNSSON (Heidelberg, DE)
- Wilhelm JONAS (Heidelberg, DE)
Cpc classification
C12N2320/50
CHEMISTRY; METALLURGY
C07K2319/70
CHEMISTRY; METALLURGY
International classification
Abstract
The invention relates to a modular polypeptide comprising a first partial effector sequence comprising a first part of a circular permutated halotag protein connected to a sensor module sequence, which is connected to a second part of a circular permutated halotag protein. The sensor module is a single polypeptide or a polypeptide pair capable of undergoing conformational change from a first confirmation to a second confirmation depending on the presence or concentration of an analyte compound. The modular peptide is catalytically active in response to an environmental stimulus or in response to the sensor pair interacting.
The invention further relates to nucleic acid sequences encoding the modular polypeptide, and to kits comprising same.
Claims
1. A modular polypeptide comprising or consisting essentially of a first partial effector sequence comprising or consisting essentially of an N-terminal first effector sequence part characterized by SEQ ID NO 002 or by a sequence at least (≥) 90% identical (particularly ≥93%, 95%, 97% or ≥98% identical) to SEQ ID NO 002, a C-terminal first effector sequence part characterized by SEQ ID NO 003 or by a sequence at least (≥) 90% identical (particularly ≥93%, 95%, 97% or ≥98% identical) to SEQ ID NO 003, an internal cpHalo linker consisting of 10 to 35 amino acids, particularly consisting of 12 to 20 amino acids, more particularly of ca. 15 amino acids, wherein the internal cpHalo linker connects the C-terminus of the N-terminal first effector sequence part to the N-terminus of the C-terminal first effector sequence part; connected to a sensor module sequence, which is connected to a second partial effector sequence comprising or consisting essentially of a sequence selected from SEQ ID NO 006 (PEP1), SEQ ID NO 007 (PEP2) and a sequence at least (≥) 75% identical (particularly ≥80%, 85%, 90% or ≥95% identical) to SEQ ID NO 007 (PEP2), particularly wherein said sequence at least (≥) 75% identical (particularly ≥80%, 85%, 90% or ≥95% identical) to SEQ ID NO 007 (PEP2) bears at least one mutation at position A151, R146, E147, T148, or T154 with respect to SEQ ID NO 007 (PEP2), more particularly a mutation selected from A151L, R146A, E147A, T148A, or T154A, wherein the first and second partial effector sequences together constitute a circularly permuted haloalkane dehalogenase, and are capable, when brought into close proximity of each other, to effect covalent attachment of a halogen alkane moiety, and wherein the sensor module sequence is selected from a. a single sensor polypeptide capable of undergoing conformational change from a first confirmation to a second confirmation depending on the presence or concentration of an analyte compound, wherein in the first conformation, the first and second partial effector sequences are in close proximity (lead to the first and second partial effector sequences constituting a catalytically active entity), and in the second conformation, the first and second partial effector sequences are not in close proximity (lead to the first and second partial effector sequences constituting a catalytically inactive entity), when the first partial effector sequence is attached to the C-terminus of the sensor module sequence (the single sensor polypeptide) and the second partial effector sequence is attached to the N-terminus of the sensor module sequence (the single sensor polypeptide) and b. a sensor polypeptide pair comprising a first sensor polypeptide and a second sensor polypeptide, wherein the first sensor polypeptide is covalently attached through a peptide bond to the first partial effector sequence and the second sensor polypeptide is covalently attached to the second partial effector sequence, the first sensor polypeptide and the second sensor polypeptide are capable of specific molecular interaction (protein-protein binding), and the first and second sensor polypeptides are part of separate polypeptide chains.
2. The modular polypeptide of claim 1, wherein the first partial effector sequence and the second partial effector sequence, when brought into close proximity of each other, are characterized by an activity of 10.sup.2 s.sup.−1M.sup.−1 in a fluorescence polarization assay using N-(10-(2-carboxy-5-((2-(2-((6-chlorohexyl)oxy)ethoxy)ethyl)carbamoyl)phenyl)-7-(dimethylamino)-9,9-dimethylanthracen-2(9H)-ylidene)-N-methylmethanaminium as the substrate.
3. The modular polypeptide of claim 1, wherein the first partial effector sequence and the second partial effector sequence, when brought into close proximity of each other, have at least 0.5%, particularly ≥1% or ≥2%, of the activity of SEQ ID NO 001.
4. The modular polypeptide according to claim 1, wherein the internal cpHalo linker comprises or consists of the amino acids G, A, J, S, T, P, C, V, M, particularly wherein the cpHalo linker comprises or consists of the amino acids G, S, A and T.
5. The modular polypeptide according to claim 1, wherein the first partial effector sequence comprises or essentially consists of a. SEQ ID NO 004, or of b. a sequence at least (≥) 90% identical (particularly ≥93%, 95%, 97% or ≥98% identical) to SEQ ID NO 004, or of c. a sequence at least (≥) 90% identical (particularly ≥93%, 95%, 97% or ≥98% identical) to construct consisting of SEQ ID NO 002 joined by a linker to SEQ ID NO 003, particularly wherein the first and second partial effector sequences together are characterized by at least 0.5%, ≥1% or ≥2%, of the activity of SEQ ID NO 004 together with SEQ ID NO 007 (PEP2).
6. The modular polypeptide according to claim 1, wherein the first partial effector sequence is connected to the sensor module sequence by a first intermodular linker sequence, and/or the second partial effector sequence is connected to the sensor module sequence by a second intermodular linker sequence.
7. The modular polypeptide according to claim 1, wherein the sensor module sequence is a single sensor polypeptide that consists of an N-terminal first partial sensor sequence and a C-terminal second partial sensor sequence connected by a sensor linker sequence.
8. The modular polypeptide according to claim 1, wherein the first partial sensor sequence and the second partial sensor sequence are selected from a calmodulin-binding peptide and a calmodulin polypeptide, particularly wherein the first partial sensor sequence is a calmodulin polypeptide and the second partial sensor sequence is a calmodulin-binding peptide, more particularly wherein the calmodulin polypeptide is or comprises SEQ ID NO 009 (CaM), or a sequence at least 90% identical to SEQ ID NO 009 (CaM) and having substantially the same biological activity, and the calmodulin-binding peptide is or comprises SEQ ID NO 008 (M13), even more particularly wherein the sensor module sequence is constituted by a sensor polypeptide pair comprising: a. a first sensor peptide that is or comprises a calmodulin binding peptide, particularly wherein the calmodulin-binding peptide is or comprises SEQ ID NO 008 (M13), b. and a second sensor polypeptide that is or comprises a calmodulin polypeptide, particularly wherein the calmodulin polypeptide is or comprises SEQ ID NO 009 (CaM), or a sequence at least 90% identical to SEQ ID NO 009 (CaM) and having substantially the same biological activity, wherein the first sensor peptide is covalently attached through a peptide bond to the first partial effector sequence and the second sensor polypeptide is covalently attached to the second partial effector sequence, and the first and second sensor polypeptides are part of separate polypeptide chains, particularly wherein the first partial effector sequence is connected to the C-terminus of the first sensor peptide by a first intermodular linker sequence having 2 to 6 amino acids, and/or the second partial effector sequence is connected to the N-terminus of the second sensor polypeptide by a second intermodular linker having 2 to 6 amino acids, more particularly wherein the first intermodular linker sequence and the second intermodular linker sequence are dipeptides or tripeptides the amino acid constituents of which are each independently selected from G, S and T residues.
9. The modular polypeptide according to claim 8, wherein the modular polypeptide is characterized by a first polypeptide sequence consisting or comprising SEQ ID NO 010 (SPLT1) or a sequence at least 90% identical to SEQ ID NO 010 (SPLT1), and a second polypeptide sequence SEQ ID NO 011 (SPLT2) or a sequence at least 90% identical to SEQ ID NO 011 (SPLT2), wherein the first and the second polypeptide sequence together have at least 0.5%, particularly ≥1% or ≥2% of the activity of the combination of SEQ ID NO 010 (SPLT1) and SEQ ID NO 011 (SPLT2).
10. The modular polypeptide according to claim 8, wherein the sensor module sequence is constituted by a single sensor polypeptide, comprising, from N to C-terminus, a. a calmodulin polypeptide, particularly SEQ ID NO 009 (CaM), or a sequence at least 90% identical to SEQ ID NO 009 (CaM) and having substantially the same biological activity; b. a peptide linker, particularly a polyproline-type rigid helix, more particularly a P.sub.n proline polypeptide wherein n is selected from an integer from 15 to 35, optionally flanked by 1-4 amino acids; c. a calmodulin binding peptide (second partial sensor sequence), particularly a sequence comprising or consisting of SEQ ID NO 008 (M13). particularly wherein the modular polypeptide comprises or consists of a sequence selected from SEQ ID NO 013 (CONF1) and SEQ ID NO 014 (CONF2), or a sequence at least 90% identical to SEQ ID NO 013 (CONF1) or SEQ ID NO 014 (CONF2) having at least 0.5%, particularly ≥1% or ≥2% of the activity of SEQ ID NO 013 (CONF1).
11. The modular polypeptide according to claim 1, wherein the sensor module sequence comprises or essentially consists of a. an N-terminal part of a glutamate binding protein, particularly wherein the first sensor polypeptide is or comprises SEQ ID NO 020 (GLT1), or a sequence at least 90% identical to SEQ ID NO 020 (GLT1), and a C-terminal part of a glutamate binding protein, particularly wherein the second sensor polypeptide is or comprises SEQ ID NO 021 (GLT2), or a sequence at least 90% identical to SEQ ID NO 021 (GLT2) and having substantially the same biological activity, particularly a bacterial periplasmic glutamate binding protein, more particularly from Gltl; wherein the combination of the first sensor polypeptide and the second sensor polypeptide and have substantially the same biological activity as a combination of SEQ ID NO 020 (GLT1) and SEQ ID NO 021 (GLT2); or b. a sequence at least (≥) 90% identical to a construct consisting of SEQ ID NO 020 (GLT1) joined by a polypeptide linker to SEQ ID NO 021 (GLT2), particularly wherein the sensor module sequence is or comprises SEQ ID NO 022 (GLT3), or a sequence at least 90% identical to SEQ ID NO 022 (GLT3) and having substantially the same biological activity, particularly wherein the modular polypeptide is characterized by a first polypeptide sequence consisting of or comprising SEQ ID NO 023 (GLTIND1), or a sequence at least 90% identical to SEQ ID NO 023 (GLTIND1), and a second polypeptide sequence SEQ ID NO 024 (GLTIND2) or a sequence at least 90% identical to SEQ ID NO 024 (GLTIND2), wherein the first polypeptide sequence and the second polypeptide sequence together have at least 0.5%, particularly ≥1% or ≥2% of the activity of SEQ ID NO 025 (GLTIND3).
12. The modular polypeptide according to claim 1, wherein the sensor module sequence is constituted by a sensor polypeptide pair comprising: a. a first sensor polypeptide that is or comprises an FKBP12 polypeptide, particularly wherein the FKBP12 polypeptide is or comprises SEQ ID NO 015 (FKBP), or a sequence at least 90% identical to SEQ ID NO 015 (FKBP) and having substantially the same biological activity, b. and a second sensor polypeptide that is or comprises a FRB peptide, particularly wherein the FRB peptide is or comprises SEQ ID NO 016 (FRB), or a sequence at least 90% identical to SEQ ID NO 016 (FRB) and having substantially the same biological activity, wherein the first sensor polypeptide is covalently attached through a peptide bond to the first partial effector sequence and the second sensor polypeptide is covalently attached to the second partial effector sequence, and the first and second sensor polypeptides are part of separate polypeptide chains, particularly wherein first partial effector sequence is connected to the C-terminus of the first sensor polypeptide by a first intermodular linker sequence having 2 to 9 amino acids, and/or the second partial effector sequence is connected to the N-terminus of the second sensor polypeptide by a second intermodular linker having 2 to 9 amino acids, more particularly wherein the first intermodular linker sequence and the second intermodular linker sequence are tripeptides, for which the amino acid constituents are each independently selected from G, S and T residues.
13. The modular polypeptide according to claim 12, wherein the modular polypeptide is characterized by a first polypeptide sequence consisting or comprising SEQ ID NO 017 (RAPIND1) or a sequence at least 90% identical to SEQ ID NO 017 (RAPIND1) and a second polypeptide sequence selected from SEQ ID NO 018 (RAPIND2) and SEQ ID NO 019 (RAPIND3) or a sequence at least 90% identical to SEQ ID NO 018 (RAPIND2), wherein the first and the second polypeptide sequence together have at least 50% of the activity of the combination of SEQ ID NO 017 (RAPIND1) and SEQ ID NO 018 (RAPIND2).
14. A nucleic acid sequence, or a plurality of nucleic acid sequences, encoding a modular polypeptide according to claim 1.
15. A combination of nucleic acid sequences comprising a. a first nucleic acid sequence encoding a first partial effector sequence, wherein the encoded first partial effector sequence comprises, from N to C-terminus, i. SEQ ID NO 002, or a sequence at least (≥) 90% identical (particularly ≥93%, 95%, 97% or ≥98% identical) to SEQ ID NO 002, ii. a polypeptide linker sequence having 10-35 (particularly approx. 15) amino acids, more particularly a polypeptide linker sequence having 12-20 amino acids selected from G, A, J, S, T, iii. SEQ ID NO 003 or a sequence at least (≥) 90% identical (particularly ≥93%, 95%, 97% or ≥98% identical) to SEQ ID NO 003; b. a second nucleic acid sequence encoding a second partial effector sequence characterized by SEQ ID NO 006 (PEP1) or 007 (PEP2), or encoding a sequence at least (≥) 95% identical (particularly ≥96%, 97%, 98% or ≥99% identical) to SEQ ID NO 006 (PEP1), wherein the first and second partial effector sequences together constitute a circularly permuted haloalkane dehalogenase, and are capable, when brought into close proximity of each other, to effect covalent attachment of a halogen alkane moiety.
16. A nucleic acid expression system comprising a. the nucleic acid sequence according to claim 14, or b. a first nucleic acid sequence encoding a first partial effector sequence, wherein the encoded first partial effector sequence comprises, from N to C-terminus, SEQ ID NO 002, or a sequence at least (≥) 90% identical (particularly ≥93%, 95%, 97% or ≥98% identical) to SEQ ID NO 002, a polypeptide linker sequence having 10-35 (particularly approx. 15) amino acids, more particularly a polypeptide linker sequence having 12-20 amino acids selected from G, A, J, S, T, and SEQ ID NO 003 or a sequence at least (≥) 90% identical (particularly ≥93%, 95%, 97% or ≥98% identical) to SEQ ID NO 003; and a second nucleic acid sequence encoding a second partial effector sequence characterized by SEQ ID NO 006 (PEP1) or 007 (PEP2), or encoding a sequence at least (≥) 95% identical (particularly ≥96%, 97%, 98% or ≥99% identical) to SEQ ID NO 006 (PEP1), wherein the first and second partial effector sequences together constitute a circularly permuted haloalkane dehalogenase, and are capable, when brought into close proximity of each other, to effect covalent attachment of a halogen alkane moiety, and wherein each of nucleic acid sequences a. and b. are under control of a promoter sequence.
17. A cell comprising the nucleic acid expression system according to claim 16, particularly wherein the promoter is operable in said cell.
18. A non-human transgenic animal or plant comprising the nucleic acid expression system according to claim 16.
19. A kit comprising a nucleic acid sequence according to claim 14, and a halotag7 substrate.
Description
BRIEF DESCRIPTION OF THE FIGURES
[0116]
[0117]
[0118]
[0119]
[0120]
[0121]
[0122]
[0123]
[0124]
[0125]
[0126]
[0127]
[0128]
[0129]
[0130]
[0131]
[0132]
BRIEF DESCRIPTION OF THE DESCRIBED SEQUENCES
[0133] The nucleic and/or amino acid sequences provided herewith are shown using standard letter abbreviations for nucleotide bases, and three letter code for amino acids, as defined in 37 C.F.R. 1.822. Only one strand of each nucleic acid sequence is shown, but the complementary strand is understood as included by any reference to the displayed strand. The Sequence Listing is submitted as an ASCII text file named 95083_381_2_ST25.txt, created Apr. 3, 2022, about 53 KB, which is incorporated by reference herein.
EXAMPLES
Example 1: Design of a Calcium Signal Integrator Based on Split HaloTag7
[0134] HaloTag7 is a self-labelling protein derived from the haloalkane dehydrogenase DhaA of Rhodococcus rhodochrous that specifically reacts with and covalently binds a synthetic chloroalkane ligand. A split system was generated, wherein the original termini of HaloTag7 were connected via a (GGS/T).sub.5 linker and a peptide was excised from the HaloTag7 protein in between the CP sites of cpHalo141-145 (cpHaloTag7 with new termini at position 141 and 145) and cpHalo156-154. The part between those positions was excised to generate a split consisting of cpHalo156-141 and the short 9mer peptide from position 145 to 153/154.
[0135] Since the cpHaloΔ9mer-9merPeptide couple showed promising preliminary results, a first version of a calcium integrator was designed by fusing them via GGS linkers to an M13 peptide and a calmodulin protein, respectively. At elevated calcium concentration, calmodulin binds up to four calcium ions resulting in a large conformational change that strongly increases its affinity to the M13 recognition peptide. The resulting association of calmodulin and M13 leads to the complementation of cpHaloΔ9mer by the 9mer peptide. Upon complementation, the enzyme regains its activity and is able to react with fluorescent HaloTag substrates, leaving a permanent mark and thus integrating the signal.
Example 2: Affinity Between cpHaloΔ9mer and the 9mer Peptide
[0136] Isothermal Titration Calorimetry
[0137] The affinity between cpHalo_9mer and the 9mer peptide was measured via a label free approach using isothermal titration calorimetry (ITC).
[0138] The 9mer peptide is soluble at ˜30 mM in activity buffer (HEPES 50 mM pH 7.3-NaCl 50 mM) and forms a gel at higher concentrations in a strongly temperature dependent manner.
[0139] This maximal concentration of peptide was titrated against a 0.6 mM cpHaloΔ9mer (
Example 3: Affinities of Fluorescent Ligands to HaloTag7 and cpHaloΔ9mer
[0140] The inventors decided to perform the experiment with some representative fluorescent HaloTag substrates. An exchange of the aspartate106 residue in the active site by an alanine, removes the nucleophile responsible for the self labeling reaction and eliminates the catalytic activity of the protein. Catalytically dead mutants of HaloTag7 (HaloTag7D106A) and cpHaloΔ9mer (cpHalo_9merD106A) were generated to measure substrate affinities without displacement of the equilibrium by the enzymatic reaction.
[0141] Affinities for the different fluorescent substrates towards HaloTag7D106A were determined in an FP binding assay (
Example 4: Background Labelling of cpHaloΔ9mer
[0142] A well performing protein complementation assay requires a low background. The inventors thus measured the extent of background labelling of cpHaloΔ9mer with different fluorophores (Halo-TMR, Halo-CPY and Halo-SiR). Therefore, cpHaloΔ9mer was incubated with an excess of the different fluorescent ligands.
[0143] The labelling efficiency at 37° C. and at different time points was determined via in gel fluorescence measurements (
Example 5: Characterization and Optimization of a Calcium Signal Integrator Based on Split HaloTag7: Linker Optimization
[0144] The initial design of a calcium integrator consisted of the fusion of cpHaloΔ9mer to an M13 peptide and the 9mer peptide to calmodulin. Both fusion partners were taken from the calcium sensing domain of GCaMP6f. Kinetics by fluorescence polarization using Halo-TMR as a substrate revealed no background in the absence of calcium and a second order rate constant for the labelling of 6.7+/−0.5*10.sup.3 s.sup.−1M.sup.−1 in presence of calcium.
[0145] Optimal linkers should assist the placement of the 9mer peptide at a good distance and proper orientation to complement cpHaloΔ9mer. Therefore, three variants of each part of the split were produced with varying linker lengths ((GGS).sub.1-3). These constructs allowed to screen all combinations of linker lengths in order to work with the optimal combination. For both, the M13-cpHaloΔ9mer linker and the 9mer-calmodulin linker, it was observed that increment of linker length leads to a significant loss in calcium induced activity. Interestingly, the initially chosen single GGS linkers clearly performed best since they showed the fastest labelling kinetic with calcium and no detectable background after 2.5 h (
Example 6: Design and Testing of an Intramolecular Calcium Integrator Based on Split HaloTag7
[0146] The inventors designed a simplified intramolecular calcium integrator that they refer to as “CaProLa” (calcium dependent protein labeling) by fusing the two integrator compartments between calmodulin and M13 using different linker domains (
[0151] Each construct as well as the split system were tested in an FP kinetic assay using Halo-TMR as a substrate in four different conditions (
[0156] All tested variants showed similar rate constants ranging between 3*10.sup.3 s.sup.−1M.sup.−1 and 5*10.sup.3 s.sup.−1M.sup.−1 in the presence of calcium. Furthermore, induction via calcium spiking after one hour and the reversibility experiments were successful, suggesting that once the 9mer peptide has bound to the cpHaloΔ9mer structure, it is able to unbind (even in an intramolecular system) offering a good dynamic.
[0157] However, differences can be seen in the background labelling of the sensors. The split integrator showed no detectable background over 2 h, while the CaProLa constructs exhibited background labelling of varying extent, correlating with the M13-CaM linker rigidity. The least background was observed with the Pro.sub.30 linker and the Pro.sub.15-SNAP-Pro.sub.15 domain.
Example 7: Calcium Responsivity of the Different CaProLa Constructs
[0158] The calmodulin-M13 pair chosen for the first design of CaProLa was taken from GCaMP6f. Depending on the structural context, the responsivity of the pair toward calcium can vary. The calmodulin moiety binds up to four calcium atoms with a very complex allosteric behaviour. However, if incorporated in a sensor, a simple titration of the sensor activity to the free calcium concentration leads to the identification of an EC50 that represents the response range of the sensor.
[0159] The calcium dependence of CaProLa constructs with different M13-CaM linkers was characterized by measuring the calcium dependent EC.sub.50. Therefore, labelling at different calcium concentrations was monitored in an FP kinetic assay. To achieve defined calcium concentrations in the nanomolar range, a K.sub.2EGTA-CaEGTA buffered system was used. Initial reaction rates were determined to calculate the calcium dependent EC50 (
Example 8: Tuning the Calcium Responsivity of CaProLa Constructs
[0160] The resting calcium concentration in neurons is reported to be 50 nM to 100 nM. As a consequence, the first generation of CaProLa was considered to be too sensitive towards calcium to be functional in neurons. Thus, a second generation of CaProLa was designed with the aim to generate different constructs exhibiting different calcium responsivities, especially with increased EC.sub.50.
[0161] The calmodulin-M13 couple is highly studied and a large number of mutations were reported and used in sensors to modify the calcium responsivity. The inventors thus decided to base their design on yet unpublished versions of the calcium integrator CaMPARI2 deposited on Addgene (Schreiter, E. Addgene plasmids #101061, #101062 and #101064). These CaMPARI2 variants are annotated with EC.sub.50 values ranging from 110 nM to 825 nM and were designed for a similar application.
[0162] Three of the modified M13 peptides were implemented in a second generation of CaProLa constructs (CaProLa 2.1-2.3). These constructs are all based on CaProLa 1.3 (Pro.sub.30 M13-CaM linker) due to its low background. EC.sub.50 values for the new CaProLa constructs were determined as described above (
[0163] EC.sub.50 values of CaProLa 2.1-2.3 are comparable to the values annotated for CaMPARI2 (table 2). CaProLa 2.1 and 2.2 both feature an EC.sub.50 significantly higher than the version 1.4 which might be appropriate for the integration of neuronal calcium waves (500 nM-10 μM).
[0164] Similar to the first generation, all new CaProLa constructs were tested regarding calcium induced kinetics, reversibility and background labelling in an FP kinetic experiment (
[0165] The CaProLa 2.2 construct was then tested in an in-gel fluorescence assay, to confirm the results obtained via FP assays (
Example 9: Fluorescence Polarization-Based Assay to Test the Performance of HaloTag7 or Circular Permutations of HaloTag7
[0166] Production and Purification of Proteins
[0167] Proteins (HaloTag7-cpHalo variants or X-ProLa) fused to purification tags (His-tag and potentially Strep-tag) are expressed in Escherichia coli BL21(DE3)-pLysS strain and purified using classic IMAC (and potentially StrepTrap affinity chromatography method). After buffer exchange and concentration, if necessary (which is the case with the cpHalo variants) the N-terminal His-tag was removed by TEV cleavage and reverse IMAC purification. The buffer is exchanged to a suitable buffer (e.g. 50 mM NaCl, 50 mM HEPES, pH 7.3). If required, the proteins can be further purified by size exclusion chromatography in the same buffer.
[0168] Fluorescence Polarization Assay
[0169] Labelling kinetics are performed mixing 100 μL of protein at 400 nM with 100 μL of fluorescent HaloTag substrate (i.e. Halo-CPY) at 100 nM in a 96 well plate (black-not binding-flat bottom) in buffer 50 mM NaCl, 50 mM HEPES, pH 7.3, 0.5 mg/ml BSA. The increase in fluorescence polarization is recorded using a microplate reader with appropriate spectral filters/monochromators (TECAN Spark20). Since the kinetics of cpHalo variants and HaloTag7 are usually extremely fast, it is mandatory to use a plate reader with internal injector to minimize the offset between mixing and the start of measurement. However, even with this equipment it might be impossible to observe the reaction that can complete in less than a second. In this case, a stopped flow setup capable of measuring fluorescence polarization with a high sampling rate is needed (e.g. BioLogic SFM). The decreased sensitivity of such instruments may require an increase in fluorescent substrate and protein concentrations (i.e. 1 μM substrate and 10 μM protein mixed 1:1).
[0170] Additionally, a fluorescence polarization time course without any protein is always recorded and subtracted from the data to account for dilution and evaporation effects. Obtained kinetic data is fitted to a second order reaction rate law (see equation below) to derive a second order rate constant (k). In order to estimate errors, the experiment should be performed at least in triplicate. To compare different variants, all assays need to be performed with the same concentrations and substrates.
with: [0171] t=time [0172] FP.sub.0=FP at t=0 [0173] FP.sub.top=upper plateau [0174] k=second order rate constant [0175] A.sub.0=starting concentration of reactant A [0176] B.sub.0=starting concentration of reactant B
[0177] Generalization of the Assay for any X-ProLa Variant
[0178] The fluorescence polarization assay is also used to test the performance of any X-ProLa variant. The general procedure is the same as above. However, since these constructs are often slower than HaloTag7 or its CP variants, the plate reader assay may be sufficient. Also the respective metabolite/ion/small molecule that activate the sensor needs to be added in addition to the fluorescent substrate. By recording labelling kinetics with and without the metabolite, the signal over background can be measured and by titrating different metabolite concentrations, an EC50 value of the X-ProLa can be derived (EC50 is defined as the concentration at which the speed of labelling is half of the maximum speed of labelling).
Example 10: Protein-Protein Assaying System
[0179] The inventors further tested the performance of the split-HaloTag system of the invention for labelling protein-protein interactions in a simple model system. They used the strong interaction of the proteins FKBP and FRB, which is conditional on the presence of the small-molecule drug rapamycin. After fusing the split HaloTag fragments to FKBP and FRB labelling was observed only in the presence of rapamycin, showing that our strategy works in a model system (
TABLE-US-00002 TABLE 1 Affinities of fluorescent ligands to the catalytically dead mutants HaloTag7.sub.D106A and cpHaloΔ9mer.sub.D106A K.sub.d values are given with the standard error resulting from the non-linear regression. Protein Fluorescent ligand K.sub.d [μM] HaloTag7.sub.D106A Halo-carbopyronine 0.622 +/− 0.0037 Halo-tetramethylrhodamine 6.68 +/− 0.16 Halo-tetramethylrhodamine- 4.40 +/− 0.022 azetidine Halo-Oregon Green 39.6 +/− 2.15 Halo-Alexa Fluor 488 94.0 +/− 2.01 Halo-silicon-rhodamine 22.7 +/− 1.4 Halo-silicon-rhodamine- 54.6 +/− 0.5 azetidine (JF646) Halo-silicon-rhodamine-3- 172.2 +/− 8.2 fluoroazetidin (JF635) cpHaloΔ9mer.sub.D106A Halo-carbopyronine 90.5 +/− 1.8 Halo-tetramethylrhodamine 115 +/− 1.8 Halo-tetramethylrhodamine- 227 +/− 6.7 azetidine
Example 11: Scanning the Mutation Tolerance on the Complementing Peptide
[0180] Experimental Procedure
[0181] Kinetic by fluorescence polarization performed in buffer 50 mM NaCl, 50 mM HEPES, 100 μM EGTA, 0.5 g/l BSA, pH 7.3. Mix of 100 μl of 400 nM protein with 100 nM Halo-TMR in buffer in in a black flat bottom 96 well plate equilibrated at 37° C. Reaction triggered by injecting 100 μl 10 mM CaCl.sub.2 in buffer. Final concentrations: 200 nM Protein, 50 nM Halo-TMR, 5 mM CaCl.sub.2. Additional background wells without protein added. Fluorescence polarization readout until plateau reached. Background values subtracted from measurements. Second order reaction rate fitted to obtain a k.sub.2.sup.app.
[0182] Results
[0183] Mutations on the 10mer peptide able to complement the activity of cpHaloΔ have a direct massive influence on the sequence conservation (%) as compared to the native sequence. The inventors therefore performed an alanine scanning over the peptide in the context of an already optimized CaProLa construction in order to evaluate the influence of peptide mutations on the overall labeling kinetics at calcium saturation (
[0184] Side by side, labeling kinetics comparisons suggest that: [0185] Ala145 mutation into Leucine affects the integrator kinetics. That can be explained by the tight hydrophobic packing in the area, the cumbersomeness of a leucine might rupture this packing, reduce the ability of the peptide to fold in an α-helix and/or interact with the substrate. [0186] Arg146, Glu147 and Thr148 mutations into alanine were not detrimental for the integrator functioning. The inventors hypothesize that the ability to form an α-helix is only essential at this positions. [0187] Phe149 and Gln150 mutations reduce drastically the integrator kinetics, especially in the case of the phenylalanine which participates in the hydrophobic heart of the substrate accommodation site. The Gln is more surface exposed but seems to cap the region and helps in the proper folding of the peptide. [0188] A151 mutation into leucine unexpectedly led to an increase of labeling speed as compared to the parental protein (˜3 fold). The inventors therefore further investigated mutations at this position and evaluated that while methionine mutation performed equivalently to the parental protein, all other tested modifications (Cys/Phe/Ile/Thr/Val) were deleterious for the activity. [0189] Phe152 and Arg153 mutation lead to a loss of protein ability to label. While Phe152 is part of the hydrophobic heart of the protein active site, the Arg153 interacts with multiple surrounding residues. They are most probably both crucial for the peptide proper α-helix folding. [0190] Thr154 mutation also leads to a decrease in protein labeling velocity, this residue seems to lock the peptide in the proper orientation by interacting with a residue of the adjacent α-helix.
[0191] To summarize, Ala145, Phe149, Gln150, Phe152 and Arg153 seem not prone to modification in the CaProLa sensor context. On the other hand, Arg146, Glu147 and Thr148 modifications are less of an issue. Finally, A151 modification can even lead to an activity increase but it is highly dependent on the nature of the modification.
Example 12: Development of a Glutamate Integrator Based on cpHaloΔ/H-Peptide
[0192] Experimental Procedure
[0193] Fluorescent polarization kinetic experiments were performed in black flat bottom 96 well plates at 37 or 22° C. Buffer composition was 50 mM NaCl, 50 mM HEPES, 0.5 g/L BSA, pH 7.3. 150 μL of 400 nM protein and glutamate at 2× final concentration were equilibrated for half an hour, and reaction initiated by injection of 100 μL Halo-CPY in buffer. Final concentration of reagents was 200 nM protein and 50 nM Halo-CPY. Fluorescence polarization was read out until measurements reached a plateau. Curves were fit to a mono-exponential in Prism 8 (GraphPad). Saturation of glutamate is observed at 1 mM.
[0194] Results
[0195] The inventors have successfully generated an integrator for glutamate (GluProLa: Glutamate dependent Protein Labeling), the primary excitatory neurotransmitter in the mammalian Central Nervous System (CNS). GluProLa is designed around the architecture of an existing real-time, intensiometric sensor of glutamate, iGluSnFR. iGluSnFR is derived from the bacterial periplasmic glutamate binding protein Gltl. Insertion of circularly permuted green fluorescent protein (cpGFP) into a flexible hinge in Gltl resulted in a green-fluorescent sensor which responds to changes in glutamate concentration with an increase in fluorescence intensity. As the N- and C-termini of Gltl are on the same face of Gltl, and mechanistic studies of iGluSnFR suggest that these positions should show glutamate-binding dependent changes in their relative distance and/or orientation, the inventors reasoned that these might be suitable sites for fusion to H-peptide and cpHaloΔ. The inventors therefore created GluProLa constructs linking H-peptide (SEQ ID NO 007 (PEP2)) to the N-terminus of iGluSnFR and cpHaloΔ (SEQ ID NO:004) to the C-terminus. The inventors cloned and purified a small family of constructs with flexible GGTGGS (SEQ ID NO 026) and/or Pro10 linkers between H-peptide and iGluSnFR and between iGluSnFR and cpHaloΔ. All constructs showed labeling in the presence of glutamate and a fluorescent HaloTag substrate, as determined by in vitro fluorescence polarization assays (e.g.
TABLE-US-00003 TABLE 2 Summary of second generation CaProLa constructs, used CaM-M13 variants, EC.sub.50 values reported for CaMPARI2 and EC.sub.50 measured for CaProLa. CaProLa version CaM-M13 origin Reported EC.sub.50 Measured CaProLa 1.4 CaMPARI 111 146 ± 44.6 (1. gen.) CaProLa 2.1 CaMPARI2 825 nM 625 ± 25 nM CaProLa 2.2 CaMPARI2 360 nM 448 ± 7 nM CaProLa 2.3 CaMPARI2 110 nM 82.6 ± 4.5 nM
[0196] Sequences
[0197] HaloTag7 (see GenBank AQS79242); the cp version employed in creating the invention does not contain the C-terminal 27 amino acids of this sequence
TABLE-US-00004 SEQ ID NO 001: HaloTag7 circular permutated sequence FARETFQAFRTTDVGRKLIIDQNVFIEGTLPMGVVRPLTEVEMDHYREPFLNPVDREPLWRF PNELPIAGEPANIVALVEEYMDWLHQSPVPKLLFWGTPGVLIPPAEAARLAKSLPNCKAVDIG PGLNLLQEDNPDLIGSEIARWLSTLEIGGTGGSGGTGGSGGSIGTGFPFDPHYVEVLGERM HYVDVGPRDGTPVLFLHGNPTSSYVWRNIIPHVAPTHRCIAPDLIGMGKSDKPDLGYFFDDH VRFMDAFIEALGLEEVVLVIHDWGSALGFHWAKRNPERVKGIAFMEFIRPIPTWDEW SEQ ID NO 002: cpHaloΔ N-terminal sequence DVGRKLIIDQNVFIEGTLPMGVVRPLTEVEMDHYREPFLNPVDREPLWRFPNELPIAGEPANI VALVEEYMDWLHQSPVPKLLFWGTPGVLIPPAEAARLAKSLPNCKAVDIGPGLNLLQEDNP DLIGSEIARWLSTLEI SEQ ID NO 003: cpHaloΔ C-terminal sequence IGTGFPFDPHYVEVLGERMHYVDVGPRDGTPVLFLHGNPTSSYVWRNIIPHVAPTHRCIAPD LIGMGKSDKPDLGYFFDDHVRFMDAFIEALGLEEVVLVIHDWGSALGFHWAKRNPERVKGI AFMEFIRPIPTWDEW SEQ ID 004 cpHaloΔ full sequence DVGRKLIIDQNVFIEGTLPMGVVRPLTEVEMDHYREPFLNPVDREPLWRFPNELPIAGEPANI VALVEEYMDWLHQSPVPKLLFWGTPGVLIPPAEAARLAKSLPNCKAVDIGPGLNLLQEDNP DLIGSEIARWLSTLEIGGTGGSGGTGGSGGSIGTGFPFDPHYVEVLGERMHYVDVGPRDGT PVLFLHGNPTSSYVWRNIIPHVAPTHRCIAPDLIGMGKSDKPDLGYFFDDHVRFMDAFIEALG LEEVVLVIHDWGSALGFHWAKRNPERVKGIAFMEFIRPIPTWDEW SEQ ID 005 cpHaloΔ internal linker sequence: GGTGGSGGTGGSGGS SEQ ID NO 006 (PEP1): 9mer Peptide 145-ARETFQAFR-153 SEQ ID NO 007 (PEP2): 10mer Peptide (higher propensity to complement the activity = faster kinetics) 145-ARETFQAFRT-154 SEQ ID NO 008 (M13) RVDSSRRKFNKTGKALRAIGRLSSLE SEQ ID NO 009 (CaM) DQLTEEQIAEFKEAFSLFDKDGDGTITTKELGTVMRSLGQNPTEAELQDMINEVDADGDGTI DFPEFLTMMARKMKDTDSEEEIREAFRVFDKDGNGYISAAELRHVMTNLGEKLTDEEVDEM IREADIDGDGQVNYEEFVVMMTAK SEQ ID NO 010 (SPLT1) RVDSSRRKFNKTGKALRAIGRLSSLEGGSDVGRKLIIDQNVFIEGTLPMGVVRPLTEVEMDH YREPFLNPVDREPLWRFPNELPIAGEPANIVALVEEYMDWLHQSPVPKLLFWGTPGVLIPPA EAARLAKSLPNCKAVDIGPGLNLLQEDNPDLIGSEIARWLSTLEIGGTGGSGGTGGSGGSIG TGFPFDPHYVEVLGERMHYVDVGPRDGTPVLFLHGNPTSSYVWRNIIPHVAPTHRCIAPDLI GMGKSDKPDLGYFFDDHVRFMDAFIEALGLEEVVLVIHDWGSALGFHWAKRNPERVKGIAF MEFIRPIPTWDEW SEQ ID NO 011 (SPLT2) ARETFQAFRGGSDQLTEEQIAEFKEAFSLFDKDGDGTITTKELGTVMRSLGQNPTEAELQD MINEVDADGDGTIDFPEFLTMMARKMKDTDSEEEIREAFRVFDKDGNGYISAAELRHVMTNL GEKLTDEEVDEMIREADIDGDGQVNYEEFVVMMTAK SEQ ID NO 012 (SPLT3) ARETFQAFRTGSDQLTEEQIAEFKEAFSLFDKDGDGTITTKELGTVMRSLGQNPTEAELQD MINEVDADGDGTIDFPEFLTMMARKMKDTDSEEEIREAFRVFDKDGNGYISAAELRHVMTNL GEKLTDEEVDEMIREADIDGDGQVNYEEFVVMMTAK SEQ ID NO 013 (CONF1) ARETFQAFRGGSDQLTEEQIAEFKEAFSLFDKDGDGTITTKELGTVMRSLGQNPTEAELQD MINEVDADGDGTIDFPEFLTMMARKMKDTDSEEEIREAFRVFDKDGNGYISAAELRHVMTNL GEKLTDEEVDEMIREADIDGDGQVNYEEFVVMMTAKEFPPPPPPPPPPPPPPPPPPPPPP PPPPPPPGGSRVDSSRRKFNKTGKALRAIGRLSSLEGGSDVGRKLIIDQNVFIEGTLPMGVV RPLTEVEMDHYREPFLNPVDREPLWRFPNELPIAGEPANIVALVEEYMDWLHQSPVPKLLF WGTPGVLIPPAEAARLAKSLPNCKAVDIGPGLNLLQEDNPDLIGSEIARWLSTLEIGGTGGSG GTGGSGGSIGTGFPFDPHYVEVLGERMHYVDVGPRDGTPVLFLHGNPTSSYVWRNIIPHVA PTHRCIAPDLIGMGKSDKPDLGYFFDDHVRFMDAFIEALGLEEVVLVIHDWGSALGFHWAKR NPERVKGIAFMEFIRPIPTWDEW SEQ ID NO 014 (CONF2) ARETFQAFFITGSDQLTEEQ1AEFKEAFSLFDKDGDGTITTKELGTVMRSLGQNPTEAELQD MINEVDADGDGTIDFPEFLTMMARKMKDTDSEEEIREAFRVFDKDGNGYISAAELRHVMTNL GEKLTDEEVDEMIREADIDGDGQVNYEEFVVMMTAKEFPPPPPPPPPPPPPPPPPPPPPPP PPPPPPPGGSRVDSSRRKFNKTGKALRAIGRLSSLEGGSDVGRKLIIDQNVFIEGTLPMGVV RPLTEVEMDHYREPFLNPVDREPLWRFPNELPIAGEPANIVALVEEYMDWLHQSPVPKLLF WGTPGVLIPPAEAARLAKSLPNCKAVDIGPGLNLLQEDNPDLIGSEIARWLSTLEIGGTGGSG GTGGSGGSIGTGFPFDPHYVEVLGERMHYVDVGPRDGTPVLFLHGNPTSSYVWRNIIPHVA PTHRCIAPDLIGMGKSDKPDLGYFFDDHVRFMDAFIEALGLEEVVLVIHDWGSALGFHWAKR NPERVKGIAFMEFIRPIPTWDEW SEQ ID NO 015 (FKBP) MGVQVETISPGDGRTFPKRGQTCVVHYTGMLEDGKKFDSSRDRNKPFKFMLGKQEVIRGW EEGVAQMSVGQRAKLTISPDYAYGAIGHPGIIPPHATLVFDVELLKLE SEQ ID NO 016 (FRB) AILWHEMWHEGLEEASRLYFGERNVKGMFEVLEPLHAMMERGPQTLKETSFNQAYGRDL MEAQEWCRKYMKSGNVKDLLQAWDLYYHVFRRISK SEQ ID NO 017 (RAPIND1) MGVQVETISPGDGRTFPKRGQTCVVHYTGMLEDGKKFDSSRDRNKPFKFMLGKQEVIRGW EEGVAQMSVGQRAKLTISPDYAYGAIGHPGIIPPHATLVFDVELLKLEGSGGTGGSGDVGR KLIIDQNVFIEGTLPMGVVRPLTEVEMDHYREPFLNPVDREPLWRFPNELPIAGEPANIVALV EEYMDWLHQSPVPKLLFWGTPGVLIPPAEAARLAKSLPNCKAVDIGPGLNLLQEDNPDLIGS EIARWLSTLEIGGTGGSGGTGGSGGSIGTGFPFDPHYVEVLGERMHYVDVGPRDGTPVLFL HGNPTSSYVWRNIIPHVAPTHRCIAPDLIGMGKSDKPDLGYFFDDHVRFMDAFIEALGLEEV VLVIHDWGSALGFHWAKRNPERVKGIAFMEFIRPIPTWDEW SEQ ID NO 018 (RAPIND2) ARETFQAFRGGSAILWHEMWHEGLEEASRLYFGERNVKGMFEVLEPLHAMMERGPQTLK ETSFNQAYGRDLMEAQEWCRKYMKSGNVKDLLQAWDLYYHVFRRISK SEQ ID NO 019 (RAPIND3) ARETFQAFRTGSAILWHEMWHEGLEEASRLYFGERNVKGMFEVLEPLHAMMERGPQTLKE TSFNQAYGRDLMEAQEWCRKYMKSGNVKDLLQAWDLYYHVFRRISK SEQ ID NO 020 (GLT1) AAGSTLDKIAKNGVIVVGHRESSVPFSYYDNQQKVVGYSQDYSNAIVEAVKKKLNKPDLQV KLIPITSQNRIPLLQNGTFDFECGSTTNNVERQKQAAFSDTIFVVGTRLLTKKGGDIKDFANLK DKAVVVTSGTTSEVLLNKLNEEQKMNMRIISAKDHGDSFRTLESGRAVAFMMDDVLLAGER AKAKKPDNWEIVGKPQSQEAYGCMLRKDDPQFKKLMDDTIAQVQTSGEAEKWFDKWFKNP ILV SEQ ID NO 021 (GLT2) NPLNMNFELSDEMKALFKEPNDKALK SEQ ID NO 022 (GLT3) AAGSTLDKIAKNGVIVVGHRESSVPFSYYDNQQKVVGYSQDYSNAIVEAVKKKLNKPDLQV KLIPITSQNRIPLLQNGTFDFECGSTTNNVERQKQAAFSDTIFVVGTRLLTKKGGDIKDFANLK DKAVVVTSGTTSEVLLNKLNEEQKMNMRIISAKDHGDSFRTLESGRAVAFMMDDVLLAGER AKAKKPDNWEIVGKPQSQEAYGCMLRKDDPQFKKLMDDTIAQVQTSGEAEKWFDKWFKNP ILVSHNVYIMADKQRNGIKANFKIRHNIEDGGVQLAYHYQQNTPIGDGPVLLPDNHYLSTQSK LSKDPNEKRDHMVLLEFVTAAGITLGMDELYKGGTGGSMVSKGEELFTGVVPILVELDGDV NGHKFSVSGEGEGDATYGKLTLKFICTTGKLPVPWPTLVTTLTYGVQCFSRYPDHMKQHDFF KSAMPEGYIQERTIFFKDDGNYKTRAEVKFEGDTLVNRIELKGIDFKEDGNILGHKLEYNFNN PLNMNFELSDEMKALFKEPNDKALK SEQ ID NO 023 (GLTIND1) ARETFQAFRTGGTGGSAAGSTLDKIAKNGVIVVGHRESSVPFSYYDNQQKVVGYSQDYSNAI VEAVKKKLNKPDLQVKLIPITSQNRIPLLQNGTFDFECGSTTNNVERQKQAAFSDTIFVVGTRL LTKKGGDIKDFANLKDKAVVVTSGTTSEVLLNKLNEEQKMNMRIISAKDHGDSFRTLESGRA VAFMMDDVLLAGERAKAKKPDNWEIVGKPQSQEAYGCMLRKDDPQFKKLMDDTIAQVQTS GEAEKWFDKWFKNPILV SEQ ID NO 024 (GLTIND2) NPLNMNFELSDEMKALFKEPNDKALKGGTGGSDVGRKLIIDQNVFIEGTLPMGVVRPLTEVE MDHYREPFLNPVDREPLWRFPNELPIAGEPANIVALVEEYMDWLHQSPVPKLLFWGTPGVLIP PAEAARLAKSLPNCKAVDIGPGLNLLQEDNPDLIGSEIARWLSTLEIGGTGGSGGTGGSGGSIG TGFPFDPHYVEVLGERMHYVDVGPRDGTPVLFLHGNPTSSYVWRNIIPHVAPTHRCIAPDLIG MGKSDKPDLGYFFDDHVRFMDAFIEALGLEEVVLVIHDWGSALGFHWAKRNPERVKGIAFM EFIRPIPTWDEW SEQ ID NO 025 (GLTIND3) ARETFQAFRTGGTGGSAAGSTLDKIAKNGVIVVGHRESSVPFSYYDNQQKVVGYSQDYSNAI VEAVKKKLNKPDLQVKLIPITSQNRIPLLQNGTFDFECGSTTNNVERQKQAAFSDTIFVVGTRL LTKKGGDIKDFANLKDKAVVVTSGTTSEVLLNKLNEEQKMNMRIISAKDHGDSFRTLESGRA VAFMMDDVLLAGERAKAKKPDNWEIVGKPQSQEAYGCMLRKDDPQFKKLMDDTIAQVQTS GEAEKWFDKWFKNPILVSHNVYIMADKQRNGIKANFKIRHNIEDGGVQLAYHYQQNTPIGD GPVLLPDNHYLSTQSKLSKDPNEKRDHMVLLEFVTAAGITLGMDELYKGGTGGSMVSKGEE LFTGVVPILVELDGDVNGHKFSVSGEGEGDATYGKLTLKFICTTGKLPVPWPTLVTTLTYGVQ CFSRYPDHMKQHDFFKSAMPEGYIQERTIFFKDDGNYKTRAEVKFEGDTLVNRIELKGIDFKE DGNILGHKLEYNFNNPLNMNFELSDEMKALFKEPNDKALKGGTGGSDVGRKLIIDQNVFIEG TLPMGVVRPLTEVEMDHYREPFLNPVDREPLWRFPNELPIAGEPANIVALVEEYMDWLHQSP VPKLLFWGTPGVLIPPAEAARLAKSLPNCKAVDIGPGLNLLQEDNPDLIGSEIARWLSTLEIGG TGGSGGTGGSGGSIGTGFPFDPHYVEVLGERMHYVDVGPRDGTPVLFLHGNPTSSYVWRNIIP HVAPTHRCIAPDLIGMGKSDKPDLGYFFDDHVRFMDAFIEALGLEEVVLVIHDWGSALGFHW AKRNPERVKGIAFMEFIRPIPTWDEW