Labelled compounds and methods for mass spectrometry-based quantification

Abstract

Methods for peptide and/or protein quantification by mass spectrometry using labeled peptides, wherein multiple labels lead to distinct fragments for the labeled peptides and their unlabeled variant, thus facilitating data analysis and enhancing the potential for quantification. Methods for selecting the label and label position are further given, as well as sets of labeled peptides resulting from or for use in the above-mentioned methods. The methods and substances are especially useful for data-independent or multiplexed parallel reaction monitoring proteomics applications involving peptide quantification.

Claims

1. A method for the absolute or relative quantitative analysis of proteins and/or peptides with or without post translational modification(s) using a mass spectrometry method in which: in a first step unlabeled proteins from an endogenous mixture are digested and subsequently digestion products thereof selected, in a second step said digestion products are fragmented, and in a third step a combined fragment spectrum is acquired comprising b-ions and y-ions of said digestion products, wherein at least one reference peptide is added to said mixture before and/or after digestion, is fragmented, acquired, and stored in said combined fragment spectrum comprising also b-ions and y-ions of said digestion products, wherein the said at least one reference peptide is added in a known concentration for absolute quantification or in always the same concentration in a series of experiments for relative quantitative analysis, wherein said at least one reference peptide this is selectively isotopically labeled by having incorporated: one isotopically labeled amino acid forming its very C-terminus or being one of the four terminal amino acids at the C-terminus and one isotopically labeled amino acid forming its very N-terminus, or being one of the four terminal amino acids at the N-terminus, and wherein the isotopically labeled amino acids are unmodified naturally occurring proteinogenic amino acids or amino acids carrying a chemically modifying moiety, wherein said unmodified naturally occurring proteinogenic amino acids or amino acids carrying a chemically modifying moiety comprise one or more atoms that are isotopically labeled such that said one or more atoms are present in the amino acid and not in the chemically modifying moiety.

2. The method according to claim 1, wherein in said reference peptide, apart from the isotopically labeled amino acid at or close to the C-terminus and the isotopically labeled amino acid at or close to the N-terminus, not more than one additional amino acid is isotopically labeled.

3. The method according to claim 1, wherein in said reference peptide one isotopically labeled amino acid is forming its very C-terminus and one further isotopically labeled amino acid is forming its very N-terminus.

4. The method according to claim 1, wherein said combined fragment spectrum is acquired using a mass isolation window having a full-range mass isolation window, or a width in terms of mass-to-charge ratio in the range of (2×1.036426×10.sup.−8 kg/C)−(1000×1.036426×10.sup.−8 kg/C).

5. The method according to claim 1, wherein said combined fragment spectrum is acquired using a mass isolation window of (5×1.036426×10.sup.−8 kg/C)−(30×1.036426×10.sup.−8 kg/C).

6. The method according to claim 1, wherein said post translational modification is one or more selected from the group consisting of: phosphorylation, acetylation, methylation, sulfation, hydroxylation, lipidation, ubiquitylation, sumoylation, and glycosylation.

7. The method according to claim 1, wherein said reference peptide consists of 5-100 amino acids.

8. The method according to claim 1, wherein it involves using DIA or mPRM techniques.

9. The method according to claim 1, wherein in said reference peptide, apart from the isotopically labeled amino acid at or close to the C-terminus and the isotopically labeled amino acid at or close to the N-terminus, no additional amino acid is isotopically labeled.

10. The method according to claim 1, wherein said combined fragment spectrum is acquired using a mass isolation window having a full-range mass isolation window, or a width in terms of mass-to-charge ratio in the range of (5×1.036426×10.sup.−8 kg/C)−(100×1.036426×10.sup.−8 kg/C).

11. The method according to claim 1, wherein said combined fragment spectrum is acquired using a mass isolation window of (10×1.036426×10.sup.−8 kg/C)−(25×1.036426×10.sup.−8 kg/C).

12. The method according to claim 1, wherein said reference peptide consists of 7-30 amino acids.

13. The method according to claim 1, wherein said reference peptide consists of 10-20 amino acids.

Description

BRIEF DESCRIPTION OF THE DRAWINGS

(1) Preferred embodiments of the invention are described in the following with reference to the drawings, which are for the purpose of illustrating the present preferred embodiments of the invention and not for the purpose of limiting the same. In the drawings,

(2) FIG. 1 shows a) an MS1 spectrum wherein the mass window for fragmentation containing the unlabeled and the labeled precursor is marked, and b) a combined fragment ion spectrum comprising fragment ions from the unlabeled and the single-labeled variant of peptide DIASGLIGPLIIC[+C2+H3+N+O]K. The code [+C2+H3+N+O] denotes a carbamidomethyl modification at cysteine that is typically introduced on purpose during sample preparation;

(3) FIG. 2 shows schematic drawings of a) a peptide fragmentation pattern and b) of a peptide and its y- and b-fragment ions;

(4) FIG. 3 shows a schematic drawing comparing DDA with DIA, wherein mass windows containing several precursors are fragmented in the data-independent acquisition experiment and the resulting data are stored in combined fragment ion spectra;

(5) FIG. 4 shows a schematic drawing of an mPRM experiment, wherein either larger mass windows containing several precursors or several mass windows containing precursors are fragmented and the resulting data are stored together;

(6) FIG. 5 shows a) fragment overlap for unlabeled peptides and peptides with a single heavy label and b) displays a schematic drawing of the y- and b-ions;

(7) FIG. 6 shows a) a fragment ion spectrum without fragment overlap and b) displays a schematic drawing of the y- and b-ions for unlabeled peptides and double-heavy-labeled peptides;

(8) FIG. 7 exemplifies processes in a method to select optimal label positions;

(9) FIG. 8 shows a schematic drawing of a calculation mode for selecting label positions;

(10) FIG. 9 in a) and b) exemplifies the outcome of an analysis for optimal label positions: barplots show the frequency with which each amino acid would be labeled for different n.sub.globalMaxVal and a human blood plasma peptide spectral library containing two isotopically labeled amino acids per peptide;

(11) FIG. 10 shows a schematic drawing of an isotopic labeling experiment wherein either single- or double-labeled reference peptides are combined with an unlabeled peptide mixture and the acquisition method is DIA;

(12) FIG. 11 shows a schematic drawing of an isotopic labeling experiment wherein labeled reference peptides are combined with an unlabeled peptide mixture and the acquisition method is DIA;

(13) FIG. 12 shows a schematic drawing of an isotopic labeling experiment wherein labeled reference peptides are combined with an unlabeled peptide mixture and the acquisition method is mPRM;

(14) FIG. 13 shows a) a combined fragment ion spectrum acquired with mPRM comprising fragment ions from the unlabeled and the single-labeled variant of peptide DIASGLIGPLIIC[+C2+H3+N+O]K, b) the fragment ion traces for fragment ions from the unlabeled and the single-labeled variant of peptide DIASGLIGPLIIC[+C2+H3+N+O]K, and c) a barplot depicting fragment-ion intensities for the unlabeled and the single-labeled variant of peptide DIAS GLIGPLIIC[+C2+H3+N+O]K and the fragment ion intensity ratio between the two variants. The code [+C2+H3+N+O] denotes a carbamidomethyl modification at cysteine that is typically introduced on purpose during sample preparation;

(15) FIG. 14 shows a) a combined fragment ion spectrum acquired with mPRM comprising fragment ions from the unlabeled and the double heavy-labeled variant of peptide DIASGLIGPLIIC[+C2+H3+N+O]K, b) the fragment ion traces for fragment ions from the unlabeled and the double heavy-labeled variant of peptide DIASGLIGPLIIC[+C2+H3+N+O]K, and c) a barplot depicting fragment-ion intensities for the unlabeled and the double heavy-labeled variant of peptide DIASGLIGPLIIC[+C2+H3+N+O]K and the fragment ion intensity ratio between the two variants. The code [+C2+H3+N+O] denotes a carbamidomethyl modification at cysteine that is typically introduced on purpose during sample preparation;

(16) FIG. 15 shows a) a combined fragment ion spectrum acquired with DIA comprising fragment ions from the unlabeled and the single-labeled variant of peptide DIASGLIGPLIIC[+C2+H3+N+O]K, b) the fragment ion traces for fragment ions from the unlabeled and the single-labeled variant of peptide DIASGLIGPLIIC[+C2+H3+N+O]K, and c) a barplot depicting fragment-ion intensities for the unlabeled and the single-labeled variant of peptide DIASGLIGPLIIC[+C2+H3+N+O]K and the fragment ion intensity ratio between the two variants. The code [+C2+H3+N+O] denotes a carbamidomethyl modification at cysteine that is typically introduced on purpose during sample preparation; and

(17) FIG. 16 shows a) a combined fragment ion spectrum acquired with DIA comprising fragment ions from the unlabeled and the double heavy-labeled variant of peptide DIASGLIGPLIIC[+C2+H3+N+O]K, b) the fragment ion traces for fragment ions from the unlabeled and the double heavy-labeled variant of peptide DIASGLIGPLIIC[+C2+H3+N+O]K, and c) a barplot depicting fragment-ion intensities for the unlabeled and the double heavy-labeled variant of peptide DIASGLIGPLIIC[+C2+H3+N+O]K and the fragment ion intensity ratio between the two variants. The code [+C2+H3+N+O] denotes a carbamidomethyl modification at cysteine that is typically introduced on purpose during sample preparation.

(18) FIG. 17 shows a) a barplot depicting the intensity correlation score (average over 3 replicates) for the unlabeled variant of peptide DIASGLIGPLIIC[+C2+H3+N+O]K in an experimental setup using DIA and using reference peptides with a single C-terminal label, or double-labeled reference peptides. The code [+C2+H3+N+O] denotes a carbamidomethyl modification at cysteine that is typically introduced on purpose during sample preparation. b) a barplot depicting the average intensity correlation score (over 3 replicates and 5 peptides) for the unlabeled variants of 5 peptides in an experimental setup using DIA and using reference peptides with a single C-terminal label, or double-labeled reference peptides.

DESCRIPTION OF PREFERRED EMBODIMENTS

(19) Herein after, the present invention is described in further detail and is exemplified. However, the examples are not intended to limit the present invention. Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. It must be noted that as used herein and in the claims, the singular forms “a”, “an”, and “the” include plural referents unless the context clearly dictates otherwise. Thus, for example reference to “a label” includes a plurality of such labels and so forth.

(20) Although any materials and methods similar or equivalent to those described herein can be used to practice or test the present invention, the preferred materials and methods are now described.

(21) This description specifically details the application of labeled reference peptides in quantitative proteomics studies wherein combined fragment ion spectra are obtained. It describes methods for the quantitative analysis of peptides and/or proteins, methods for the selection of suitable reference peptides and label positions, and the reference peptides used in said methods. Different aspects relating to the experimental setup, and labeling strategies are discussed. Finally, examples of applications illustrate the potential of the methods and substances of the present invention to improve the accuracy of quantitative studies.

(22) Mass Spectrometry Methods:

(23) Mass spectrometry (MS) methods are widely used for peptide and/or protein identification and quantification, especially in proteomics studies where large numbers of analytes are monitored. A standard sample preparation workflow for bottom-up liquid chromatography (LC)-MS experiments includes the following steps: Proteins comprised in a sample are digested to peptides using a protease such as trypsin. The peptides are then separated by liquid chromatography, most commonly via reversed-phase liquid chromatography (LC). As soon as the peptides elute from the chromatography column, they are ionized by electrospray ionization (ESI): At the ion source, a voltage is applied which disperses the liquid sample into fine droplets containing charged peptide molecules. These precursors then enter the mass spectrometer where they fly in an electric field and are resolved according to their mass-to-charge (m/z) ratio. Finally, the precursor ions are detected and their mass-to-charge (m/z) ratio is registered, resulting in MS1 (or MS) spectra acquired over the whole gradient. Single peptide precursors or wider mass ranges are sequenced as follows: The ions in the selected mass window are isolated and fragmented, e.g. by collision with Helium gas, a process termed collision-induced dissociation (CID) or by higher energy C-trap dissociation (HCD). All fragment ions are then recorded in one MS/MS, MS2, or fragment ion spectrum.

(24) The fragment ion spectra serve as a basis for peptide identification. Peptides do not disintegrate randomly during fragmentation, but rather fragment according to a pattern into a, b, c, x, y, and z-ions (FIG. 2a). In common proteomics studies, the most prominent ion series are often y- and b-ions and special attention is paid to them. These two form complementary fragment ion series (FIG. 2b), wherein y-ions include the peptide's C-terminus and b-ions include the N-terminus. Since peptide fragmentation follows a known pattern, the peptide sequence can be derived from the fragment ion peaks in an MS2 spectrum. Once the peptide has been identified, it can further be quantified using the acquired MS1 or MS2 data.

(25) Different mass spectrometry approaches can be used in bottom-up proteomics experiments. While the basic steps of the protocols remain the same for all approaches, other parts, such as fragmentation, identification, and quantification of peptides, vary depending on the MS method used.

(26) One of the most frequently used mass spectrometry approaches in proteomics is data-dependent acquisition (DDA), also called “shotgun” (FIG. 3, left panel). In a classical data-dependent workflow only the precursors with the highest signal intensities in the MS1 spectrum are sequenced: The ions in a small mass window around the desired precursor m/z are isolated and fragmented (FIG. 3, left panel). All fragment ions derived from this small mass window are then recorded in one MS/MS, MS2, or fragment ion spectrum. To identify the peptides and proteins contained in the sample, the MS/MS spectra are searched against a database containing the theoretical spectra of the whole proteome of interest. After the peptides have been identified, peptide and/or protein quantification is typically done on the MS1 level by creating extracted ion chromatograms (XIC), i.e. by monitoring the signal of a certain precursor m/z peak over the LC gradient. Since it can identify thousands of proteins with minimal prior knowledge about a sample's protein content, DDA is widely used for discovery studies. However, a disadvantage of the DDA approach is that only a limited number of precursors is selected for fragmentation. As a consequence many peptides remain unidentified. Furthermore, changes in precursor intensities can result in different sets of peptides being sequenced even in replicate MS acquisitions of the same sample. Additionally, sensitivity is lower compared to other mass spectrometry approaches.

(27) Within the last years, data-independent acquisition (DIA) emerged as a new MS approach which remedies many of DDA's disadvantages. Techniques which are based on this principle include for example HRM, SWATH, MS.sup.E and All-Ion-Fragmentation. The core feature of all DIA methods is that instead of a single precursor as for DDA, larger mass windows, or swaths, containing multiple precursors are fragmented (FIG. 3, right panel). Usually, a quadrupole acts as a mass filter here and targets certain mass ranges for fragmentation. The resulting fragment ions are then acquired on a high resolution mass analyzer, such as a time-of-flight (TOF) or an Orbitrap. This produces complex MS2 spectra (combined fragment ion spectra) containing fragment ions of several precursors. Due to the complexity of the MS2 spectra, it is vital to acquire fragment ions with high resolution and high mass accuracy in order to later assign the different fragments to their corresponding peptide precursors.

(28) Data analysis can be challenging due to the spectra containing fragments of several peptides.

(29) To identify and quantify the peptides present in a sample, the combined fragment ion spectra can be searched against a spectral library, or theoretical spectra or can be mined using SRM-like transitions. Fragments from the same peptide are subsequently arranged in SRM-like peak groups: The signal corresponds to the intensity of each fragment monitored over time in sequential spectra. Fragments of the same peptide will produce similarly shaped elution peaks with maxima at identical retention times (RT). These SRM-like peak groups can then be used to quantify e.g. an unlabeled endogenous peptide versus a labeled reference peptide. I.e. the quantification is done based on MS2 level data. Alternatively, peptide and/or protein quantification can be done on MS1 level if the corresponding MS1 data was acquired.

(30) The same data analysis concepts can be applied to the analysis of DIA and mPRM data. Traditionally, a spectral library generated from DDA-data is employed to extract quantitative features from DIA or mPRM runs and to identify peptides and/or proteins. Alternative data analysis approaches exist which do not rely on DDA-based spectral libraries, or do not rely on them exclusively: For example, mPRM or DIA data containing MS1 and MS2 scans can be converted into MS2 spectra containing fragment ions relevant for a specific MS1 feature. These spectra are searched using a database of theoretical spectra which results in peptide identifications being assigned to the precursor-fragment matches. This process is very similar to how DDA data is typically processed. The search results can be saved as spectral library. Furthermore, a spectral library can be generated from combined search results from DIA and/or DDA experiments, or from mPRM and/or DDA experiments. In either case, the search results and/or the spectral library are used to extract quantitative information from the mPRM or DIA runs, allowing peptide and/or protein quantification on MS1 and/or MS2 level.

(31) In summary, a spectral library can be generated from many sources including but not limited to the following: from data of the same acquisition, from a previous acquisition of the same sample, from an independent acquisition of a similar tissue or complete organism, from published data, from mPRM data, from DIA data, from DDA data, from a combination of DIA and DDA data, from a combination of mPRM and DDA data, from a resource database from fractionated or unfractionated samples, it can be generated on-the-fly from DIA or mPRM data, or from a combination of sources mentioned above. The spectral library can be saved and/or can be discarded after use.

(32) The following paragraph provides non-limiting examples for different data analysis approaches for DIA and/or mPRM data. A spectral library can be generated from the same sample, a similar sample, or from resource data. The data for the spectral library can stem from fractionated and/or unfractionated samples. The data for the spectral library can have been acquired with different mass spectrometry methods such as DDA, targeted mass spectrometry methods, DIA or mPRM, or any combination of them. The sample to be quantified can be fractionated or unfractionated and is acquired by DIA and/or mPRM. Peak groups and peptides in the sample are identified using the spectral library. The sample is then quantified based on MS2 and/or MS1 level data.

(33) Existing data analysis software, e.g. Spectronaut Pulsar (Biognosys AG) support many of the proposed data analysis workflows. The person skilled in the art will know which software to use or how to modify existing software to support the desired workflow.

(34) In an exemplary peptide and/or protein quantification experiment employing DIA, the amount of the endogenous, unlabeled peptide variant relative to its labeled, reference peptide variant has to be determined. To this end, unlabeled and labeled peptides comprised in a sample are fragmented. Due to the label introducing only a small mass shift the fragment ions of both precursors will most often be present in the same combined fragment spectrum. Thus, only fragment ions differing in at least one label can be distinguished between unlabeled and reference peptide. The amount of unlabeled peptide relative to reference peptide can be determined by comparing the SRM-like peaks formed by these fragment ions differing in at least one label.

(35) DIA methods have several advantages over DDA and other targeted methods such as SRM: DIA approaches have excellent sensitivity and a large dynamic range. Moreover, since no stochastic peak picking is involved DIA methods avoid the missing peptide ID data points typical for DDA methods and peptides are reproducibly measured over all samples. Furthermore, DIA allows sequencing of almost complete proteomes within one run without requiring prior knowledge about targeted transitions. All these properties make DIA methods especially suitable for quantification studies where many peptides and/or proteins need to be measured.

(36) Another MS method which is frequently used for the quantification of peptides and/or proteins is Selected Reaction Monitoring (SRM). SRM is a targeted mass spectrometry approach. Herein, fragment ions of a single, pre-selected target peptide are detected on low resolution, low mass accuracy mass spectrometers. Only limited numbers of peptides can be monitored with this technique, and assay development is laborious. Multiplexed parallel reaction monitoring (mPRM), a novel targeted proteomics technique, remedies these disadvantages (FIG. 4).

(37) Usually, mPRM analyses are conducted on a quadrupole which is combined with a high resolution mass analyzer. The quadrupole acts as mass filter to target mass ranges for fragmentation in a second quadrupole, and the resulting fragment ions are acquired by the high resolution mass analyzer. Fragmentation is done by either of two ways: Several precursors can be fragmented sequentially and their fragment ions are stored together for later measurement. Alternatively, larger m/z ranges containing several precursors are fragmented together. In both cases the fragmentation procedure results in combined fragment ion spectra comprising fragment ions from several precursors.

(38) The fragment ions are analyzed in the high resolution part of the instrument, often an orbitrap analyzer. This has several advantages over using a low resolution instrument as in

(39) SRM studies: Firstly, all fragment ions a peptide produces can be monitored, rather than just a small number, leading to a higher specificity and increasing the confidence that the correct peptide was identified. Moreover, assay optimization becomes less crucial and the larger number of fragment ions that is monitored per peptide makes quantification more robust. Secondly, since the fragment ions are acquired with high resolution and mass accuracy, the probability of false positive identifications decreases.

(40) DIA and mPRM workflows produce similar combined fragment ion spectra and can sometimes even be run on the same type of mass spectrometers. Therefore, also the basic principles for data analysis and quantification are the same. Thus, also for mPRM the SRM-like peak groups extracted from the fragment ion spectra can be used to quantify e.g. an unlabeled endogenous peptide versus a labeled reference peptide. Hence, quantification in an mPRM experiment is usually done based on MS2 level data.

(41) The advantages of mPRM over DDA and SRM are similar to the ones mentioned above for DIA: high sensitivity, a large dynamic range, and reproducible peptide picking. As a consequence, it is especially suitable for quantification studies.

(42) The present invention solves the problem of fragment overlap for any method that produces combined fragment ion spectra. This includes mass spectrometry methods acquiring low resolution data that is stored as combined fragment ion spectrum. Moreover, mass transmission windows for selecting precursors for fragmentation can be non-overlapping, overlapping, and/or can be sliding windows with small offsets. One DIA method using the latter is SONAR (Waters). This technique uses a quadrupole that slides over a selected mass range during each MS scan using transmission mass windows with offsets of a few Daltons. One full scan covers the whole mass range and high and low collision energy are applied in an alternating fashion to the scans, thus producing both MS1 and MS2 data. The person skilled in the art will know how to set up and operate the corresponding mass spectrometry setting.

(43) Combined fragment ion spectra can be produced by pooling data of fragment ions from several precursors in one transmission mass window, e.g. as described in the examples. Even DDA methods can thus produce combined fragment ion spectra if large transmission mass windows are used and several precursors are fragmented together. Alternatively, fragment ion data of precursors from different transmission mass windows can be pooled to form a combined fragment ion spectrum. This principle is for example used in multiplexed DIA (Egertson, J. D., MacLean, B., Johnson, R., Xuan, Y., and MacCoss, M. J., 2015. Multiplexed Peptide Analysis using Data Independent Acquisition and Skyline, Nature Protocols, 2015, 10(6), pp. 887-903.). The person skilled in the art will know how to set up the corresponding mass spectrometry acquisition methods. Data analysis of the combined fragment ion spectra proceeds as described.

(44) Use of Multiply-Labeled Peptides in Quantification Studies Employing DIA or mPRM:

(45) A common setup for protein and/or peptide quantification is to compare the abundances of an unlabeled, endogenous peptide and its reference peptide variant carrying a single C-terminal label. Usually, this is an amino acid containing heavy elemental isotopes, most commonly arginine or lysine. When a combined fragment spectrum of these peptides is acquired with DIA or mPRM the presence of a single label will lead to complications: All C-terminal ions from the reference peptide will contain the label and will have an m/z distinct from their unlabeled counterparts (FIGS. 5a, 5b). However, N-terminal ions from the reference peptide, such as b-ions, will not contain any label and will have the same m/z as the corresponding ions from the unlabeled peptide (FIGS. 5a, 5b). We call this “fragment overlap”. As a consequence, none of the N-terminal fragment ions can be used for quantification. Only the C-terminal fragment ion pairs differing in one label will reflect the abundance ratio between the unlabeled and the reference peptide. The use of only roughly half of the theoretical fragments leads to a less robust quantification. To further aggravate the problem, the presence of shared fragments between two peptide variants further complicates data analysis and hampers peptide identification, for instance if the known relative fragment ion intensity is used for scoring. The relative fragment ion intensities are the intensities of fragment ions within one peptide variant's peak. An example would be if for an unlabeled peptide b7 is the most intense ion, followed by y10, and y5. The relative fragment ion intensities follow a certain pattern for each peptide sequence, usually regardless of the label. Therefore, they can be used in the identification of both, unlabeled and labeled, peptides. If reference peptides are used that produce fragment overlap with the peptide variants to be quantified, the relative fragment ion intensities for both peptide variants might be skewed (FIG. 13, FIG. 15). Thus, fragment overlap can impair peptide and/or protein identification.

(46) One way to eliminate the fragment overlap during DIA- or mPRM-based peptide and/or protein quantification experiments is by selectively introducing two labels (heavy isotope containing amino acids) at different positions into the reference peptides such that most C-terminal, as well as N-terminal fragments of interest will contain a label (FIG. 6). In any case the presence of multiple labels at suitable positions in the reference peptide results in distinct m/z for fragment ions from the reference and the unlabeled peptide (FIG. 6b), both for N- and C-terminal fragment ions. Thus, no fragment overlap occurs and the fragments stemming from unlabeled and labeled peptides can be distinguished.

(47) The present invention makes use of such multiply-labeled reference peptides and/or proteins to provide an improved quantification method that is compatible with combined fragment ion MS spectra. Secondly, the present invention relates to a method for selecting the label and label position of at least one suitable reference peptide. Thirdly, the present invention relates to selectively double-labeled reference peptides for use in or produced by the above mentioned methods.

(48) Using such multiply-labeled reference peptides solves the problems occurring with single-labeled reference peptides in conjunction with mass spectrometry approaches producing combined fragment ion spectra. It allows exploiting the full potential of DIA and mPRM methods for quantitative studies. Firstly, combined fragment ion spectra of unlabeled and labeled precursors will contain less shared fragment ions which can facilitate the identification of peptides and peak groups. For example, fragment overlap between reference peptides and peptides to be quantified might lead to skewed relative fragment intensities for both variants, as discussed above. Relative fragment intensities are often used for peptide and peak group identification and scoring. Therefore, using reference peptides that differ in at least 2 labels from the other peptide variant can aid peptide and/or protein identification.

(49) Secondly, being able to differentiate between N-terminal fragment ions, such as b-ions, from unlabeled and from labeled peptides allows including them for quantification without skewing quantitative values. Including a higher number of suitable ions will render quantification more robust and accurate.

(50) Steps for peptide and/or protein quantification using DIA or mPRM:

(51) In quantification experiments unlabeled endogenous peptides and/or proteins will be pooled with reference peptides. Since sample preparation can introduce considerable inter-sample variability, preferably the unlabeled peptides and/or proteins and the labeled peptides are pooled as early as possible in the protocol. Thus, any variability introduced by later sample preparation steps will affect both, light and heavy peptide, in equal measures. The steps at which pooling is most suitable may vary and are therefore not included in the standard protocol below. Most frequently, synthetic reference peptides are added to peptide samples in a last step before liquid chromatography.

(52) A standard protocol for the quantification of peptides and/or proteins by DIA or mPRM mass spectrometry includes, but is not limited to, the following steps:

(53) 1. Protein extraction: Proteins are extracted from samples. If necessary, this can include the use of detergents, mechanical force, heat, chaotropes or other means. The suitable protein extraction protocol depends on the sample and the skilled person will know which one is suitable for a specific mixture.

(54) 2. Reduction of disulfide bonds: Prior to digestion disulfide bonds between cysteine residues of proteins, are reduced. This serves to make more residues accessible for digestion and prevents two peptides from being connected which would result in complex fragment ion spectra. Preferably, Dithiothreitol (DTT) or TCEP (Tris(2-carboxyethyl)phosphine hydrochloride) are used for this step.

(55) 3. Alkylation of free cysteines: In order to avoid re-formation of disulfide bonds the free cysteines are alkylated, preferably with iodoacetamide or iodoacetic acid. The reaction is carried out in the dark to avoid formation of side products and further modifications.

(56) 4. Protein digestion: Proteins in the sample are cleaved into peptides, preferably using a protease such as trypsin and/or Lys-C. The reaction is preferably carried out at 37° C. in a suitable buffer.

(57) 5. Peptide purification: The peptides are purified prior to MS analysis. Preferably they are desalted, typically using a C18 stationary phase.

(58) 6. Liquid chromatography: Several microliters of sample are loaded onto a liquid chromatography column and are separated, preferably by increasing hydrophobicity via reversed-phase LC and a gradient of increasing acetonitrile concentrations.

(59) 7. MS analysis: Peptides elute, are ionized and subjected to MS analysis via either a DIA- or an mPRM-method. Fragment ions are detected on a high resolution instrument and combined fragment ion spectra are stored.

(60) 8. Data analysis: Quantification is usually done based on MS2 level data. Spectra can be searched against a spectral library, or theoretical spectra, or can be mined using SRM-like transitions to identify and quantify peptides and/or proteins. Examples for specialized software for these analyses are Spectronaut and Spectronaut Pulsar (Biognosys AG), DIA-Umpire (Tsou, C. C., Tsai, C. F., Teo, G., Chen, Y. J., Nesvizhskii, A. I., 2016. Untargeted, spectral library-free analysis of data independent acquisition proteomics data generated using Orbitrap mass spectrometers. Proteomics, (15-16), pp.2257-2271.) or OpenSWATH. Fragments from the same peptide are subsequently arranged in SRM-like peak groups: The signal corresponds to the intensity of each fragment monitored over time in sequential spectra. Fragments of the same peptide will produce similarly shaped elution peaks with maxima at identical retention times (RT). These SRM-like peak groups can then be used to quantify e.g. an unlabeled endogenous peptide versus a labeled reference peptide. Alternatively, data analysis approaches which do not rely on DDA-based spectral libraries, or do not rely on them exclusively, can be applied for peptide and/or protein identification and/or quantification. Analysis software, such as Spectronaut Pulsar, support these data analysis workflows. Moreover, quantification can be done on MS1 and/or MS2 level. The details and the optimal implementation of the standard protocol depend on the purpose of the experiment, the properties of the sample and the proteins of interest, and the instruments used, among other factors. The skilled person will know how to implement and alter the standard workflow to best suit a specific setup.

(61) The following paragraphs guide through the details of selecting a suitable label and label position for selectively double-labeled reference peptides:

(62) To produce double-labeled reference peptides, the labels are introduced selectively at certain positions within the peptide sequence. The label position is crucial to ensure an optimal balance between the information content provided (which is biggest for terminal labels) and other parameters, e.g. total label cost. Therefore, the present invention relates to a method for selecting the label and label position of at least one suitable reference peptide. A method for the selection of optimal label positions to produce double-labeled peptides can for example contain the following steps (FIG. 7):

(63) In a first step, a spectral library is selected. Moreover, any additional input data required for the optimization according to the desired parameters will be supplied. E.g. if the optimization occurs according to total label cost, the label cost for each label is obtained. In addition, the label positions to be considered during the optimization process need to be defined. This includes how many amino acid positions within the terminus will be considered, as well as if both termini of the peptide will be optimized according to the same parameters.

(64) In a second step, the most advantageous amino acid position for labeling within the considered amino acids is determined for each peptide in the spectral library. During this step different parameters can be balanced to find the optimal label, e.g. information content of labeled fragment ions, total label cost which reflects the availability of the label and the complexity of its incorporation etc. For the optimization according to total label cost, the label with the lowest label cost but yielding fragment ions with maximum information content would be selected.

(65) Optionally, the method could further include any of the following features: an estimation of the total label cost for the selected labels and label positions, a simulation of fragment collisions, a calculation of label and label position frequencies, and/or a report of the results.

(66) In FIG. 8 an example of a calculation mode for an optimal label position analysis according to total label cost is displayed. Further, non-limiting details are listed in Example 3. To produce double-labeled reference peptides based on a spectral library wherein the positions for the heavy amino acid labels are optimized according to total label cost, a list of the label costs for all labels is needed. When selecting the labels and label positions, amino acids within a selected number (n.sub.globalMaxVal) of positions from each terminus are considered for labeling. If a peptide comprises less amino acids than the double of n.sub.globalMaxVal then instead all amino acids within n.sub.pepMaxVal positions from each terminus are considered for labeling, wherein n.sub.pepMaxVal corresponds to the peptide length divided by two and rounded down to the next lowest integer. For each peptide the amino acid with the lowest label cost will be selected from the stretch of considered amino acids (n.sub.i). The label costs of all labels for each peptide will then be summed up to estimate the total label cost for the specific n.sub.globalMaxVal. If the positioning of the labels is optimized according to a specific parameter, then the amino acids with the best “values” for the respective parameter should be preferred over other amino acids. As a consequence they are picked more frequently for labeling. FIG. 9 illustrates this: Optimal label positions were analyzed for double-labeling all peptides in a human plasma spectral library with amino acids containing heavy elemental isotopes. The label positions were optimized according to lowest label cost, e.g. the labeled amino acid with the lowest price per millimole from a certain vendor were preferred. This in turn also results in the lowest total label cost, i.e. the price for all labels used to label a certain amount of a specific set of proteins and/or peptides with a specific n.sub.globalMaxVal. The character “n.sub.i” denotes the length of terminal amino acid stretches that were considered for positioning the label. E.g. “n.sub.i=4” indicates that a first label can be incorporated at the position of any the 4 most N-terminal amino acids, and a second label can be incorporated at the position of any of the 4 most C-terminal amino acids. The frequency with which each amino acid was picked for labeling all peptides of the spectral library is displayed for n.sub.globalMaxVal values from 1 to 22 (with 22 corresponding to half the length of the longest peptide in the library, rounded down to the next integer). The longer the n.sub.globalMaxVal, the more positions are considered for labeling and the closer a situation is approached where primarily label positions are picked which correspond to alanine, glycine, arginine, leucine, arginine, and valine (FIG. 9). These are the five amino acids with the lowest label cost in this specific analysis.

(67) Furthermore, we discovered that for the analysis displayed in FIG. 9, the decrease in total label cost was considerable for n.sub.globalMaxVal equal to 2, 3, 4, and 5. For higher n.sub.globalMaxVal the additional savings became smaller and a higher loss of information content occurred due to small fragment ions not being considered in the analysis.

(68) The reference peptides of the present invention can further carry post translational modification(s) (PTM(s)). The PTMs of interest can be of biological importance to study signaling cascades via protein phosphorylation for instance or to reflect the chemical treatment of the sample during sample preparation. These can be any modification occurring on peptides and/or proteins. Preferably PTMs are selected from phosphorylation, acetylation, methylation, sulfation, hydroxylation, lipidation, ubiquitylation, sumoylation, glycosylation, oxidation, and carbamidomethylation. Preferably, the post translational modification(s) occurs on peptides and/or proteins in nature, or is introduced as part of a standard sample preparation workflow, e.g. as described in this application. For example, carbamidomethylation of cysteines is commonly introduced during sample preparation by reducing disulfide bonds and alkylating residues with iodoacetamide. Other common post translational modifications that are introduced during sample preparation are e.g. carbamylation due to urea present in the sample, or methionine oxidation.

(69) Labeled peptides and their unlabeled counterparts contain the same post translational modification(s) at the same position(s) to ensure that both peptide variants exhibit similar behavior during sample preparation and LC-MS analysis. Thus, the reference peptide corresponds to the unlabeled peptide as present in the sample including any modifications, but with the respective isotopically labeled amino acids. The present invention can be particularly useful for the analysis of peptides with post translational modifications for which only few fragment ions are available for quantification, e.g. phospho-peptides. By minimizing or eliminating fragment overlap we can ensure that available N-terminal and C-terminal fragment ions can be used for identification and quantification. In some cases only a single b- or y-ion differentiates between isoforms of phospho-peptides where e.g. the phosphorylation can occur on either of two neighboring amino-acids. In such instances the present invention enables the unequivocal assignment of the modified amino acid. Chemical synthesis of peptides is usually carried out by attaching amino acid building blocks to each other. To introduce an isotopically labeled amino acid, the building block comprises the amino acid containing the corresponding heavy isotopes. To introduce an amino acid carrying a post translational modification, the building block usually already comprises the amino acid and the PTM. Building blocks are most often introduced by coupling the carboxyl group of an amino acid building block to the N-terminus of the peptide being formed. Thus, chemical synthesis usually starts at a peptide's C-terminus and proceeds to its N-terminus. To avoid side reactions during peptide synthesis, some of the amino acid building block's reactive groups have to be protected. Therefore, the individual amino acid building blocks are reacted with protecting groups before they are added to the nascent peptide. Once the building block has been integrated into the peptide, its N-terminus is deprotected to allow for incorporation of the next amino acid. After the peptide is fully formed, any remaining protecting groups are removed.

(70) Applications:

(71) The methods and substances of the present invention can be applied to the quantification of a variety of samples, including different cell or tissue types, environmental samples, or bodily fluids. In a preferred embodiment the methods and substances of the present invention are applied to the quantification of human plasma proteins (FIGS. 10, 11, 12).

(72) In a first aspect, we analyzed the fragment overlap occurring during DIA-based quantification of human plasma peptides and/or proteins with sets of single-labeled synthetic peptides (FIG. 1, FIG. 10). To this end human plasma was subjected to in solution digestion: 10 μl of plasma were diluted in 75 μl 10 M urea and 0.1 M ammonium bicarbonate. The samples were reduced with 5 mM TCEP for 1 h at 37° C. Subsequently, the plasma was alkylated with 25 mM iodoacetamide for 20 min at 21° C. The samples were diluted to 2 M urea and digested with trypsin at a ratio 1:100 (enzyme to protein) at 37° C. for 15 h. The samples were centrifuged at 20,000 g at 4° C. for 10 min. The peptides were desalted using C18 MacroSpin columns from The Nest Group according to the manufacturer's instructions. After drying, the peptides were resuspended in 1% ACN and 0.1% formic acid. Sets of reference peptides, each carrying a C-terminal heavy amino acid label (Arg10 or Lys8), were added to all of the samples. The reference peptides were derived from plasma protein sequences and thus allowed for the quantification of a number of endogenous plasma proteins.

(73) Two micrograms of each sample were analyzed using a self-made analytical column (75 μm×50 cm length, packed with ReproSil-Pur 120 A C18-AQ, 1.9 μm) at 50° C. on an Easy-nLC 1200 connected to a Q Exactive HF mass spectrometer (Thermo Scientific). The peptides were separated by a 1 h segmented gradient from 1 to 52% acetonitrile (ACN) in 60 min with 0.1% formic acid at 250 nl/min, followed by a linear increase to 90% ACN in 2 min and 90% for 10 min. The DIA-MS method consisted of a survey scan at 120,000 resolution from 350 to 1,650 m/z (AGC target of 3*10.sup.6 or 60 ms injection time). Then, 14 DIA windows were acquired at 30,000 resolution (AGC target 3*10.sup.6 and auto for injection time) spanning 350-1650 m/z. Stepped collision energy was 10% at 27%. The spectra were recorded in profile mode. The default charge state for the MS2 was set to 4.

(74) The spectra were processed to extract peptide and protein identifications and quantitative values using specialized software such as Spectronaut (Biognosys AG). To demonstrate the fragment overlap occurring in combined fragment ion spectra between N-terminal b-ions from endogenous, unlabeled peptide and single-labeled synthetic reference peptides, we further analyzed spectra from single peptides.

(75) Combined fragment ion spectra for three peptides showing an intense signal were analyzed. FIG. 1 shows DIA data for one peptide present in an unlabeled, as well as a labeled variant carrying a modified lysine residue (K8) as single C-terminal label. In a first part of the Figure a section of an MS1 spectrum is displayed (FIG. 1a). The 50 Th mass window containing both the unlabeled and the labeled precursor of peptide DIASGLIGPLIIC[+C2+H3+N+O]K is marked (FIG. 1a). All ions inside this swath were fragmented and a combined fragment ion spectrum comprising fragment ions from both the unlabeled and the labeled peptide was acquired (FIG. 1b). The fragment overlap for different fragment ions was analyzed. Fragment ions from the unlabeled (light) precursor are marked with white triangles, fragment ions from the labeled (heavy) precursor are marked with black triangles, and shared b-ions are marked with pointed circles. A mass shift between corresponding fragment ions from unlabeled and labeled peptides due to the C-terminal label is displayed as a line connecting two triangles. All y-ions show such a mass shift (for y4+ the unlabeled signal is not marked). On the other hand, fragment overlap was observed for all b-ions in the spectrum. This affects quantification: if intensity at the apex of the first monoisotopic peak is compared, the y-fragment-ions have a light (unlabeled) to heavy (labeled) ratio (L/H ratio) <0.5 which reflects the ratio between the light and heavy precursor peptide in the MS1 spectrum (FIG. 1a). However, if b-ions are to be used for quantification, they show an L/H ratio of 1 since the same shared fragment ion peaks are compared between light and heavy peptides. Thus, if the b-ions are considered in the calculation they will skew the L/H ratios towards a higher amount of unlabeled peptide. Furthermore, due to the fragment overlap of b-ions spectra of light and heavy peptides comprise shared fragments. All these problems, fragment overlap leading to inaccurate quantitative values or unused ions and shared fragments, do not occur if selectively double-labeled are used instead of single-labeled peptides.

(76) In a preferred embodiment the methods and substances of the present invention are applied to the quantification of human plasma proteins (FIG. 11, FIG. 12). In a first step proteins are extracted from a plasma sample and solubilized. The proteins are then subjected to reduction and alkylation, prior to cleavage into peptides, preferably using a protease, typically trypsin and/or Lys-C. The digested endogenous, unlabeled peptides are then pooled with synthetic, selectively double-labeled reference peptides. The peptide mixture is desalted, typically using C18 stationary phase. Peptides are separated via liquid chromatography, typically by increasing hydrophobicity via a reversed-phase column and a gradient of increasing acetonitrile concentrations. Peptides elute, are ionized and subjected to MS analysis via either a DIA—(FIG. 11) or an mPRM-method (FIG. 12). Fragment ions are detected on a high resolution instrument and combined fragment ion spectra containing several precursors are stored. Since the reference peptides contain two strategically positioned labels, most of their fragments will be labeled. Thus, most corresponding fragment ions from unlabeled endogenous and labeled reference peptides will have distinct masses and fragment overlap between peptides is greatly reduced. Based on MS2 data peptides and/or proteins will be identified and quantified using specialized software. Alternatively, other data analysis workflows mentioned in the text can be employed, e.g. quantification based on MS1 level data. Using this workflow the endogenous, unlabeled peptides can be quantified relative to the labeled reference peptides. If the concentration of the labeled reference peptides within the sample is known, this further enables absolute quantification of the unlabeled peptides. Proteins are then quantified based on the amount of their peptides. In a second aspect, we analyzed the fragment overlap occurring during DIA- and mPRM-based quantification of human plasma peptides and/or proteins with sets of single-labeled and double-labeled synthetic peptides (FIGS. 11-16). FIGS. 13, and 15 show mPRM and DIA data, respectively, for peptide DIASGLIGPLIIC[+C2+H3+N+O]K present in an unlabeled, as well as a labeled variant carrying a modified lysine residue (K8) as single C-terminal label. Precursors of both peptide variants were fragmented and a combined fragment ion spectrum comprising fragment ions from both the unlabeled and the labeled peptide was stored. Using analysis software, we compared fragment ion signals attributed to the unlabeled and the labeled peptide (FIGS. 13a, 15a). The fragment overlap was analyzed. Fragment ions from the unlabeled (light) precursor are marked with white triangles, fragment ions from the labeled (heavy) precursor are marked with black triangles, and shared b-ions are marked with pointed circles. A mass shift between corresponding fragment ions from unlabeled and labeled peptides due to the C-terminal and/or N-terminal label is displayed as a line connecting two triangles. Symbols (* or #) mark mass shifts due to the C-terminal, or the N-terminal label, respectively. All y-ions show a mass shift. On the other hand, fragment overlap was observed for all b-ions in the spectrum. This affects the relative fragment ion intensities which differ between the respective peptide variants (FIGS. 13b, 15b). Moreover, it affects quantification: The y-ions show unlabeled-to-labeled intensity ratios from 0.11 to 0.18 which reflects the ratio between the light and heavy precursor (FIGS. 13c, 15c). However, if b-ions are to be used for quantification, they show an L/H ratio of 1 since the same shared fragment ion peaks are compared between light and heavy peptides. Thus, if the b-ions are considered in the calculation they will skew the L/H ratios towards a higher amount of unlabeled peptide. On the other hand, if all b-ions are ignored, the quantification is less robust compared a case where all fragments correctly represent the ratios between unlabeled and labeled precursors present in the sample. All these problems, fragment overlap leading to inaccurate quantitative values or unused ions and shared fragments, do not occur if selectively double-labeled peptides are used instead of single-labeled peptides.

(77) FIGS. 14 (mPRM data) and 16 (DIA data) show the corresponding plots for the unlabeled and the double labeled variants of the peptide. Both b- and y-ions show no fragment overlap (FIG. 14a, FIG. 16a). Both peptide variants produce similar relative fragment ion intensities (FIG. 14b, FIG. 16b). Moreover, b- and y-ions show similar unlabeled-to-labeled intensity ratios which reflect the ratio between the light and heavy precursor (FIGS. 14c, 16c).

(78) Moreover, we re-analyzed data from the DIA experiments described above (FIG. 15, FIG. 16) to test if using single-labeled reference peptides negatively influenced the identification of the unlabeled peptides. In our setup the reference peptides was present in higher amounts than the endogenous, unlabeled peptides. If a reference peptide with a single C-terminal label is used, some of its fragment ions overlap with fragment ions of the less abundant, unlabeled peptide. Therefore, the relative fragment ion intensities were mainly skewed for the less abundant, unlabeled peptide (FIG. 15). We analyzed the impact of these skewed relative fragment intensities on the peptide identification score. To this end we analyzed Spectronaut's intensity correlation score. The intensity correlation score takes into account the expected relative fragment ion intensities based on the spectral library and the fit with the actual relative fragment intensities of the measured peak. It is used for scoring of peptide and peak identification and thus is a good measure for how much altered relative fragment intensities by fragment overlap will affect peptide and/or protein identification. We analyzed the intensity correlation score for five peptides measured in the DIA experiments described above (FIGS. 15, 16, 17). FIG. 17a shows data from the DIA experiment and depicts the intensity correlation score for the unlabeled peptide DIASGLIGPLIIC[+C2+H3+N+O]K averaged over 3 replicates. If double-labeled reference peptides were used, the average intensity correlation score was significantly higher than when reference peptides with a single C-terminal label were used (t-test, p<0.05). This also held true for other peptides. The average intensity correlation score for 5 unlabeled peptides was significantly higher in an experimental setup using double-labeled reference peptides compared to reference peptides with a single C-terminal label (FIG. 17b).

(79) Experimental Part:

EXAMPLE 1

Quantification of Human Plasma Proteins Using Selectively Double-Labeled Peptides

(80) See FIG. 11 for a scheme of the workflow.

(81) Sample Preparation:

(82) Human plasma will be digested using in solution digestion: 10 μl of plasma will be diluted in 75 μl 10 M urea and 0.1 M ammonium bicarbonate. The samples will be reduced with 5 mM TCEP for 1 h at 37° C. Subsequently, the plasma will be alkylated with 25 mM iodoacetamide for 20 min at 21° C. The samples will be diluted to 2 M urea and digested with trypsin at a ratio 1:100 (enzyme to protein) at 37° C. for 15 h. The samples will be centrifuged at 20,000 g at 4° C. for 10 min. The peptides will be desalted using C18 MacroSpin columns from The Nest Group according to the manufacturer's instructions. After drying, the peptides will be resuspended in 1% ACN and 0.1% formic acid.

(83) Preparation of Labeled Reference Peptides:

(84) The reference peptide mix will contain synthetic double-labeled peptides covering amino acid sequences of interest, the unlabeled, endogenous version of which will be quantified within the samples. These dried, labeled reference peptides will be dissolved in 20 μl dissolution buffer before adding 100 μl of LC solution to it. Dissolution will be assisted by vortexing and/or sonication. Two microliters of this reference peptide mix will be added to each sample.

(85) Mass Spectrometry Analysis:

(86) Two micrograms of each sample will be analyzed using a self-made analytical column (75 μm×50 cm length, packed with ReproSil-Pur 120 A C18-AQ, 1.9 μm) at 50° C. on an Easy-nLC 1200 connected to a Q Exactive HF mass spectrometer (Thermo Scientific). The peptides will be separated by a 1 h segmented gradient from 1 to 52% ACN in 60 min with 0.1% formic acid at 250 nl/min, followed by a linear increase to 90% ACN in 2 min and 90% for 10 min. The DIA-MS method will consist of a survey scan at 120,000 resolution from 350 to 1,650 m/z (AGC target of 3*10.sup.6 or 60 ms injection time). Then, 14 DIA windows will be acquired at 30,000 resolution (AGC target 3*10.sup.6 and auto for injection time) spanning 350-1650 m/z. Stepped collision energy will be 10% at 27%. The spectra will be recorded in profile mode. The default charge state for the MS2 will be set to 4.

(87) Data Analysis:

(88) Peptide and protein identification, as well as quantification will be done using any suitable software, such as for example Spectronaut, OpenSWATH, SpectroDive or MaxQuant.

EXAMPLE 3

Method for Selecting Cheapest Amino Acid for Labeling and Estimate Total Label Costs

(89) A method was created to select optimal amino acids and positions for labeling. Furthermore, the method estimated the total label cost for double-labeling a set of peptides. It offered the following features:

(90) In a first step three pieces of input data were accepted, the first containing the label prices, i.e. the price of amino acids containing heavy elemental isotopes as stated by a certain vendor, the second containing the molecular weight of all 20 amino acids, and the third being a spectral library for human plasma.

(91) In a second step the label prices and the amino acid molecular weight data was used to estimate the cost per mmol of each labeled amino acid. Furthermore, all unique, unmodified peptide sequences were extracted from the spectral library.

(92) In a third step a value for n.sub.globalMaxVal was specified. Herein n.sub.globalMaxVal defines a positive integer that is set by the experimenter, e.g. n.sub.globalMaxVal=4. The highest possible value for n.sub.globalMaxVal is equal to the length of the longest peptide in the analyzed peptide spectral library divided by two, and rounded down to the nearest lower positive integer if the value was not an integer.

(93) In a fourth step, the value for n.sub.globalMaxVal, the values for label cost per mmol, and the peptide sequences from the spectral library were used to select the cheapest amino acid for labeling, to estimate the total label cost, and to calculate the frequency with which each amino acid was labeled for the set of peptides for different n.sub.globalMaxVal values. For each peptide stretches of n.sub.i amino acids from each terminus were considered. The n.sub.i values were peptide-specific and related to an amino acid stretch starting from the terminus of a peptide, e.g. a value of n.sub.i=1 comprised the terminal amino acid, n.sub.i=2 comprised the terminal amino acid and the amino acid one removed from the terminus, and so forth. The cheapest amino acid and the total label costs were determined as follows:

(94) For each peptide sequence extracted from the library the peptide-specific value for n.sub.i was equal to the lower of two values: either the value of the user-defined positive integer n.sub.globalMaxVal, or the value of n.sub.pepMaxVal which corresponds to the number of amino acids in the peptide divided by two and rounded down to the nearest lower integer if the value was not an integer. The position and the cost of the first label for said peptide were determined by selecting the amino acid with the lowest label cost per millimole from a stretch of amino acids of length n.sub.i starting from the C-terminus. The position and the cost of the second label were determined by applying the same procedure to the N-terminus. This was repeated for all peptide sequences. The label costs for all peptide sequences were summed up to obtain the total label cost for the selected n.sub.globalMaxVal value.

(95) This calculation was repeated for different integer values of n.sub.globalMaxVal between 1 and the maximum possible value (length of longest peptide in the library divided by two and rounded down to the next lowest integer). As a result, a separate total label cost was calculated for each n.sub.globalMaxVal value.

(96) In a fifth step, the resulting total label costs for labeling the peptide sequences were displayed for each n.sub.globalMaxVal value. Furthermore, the frequencies with which each of the 20 amino acids had been selected for labeling, were calculated (FIG. 8).

EXAMPLE 4

Exclusion of Modified Amino Acids and Analysis of Fragment Collisions

(97) A method for the selection of labels and label positions will be created which will offer the following features in addition to the label cost calculation features of Example 3:

(98) After the optimization of label positions according to total label cost as in Example 3, the present method will in a first aspect select the amino acid with the next lowest label cost for labeling if the selected amino acid is an amino acid that is often post-translationally modified in the experimental setup. In a second aspect the method will simulate the fragment masses that would be produced by the selected double-labeled peptide sequences. Based on the simulation the method will further analyze how many fragment collisions occur, i.e. how many fragment ions from the double-labeled precursor overlap with any other fragment ions of the unlabeled precursor. If the number lies above a certain threshold, the amino acid with the next lowest label cost with a number of fragment collisions which lies below the threshold will instead be selected for labeling if such a residue is available.

EXAMPLE 5

Set of Synthetic Double-Labeled Human Plasma Peptides

(99) A list of tryptic sequences extracted from a human plasma spectral library will be analyzed. The value for n.sub.globalMaxVal will be set equal to 4. For each peptide stretches of n.sub.i amino acids from each terminus were considered. The n.sub.i values will be peptide-specific and relate to an amino acid stretch starting from the terminus of a peptide, e.g. a value of n.sub.i=1 comprises the terminal amino acid, n.sub.i=2 comprises the terminal amino acid and the amino acid one removed from the terminus, and so forth.

(100) For each peptide sequence extracted from the library the peptide-specific value for n.sub.i will be equal to the lower of two values: either the value of the user-defined positive integer n.sub.globalMaxVal, or the value of n.sub.pepMaxVal which corresponds to the number of amino acids in the peptide divided by two and rounded down to the nearest lower integer if the value was not an integer.

(101) For each peptide a first amino acid having the lowest label cost from the n.sub.i most C-terminal amino acids, and a second amino acid having the lowest label cost from the n.sub.i most N-terminal amino acids will be selected for labeling. n.sub.i will adopt values 1, 2, 3, and 4 for different peptides, depending on their length, e.g. for a peptide of six amino acids n.sub.i will be 3, for a peptide of seven amino acids n.sub.i will be 3, for a peptide of eight amino acids, n.sub.i will be 4.

(102) The most appropriate 1, 2, 3, 4, 5 or more peptides per protein will be selected based on labeling cost and other criteria (such as peptide length, hydrophobicity and so forth). Furthermore, total label costs for n.sub.globalMaxVal will be estimated. Special selection criteria will apply in case fragment collisions occur or in case the selected amino acid is easily modified. The corresponding set of quantified, double-labeled peptides corresponding to the data of n.sub.globalMaxVal=4 will be synthesized wherein the labels are the designated amino acids containing .sup.13C and/or .sup.15N.

(103) The set of synthetic double-labeled peptides will be diluted appropriately. A suitable amount of the double-labeled peptide mix will be added to a sample containing an unlabeled protein digest from human plasma. Fragment ion spectra for the combined peptide mixture will be acquired using a DIA method. Due to the labeled peptides being added in known amounts, absolute peptide abundances in the unlabeled sample can then be determined using specialized software. Due to the synthetic peptides containing two labels, their b- and y-ions series will have different masses from the corresponding ions of the unlabeled peptide. Thus, no fragment overlap will occur.

EXAMPLE 6

Quantification of Human Plasma Peptides Using Selectively Double-Labeled Peptides

(104) See FIGS. 11 and 12 for a scheme of the workflow using DIA and mPRM methods, respectively. See FIGS. 13, 14 and FIGS. 15, 16, 17 for results from mPRM and DIA workflows, respectively.

(105) Sample Preparation:

(106) Human plasma sample was prepared by in solution digestion: 10 μl of plasma was diluted in 90 μl 10 M urea in 0.1 M ammonium bicarbonate. The sample was reduced with 5 mM dithiothreitol for 30 minutes at 37° C. Subsequently, the plasma was alkylated with 27 mM iodoacetamide for 30 minutes at 21° C. protected from light. The sample was diluted to a urea concentration below 1.5 M and digested with trypsin at a ratio 1:50 (enzyme to protein) at 37° C. for 3 hours. The sample was centrifuged at 14,000×g at 4° C. for 15 minutes, before the peptides were desalted using a C18 MacroSpin 96-well plate (The Nest Group) according to the manufacturer's instructions. After complete drying in a vacuum concentrator, the plasma sample was re-suspended in 1% ACN and 0.1% formic acid and frozen at −20° C. until further use.

(107) Preparation of Labeled Reference Peptides:

(108) The reference peptide mix contained five synthetic, double-labeled peptides covering amino acid sequences of interest, the unlabeled, endogenous version of which will be quantified within the samples.

(109) Stock solutions of the individual peptides and a working solution of the reference peptide mix were prepared according to the following table:

(110) TABLE-US-00001 Stock Stock Working Solution Solution Solution Peptide (fmol/μl) (μl) Dilution (fmol/μl) _PVA*FSVVPTAAAAVSLK*_ 670776.7 1000 404.4 1658.6 _AG*LLRPDYALLGHR*_ 702996.7 1000 1209.8 581.1 _DIA*SGLIGPLIIC[+C2+H3+N+O]K*_ 742360.4 1000 583.9 1271.3 _G*LTLHLK*_ 1389099.4 1000 2271.7 611.5 _EHV*AHLLFLR*_ 879725.5 1000 276.7 3179.3 Heavy labeled amino acids are marked by a star (*) following the amino acid letter.

(111) Of the working solution 2 μl was added to 6 μl of plasma sample. Additionally, 0.8 μl of iRT peptides were added to the sample before injection. Purity of the double labeled peptides, concerning single or non-labeled contaminates, was confirmed by mass-spectrometric analysis (data not shown).

(112) As comparison for single labeled reference peptides, Biognosys' PlasmaDive reference peptide mix was used, according to the manufacturer's instructions. The mix comprises the sequences of the five double-labeled peptides in their single-labeled variant, i.e. with a single C-terminal heavy amino acid.

(113) Mass Spectrometry Analysis:

(114) One microgram of each sample was analyzed using a self-made analytical column (75 μm×50 cm length, packed with ReproSil-Pur 120 A C18-AQ, 1.9 μm) at 50° C. on an Easy-nLC 1200 connected to a Q Exactive HF mass spectrometer (Thermo Scientific). The peptides were separated by a 40 minutes (PRM) linear gradient or 60 minutes segmented gradient (DIA) from 1 to 45% ACN with 0.1% formic acid at 250 nl/min. The DIA-MS method consisted of a survey scan at 120,000 resolution from 350 to 1,650 m/z (AGC target of 3*10.sup.6 or 60 ms injection time). Then, 14 DIA windows were acquired at 30,000 resolution (AGC target 3*10.sup.6 and auto for injection time) spanning 350-1650 m/z. Normalized stepped collision energy from 10% to 27% was used and the spectra were recorded in profile mode. The default charge state for the MS2 was set to 3. For the PRM analysis, the settings were similar, but only the five heavy labeled peptides and endogenous counterparts were targeted, as well as iRT peptides. The instrument was set to use multiplexing and analyze heavy-light pairs together.

(115) Data Analysis:

(116) The multiplexed PRM files were analyzed with SpectroDive 7 (Biognosys) and the DIA runs with Spectronaut 9 (Biognosys), both using standard settings, according to the manufacturer's instructions.

LIST OF REFERENCE SIGNS/ABBREVIATIONS

(117) CID collision-induced dissociation

(118) ECD electron-capture dissociation

(119) ESI electrospray ionization

(120) ETD electron-transfer dissociation

(121) HCD Higher-energy collisional dissociation

(122) LC liquid chromatography

(123) MALDI matrix-assisted laser desorption ionization

(124) mmol millimole

(125) mPRM multiplexed parallel reaction monitoring

(126) MS mass spectrometry

(127) m/z mass to charge ratio

(128) NETD negative electron transfer dissociation

(129) PQD Pulsed Q Collision Induced Dissociation

(130) SRM selected reaction monitoring

Labelled compounds and methods for mass spectrometry-based quantification

Assignee

Inventors

Cpc classification

Classification Explorer

G01N33/6848

PHYSICS

Classification Explorer

G01N33/6842

PHYSICS

Classification Explorer

G01N2440/20

PHYSICS

Classification Explorer

G01N2560/00

PHYSICS

Classification Explorer

G01N2440/38

PHYSICS

International classification

Classification Explorer

G01N33/68

PHYSICS

Abstract

Claims

Description