Labelled compounds and methods for mass spectrometry-based quantification
11293928 · 2022-04-05
Assignee
Inventors
Cpc classification
G01N33/6842
PHYSICS
G01N2560/00
PHYSICS
International classification
Abstract
Methods for peptide and/or protein quantification by mass spectrometry using labeled peptides, wherein multiple labels lead to distinct fragments for the labeled peptides and their unlabeled variant, thus facilitating data analysis and enhancing the potential for quantification. Methods for selecting the label and label position are further given, as well as sets of labeled peptides resulting from or for use in the above-mentioned methods. The methods and substances are especially useful for data-independent or multiplexed parallel reaction monitoring proteomics applications involving peptide quantification.
Claims
1. A method for the absolute or relative quantitative analysis of proteins and/or peptides with or without post translational modification(s) using a mass spectrometry method in which: in a first step unlabeled proteins from an endogenous mixture are digested and subsequently digestion products thereof selected, in a second step said digestion products are fragmented, and in a third step a combined fragment spectrum is acquired comprising b-ions and y-ions of said digestion products, wherein at least one reference peptide is added to said mixture before and/or after digestion, is fragmented, acquired, and stored in said combined fragment spectrum comprising also b-ions and y-ions of said digestion products, wherein the said at least one reference peptide is added in a known concentration for absolute quantification or in always the same concentration in a series of experiments for relative quantitative analysis, wherein said at least one reference peptide this is selectively isotopically labeled by having incorporated: one isotopically labeled amino acid forming its very C-terminus or being one of the four terminal amino acids at the C-terminus and one isotopically labeled amino acid forming its very N-terminus, or being one of the four terminal amino acids at the N-terminus, and wherein the isotopically labeled amino acids are unmodified naturally occurring proteinogenic amino acids or amino acids carrying a chemically modifying moiety, wherein said unmodified naturally occurring proteinogenic amino acids or amino acids carrying a chemically modifying moiety comprise one or more atoms that are isotopically labeled such that said one or more atoms are present in the amino acid and not in the chemically modifying moiety.
2. The method according to claim 1, wherein in said reference peptide, apart from the isotopically labeled amino acid at or close to the C-terminus and the isotopically labeled amino acid at or close to the N-terminus, not more than one additional amino acid is isotopically labeled.
3. The method according to claim 1, wherein in said reference peptide one isotopically labeled amino acid is forming its very C-terminus and one further isotopically labeled amino acid is forming its very N-terminus.
4. The method according to claim 1, wherein said combined fragment spectrum is acquired using a mass isolation window having a full-range mass isolation window, or a width in terms of mass-to-charge ratio in the range of (2×1.036426×10.sup.−8 kg/C)−(1000×1.036426×10.sup.−8 kg/C).
5. The method according to claim 1, wherein said combined fragment spectrum is acquired using a mass isolation window of (5×1.036426×10.sup.−8 kg/C)−(30×1.036426×10.sup.−8 kg/C).
6. The method according to claim 1, wherein said post translational modification is one or more selected from the group consisting of: phosphorylation, acetylation, methylation, sulfation, hydroxylation, lipidation, ubiquitylation, sumoylation, and glycosylation.
7. The method according to claim 1, wherein said reference peptide consists of 5-100 amino acids.
8. The method according to claim 1, wherein it involves using DIA or mPRM techniques.
9. The method according to claim 1, wherein in said reference peptide, apart from the isotopically labeled amino acid at or close to the C-terminus and the isotopically labeled amino acid at or close to the N-terminus, no additional amino acid is isotopically labeled.
10. The method according to claim 1, wherein said combined fragment spectrum is acquired using a mass isolation window having a full-range mass isolation window, or a width in terms of mass-to-charge ratio in the range of (5×1.036426×10.sup.−8 kg/C)−(100×1.036426×10.sup.−8 kg/C).
11. The method according to claim 1, wherein said combined fragment spectrum is acquired using a mass isolation window of (10×1.036426×10.sup.−8 kg/C)−(25×1.036426×10.sup.−8 kg/C).
12. The method according to claim 1, wherein said reference peptide consists of 7-30 amino acids.
13. The method according to claim 1, wherein said reference peptide consists of 10-20 amino acids.
Description
BRIEF DESCRIPTION OF THE DRAWINGS
(1) Preferred embodiments of the invention are described in the following with reference to the drawings, which are for the purpose of illustrating the present preferred embodiments of the invention and not for the purpose of limiting the same. In the drawings,
(2)
(3)
(4)
(5)
(6)
(7)
(8)
(9)
(10)
(11)
(12)
(13)
(14)
(15)
(16)
(17)
(18)
DESCRIPTION OF PREFERRED EMBODIMENTS
(19) Herein after, the present invention is described in further detail and is exemplified. However, the examples are not intended to limit the present invention. Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. It must be noted that as used herein and in the claims, the singular forms “a”, “an”, and “the” include plural referents unless the context clearly dictates otherwise. Thus, for example reference to “a label” includes a plurality of such labels and so forth.
(20) Although any materials and methods similar or equivalent to those described herein can be used to practice or test the present invention, the preferred materials and methods are now described.
(21) This description specifically details the application of labeled reference peptides in quantitative proteomics studies wherein combined fragment ion spectra are obtained. It describes methods for the quantitative analysis of peptides and/or proteins, methods for the selection of suitable reference peptides and label positions, and the reference peptides used in said methods. Different aspects relating to the experimental setup, and labeling strategies are discussed. Finally, examples of applications illustrate the potential of the methods and substances of the present invention to improve the accuracy of quantitative studies.
(22) Mass Spectrometry Methods:
(23) Mass spectrometry (MS) methods are widely used for peptide and/or protein identification and quantification, especially in proteomics studies where large numbers of analytes are monitored. A standard sample preparation workflow for bottom-up liquid chromatography (LC)-MS experiments includes the following steps: Proteins comprised in a sample are digested to peptides using a protease such as trypsin. The peptides are then separated by liquid chromatography, most commonly via reversed-phase liquid chromatography (LC). As soon as the peptides elute from the chromatography column, they are ionized by electrospray ionization (ESI): At the ion source, a voltage is applied which disperses the liquid sample into fine droplets containing charged peptide molecules. These precursors then enter the mass spectrometer where they fly in an electric field and are resolved according to their mass-to-charge (m/z) ratio. Finally, the precursor ions are detected and their mass-to-charge (m/z) ratio is registered, resulting in MS1 (or MS) spectra acquired over the whole gradient. Single peptide precursors or wider mass ranges are sequenced as follows: The ions in the selected mass window are isolated and fragmented, e.g. by collision with Helium gas, a process termed collision-induced dissociation (CID) or by higher energy C-trap dissociation (HCD). All fragment ions are then recorded in one MS/MS, MS2, or fragment ion spectrum.
(24) The fragment ion spectra serve as a basis for peptide identification. Peptides do not disintegrate randomly during fragmentation, but rather fragment according to a pattern into a, b, c, x, y, and z-ions (
(25) Different mass spectrometry approaches can be used in bottom-up proteomics experiments. While the basic steps of the protocols remain the same for all approaches, other parts, such as fragmentation, identification, and quantification of peptides, vary depending on the MS method used.
(26) One of the most frequently used mass spectrometry approaches in proteomics is data-dependent acquisition (DDA), also called “shotgun” (
(27) Within the last years, data-independent acquisition (DIA) emerged as a new MS approach which remedies many of DDA's disadvantages. Techniques which are based on this principle include for example HRM, SWATH, MS.sup.E and All-Ion-Fragmentation. The core feature of all DIA methods is that instead of a single precursor as for DDA, larger mass windows, or swaths, containing multiple precursors are fragmented (
(28) Data analysis can be challenging due to the spectra containing fragments of several peptides.
(29) To identify and quantify the peptides present in a sample, the combined fragment ion spectra can be searched against a spectral library, or theoretical spectra or can be mined using SRM-like transitions. Fragments from the same peptide are subsequently arranged in SRM-like peak groups: The signal corresponds to the intensity of each fragment monitored over time in sequential spectra. Fragments of the same peptide will produce similarly shaped elution peaks with maxima at identical retention times (RT). These SRM-like peak groups can then be used to quantify e.g. an unlabeled endogenous peptide versus a labeled reference peptide. I.e. the quantification is done based on MS2 level data. Alternatively, peptide and/or protein quantification can be done on MS1 level if the corresponding MS1 data was acquired.
(30) The same data analysis concepts can be applied to the analysis of DIA and mPRM data. Traditionally, a spectral library generated from DDA-data is employed to extract quantitative features from DIA or mPRM runs and to identify peptides and/or proteins. Alternative data analysis approaches exist which do not rely on DDA-based spectral libraries, or do not rely on them exclusively: For example, mPRM or DIA data containing MS1 and MS2 scans can be converted into MS2 spectra containing fragment ions relevant for a specific MS1 feature. These spectra are searched using a database of theoretical spectra which results in peptide identifications being assigned to the precursor-fragment matches. This process is very similar to how DDA data is typically processed. The search results can be saved as spectral library. Furthermore, a spectral library can be generated from combined search results from DIA and/or DDA experiments, or from mPRM and/or DDA experiments. In either case, the search results and/or the spectral library are used to extract quantitative information from the mPRM or DIA runs, allowing peptide and/or protein quantification on MS1 and/or MS2 level.
(31) In summary, a spectral library can be generated from many sources including but not limited to the following: from data of the same acquisition, from a previous acquisition of the same sample, from an independent acquisition of a similar tissue or complete organism, from published data, from mPRM data, from DIA data, from DDA data, from a combination of DIA and DDA data, from a combination of mPRM and DDA data, from a resource database from fractionated or unfractionated samples, it can be generated on-the-fly from DIA or mPRM data, or from a combination of sources mentioned above. The spectral library can be saved and/or can be discarded after use.
(32) The following paragraph provides non-limiting examples for different data analysis approaches for DIA and/or mPRM data. A spectral library can be generated from the same sample, a similar sample, or from resource data. The data for the spectral library can stem from fractionated and/or unfractionated samples. The data for the spectral library can have been acquired with different mass spectrometry methods such as DDA, targeted mass spectrometry methods, DIA or mPRM, or any combination of them. The sample to be quantified can be fractionated or unfractionated and is acquired by DIA and/or mPRM. Peak groups and peptides in the sample are identified using the spectral library. The sample is then quantified based on MS2 and/or MS1 level data.
(33) Existing data analysis software, e.g. Spectronaut Pulsar (Biognosys AG) support many of the proposed data analysis workflows. The person skilled in the art will know which software to use or how to modify existing software to support the desired workflow.
(34) In an exemplary peptide and/or protein quantification experiment employing DIA, the amount of the endogenous, unlabeled peptide variant relative to its labeled, reference peptide variant has to be determined. To this end, unlabeled and labeled peptides comprised in a sample are fragmented. Due to the label introducing only a small mass shift the fragment ions of both precursors will most often be present in the same combined fragment spectrum. Thus, only fragment ions differing in at least one label can be distinguished between unlabeled and reference peptide. The amount of unlabeled peptide relative to reference peptide can be determined by comparing the SRM-like peaks formed by these fragment ions differing in at least one label.
(35) DIA methods have several advantages over DDA and other targeted methods such as SRM: DIA approaches have excellent sensitivity and a large dynamic range. Moreover, since no stochastic peak picking is involved DIA methods avoid the missing peptide ID data points typical for DDA methods and peptides are reproducibly measured over all samples. Furthermore, DIA allows sequencing of almost complete proteomes within one run without requiring prior knowledge about targeted transitions. All these properties make DIA methods especially suitable for quantification studies where many peptides and/or proteins need to be measured.
(36) Another MS method which is frequently used for the quantification of peptides and/or proteins is Selected Reaction Monitoring (SRM). SRM is a targeted mass spectrometry approach. Herein, fragment ions of a single, pre-selected target peptide are detected on low resolution, low mass accuracy mass spectrometers. Only limited numbers of peptides can be monitored with this technique, and assay development is laborious. Multiplexed parallel reaction monitoring (mPRM), a novel targeted proteomics technique, remedies these disadvantages (
(37) Usually, mPRM analyses are conducted on a quadrupole which is combined with a high resolution mass analyzer. The quadrupole acts as mass filter to target mass ranges for fragmentation in a second quadrupole, and the resulting fragment ions are acquired by the high resolution mass analyzer. Fragmentation is done by either of two ways: Several precursors can be fragmented sequentially and their fragment ions are stored together for later measurement. Alternatively, larger m/z ranges containing several precursors are fragmented together. In both cases the fragmentation procedure results in combined fragment ion spectra comprising fragment ions from several precursors.
(38) The fragment ions are analyzed in the high resolution part of the instrument, often an orbitrap analyzer. This has several advantages over using a low resolution instrument as in
(39) SRM studies: Firstly, all fragment ions a peptide produces can be monitored, rather than just a small number, leading to a higher specificity and increasing the confidence that the correct peptide was identified. Moreover, assay optimization becomes less crucial and the larger number of fragment ions that is monitored per peptide makes quantification more robust. Secondly, since the fragment ions are acquired with high resolution and mass accuracy, the probability of false positive identifications decreases.
(40) DIA and mPRM workflows produce similar combined fragment ion spectra and can sometimes even be run on the same type of mass spectrometers. Therefore, also the basic principles for data analysis and quantification are the same. Thus, also for mPRM the SRM-like peak groups extracted from the fragment ion spectra can be used to quantify e.g. an unlabeled endogenous peptide versus a labeled reference peptide. Hence, quantification in an mPRM experiment is usually done based on MS2 level data.
(41) The advantages of mPRM over DDA and SRM are similar to the ones mentioned above for DIA: high sensitivity, a large dynamic range, and reproducible peptide picking. As a consequence, it is especially suitable for quantification studies.
(42) The present invention solves the problem of fragment overlap for any method that produces combined fragment ion spectra. This includes mass spectrometry methods acquiring low resolution data that is stored as combined fragment ion spectrum. Moreover, mass transmission windows for selecting precursors for fragmentation can be non-overlapping, overlapping, and/or can be sliding windows with small offsets. One DIA method using the latter is SONAR (Waters). This technique uses a quadrupole that slides over a selected mass range during each MS scan using transmission mass windows with offsets of a few Daltons. One full scan covers the whole mass range and high and low collision energy are applied in an alternating fashion to the scans, thus producing both MS1 and MS2 data. The person skilled in the art will know how to set up and operate the corresponding mass spectrometry setting.
(43) Combined fragment ion spectra can be produced by pooling data of fragment ions from several precursors in one transmission mass window, e.g. as described in the examples. Even DDA methods can thus produce combined fragment ion spectra if large transmission mass windows are used and several precursors are fragmented together. Alternatively, fragment ion data of precursors from different transmission mass windows can be pooled to form a combined fragment ion spectrum. This principle is for example used in multiplexed DIA (Egertson, J. D., MacLean, B., Johnson, R., Xuan, Y., and MacCoss, M. J., 2015. Multiplexed Peptide Analysis using Data Independent Acquisition and Skyline, Nature Protocols, 2015, 10(6), pp. 887-903.). The person skilled in the art will know how to set up the corresponding mass spectrometry acquisition methods. Data analysis of the combined fragment ion spectra proceeds as described.
(44) Use of Multiply-Labeled Peptides in Quantification Studies Employing DIA or mPRM:
(45) A common setup for protein and/or peptide quantification is to compare the abundances of an unlabeled, endogenous peptide and its reference peptide variant carrying a single C-terminal label. Usually, this is an amino acid containing heavy elemental isotopes, most commonly arginine or lysine. When a combined fragment spectrum of these peptides is acquired with DIA or mPRM the presence of a single label will lead to complications: All C-terminal ions from the reference peptide will contain the label and will have an m/z distinct from their unlabeled counterparts (
(46) One way to eliminate the fragment overlap during DIA- or mPRM-based peptide and/or protein quantification experiments is by selectively introducing two labels (heavy isotope containing amino acids) at different positions into the reference peptides such that most C-terminal, as well as N-terminal fragments of interest will contain a label (
(47) The present invention makes use of such multiply-labeled reference peptides and/or proteins to provide an improved quantification method that is compatible with combined fragment ion MS spectra. Secondly, the present invention relates to a method for selecting the label and label position of at least one suitable reference peptide. Thirdly, the present invention relates to selectively double-labeled reference peptides for use in or produced by the above mentioned methods.
(48) Using such multiply-labeled reference peptides solves the problems occurring with single-labeled reference peptides in conjunction with mass spectrometry approaches producing combined fragment ion spectra. It allows exploiting the full potential of DIA and mPRM methods for quantitative studies. Firstly, combined fragment ion spectra of unlabeled and labeled precursors will contain less shared fragment ions which can facilitate the identification of peptides and peak groups. For example, fragment overlap between reference peptides and peptides to be quantified might lead to skewed relative fragment intensities for both variants, as discussed above. Relative fragment intensities are often used for peptide and peak group identification and scoring. Therefore, using reference peptides that differ in at least 2 labels from the other peptide variant can aid peptide and/or protein identification.
(49) Secondly, being able to differentiate between N-terminal fragment ions, such as b-ions, from unlabeled and from labeled peptides allows including them for quantification without skewing quantitative values. Including a higher number of suitable ions will render quantification more robust and accurate.
(50) Steps for peptide and/or protein quantification using DIA or mPRM:
(51) In quantification experiments unlabeled endogenous peptides and/or proteins will be pooled with reference peptides. Since sample preparation can introduce considerable inter-sample variability, preferably the unlabeled peptides and/or proteins and the labeled peptides are pooled as early as possible in the protocol. Thus, any variability introduced by later sample preparation steps will affect both, light and heavy peptide, in equal measures. The steps at which pooling is most suitable may vary and are therefore not included in the standard protocol below. Most frequently, synthetic reference peptides are added to peptide samples in a last step before liquid chromatography.
(52) A standard protocol for the quantification of peptides and/or proteins by DIA or mPRM mass spectrometry includes, but is not limited to, the following steps:
(53) 1. Protein extraction: Proteins are extracted from samples. If necessary, this can include the use of detergents, mechanical force, heat, chaotropes or other means. The suitable protein extraction protocol depends on the sample and the skilled person will know which one is suitable for a specific mixture.
(54) 2. Reduction of disulfide bonds: Prior to digestion disulfide bonds between cysteine residues of proteins, are reduced. This serves to make more residues accessible for digestion and prevents two peptides from being connected which would result in complex fragment ion spectra. Preferably, Dithiothreitol (DTT) or TCEP (Tris(2-carboxyethyl)phosphine hydrochloride) are used for this step.
(55) 3. Alkylation of free cysteines: In order to avoid re-formation of disulfide bonds the free cysteines are alkylated, preferably with iodoacetamide or iodoacetic acid. The reaction is carried out in the dark to avoid formation of side products and further modifications.
(56) 4. Protein digestion: Proteins in the sample are cleaved into peptides, preferably using a protease such as trypsin and/or Lys-C. The reaction is preferably carried out at 37° C. in a suitable buffer.
(57) 5. Peptide purification: The peptides are purified prior to MS analysis. Preferably they are desalted, typically using a C18 stationary phase.
(58) 6. Liquid chromatography: Several microliters of sample are loaded onto a liquid chromatography column and are separated, preferably by increasing hydrophobicity via reversed-phase LC and a gradient of increasing acetonitrile concentrations.
(59) 7. MS analysis: Peptides elute, are ionized and subjected to MS analysis via either a DIA- or an mPRM-method. Fragment ions are detected on a high resolution instrument and combined fragment ion spectra are stored.
(60) 8. Data analysis: Quantification is usually done based on MS2 level data. Spectra can be searched against a spectral library, or theoretical spectra, or can be mined using SRM-like transitions to identify and quantify peptides and/or proteins. Examples for specialized software for these analyses are Spectronaut and Spectronaut Pulsar (Biognosys AG), DIA-Umpire (Tsou, C. C., Tsai, C. F., Teo, G., Chen, Y. J., Nesvizhskii, A. I., 2016. Untargeted, spectral library-free analysis of data independent acquisition proteomics data generated using Orbitrap mass spectrometers. Proteomics, (15-16), pp.2257-2271.) or OpenSWATH. Fragments from the same peptide are subsequently arranged in SRM-like peak groups: The signal corresponds to the intensity of each fragment monitored over time in sequential spectra. Fragments of the same peptide will produce similarly shaped elution peaks with maxima at identical retention times (RT). These SRM-like peak groups can then be used to quantify e.g. an unlabeled endogenous peptide versus a labeled reference peptide. Alternatively, data analysis approaches which do not rely on DDA-based spectral libraries, or do not rely on them exclusively, can be applied for peptide and/or protein identification and/or quantification. Analysis software, such as Spectronaut Pulsar, support these data analysis workflows. Moreover, quantification can be done on MS1 and/or MS2 level. The details and the optimal implementation of the standard protocol depend on the purpose of the experiment, the properties of the sample and the proteins of interest, and the instruments used, among other factors. The skilled person will know how to implement and alter the standard workflow to best suit a specific setup.
(61) The following paragraphs guide through the details of selecting a suitable label and label position for selectively double-labeled reference peptides:
(62) To produce double-labeled reference peptides, the labels are introduced selectively at certain positions within the peptide sequence. The label position is crucial to ensure an optimal balance between the information content provided (which is biggest for terminal labels) and other parameters, e.g. total label cost. Therefore, the present invention relates to a method for selecting the label and label position of at least one suitable reference peptide. A method for the selection of optimal label positions to produce double-labeled peptides can for example contain the following steps (
(63) In a first step, a spectral library is selected. Moreover, any additional input data required for the optimization according to the desired parameters will be supplied. E.g. if the optimization occurs according to total label cost, the label cost for each label is obtained. In addition, the label positions to be considered during the optimization process need to be defined. This includes how many amino acid positions within the terminus will be considered, as well as if both termini of the peptide will be optimized according to the same parameters.
(64) In a second step, the most advantageous amino acid position for labeling within the considered amino acids is determined for each peptide in the spectral library. During this step different parameters can be balanced to find the optimal label, e.g. information content of labeled fragment ions, total label cost which reflects the availability of the label and the complexity of its incorporation etc. For the optimization according to total label cost, the label with the lowest label cost but yielding fragment ions with maximum information content would be selected.
(65) Optionally, the method could further include any of the following features: an estimation of the total label cost for the selected labels and label positions, a simulation of fragment collisions, a calculation of label and label position frequencies, and/or a report of the results.
(66) In
(67) Furthermore, we discovered that for the analysis displayed in
(68) The reference peptides of the present invention can further carry post translational modification(s) (PTM(s)). The PTMs of interest can be of biological importance to study signaling cascades via protein phosphorylation for instance or to reflect the chemical treatment of the sample during sample preparation. These can be any modification occurring on peptides and/or proteins. Preferably PTMs are selected from phosphorylation, acetylation, methylation, sulfation, hydroxylation, lipidation, ubiquitylation, sumoylation, glycosylation, oxidation, and carbamidomethylation. Preferably, the post translational modification(s) occurs on peptides and/or proteins in nature, or is introduced as part of a standard sample preparation workflow, e.g. as described in this application. For example, carbamidomethylation of cysteines is commonly introduced during sample preparation by reducing disulfide bonds and alkylating residues with iodoacetamide. Other common post translational modifications that are introduced during sample preparation are e.g. carbamylation due to urea present in the sample, or methionine oxidation.
(69) Labeled peptides and their unlabeled counterparts contain the same post translational modification(s) at the same position(s) to ensure that both peptide variants exhibit similar behavior during sample preparation and LC-MS analysis. Thus, the reference peptide corresponds to the unlabeled peptide as present in the sample including any modifications, but with the respective isotopically labeled amino acids. The present invention can be particularly useful for the analysis of peptides with post translational modifications for which only few fragment ions are available for quantification, e.g. phospho-peptides. By minimizing or eliminating fragment overlap we can ensure that available N-terminal and C-terminal fragment ions can be used for identification and quantification. In some cases only a single b- or y-ion differentiates between isoforms of phospho-peptides where e.g. the phosphorylation can occur on either of two neighboring amino-acids. In such instances the present invention enables the unequivocal assignment of the modified amino acid. Chemical synthesis of peptides is usually carried out by attaching amino acid building blocks to each other. To introduce an isotopically labeled amino acid, the building block comprises the amino acid containing the corresponding heavy isotopes. To introduce an amino acid carrying a post translational modification, the building block usually already comprises the amino acid and the PTM. Building blocks are most often introduced by coupling the carboxyl group of an amino acid building block to the N-terminus of the peptide being formed. Thus, chemical synthesis usually starts at a peptide's C-terminus and proceeds to its N-terminus. To avoid side reactions during peptide synthesis, some of the amino acid building block's reactive groups have to be protected. Therefore, the individual amino acid building blocks are reacted with protecting groups before they are added to the nascent peptide. Once the building block has been integrated into the peptide, its N-terminus is deprotected to allow for incorporation of the next amino acid. After the peptide is fully formed, any remaining protecting groups are removed.
(70) Applications:
(71) The methods and substances of the present invention can be applied to the quantification of a variety of samples, including different cell or tissue types, environmental samples, or bodily fluids. In a preferred embodiment the methods and substances of the present invention are applied to the quantification of human plasma proteins (
(72) In a first aspect, we analyzed the fragment overlap occurring during DIA-based quantification of human plasma peptides and/or proteins with sets of single-labeled synthetic peptides (
(73) Two micrograms of each sample were analyzed using a self-made analytical column (75 μm×50 cm length, packed with ReproSil-Pur 120 A C18-AQ, 1.9 μm) at 50° C. on an Easy-nLC 1200 connected to a Q Exactive HF mass spectrometer (Thermo Scientific). The peptides were separated by a 1 h segmented gradient from 1 to 52% acetonitrile (ACN) in 60 min with 0.1% formic acid at 250 nl/min, followed by a linear increase to 90% ACN in 2 min and 90% for 10 min. The DIA-MS method consisted of a survey scan at 120,000 resolution from 350 to 1,650 m/z (AGC target of 3*10.sup.6 or 60 ms injection time). Then, 14 DIA windows were acquired at 30,000 resolution (AGC target 3*10.sup.6 and auto for injection time) spanning 350-1650 m/z. Stepped collision energy was 10% at 27%. The spectra were recorded in profile mode. The default charge state for the MS2 was set to 4.
(74) The spectra were processed to extract peptide and protein identifications and quantitative values using specialized software such as Spectronaut (Biognosys AG). To demonstrate the fragment overlap occurring in combined fragment ion spectra between N-terminal b-ions from endogenous, unlabeled peptide and single-labeled synthetic reference peptides, we further analyzed spectra from single peptides.
(75) Combined fragment ion spectra for three peptides showing an intense signal were analyzed.
(76) In a preferred embodiment the methods and substances of the present invention are applied to the quantification of human plasma proteins (
(77)
(78) Moreover, we re-analyzed data from the DIA experiments described above (
(79) Experimental Part:
EXAMPLE 1
Quantification of Human Plasma Proteins Using Selectively Double-Labeled Peptides
(80) See
(81) Sample Preparation:
(82) Human plasma will be digested using in solution digestion: 10 μl of plasma will be diluted in 75 μl 10 M urea and 0.1 M ammonium bicarbonate. The samples will be reduced with 5 mM TCEP for 1 h at 37° C. Subsequently, the plasma will be alkylated with 25 mM iodoacetamide for 20 min at 21° C. The samples will be diluted to 2 M urea and digested with trypsin at a ratio 1:100 (enzyme to protein) at 37° C. for 15 h. The samples will be centrifuged at 20,000 g at 4° C. for 10 min. The peptides will be desalted using C18 MacroSpin columns from The Nest Group according to the manufacturer's instructions. After drying, the peptides will be resuspended in 1% ACN and 0.1% formic acid.
(83) Preparation of Labeled Reference Peptides:
(84) The reference peptide mix will contain synthetic double-labeled peptides covering amino acid sequences of interest, the unlabeled, endogenous version of which will be quantified within the samples. These dried, labeled reference peptides will be dissolved in 20 μl dissolution buffer before adding 100 μl of LC solution to it. Dissolution will be assisted by vortexing and/or sonication. Two microliters of this reference peptide mix will be added to each sample.
(85) Mass Spectrometry Analysis:
(86) Two micrograms of each sample will be analyzed using a self-made analytical column (75 μm×50 cm length, packed with ReproSil-Pur 120 A C18-AQ, 1.9 μm) at 50° C. on an Easy-nLC 1200 connected to a Q Exactive HF mass spectrometer (Thermo Scientific). The peptides will be separated by a 1 h segmented gradient from 1 to 52% ACN in 60 min with 0.1% formic acid at 250 nl/min, followed by a linear increase to 90% ACN in 2 min and 90% for 10 min. The DIA-MS method will consist of a survey scan at 120,000 resolution from 350 to 1,650 m/z (AGC target of 3*10.sup.6 or 60 ms injection time). Then, 14 DIA windows will be acquired at 30,000 resolution (AGC target 3*10.sup.6 and auto for injection time) spanning 350-1650 m/z. Stepped collision energy will be 10% at 27%. The spectra will be recorded in profile mode. The default charge state for the MS2 will be set to 4.
(87) Data Analysis:
(88) Peptide and protein identification, as well as quantification will be done using any suitable software, such as for example Spectronaut, OpenSWATH, SpectroDive or MaxQuant.
EXAMPLE 3
Method for Selecting Cheapest Amino Acid for Labeling and Estimate Total Label Costs
(89) A method was created to select optimal amino acids and positions for labeling. Furthermore, the method estimated the total label cost for double-labeling a set of peptides. It offered the following features:
(90) In a first step three pieces of input data were accepted, the first containing the label prices, i.e. the price of amino acids containing heavy elemental isotopes as stated by a certain vendor, the second containing the molecular weight of all 20 amino acids, and the third being a spectral library for human plasma.
(91) In a second step the label prices and the amino acid molecular weight data was used to estimate the cost per mmol of each labeled amino acid. Furthermore, all unique, unmodified peptide sequences were extracted from the spectral library.
(92) In a third step a value for n.sub.globalMaxVal was specified. Herein n.sub.globalMaxVal defines a positive integer that is set by the experimenter, e.g. n.sub.globalMaxVal=4. The highest possible value for n.sub.globalMaxVal is equal to the length of the longest peptide in the analyzed peptide spectral library divided by two, and rounded down to the nearest lower positive integer if the value was not an integer.
(93) In a fourth step, the value for n.sub.globalMaxVal, the values for label cost per mmol, and the peptide sequences from the spectral library were used to select the cheapest amino acid for labeling, to estimate the total label cost, and to calculate the frequency with which each amino acid was labeled for the set of peptides for different n.sub.globalMaxVal values. For each peptide stretches of n.sub.i amino acids from each terminus were considered. The n.sub.i values were peptide-specific and related to an amino acid stretch starting from the terminus of a peptide, e.g. a value of n.sub.i=1 comprised the terminal amino acid, n.sub.i=2 comprised the terminal amino acid and the amino acid one removed from the terminus, and so forth. The cheapest amino acid and the total label costs were determined as follows:
(94) For each peptide sequence extracted from the library the peptide-specific value for n.sub.i was equal to the lower of two values: either the value of the user-defined positive integer n.sub.globalMaxVal, or the value of n.sub.pepMaxVal which corresponds to the number of amino acids in the peptide divided by two and rounded down to the nearest lower integer if the value was not an integer. The position and the cost of the first label for said peptide were determined by selecting the amino acid with the lowest label cost per millimole from a stretch of amino acids of length n.sub.i starting from the C-terminus. The position and the cost of the second label were determined by applying the same procedure to the N-terminus. This was repeated for all peptide sequences. The label costs for all peptide sequences were summed up to obtain the total label cost for the selected n.sub.globalMaxVal value.
(95) This calculation was repeated for different integer values of n.sub.globalMaxVal between 1 and the maximum possible value (length of longest peptide in the library divided by two and rounded down to the next lowest integer). As a result, a separate total label cost was calculated for each n.sub.globalMaxVal value.
(96) In a fifth step, the resulting total label costs for labeling the peptide sequences were displayed for each n.sub.globalMaxVal value. Furthermore, the frequencies with which each of the 20 amino acids had been selected for labeling, were calculated (
EXAMPLE 4
Exclusion of Modified Amino Acids and Analysis of Fragment Collisions
(97) A method for the selection of labels and label positions will be created which will offer the following features in addition to the label cost calculation features of Example 3:
(98) After the optimization of label positions according to total label cost as in Example 3, the present method will in a first aspect select the amino acid with the next lowest label cost for labeling if the selected amino acid is an amino acid that is often post-translationally modified in the experimental setup. In a second aspect the method will simulate the fragment masses that would be produced by the selected double-labeled peptide sequences. Based on the simulation the method will further analyze how many fragment collisions occur, i.e. how many fragment ions from the double-labeled precursor overlap with any other fragment ions of the unlabeled precursor. If the number lies above a certain threshold, the amino acid with the next lowest label cost with a number of fragment collisions which lies below the threshold will instead be selected for labeling if such a residue is available.
EXAMPLE 5
Set of Synthetic Double-Labeled Human Plasma Peptides
(99) A list of tryptic sequences extracted from a human plasma spectral library will be analyzed. The value for n.sub.globalMaxVal will be set equal to 4. For each peptide stretches of n.sub.i amino acids from each terminus were considered. The n.sub.i values will be peptide-specific and relate to an amino acid stretch starting from the terminus of a peptide, e.g. a value of n.sub.i=1 comprises the terminal amino acid, n.sub.i=2 comprises the terminal amino acid and the amino acid one removed from the terminus, and so forth.
(100) For each peptide sequence extracted from the library the peptide-specific value for n.sub.i will be equal to the lower of two values: either the value of the user-defined positive integer n.sub.globalMaxVal, or the value of n.sub.pepMaxVal which corresponds to the number of amino acids in the peptide divided by two and rounded down to the nearest lower integer if the value was not an integer.
(101) For each peptide a first amino acid having the lowest label cost from the n.sub.i most C-terminal amino acids, and a second amino acid having the lowest label cost from the n.sub.i most N-terminal amino acids will be selected for labeling. n.sub.i will adopt values 1, 2, 3, and 4 for different peptides, depending on their length, e.g. for a peptide of six amino acids n.sub.i will be 3, for a peptide of seven amino acids n.sub.i will be 3, for a peptide of eight amino acids, n.sub.i will be 4.
(102) The most appropriate 1, 2, 3, 4, 5 or more peptides per protein will be selected based on labeling cost and other criteria (such as peptide length, hydrophobicity and so forth). Furthermore, total label costs for n.sub.globalMaxVal will be estimated. Special selection criteria will apply in case fragment collisions occur or in case the selected amino acid is easily modified. The corresponding set of quantified, double-labeled peptides corresponding to the data of n.sub.globalMaxVal=4 will be synthesized wherein the labels are the designated amino acids containing .sup.13C and/or .sup.15N.
(103) The set of synthetic double-labeled peptides will be diluted appropriately. A suitable amount of the double-labeled peptide mix will be added to a sample containing an unlabeled protein digest from human plasma. Fragment ion spectra for the combined peptide mixture will be acquired using a DIA method. Due to the labeled peptides being added in known amounts, absolute peptide abundances in the unlabeled sample can then be determined using specialized software. Due to the synthetic peptides containing two labels, their b- and y-ions series will have different masses from the corresponding ions of the unlabeled peptide. Thus, no fragment overlap will occur.
EXAMPLE 6
Quantification of Human Plasma Peptides Using Selectively Double-Labeled Peptides
(104) See
(105) Sample Preparation:
(106) Human plasma sample was prepared by in solution digestion: 10 μl of plasma was diluted in 90 μl 10 M urea in 0.1 M ammonium bicarbonate. The sample was reduced with 5 mM dithiothreitol for 30 minutes at 37° C. Subsequently, the plasma was alkylated with 27 mM iodoacetamide for 30 minutes at 21° C. protected from light. The sample was diluted to a urea concentration below 1.5 M and digested with trypsin at a ratio 1:50 (enzyme to protein) at 37° C. for 3 hours. The sample was centrifuged at 14,000×g at 4° C. for 15 minutes, before the peptides were desalted using a C18 MacroSpin 96-well plate (The Nest Group) according to the manufacturer's instructions. After complete drying in a vacuum concentrator, the plasma sample was re-suspended in 1% ACN and 0.1% formic acid and frozen at −20° C. until further use.
(107) Preparation of Labeled Reference Peptides:
(108) The reference peptide mix contained five synthetic, double-labeled peptides covering amino acid sequences of interest, the unlabeled, endogenous version of which will be quantified within the samples.
(109) Stock solutions of the individual peptides and a working solution of the reference peptide mix were prepared according to the following table:
(110) TABLE-US-00001 Stock Stock Working Solution Solution Solution Peptide (fmol/μl) (μl) Dilution (fmol/μl) _PVA*FSVVPTAAAAVSLK*_ 670776.7 1000 404.4 1658.6 _AG*LLRPDYALLGHR*_ 702996.7 1000 1209.8 581.1 _DIA*SGLIGPLIIC[+C2+H3+N+O]K*_ 742360.4 1000 583.9 1271.3 _G*LTLHLK*_ 1389099.4 1000 2271.7 611.5 _EHV*AHLLFLR*_ 879725.5 1000 276.7 3179.3 Heavy labeled amino acids are marked by a star (*) following the amino acid letter.
(111) Of the working solution 2 μl was added to 6 μl of plasma sample. Additionally, 0.8 μl of iRT peptides were added to the sample before injection. Purity of the double labeled peptides, concerning single or non-labeled contaminates, was confirmed by mass-spectrometric analysis (data not shown).
(112) As comparison for single labeled reference peptides, Biognosys' PlasmaDive reference peptide mix was used, according to the manufacturer's instructions. The mix comprises the sequences of the five double-labeled peptides in their single-labeled variant, i.e. with a single C-terminal heavy amino acid.
(113) Mass Spectrometry Analysis:
(114) One microgram of each sample was analyzed using a self-made analytical column (75 μm×50 cm length, packed with ReproSil-Pur 120 A C18-AQ, 1.9 μm) at 50° C. on an Easy-nLC 1200 connected to a Q Exactive HF mass spectrometer (Thermo Scientific). The peptides were separated by a 40 minutes (PRM) linear gradient or 60 minutes segmented gradient (DIA) from 1 to 45% ACN with 0.1% formic acid at 250 nl/min. The DIA-MS method consisted of a survey scan at 120,000 resolution from 350 to 1,650 m/z (AGC target of 3*10.sup.6 or 60 ms injection time). Then, 14 DIA windows were acquired at 30,000 resolution (AGC target 3*10.sup.6 and auto for injection time) spanning 350-1650 m/z. Normalized stepped collision energy from 10% to 27% was used and the spectra were recorded in profile mode. The default charge state for the MS2 was set to 3. For the PRM analysis, the settings were similar, but only the five heavy labeled peptides and endogenous counterparts were targeted, as well as iRT peptides. The instrument was set to use multiplexing and analyze heavy-light pairs together.
(115) Data Analysis:
(116) The multiplexed PRM files were analyzed with SpectroDive 7 (Biognosys) and the DIA runs with Spectronaut 9 (Biognosys), both using standard settings, according to the manufacturer's instructions.
LIST OF REFERENCE SIGNS/ABBREVIATIONS
(117) CID collision-induced dissociation
(118) ECD electron-capture dissociation
(119) ESI electrospray ionization
(120) ETD electron-transfer dissociation
(121) HCD Higher-energy collisional dissociation
(122) LC liquid chromatography
(123) MALDI matrix-assisted laser desorption ionization
(124) mmol millimole
(125) mPRM multiplexed parallel reaction monitoring
(126) MS mass spectrometry
(127) m/z mass to charge ratio
(128) NETD negative electron transfer dissociation
(129) PQD Pulsed Q Collision Induced Dissociation
(130) SRM selected reaction monitoring