Methods and apparatus for determining interference in MS scan data, filtering ions and performing mass spectrometry analysis on a sample

11527394 · 2022-12-13

Assignee

Inventors

Cpc classification

International classification

Abstract

A method of determining one or more interference parameters for a particular peak of an isotopic distribution corresponding to a precursor molecule in MS scan data is provided. The MS scan data comprises a plurality of peaks. Each peak has a mass-to-charge ratio and a relative abundance. The isotopic distribution comprises a subset of the plurality of peaks. The one or more interference parameters comprises a peak purity, p.sub.i, for the particular peak. The method comprises determining that there are no interfering peaks relevant to the isotopic distribution and determining that the peak purity, p.sub.i, for the particular peak should be a maximum purity value. Alternatively, the method comprises identifying one or more interfering peaks from the MS scan data, wherein the one or more interfering peaks do not belong to the subset of peaks of the isotopic distribution, and determining the peak purity, p.sub.i, for the particular peak based on: the relative abundance, I.sub.i, of the particular peak, and the relative abundance of the one or more interfering peaks.

Claims

1. A method of data-dependent mass spectrometry comprising: generating mass spectrometer (MS) scan data comprising a plurality of peaks by performing a first MS scan, each peak having a respective mass-to-charge ratio and a relative abundance; recognizing an isotopic distribution comprising a subset of two or more of the plurality of peaks of the MS scan data, the isotopic distribution corresponding to a precursor molecule; determining one or more interference parameters for a particular peak of the isotopic distribution, wherein the one or more interference parameters include a peak purity, p.sub.i, for the particular peak; setting or determining the peak purity value, p.sub.i, for the particular peak by either; determining that there are no interfering peaks relevant to the isotopic distribution and setting the peak purity value, p.sub.i, at a maximum purity value; or identifying one or more interfering peaks from the MS scan data, wherein the one or more interfering peaks do not belong to the subset of peaks of the isotopic distribution, and determining the peak purity, p.sub.i, for the particular peak based on: the relative abundance, I.sub.i, of the particular peak, and the relative abundance of the one or more interfering peaks; and generating a purity score for the particular peak of the isotopic distribution that is based, at least in part, on the peak purity value, p.sub.i, and performing one or more mass analyses of ions corresponding to one or more peaks of the isotopic distribution, wherein either a sequence of the mass analyses or mass-to-charge values of the mass-analyzed ions are determined based on the purity score and on purity scores of other particular peaks of the isotopic distribution.

2. The method of claim 1, wherein the peak purity value, p.sub.i, for the particular peak of the isotopic distribution is determined based on the relative abundance(s) of the one or more interfering peaks and comprises selecting a first interfering peak of the one or more interfering peaks and determining the peak purity value, p.sub.i, for the particular peak of the isotopic distribution based on the relative abundance, I.sub.interf, of the first interfering peak.

3. The method of claim 2, wherein the interfering peak is a nearest interfering peak having a relative abundance above an interference threshold, such that the mass-to-charge ratio of the first interfering peak is closer to the mass-to-charge ratio of the particular peak of the isotopic distribution than any other peak in the MS scan data not belonging to the subset of MS-scan-data peaks in the isotopic distribution and having a relative abundance above the interference threshold.

4. The method of claim 2, wherein the one or more interference parameters further include an interference distance, d.sub.interf, for the particular peak of the isotopic distribution, wherein the interference distance, d.sub.interf, is based on the difference between the mass-to-charge ratio, M.sub.i, of the particular peak of the isotopic distribution and the mass-to-charge ratio, M.sub.interf, of the first interfering peak.

5. The method of claim 1, wherein the one or more interference parameters further include an isotopic m/z window, w.sub.ISD, of the isotopic distribution wherein the isotopic m/z window defines a range of mass-to-charge ratios that includes every peak of the isotopic distribution having a relative abundance above an inclusion threshold, wherein the purity score for the particular peak of the isotopic distribution is based, in part, on w.sub.ISD.

6. The method of claim 5, wherein: i) the isotopic m/z window is centered on a mass-to-charge ratio, M.sub.0, of a most abundant peak of the isotopic distribution having the highest relative abundance, I.sub.0, of the peaks in the isotopic distribution and wherein a half-width, w.sub.ISD/2, of the isotopic m/z window is defined as the absolute difference between the mass-to-charge ratio of the most abundant peak of the isotopic distribution and the mass-to-charge ratio of a furthest significant peak of the isotopic distribution, wherein the furthest significant peak has: a) a relative abundance above the inclusion threshold; and b) a mass-to-charge ratio that is furthest from the most abundant peak of the isotopic distribution, such that the absolute difference between the mass-to-charge ratio of the furthest significant peak and the most abundant peak is greater than the absolute difference between the mass-to-charge ratio of the most abundant peak and any other peak in the isotopic distribution having a relative abundance above the inclusion threshold; and ii) the setting or determining of the peak purity value, p.sub.i, for the particular peak comprises either: a) determining that there are no interfering peaks relevant to the isotopic distribution by determining that the range of mass-to-charge ratios defined by the isotopic m/z window does not contain any peaks that do not belong to the subset of peaks of the isotopic distribution and have a relative abundance above an interference threshold; or b) identifying one or more interfering peaks from the MS scan data by identifying peaks having a mass-to-charge ratio within the isotopic m/z window and having a relative abundance above an interference threshold.

7. The method of claim 5, wherein the one or more interference parameters further include an isotopic purity, p.sub.ISD, for the isotopic distribution, wherein the purity score for the particular peak of the isotopic distribution is based, in part, on p.sub.ISD, the method further comprising: determining a total relative abundance, S.sub.iso, of the subset of peaks belonging to isotopic distribution; determining the total relative abundance of all of the peaks in the MS scan having a mass-to-charge ratio falling within the range defined by the isotopic m/z window; and using the total relative abundance for the subset of peaks, S.sub.iso, and the total relative abundance for all of the peaks in the isotopic m/z window to determine the isotopic purity, p.sub.ISD.

8. A method of data-dependent mass spectrometric analysis the that depends on MS scan data that comprises a plurality of peaks, each peak having a respective mass-to-charge ratio and a relative intensity, wherein a subset of the plurality of peaks corresponds to an isotopic distribution, the method comprising: (i) determining a peak purity, p.sub.i, for each peak of the isotopic distribution by: determining that there are no interfering: peaks relevant to the isotopic distribution and setting the peak purity value, p.sub.i, at a maximum purity value; or identifying one or more interfering peaks from the MS scan data, wherein the one or more interfering peaks do not belong to the subset of peaks of the isotopic distribution, and determining the peak purity, p.sub.i, for the particular peak based on: the relative abundance, I.sub.i, of the particular peak, and the relative abundance of the one or more interfering peaks; (ii) determining a purity score, s.sub.i, for each peak of the isotopic distribution that is based, at least in part, on the peak purity; (iii) defining a lower boundary, W.sub.start, and an upper boundary, W.sub.end, of an isolation window so that only peaks of the isotopic distribution having a purity score greater than a predetermined threshold, T, are included in the isolation window; (iv) isolating only ions that correspond to peaks that have mass-to-charge ratios that are within the isolation window; and (v) either fragmenting or mass analyzing the isolated ions.

9. The method of claim 8, wherein: the purity score, s.sub.i, of each peak is further based on one or more of: a) an isotopic m/z window, w.sub.ISD, of the isotopic distribution that is defined as a ran e of mass-to-charge ratios that includes every-peak of the isotopic distribution having a relative abundance above an inclusion threshold; b) an isotopic purity for the isotopic distribution, p.sub.ISD, that is determined from a ratio between a total relative abundance, S.sub.iso, of a subset of peaks belonging to the isotopic distribution and the total relative abundance of all of the peaks in the MS scan having a mass-to-charge ratio falling within the range defined by the isotopic m/z window; and c) an interference distance, d.sub.interf, for the respective peak that is based on the difference between the mass-to-charge ratio, M.sub.i, of the respective peak of the isotopic distribution and the mass-to-charge ratio, M.sub.interf, of a first interfering peak.

10. The method of claim 8, wherein the isolation window is centered around a peak of the corresponding subset of the plurality of peaks having the highest relative abundance and wherein setting the lower boundary of the isolation window and the upper boundary of the isolation window comprises defining a width of the isolation window.

Description

BRIEF DESCRIPTION OF THE FIGURES

(1) The above noted and various other aspects of the present invention will become further apparent from the following description which is given by way of example only and with reference to the accompanying drawings, not drawn to scale, in which:

(2) FIG. 1 shows a simplified example of a mass spectrum in which two isotopic clusters have been identified.

(3) FIGS. 2A and 2B illustrate how an isotopic m/z window may be defined.

(4) FIG. 3 illustrates determination of a candidate purity for a particular peak of an isotopic distribution.

(5) FIGS. 4A and 4B illustrate distances between neighbouring peaks (whose masses differ by approx. 1 amu) for ions with charge values z=2 and z=3.

DETAILED DESCRIPTION

(6) The following description is presented to enable any person skilled in the art to make and use the invention, and is provided in the context of a particular application and its requirements. Accordingly, the disclosed materials, methods, and examples are illustrative only and not intended to be limiting. Various modifications to the described embodiments will be readily apparent to those skilled in the art and the generic principles herein may be applied to other embodiments. Thus, the present invention is not intended to be limited to the embodiments and examples shown but is to be accorded the widest possible scope in accordance with the features and principles shown and described. The particular features and advantages of the invention will become more apparent with reference to the figures taken in conjunction with the following description.

(7) Unless otherwise defined, all technical and scientific terms used herein have the meaning commonly understood by one of ordinary skill in the art to which this invention belongs. In case of conflict, the present specification, including definitions, will control.

(8) In this document, the terms “precursor ions”, “precursor ion species”, “first-generation ions” and “first-generation ion species” refer to ions as they are received by a mass analyzer from an ionization source in the absence of any controlled fragmentation in a fragmentation cell. The term “scan”, when used as a noun, should be understood in a general sense to mean “mass spectrum” regardless of whether or not the apparatus that generates the scan is actually a scanning instrument. Similarly, the term “scan”, when used as a verb, should be understood in a general sense as referring to an act or process of acquiring mass spectral data.

(9) As used herein, “a” or “an” also may refer to “at least one” or “one or more.” Also, the use of “or” is inclusive, such that the phrase “A or B” is true when “A” is true, “B” is true, or both “A” and “B” are true. Further, a word appearing in the singular encompasses its plural counterpart, and a word appearing in the plural encompasses its singular counterpart, unless implicitly or explicitly understood or stated otherwise. Furthermore, it is understood that for any given component or embodiment described herein, any of the possible candidates or alternatives listed for that component may generally be used individually or in combination with one another, unless implicitly or explicitly understood or stated otherwise. Moreover, it is to be appreciated that the figures, as shown herein, are not necessarily drawn to scale, wherein some of the elements may be drawn merely for clarity of the disclosure. Also, reference numerals may be repeated among the various figures to show corresponding or analogous elements. Additionally, it will be understood that any list of such candidates or alternatives is merely illustrative, not limiting, unless implicitly or explicitly understood or stated otherwise. Also, the use of “comprise”, “comprises”, “comprising”, “contain”, “contains”, “containing”, “include”, “includes”, and “including” are not intended to be limiting.

(10) In addition, unless otherwise indicated, numbers expressing quantities of ingredients, constituents, reaction conditions and so forth used in the specification and claims are to be understood as being modified by the term “about”, such that slight and insubstantial deviations are within the scope of the present teachings. Accordingly, unless indicated to the contrary, the numerical parameters set forth in the specification and attached claims are approximations that may vary depending upon the desired properties sought to be obtained by the subject matter presented herein. At the very least, and not as an attempt to limit the application of the doctrine of equivalents to the scope of the claims, each numerical parameter should at least be construed in light of the number of reported significant digits and by applying ordinary rounding techniques. Notwithstanding that the numerical ranges and parameters setting forth the broad scope of the subject matter presented herein are approximations, the numerical values set forth in the specific examples are reported as precisely as possible. Any numerical values, however, inherently contain certain errors necessarily resulting from the standard deviation found in their respective testing measurements.

(11) A new approach is provided for identifying molecules of a sample (the precursors) observed in mass spectra (MS) by their mass peak. Ions in a mass window around the mass peak are isolated, fragmented and then the mass spectrum (MS.sup.2) of the fragments is detected to identify the molecule based on the fragment spectrum. This may be achieved in some examples by comparing the fragment spectrum with standard spectra in a library.

(12) The sample may be eluted from a chromatography column and accordingly mass spectra may have to be detected from the eluted sample in certain time intervals. Accordingly, the time to detect the mass spectra (MS.sup.2) of the fragments of observed molecules may be limited. As a result, it may be beneficial to define a ranking for a reasonable order in which to identify the observed precursors by MS.sup.2 mass spectra.

(13) Each molecule of the sample has an isotopic distribution (also called an “isotope distribution”). The isotopic distribution results from isotopologues of a molecule having different m/z values. Each isotopologue produces a peak in the mass spectrum. The ions corresponding to the isotopologues of the molecule may be analysed by collecting ions in a mass window that includes the isotopologues of the isotopic distribution. The mass window of the ions collected for further analysis is termed the “isolation window”. The isolation window may be independent of the width of the isotopic distribution. Alternatively, the isolation window may be chosen based on the isotopic distribution (e.g. to be the same width as the isotopic distribution). In some examples, the width of the isolation window may be specified by the user (e.g. in a user interface for MS.sup.2 scans).

(14) Each precursor molecule present in the sample produces an isotopic distribution when the sample undergoes MS spectroscopy. The isotopic distributions in the scan can be analysed with the goal of identifying the corresponding precursor molecule. However, one problem is that in the mass range of the isotopic distribution of the ions of a molecule, mass peaks of the isotopic distribution of another molecule may also be observed. This effect of interference of the isotopic distributions of different molecules complicates the identification of molecules by their fragments observed in the MS.sup.2 mass spectra substantially. Therefore, it is preferred to analyse mass peaks that are not influenced or not essentially influenced by the interference. Another term to describe the influence of interference is the “purity” of a mass range of a mass spectrum. So the order of detected MS.sup.2 mass spectra should be related to the purity of the analysed mass range or the surroundings of the analysed mass peaks. For this task, a purity filter has been proposed which provides a purity or also called non-interference score. Example methods discussed in this application provide improved purity filters.

(15) These techniques can also be used when analysing MS.sup.2 scan data. For example, each fragment may have an isotopic distribution and these may overlap. If further fragmentation is required (e.g. in MS.sup.n spectroscopy) then the techniques described may be used to identify which fragments should be further analysed and in what order to do so, based on the purity of the isotopic distributions of the fragments. An isolation window around the fragments may also be determined based on the purity of the peaks in the isotopic distribution of the fragment.

(16) One example of an improved method of implementing a purity filter identifies, for one or more isotopic distributions, one or more (preferably two) parameters characterising the interference of the isotopic distribution with another isotopic distribution and, for each isotopic peak in the isotopic distribution, one or more (preferably two) parameters characterising interference from the nearest interfering peak.

(17) Isotopic distributions (also known as isotopic clusters) may be identified from a mass spectrum by an isotope and charge state defining algorithm or an advanced peak detection algorithm. This process is described in more detail in European Patent Application No. 17174330.5 (EP 3293755 B1), which is herein incorporated by reference. Other methods for identifying isotopic distributions from MS data are possible.

(18) FIG. 1 shows a simplified example of a mass spectrum in which two isotopic clusters have been identified. As can be seen, the m/z ratio ranges of the two clusters overlap.

(19) The parameters determined to define levels of interference may be used for filtering and ordering. For example, the purity of the isotopic distribution as a whole may be used for determining an order in which isotopic distributions should be analysed. Moreover, the purity of the individual isotopic peaks in the isotopic distribution (each identifying a level of interference from one or more interfering peaks in the vicinity of the isotopic peak) may be used to filter out certain peaks of the isotopic distribution before MS.sup.2 analysis (e.g. by adjusting the isolation window to exclude peaks with poor interference scores). The one or more parameters defining the interference may be combined into a single score for each isotopic peak and/or for each isotopic distribution.

(20) After isotopic distributions have been extracted from the mass spectrum of a sample, a m/z window w.sub.ISD and a purity value p.sub.ISD may be determined for each isotopic distribution. For each isotopic peak of an isotopic distribution the nearest interfering peak may be identified. The nearest interfering peak may be the nearest peak not belonging to the isotopic distribution (e.g. a peak belonging to a different isotopic distribution). In particular, the m/z distance d.sub.interf of this nearest interfering peak is determined as one parameter characterising the interference of each isotopic peak. Another parameter characterising the interference of each isotopic peak may be the isotopic peak purity p.sub.i. The isotopic peak purity may be related to the relative abundance of the isotopic peak compared to the relative abundance of the interfering peak.

(21) From one or more of the purity parameters determined, a purity score (or interference score) may be calculated for each isotopic peak and/or for each isotopic distribution. The interference score can be used as a sorting criterion to define the order in which to identify the observed precursors by MS.sup.2 mass spectra or as a selection criterion, which precursors shall be investigated by MS.sup.2 mass spectra. Then the score is the selection criterion of a purity filter defining which precursors can be identified.

(22) The weighting of the different determined parameters may be combined in a single score. One way of doing this is by multiplying each parameter by a weighting factor and adding the parameters together.

(23) A typical maximum value of 10 Thomson (10 u) may be assumed as the m/z width of an isotopic distribution. 1 Thomson may be defined as 1 u/e, where u is the unified atomic mass unit and e is the elementary charge.

(24) A specific implementation of the APD-based purity filter developed for use in mass spectrometers is described below. This method may be implemented on a computer system and the instructions may be stored in computer software.

(25) In general, purity filters provide a means to select and prioritize precursors based on the amount of interference in the proximity of the precursor. In this context, “interference” refers to any signal in the mass spectrum that does not belong to the same isotopic distribution (and thus chemical species) as the precursor of interest.

(26) In the present invention, purity values are calculated individually for every isotopic distribution and their associated peaks found by a charge state detection algorithm (such as the Advanced Peak Determination (APD) algorithm of EP3293755B1) so as to avoid using predefined m/z windows. Moreover, the purity values may be used as a sorting criterion to process a list of candidate peaks by order of purity. For example, it may be beneficial to start with the “purest” candidate and continue in descending order of purity. This is particularly significant when the amount of MS' spectra has to be maximised for a given amount of time, without compromising the information provided. The samples for analysis may be eluted from a chromatography column. Samples may be collected from the column at fixed sample time intervals. Time for sample analysis may therefore be limited to the time between samples (before the next sample is eluted). During the fixed time interval between samples, the level of analysis that may be performed on the sample is limited. The purity values may therefore be used to guide the analysis so that more useful information may be obtained during the fixed interval.

(27) Moreover, the abundance of precursor molecules in the eluted sample will vary over time. Therefore, a precursor may be subject to significant levels of interference from interfering molecules when a first eluted sample is analysed. However, when a second eluted sample is analysed after a time interval, the precursor molecule may be observed in higher abundance and the interfering molecules may be observed in lower abundance (or may have completely eluted by that time interval), resulting in improved purity scores for the precursor molecule. The precursor may therefore be identified at the second time period, when it has a higher purity score, rather than at the first time period.

(28) Calculation of Purity Values

(29) The following steps may be performed for each isotopic distribution found by the charge state detection algorithm. The APD algorithm is particularly suited for this approach, since it is capable of extracting a list of isotopic distributions from a given mass spectrum.

(30) Purity of the Entire Isotopic Distribution

(31) The isotopic envelope of an isotopic distribution is defined as a range of m/z values starting from the peak in the isotopic distribution having the lowest m/z value and ending with the peak in the isotopic distribution having the highest m/z value, including the end points. The width of the isotopic envelope is therefore given by the m/z distance between the lowest-m/z and highest-m/z isotopic peak of the ISD, considering all peaks that are likely to originate from the same chemical species. The isotopic envelope includes all the peaks of the isotopic distribution.

(32) The purity of the isotopic distribution (ISD) is calculated using an individual m/z window, which is based on the isotopic envelope. This window is referred to in this application as “the m/z window”, “the mass/charge (m/z) window” or “the isotopic m/z window” of the isotopic distribution and is denoted by w.sub.ISD. The m/z window w.sub.ISD may be obtained by centring the window on the most abundant peak of the ISD and adjusting the width of the window until all peaks belonging to the ISD are included in the window (so that the window remains symmetric around the most abundant peak of the ISD). This is illustrated in FIG. 2A. More specifically, a half-width W.sub.ISD/2 is calculated first:
w.sub.ISD/2=max(M.sub.0−M.sub.low,M.sub.high−M.sub.0),
with M.sub.x being m/z values of isotopic peaks (M.sub.0: most abundant peak of the ISD, M.sub.low: lowest-m/z peak, M.sub.high: highest-m/z peak). Then the window boundaries w.sub.ISDstart, w.sub.ISDend are given by:
w.sub.ISDstart=M.sub.0−w.sub.ISD/2,
w.sub.ISDend=M.sub.0+w.sub.ISD/2,
such that w.sub.ISDend−w.sub.ISDstart=2.Math.w.sub.ISD/2=w.sub.ISD. In this way, the window is exactly large enough to meet the two conditions (centred on the most abundant peak in the ISD and includes all peaks in the ISD) and no wider.

(33) In the approach described above, the m/z window w.sub.ISD is centred on the most intense peak. Therefore it is at least as wide as the isotopic envelope. This approach is advantageous because the most intense peak of the ISD is triggered preferentially for MS.sup.2. However, the m/z window w.sub.ISD could be defined in other ways. For example, the m/z window w.sub.ISD could be centred on the average m/z value of the ISD. Alternatively, the m/z window w.sub.ISD may be identical to the isotopic envelope, i.e., defined by the lowest-m/z and highest-m/z peaks of the ISD.

(34) The term w.sub.ISD may refer to the window defined by the start and end points of the m/z window (w.sub.ISDend and w.sub.ISDstart). w.sub.ISD may also be used to refer to the m/z distance between the start and end points (i.e. w.sub.ISD=w.sub.ISDend−w.sub.ISDstart).

(35) The method of determining the m/z window w.sub.ISD assumes that the relationship between each peak in the MS data and a corresponding ISD is well-defined. In reality, some peaks may only be assigned to a particular ISD with a particular certainty. In some embodiments, peaks that are assigned to a particular ISD with a certainty below a particular threshold may be disregarded when determining the individual m/z window w.sub.ISD.

(36) In a similar manner, peaks having an intensity (relative abundance) below a certain threshold may be disregarded when determining the individual m/z window w.sub.ISD. This is illustrated in FIG. 2B.

(37) All peaks detected in the m/z range w.sub.ISDstart≤m/z≤w.sub.ISDend are checked with respect to their association with the ISD of interest. If a peak has been determined to belong to the same ISD with high probability, its intensity I is added to the isotopic intensity accumulator S.sub.iso. Otherwise, its intensity is added to the interference intensity accumulator S.sub.interf. If a peak belongs to multiple ISDs, including the ISD of interest, its intensity can be distributed proportionately to the accumulators. These proportions can be calculated, for example, based on intensity ratios of the associated ISDs. An alternative, simpler approach, consists of adding a fixed proportion of 50% to either accumulator if a peak is associated with multiple ISDs.

(38) After processing the entire m/z window, the purity value p.sub.ISD is calculated as:
p.sub.ISD=S.sub.iso/(S.sub.iso+S.sub.interf),
with 0<p.sub.ISD≤1.
Purity of Isotopic Peaks

(39) After calculating the purity value of the entire ISD, each isotopic peak of the ISD with m/z value M.sub.i and intensity I.sub.i is analysed individually with respect to the nearest interference peak within the m/z window of the ISD: 1. Store the intensity (relative abundance) of the isotopic peak in I.sub.i. 2. Find the nearest interference peak (m/z value M.sub.interf) within the m/z window w.sub.ISD of the ISD and store its intensity in I.sub.interf and its m/z distance in d.sub.interf=|M.sub.interf−M.sub.i|. 3. Calculate the isotopic peak purity as p.sub.i=I.sub.i/(I.sub.i+I.sub.interf).

(40) Note that if p.sub.ISD is 1 (i.e., there are no interference peaks within the window), p.sub.i must be 1 as well for all isotopic peaks. In this case, d.sub.interf can be set to the m/z distance between the isotopic peak and the nearest boundary of the window (w.sub.ISDstart or w.sub.ISDend).
If p.sub.ISD==1: d.sub.interf=min(M.sub.i−w.sub.ISDstart,w.sub.ISDend−M.sub.i)
else: d.sub.interf=|M.sub.interf−M.sub.i|.

(41) For performance reasons, it may be beneficial to skip the step of calculating the individual purities if the purity of the ISD is 100% (no interferences). It may also improve performance, without unduly affecting results, to skip the step of calculating the individual purities if the purity of the ISD is higher than a specified threshold, and then assigning the purity of the ISD to the individual peaks.

(42) Each isotopic peak is annotated internally in the software with the value pairs (p.sub.i, d.sub.interf) and (p.sub.ISD, w.sub.ISD), which are used by the purity filter to assess the purity with respect to a purity window (which may be user-defined). Also, the value pairs may be combined to obtain a single number as a score, which can then be used as a sorting criterion for lists of precursor candidates in data-dependent experiments.

(43) For example, to convert a pair of purity value and m/z width or distance (p, m) into a single 32-bit integer number/score, p is first multiplied by 10.sup.6, rounded to the nearest integer, and again multiplied by 1000, such that 1000≤p≤10.sup.9. m is multiplied by 100 and rounded to the nearest integer, such that 0≤m≤999. Then both converted values are added to obtain an integer score. When used as a sorting criterion, this number format puts more emphasis on the purity than on the m/z width or distance. However, depending on the priorities of the user and targeted applications of the scoring mechanism, it may be beneficial to interchange the values, such that the m/z value is placed at the higher-order digits and the purity value at the lower-order digits.

(44) Filtering Peaks Based on Purity (First Way)

(45) Two different approaches to filtering based on purity score are provided. The first embodiment assumes a defined mass window width W.

(46) The purity filter depends on two parameters: The purity window (W) in m/z units, and the purity threshold (T) in the range 0-1 (or 0-100%). The filter aims at filtering (excluding) candidate peaks with purity values below the threshold T, i.e., only peaks with purity values equal to or above T may pass the filter. The purity window defines the boundaries of the purity determination and is symmetric around the candidate peak, thus yielding the boundaries
W.sub.start=M.sub.c−W/2,
W.sub.end=M.sub.c+W/2,
with M.sub.c being the m/z value of the candidate. The width of the purity window W for each of the candidate peaks may be predefined. This parameter may be defined by a user in a user interface. The width of the purity window may be one of the properties of the purity filter. The width of the purity window may in some cases be set equal to the width of the isolation window.

(47) The candidate purity, p.sub.c, may be interpolated from the parameters of the purity of the ISD, given by p.sub.ISD and d.sub.interf. The window for calculating the candidate purity is usually centred on the candidate peak (although an asymmetry could be introduced by specifying an m/z offset, similar to the isolation offset).

(48) This approach defines a purity value p.sub.c for a candidate peak having the m/z value M.sub.c. If a candidate peak has been annotated beforehand with the purity value pairs as described above, the decision whether to filter (exclude) or pass (include) the peak can be made as follows: 1. If the purity of the entire isotopic distribution p.sub.ISD is 1, or if the purity window W does not include the nearest interfering peak to the candidate peak (i.e., W/2<d.sub.interf), set the candidate purity p.sub.c=1. In other words, if no interference peak is observed in the intended mass window of width W, the purity value p.sub.c has its maximum value 1. 2. Otherwise, if the purity window W is equal to or larger than the width of the m/z window w.sub.ISD (W≥w.sub.ISD), set p.sub.c=p.sub.ISD. I.e. if the whole m/z window w.sub.ISD of the isotopic distribution of the candidate is within the purity window of width W, the purity value of the isotopic distribution p.sub.ISD is most relevant. Likewise, when the distance of the next interfering peak d.sub.interf is the same as the value of w.sub.ISD/2, set p.sub.c=p.sub.ISD. 3. If the purity window W is smaller than the mass/charge window w.sub.ISD of the complete isotopic distribution (i.e., W<w.sub.ISD), and the next interference peak is within the purity window (i.e. d.sub.interf<W/2), the individual isotopic peak purity p.sub.i of the candidate peak has more relevance and is taken into account. The formula for the purity value p.sub.c of a candidate, takes into account all four determined parameters for the calculation. The candidate purity p.sub.c may be calculated via linear interpolation: a. Calculate slope a=(p.sub.ISD−p.sub.i)/(w.sub.ISD/2−d.sub.interf) and offset b=p.sub.i−a.Math.d.sub.interf. Note that the slope is undefined if w.sub.ISD/2.Math.d.sub.interf. This case is caught in step 2. b. Interpolate p.sub.c by calculating p.sub.c=a.Math.W/2+b. c. Exclude (filter) peak if p.sub.c<T; include otherwise. If a peak is not annotated with purity values (which is mostly expected for peaks with low signal-to-noise ratios), it should be filtered out by default.

(49) Illustration of the third case above:
p.sub.c=a.Math.W/2+b
b=p.sub.i−a.Math.d.sub.interf
p.sub.c=a.Math.(W/2−d.sub.interf)+p.sub.i
a=(p.sub.ISD−p.sub.i)/(w.sub.ISD/2−d.sub.interf)
p.sub.c=(p.sub.ISD−p.sub.i)×(W/2−d.sub.interf)/(w.sub.ISD/2−d.sub.interf)+p.sub.i

(50) Conditions:
W<w.sub.ISD
d.sub.interf<W/2
FIG. 3 illustrates this calculation graphically. The values of (W/2−d.sub.interf) and (w.sub.ISD/2−d.sub.interf) in the interpolation are illustrated in the Figure. The relationship between these values and the mass windows and distances to nearest interfering peaks can also be seen.

(51) In the first embodiment, a predefined purity window (i.e. a user-defined purity window) is used to filter the peaks. Filtering is based on a pass or fail test. In other words, include a candidate peak if its purity value is above a user-defined threshold and exclude otherwise. This may help to avoid too strong interferences in the resulting MS.sup.2 spectra. This purity filter may be part of a comprehensive filter library consisting of numerous filters for various peak characteristics (such as intensity, m/z, charge state, etc.). All these filters may perform a pass or fail (include/exclude) test to select candidates for MS.sup.2 according to the user's requirements (for example, only candidates with charge state>1, intensity>1e4, and/or purity>0.8 may be included in MS.sup.2 analysis). In many cases the user may set the purity window equal to the isolation window of the MS.sup.2 experiment. However, this is not mandatory.

(52) Filtering Peaks Based on Purity without Prior Knowledge of Isolation/Purity Window (Second Way)

(53) A second way of determining these interference parameters (scores) is provided below that does not require prior knowledge of the isolation window W. In other words, the user does not need to set a predefined purity window. The information derived from this second way can advantageously be used for choosing an isolation window for fragmentation.

(54) The purity values of the entire isotopic distribution p.sub.ISD and the individual isotopic peaks p.sub.i are calculated in a similar manner to the first way described above.

(55) In the second embodiment, an approach is provided that is independent of a specific mass window width W, for which the MS.sup.2 mass spectra are detected (the isolation window). This approach is based on the fact that in a mass window of 1.4 m/z units (Thomson) for a charge state of z=2, one neighbour peak of an isotopic distribution can be observed with an increased mass and one neighbour peak of an isotopic distribution can be observed with a reduced mass. This is illustrated in FIG. 4A. Moreover, for a charge state of z=3 two neighbour peaks of an isotopic distribution of increased mass can be observed and two neighbour peaks of an isotopic distribution of reduced mass can be observed. This is illustrated in FIG. 4B (not to scale).

(56) It should be noted that in the field of proteomics, precursors with z=1 are generally of low interest (as they often originate from non-analyte background ions) and therefore may be filtered out by a charge state filter. For other applications such as small molecules, in which precursors with z=1 are more interesting, the 1.4-m/z window may be adjusted to also include the adjacent isotopic peak. The purity-based precursor selection techniques described in this application may be especially beneficial to proteomics applications (discovery experiments) due to higher sample complexity and largely unknown peptides in the sample (this is in contrast to non-proteomics applications, which generally have lower sample complexity and a target-oriented workflow).

(57) Where FIGS. 4A and 4B refer to m/z values of “(m.sub.c+1)/2”, “(m.sub.c+2)/3”, “(m.sub.c−1)”, “(m.sub.c−2)” etc., the “+1” or “−1” refers to a difference in mass of approximately 1 atomic mass unit (amu). The “/2” or “/3” refers to an ion charge of 2e or 3e, where e is the elementary charge. This is a simplified illustration. The skilled person will understand that the exact spacing between the peaks of the mass spectrum may not be identical between isotopologue species.

(58) Moreover, the skilled person will further appreciate that different isotopologues may produce peaks having nearly identical m/z values but not exactly the same. For example, peaks having an m/z value 1 m/z unit higher than the most abundant species (in the “m.sub.0+1” position for a charge value of z=1) may have slightly different m/z values. If the m.sub.c peak related to .sup.12CH.sub.3.sup.+ ions, there may be two peaks in the “m.sub.c+1”, the first belonging to .sup.13CH.sub.3.sup.+ and the second belonging to .sup.12CH.sub.2D.sup.+. These peaks at low resolution appear to have identical m/z values differences in the m/z values may be observed at high resolution.

(59) Accordingly if the distance d.sub.interf of the next interfering peak is higher than 0.7 m/z units, the candidate peak is closer to the neighbouring peaks of the ISD than to the interfering peak. Accordingly, the influence of these interfering peaks can be considered to be small and the purity value approximates the purity score for the entire ISD. This approach is therefore not related to a specific mass window width W.

(60) It is recognised that there are benefits in attributing a different interference score on each isotope peak of a cluster, without prior knowledge of an isolation window to be used for fragmentation. The information derived can be used for choosing an isolation window for fragmentation. This process may be achieved through the following steps: a) Isotopic clusters are defined at MS.sup.1 using an isotope and charge state defining algorithm (e.g. APD) b) Every peak is examined and attributed to an isotopic cluster. When isotopic clusters overlap a “cluster overlap score” (or isotopic purity p.sub.ISD) is defined, which is identical for all peaks belonging to the isotopic cluster (also called “isotopologues”). In one implementation, the cluster overlap score is given by the total intensity of the isotopic peaks normalized to the total intensity of isotopic and interfering peaks within the isotopic cluster m/z range, which is symmetric around the most abundant isotopic peak and includes both the lowest-m/z and highest-m/z isotopic peak of the cluster. Other methods can also be used for the assignment of the “cluster overlap score”. c) Each peak of the isotopic cluster is examined and an m/z distance value from the nearest interference, together with the normalized intensity of the interfering peak are used in order to rescore each isotopic peak. If there is more than one interfering peak within the isotopic cluster m/z range, the nearest one above an intensity/significance threshold is selected. d) A list of parent ions is used, and the mass spectrometer moves through the most intense peak with the best non-interference score towards to the less intense peak with the worst interference score.
A decision matrix of intensity/interference score can also be used for different types of experiments in order to maximise the utilisation of this information. e) When filtering precursors based on the amount of interference within a given isolation window, the amount of interference of a precursor within an isolation window that is smaller than or equal to the isotopic cluster width of the precursor can be estimated by exploiting both the interference score (which only takes the nearest significant interference into account) and the cluster overlap score (which takes the entire isotopic cluster into account).

(61) Advantageously, this method can be performed without prior knowledge of the isolation window. Moreover, the isolation window for fragmentation can be adjusted automatically according to the “interference score” of each peak of interest, thus preserving sensitivity on low intensity peaks with good interference score (e.g. using a wider window).

(62) Application of Purity Values

(63) The value pairs (p.sub.i, d.sub.interf) and (p.sub.ISD, w.sub.ISD) can be combined to obtain a single number as a score for the isotopic peak. This score (a) may be used as a sorting criterion for lists of precursor candidates in data-dependent experiments, or (b) it may serve as a selection criterion for a purity filter.

Example: Purity Score as a Sorting Criterion

(64) To convert a pair of purity value and m/z width or distance (p, m) into a single 32-bit integer number/score, p is first multiplied by 10.sup.6, rounded to the nearest integer, and again multiplied by 1000, such that 1000≤p≤10.sup.9. m is multiplied by 100 and rounded to the nearest integer, such that 0≤m≤999. Then both converted values are added to obtain an integer score. When used as a sorting criterion, this number format puts more emphasis on the purity than on the m/z width or distance. However, depending on the priorities of the user and targeted applications of the scoring mechanism, it may be beneficial to interchange the values, such that the m/z value is placed at the higher-order digits and the purity value at the lower-order digits.

(65) Ordering the candidate peaks may provide advantages for the workflow of the peak filtering mechanism, which can operates on spectral peaks rather than isotope distributions. Also, filtering the individual peaks allows selecting those peaks in an ISD that are farthest from an interference within the ISD. For example, if the interference is close to the most intense peak of the ISD (=small distance), isotope peaks with higher distances and thus higher integer scores may preferentially be selected for MS.sup.2.

Example: Purity Score as a Selection Criterion (Purity Filter)

(66) A purity filter filters (excludes) candidate peaks with purity values below a user-defined threshold T (0≤T≤1), i.e., only peaks with purity values equal to or above T may pass the filter and then serve as potential candidates for data-dependent MS/MS experiments.

(67) For compatibility with this filter, the purity value pairs have to be converted into a single isotopic purity score s.sub.i (0<s.sub.i≤1). Without prior knowledge of a user-defined purity window, the purity value can be weighted by the isotope-specific m/z distance d.sub.interf from the nearest interference, so that the score increases (i.e., improves) with increasing distance. This could be achieved, for example, by taking the k-th root of the purity value with k as a function of distance:
s.sub.i=min(p.sub.ISD,p.sub.i).sup.1/k with k=1,2,4,8, . . . (powers of 2)

(68) Using the minimum value of the purity of the entire isotopic distribution, p.sub.ISD, and the purity of the isotopic peak, p.sub.i, emphasizes the individual environment of the isotopic peak. Typically, isolation windows for peptides with common charge states z of 2 or 3 have a width of ˜1.4 m/z units in order to include the second isotopic peak (z=2) and additionally the third isotopic peak (z=3). This window dimension corresponds to a half-width of ˜0.7 if the window is symmetric around the precursor peak (which is mostly the case). Based on these assumptions, k is chosen according to the m/z distance din discrete steps of 0.7 m/z units:

(69) k = 2 n if 0.7 n d < 0 . 7 ( n + 1 ) , n = 0 , 1 , 2 , .Math. i . e . : k = 1 if d < 0.7 k = 2 if 0.7 d < 1.4 .Math.
d can be set equal to d.sub.interf. Alternatively, d can be set to the minimum of d.sub.interf and the half-width of the mass/charge window of the isotopic distribution, w.sub.ISD/2, (d min(d.sub.interf, w.sub.ISD/2)). This may avoid artificially high purity scores for peaks at the edges of the mass/charge window w.sub.ISD. For example, the score for an isotopic peak with d=2, p.sub.ISD=0.5, p.sub.i=0.8 is s.sub.i=0.5.sup.1/4=0.84.

(70) It may be advantageous to choose higher thresholds for larger isolation windows. This is at least because the score increases with increasing distance from the nearest interference peak.

(71) Whilst the above description provides techniques in which the mass/charge window w.sub.ISD of the isotopic distribution and the isolation window are both centred on the most abundant peak, it is not essential that this should be the case. One could allow the user to specify an offset for each of these parameters (similar to the isolation offset that is available in the properties of MS.sup.2 scans) to introduce some kind of asymmetry in the purity calculation.

(72) In principle, there are multiple options with respect to the m/z window for which the purity of the ISD is calculated: A window centred around the most intense peak, including all peaks of the ISD (as suggested above). A window centred around the average m/z value of the ISD. A window defined by the lowest-m/z and highest-m/z peaks of the ISD.

(73) In practice, the common workflow of data-dependent experiments normally trigger the (filtered) peaks in descending order of intensity. As a result, the present disclosure provides a window centred around the most intense peak, including all peaks of the ISD (above a threshold). This may provide certain practical advantages.