Controlling Hydrogen-Deuterium Exchange on a Spectrum by Spectrum Basis
20170025260 ยท 2017-01-26
Inventors
Cpc classification
H01J49/0036
ELECTRICITY
H01J49/065
ELECTRICITY
H01J49/0072
ELECTRICITY
H01J49/0031
ELECTRICITY
H01J49/0054
ELECTRICITY
International classification
Abstract
A mass spectrometer is disclosed comprising a liquid chromatography device for separating ions. A gas phase ion-neutral reaction device is arranged downstream to perform a gas phase ion-neutral reaction such as Hydrogen-Deuterium exchange. A control system is arranged to automatically and repeatedly switch the reaction device back and forth between a first mode of operation and a second mode of operation, wherein in the first mode of operation at least some parent or precursor ions are caused to react within the reaction device and wherein in the second mode of operation substantially fewer or no parent or precursor ions are caused to react.
Claims
1. A method of mass spectrometry comprising: performing hydrogen-deuterium exchange on ions within a hydrogen-deuterium exchange device and producing at least one measured spectrum of said ions using a mass spectrometer; comparing the at least one measured spectrum with a library of known spectra to identify or characterise the ions on the basis of a characteristic pattern of deuteration.
2. A method as claimed in claim 1, further comprising deconvoluting the at least one measured spectrum, wherein the deconvoluted spectrum is compared with the library of known spectra.
3. A method as claimed in claim 2, wherein said the at least one measured spectrum is deconvoluted by Bayesian inference.
4. A method as claimed in claim 1, comprising switching said hydrogen-deuterium exchange device between a first mode and a second mode to record alternate spectra, wherein in said first mode of operation at least some ions are caused to become deuterated within said second device and wherein in said second mode of operation substantially fewer or no ions are caused to become deuterated.
5. A method as claimed in claim 4, comprising correlating deuterated parent or precursor ions with corresponding non-deuterated parent or precursor ions.
6. A method as claimed in claim 1, comprising separating the ions in a liquid chromatography or capillary electrophoresis device.
7. A mass spectrometer as claimed in claim 1, further comprising a device for supplying a reagent gas or vapour to said hydrogen-deuterium exchange device and wherein said reagent gas or vapour is selected from the group consisting of: (i) deuterated ammonia or ND.sub.3; (ii) deuterated methanol or CD.sub.3OD; (iii) deuterated water or D.sub.2O; and (iv) deuterated hydrogen sulphide or D.sub.2S.
8. A method of mass spectrometry comprising: performing gas-phase ion-neutral reactions on ions within a gas-phase ion-neutral reaction device and producing at least one measured spectrum of said ions using a mass spectrometer; comparing the at least one measured spectrum with a library of known spectra to identify or characterise the ions on the basis of a characteristic pattern of said gas-phase ion-neutral reaction.
9. A mass spectrometer as claimed in claim 8, wherein said gas-phase ion-neutral reaction comprises ozonolysis.
10. A method of identifying or characterising a sample containing one or more components comprising: generating a library containing a list of ions and the characteristic pattern of deuteration for each ion; performing hydrogen-deuterium exchange on the sample to produce at least one measured mass spectrum; and comparing the at least one measured mass spectrum with the library on the basis of the pattern of deuteration to identify or characterise the components of the sample.
Description
BRIEF DESCRIPTION OF THE DRAWINGS
[0077] Various embodiments of the present invention will now be described, by way of example only, and with reference to the accompanying drawings in which:
[0078]
[0079]
[0080]
[0081]
[0082]
[0083]
DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS
[0084]
[0085] In a preferred embodiment the separation device 1 preferably comprises a LC or nano-LC system and preferably includes an ESI/nano or ESI ion source and an Atmospheric Pressure Ionisation (API) inlet. In an alternative embodiment the separation device 1 may comprise an ion mobility separator. According to another less preferred embodiment the separation device 1 may comprise a quadrupole mass analyser or a linear ion trap. Other less preferred separation techniques are also contemplated.
[0086] In a preferred embodiment hydrogen/deuterium exchange is preferably performed within a hydrogen-deuterium exchange device 2 which preferably comprises a stacked ring ion guide comprising a plurality of electrodes each having an aperture through which ions are transmitted in use. A travelling wave or one or more transient DC voltages or transient DC voltage waveforms is preferably applied to the electrodes of the stacked ring ion guide in order to urge ions along at least part of the length of the ion guide. When a relatively high voltage pulse (e.g. 5 to 10 V) is applied to the electrodes using a default travelling wave velocity of 300 m/s then ions are preferably prevented from rolling over the top of the travelling wave. As a result, the ion residence time within the ion guide is relatively short and hence hydrogen-deuterium exchange within the ion guide is effectively disabled since the ion residence time is too short for hydrogen-deuterium exchange to occur.
[0087] According to an embodiment hydrogen/deuterium exchange may be enabled by reducing the amplitude of the travelling wave to a relatively low voltage (e.g. 0.2 V or 0 V). This has the effect of effectively switching OFF the travelling wave voltage and hence the ion residence time increases allowing hydrogen-deuterium exchange to occur.
[0088] According to another embodiment, the amplitude of the travelling wave may be kept constant and hydrogen/deuterium exchange may be controlled by controlling the velocity of the travelling wave. For example, if the amplitude of the travelling wave is set at an intermediate level and the pulse velocity is set very high (e.g. 600 m/s to 1000 m/s) then ions may simply rollover the travelling wave. As a result, the ion residence time is then relatively long and hydrogen-deuterium exchange is enabled. Hydrogen-deuterium exchange may be disabled by setting the pulse velocity to be relatively slower (e.g. 80 m/s to 300 m/s). At lower pulse velocities the ions may be caught by the travelling wave and urged along the length of the ion guide. As a result, the ion residence time is relatively short and hydrogen-deuterium exchange is preferably disabled.
[0089] In other less preferred embodiments hydrogen/deuterium exchange may be performed within an ion guide and the residence time of ions passing through the device may be controlled by other methods.
[0090] According to an embodiment the hydrogen-deuterium exchange device may comprise a segmented multipole device and an axial driving field (DC or pseudo-potential) may be used to urge ions along and through the length of the ion guide.
[0091] In a preferred embodiment a hydrogen/deuterium exchange reagent gas or vapour such as ND.sub.3, CD.sub.3OD, D.sub.2O or D.sub.2S may be provided within the ion guide or hydrogen-deuterium exchange device.
[0092] In a preferred embodiment the analytical mass analyser 3 may comprise a Time of Flight mass analyser or a Fourier Transform electrostatic trap (such as an Orbitrap). In other less preferred embodiments other types of mass analyser may be used.
[0093] According to the preferred embodiment alternate mass spectra are preferably acquired wherein the hydrogen/deuterium exchange device 2 is preferably arranged to be switched ON and OFF between an exchanging and a non-exchanging mode of operation. The resulting mass spectra are preferably deconvoluted using their elution profiles.
[0094] In an embodiment the deconvolution may be performed using a computer algorithm such as BayesSpray to automate and improve the process of matching the hydrogen/deuterium exchange product ions to corresponding precursor or parent ions. The algorithm has previously been used for, and is particularly suited to, deconvoluting complex mixtures of precursor analytes and MS/MS fragments.
[0095] BayesSpray is a Bayesian Markov chain Monte Carlo deconvolution algorithm for mass spectrometry data and the algorithm is described in GB1008542.1 filed 21 May 2010 the contents of which are incorporated into the present application. For each isotopic cluster of peaks, the total signal associated with each level of deuteration is reconstructed and therefore significantly simplifies the data. By associating precursor or parent ions to product ions based on chromatographic retention time the degree of deuterium uptake is then directly depicted. This automated process of deconvolution is preferably used to generate a characteristic list (or fingerprint) of precursor or parent ions and the pattern of deuteration for each precursor or parent ion. In addition, the degree of deuteration of each precursor or parent ion is recorded. Various hydrogen/deuterium exchange specific modifications to BayesSpray (including direct modelling of deuteration) enable the speed of deconvolution and/or the quality of the results obtained in a fixed processing time to be improved.
[0096] In other embodiments other deconvolution techniques may be used.
[0097]
[0098] The system preferably has a four spectrum cycle: (i) parent ion scan i.e. hydrogen/deuterium exchange disabled, fragmentation disabled; (ii) deuterated parent ion scan i.e. hydrogen/deuterium exchange enabled, fragmentation disabled; (iii) fragment ion scan i.e. hydrogen/deuterium exchange disabled, fragmentation enabled; and finally (iv) deuterated fragment ion scan i.e. hydrogen/deuterium exchange enabled, fragmentation enabled. The resulting mass spectra are preferably deconvoluted and fragment ions are preferably assigned to precursor or parent ions using their elution profiles.
[0099] A further embodiment of the present invention is shown in
[0100] The multi-mode HDx devices 5 preferably comprise an ion guide which may be operated either as hydrogen-deuterium exchange device, an ETD device or a CID device.
[0101] The multi-mode ion mobility separator device 6 preferably comprises an ion guide which may be operated either an ion mobility separator, a CID fragmentation device or as an ion guide.
[0102] In a preferred embodiment the two multi-mode HDx devices 5 and/or the ion mobility separator device 6 comprise travelling wave enabled stacked ring ion guides, although other geometries are contemplated. According to an embodiment HDx may be performed in the hydrogen-deuterium exchange device 2, followed by ETD in the first multi-mode HDx device 5, followed by ion mobility separation (IMS) in the ion mobility separation device 6, followed by CID in the second multi-mode HDx device 5. Deconvolution is preferably performed based upon both LC retention time and ion mobility drift time.
[0103] Clearly one skilled in the art may construct other advantageous geometries without detracting from the scope of this invention.
[0104] Experimental data was generated on a modified Waters Synapt hybrid quadrupole Time of Flight mass spectrometer as shown in
[0105] The mass spectrometer was modified by the addition of a gas inlet needle valve connected to the source ion guide gas inlet allowing the introduction of fully deuterated ammonia (ND.sub.3) into the T-Wave ion guide 46 which is arranged upstream of a quadrupole rod set mass filter 47.
[0106] When the needle valve was closed so that deuterated ammonia was not introduced into the travelling wave ion guide 46 then the pressure in the travelling ion guide 46 was 1.4010.sup.3 mbar.
[0107] When ND.sub.3 was introduced into the travelling wave ion guide 46 then the indicated pressure in the travelling wave ion guide 46 was 1.4210.sup.3 mbar.
[0108] Angiotensin I (Asp-Arg-Val-Tyr-Ile-His-Pro-Phe-His-Leu (C.sub.62H.sub.89N.sub.17O.sub.14)) was ionised using a standard ESI probe and triply charged precursor or parent ions having a mass to charge ratio of 432.9 were monitored.
[0109] A mass spectrum of Angiotensin I was obtained under normal conditions (i.e. without introducing ND.sub.3 into the source travelling wave ion guide 46) and is shown in
[0110]
[0111]
[0112] From comparing
[0113]
[0114] Although the preferred embodiment has been described as relating to Hydrogen-Deuterium exchange wherein gas phase ions react with neutral gas, the present invention is also intended to cover other gas phase ion-neutral reactions including ozonolysis.
BayesSpray
[0115] Mass spectrometers can be used for many applications including identification, characterisation and relative and absolute quantification of proteins, peptides, oligonucleotides, phosphopeptides, polymers and fragments or a mixture of these produced inside the mass spectrometer. One of the current limiting factors in the generation of these results is the analysis of the raw data produced from the mass spectrometerin particular, the isolation and mass measurement of species present in complicated mass spectra.
[0116] The data produced by mass spectrometers are complicated due to the ionisation process, the presence of isotopes and the individual characteristics of each instrument.
[0117] Current methods for the analysis of raw data produced from mass spectrometers include maximum entropy deconvolution and various algebraic techniques based on inversion, usually by a linear filter.
[0118] In attempting to deconvolute the data, linear inversion sharpens individual peaks, which has the unfortunate side effect of introducing ringing which damages the reconstruction of complex spectra containing many overlapping peaks. The peaks interfere with each other, and the ringing is liable to produce physically-impossible regions of negative intensity.
[0119] Maximum entropy (see Disentangling electrospray spectra with maximum entropy, Rapid Communications in Mass Spectrometry, 6, 707-711) is a nonlinear maximisation inversion, designed to produce an optimal best possible result from the given data. In spectrometry, the natural measure of quality of a reconstructed mass spectrum I(M) is the entropy:
entropy=I(M)log I(M)dM
[0120] Being negative information, this measures the cleanliness of the result, which result (because of the logarithm) is everywhere positive and so physically permissible. Any spectrum I* other than the maximum entropy spectrum I.sup.MaxEnt has more structure, which by definition was not required by the data, so is unreliable.
[0121] Modern professional standards demand quantified error bars that are produced from probabilistic (aka Bayesian) analysis. In order to understand exactly which parts of the maximum entropy result are reliable and which may be unreliable, one needs not just the best but also the range of the plausible. To estimate uncertainty, quadratic expansion around the maximum entropy result yields a Gaussian approximation which appears to define the uncertainty on any specified feature. This approach has been implemented but the expansion is deceptive.
[0122] Many modern instruments produce high resolution spectra which may be digitised into a correspondingly large number N of bins. As the quality of instrumentation improves, N increases, so that the proportion of signal in any particular bin diminishes as 1/N. The same is true for the variances produced by the quadratic approximation. Hence the size of the error bars around the maximum entropy result decreases more slowly, as the square root of 1/N. The reconstructed signal in a local bin that started comfortably positive as (31) percent becomes, at hundred-fold greater resolution, (0.030.1) percent, with a substantial probability of being negative. Across the entire spectrum, it becomes almost certain that there will be many negatives in a typical result. But signals are supposed to be positive, so almost all supposedly typical results are impossible when viewed on small scales.
[0123] Thus the quadratic approximation breaks down at small scales, where error bars are clearly incorrect so that local structure is not properly quantified. There is therefore a need for an improved deconvolution method with the rigour, power and flexibility to deal with modern instrument performance and applications.
[0124] A method of identifying and/or characterising at least one property of a sample is disclosed, the method comprising the steps of producing at least one measured spectrum of data from a sample using a mass spectrometer; deconvoluting the at least one measured spectrum of data by Bayesian inference to produce a family of plausible deconvoluted spectra of data; inferring an underlying spectrum of data from the family of plausible deconvoluted spectra of data; and using the underlying spectrum of data to identify and/or characterise at least one property of the sample.
[0125] The method may also comprise the step of identifying the uncertainties associated with underlying spectrum of data, e.g. from the family of plausible deconvoluted spectra of data.
[0126] Additionally or alternatively, the deconvolution step may further comprise assigning a prior, for example using a procedure that may comprise one or more, for example at least two steps. The procedure may comprise first assigning a prior to the total intensity and then, for example, modifying the prior to encompass the relative proportions of this total intensity that is assigned to specific charge states.
[0127] Optionally, the deconvolution step may further comprise the use of a nested sampling technique.
[0128] The procedure may comprise varying predicted ratios of isotopic compositions, for example to identify and/or characterise the at least one property of the sample.
[0129] The method may further comprise comparing at least one characteristic of the underlying spectrum of data, e.g. with a library of known spectra, for example to identify and/or characterise the at least one property of the sample.
[0130] The method may also comprise comparing at least one characteristic of the underlying spectrum of data, for example with candidate constituents, e.g. to identify and/or characterise the at least one property of the sample.
[0131] The deconvolution step comprises the use of importance sampling.
[0132] Optionally, the at least one measured spectrum of data may comprise electrospray mass spectral data.
[0133] The method may further comprise recording a temporal separation characteristic for the at least one measured spectrum of data and/or may include storing the underlying spectrum of data, e.g. with the recorded temporal separation characteristic, for example on a memory means.
[0134] The method may also comprise recording a temporal separation characteristic for the at least one measured spectrum of data, e.g. and using the recorded temporal separation characteristic, for example to identify and/or characterise the or a further at least one property of the sample.
[0135] A system for identifying and/or characterising a sample is disclosed, the system comprising: a mass spectrometer for producing at least one measured spectrum of data from a sample; a processor configured or programmed or adapted to deconvolute the at least one measured spectrum of data by Bayesian inference to produce a family of plausible deconvoluted spectra of data and infer an underlying spectrum of data from the family of plausible deconvoluted spectra of data; wherein the processor is further configured or programmed or adapted to use the underlying spectrum of data to identify and/or characterise at least one property of the sample.
[0136] The system may further comprise a first memory means for storing the underlying spectrum of data and/or a second memory means on which is stored a library of known spectra. The processor may be further configured or programmed or adapted to carry out a method as described above.
[0137] A computer program element is disclosed, for example comprising computer readable program code means, e.g. for causing a processor to execute a procedure to implement the method described above.
[0138] The computer program element may be embodied on a computer readable medium.
[0139] A computer readable medium having a program stored thereon is disclosed, for example where the program is to make a computer execute a procedure, e.g. to implement the method described above.
[0140] A mass spectrometer suitable for carrying out, or specifically adapted to carry out, a method as described above and/or comprising a program element as described above a computer readable medium as described above is disclosed.
[0141] A retrofit kit for adapting a mass spectrometer to provide a mass spectrometer as described above is disclosed. The kit may comprise a program element as described above and/or a computer readable medium as described above.
[0142] A method and apparatus for the deconvolution of mass spectral data is provided. This method preferably uses Bayesian Inference implemented using nested sampling techniques in order to produce improved deconvoluted mass specrtral data.
[0143] Bayesian inference is the application of standard probability calculus to data analysis, taking proper account of uncertainties.
[0144] Bayesian inference does not provide absolute answers. Instead, data modulate our prior information into posterior results. Good data is sufficiently definitive to over-ride prior ignorance, but noisy or incomplete data is not. To account for this, the rules of probability calculus require assignment of a prior probability distribution over a range sufficient to cover any reasonable result. A mass range within which the target masses must lie might be specified, and, less obviously, information about how many target masses are reasonable could be provided.
[0145] Prior information must be specified in enough detail to represent expectations about what the target spectrumin the preferred embodiment a spectrum of parent massesmight be, before the data are acquired. One specifies an appropriate range of targets T through a probability distribution:
prior(T)=prior probability of target T
known in Bayesian parlance as the prior.
[0146] There is a huge number of possible targets, depending on how many masses may be present, and the myriad different values those masses and their associated intensities could take. Practical instrumentation usually has a few more calibration parameters as well, which adds to the uncertainty in the target. Nevertheless, it is assumed that the instrument can be modelled well enough that average data (known as mock data) can be calculated for any proposed target (and any proposed calibration). Actual data will be noisy, and won't fit the mock data exactly. The noise is part of the presumed-known instrumental characteristics, so that the misfit between actual and mock data lets us calculate, as a probability, how likely the actual data were. This probability is known as the likelihood:
Lhood(T)=Prob(actual data D GIVEN proposed target T)
which is the other half of the Bayesian inputs (the other being the prior).
[0147] The product law of probability calculus then gives a joint distribution:
[0148] In the presence of complicated data, the possibility of processing the joint distribution through algebraic manipulation rapidly fades, so that it needs to be computed numerically as an ensemble of typically a few dozen plausible targets T.sub.1, T.sub.2, . . . , T.sub.n, accompanied by weights w.sub.1, w.sub.2, . . . , w.sub.n that need not be uniform.
[0149] Methods which yield these weighted ensembles are required. These methods will provide the joint distribution.
[0150] Using the probability product law the other way round gives the Bayesian outputs:
[0151] The evidence measures how well the prior model managed to predict the actual data, which assesses the quality of the model against any alternative suggestions. It is evaluated as the sum of the weights. The posterior is the inference about what the target waswhich is usually the user's primary aim. It is evaluated as the ensemble of plausible targets, weighted by the relative w's.
[0152] The joint distribution thus includes both halves, evidence and posterior, of Bayesian inference. Nested sampling is the preferred method for the computation of this distribution.
[0153] It is easy to take random samples from the prior alone, ignoring the data. Each sample target has its likelihood value, so in principle it might be possible to find the good targets of high likelihood by taking random proposals. The difficulty is that there is too much choice. Suppose a mass spectrum has 100 lines each located to 100 ppm (1 in 10000 accuracy). Only one trial in 10000.sup.100=10.sup.400 will get to the right answer. Obviously, computing 10.sup.400 samples would be prohibitively time consuming and is therefore impractical.
[0154] That example illustrates that the posterior is exponentially tighter than the prior. Every relevant bit of data halves the number of plausible results, so compresses by a factor of 2. Although the number of relevant bits may be considerably less than the size of the (somewhat redundant) dataset, it is still likely to be hundreds or thousands. To accomplish exponential compression, it is essential to bridge iteratively from prior to posterior. A single step can compress by O(1), say a factor of 2, without undue inefficiency, so that the required compression can be achieved in a feasible number (say hundreds or thousands) of iterations.
[0155] The required deconvolution is preferably of electrospray mass spectrometry data. In this case, the data is complicated by the presence of variable charge attached to each target mass. Nested Sampling enables the required probability computation to be accomplished, even in the face of the extra uncertainty of how the signals from each parent mass are distributed over charge.
[0156] Nested Sampling (see Nested sampling for general Bayesian computation, Journal of Bayesian Analysis, 1, 833-860 (2006)) is an inference algorithm specifically designed for large and difficult applications. In mass spectrometry, iteration is essential because single-pass algorithms are inherently incapable of inferring a spectrum under the nonlinear constraint that intensities must all be positive. Nested-sampling iterations steadily and systematically extract information (also known as negative entropy) from the data and yield mass spectra with ever-closer fits.
[0157] Although capable of proceeding to a final maximum likelihood solution, the algorithm is in practice stopped when it has acquired enough information to define the distribution of spectra that are both intrinsically plausible and offer a probabilistically correct fit to the data. After all, any single solution would be somehow atypical, whereas professional standards demand that results are provided with proper estimates of the corresponding uncertainties, which can only be achieved through the ensemble.
[0158] Although nested sampling can in principle cope with arbitrary likelihood and arbitrary prior, it remains advantageous to choose an appropriate prior (the likelihood function being fixed by the responses as specified by the equipment manufacturer). If the assigned prior is not appropriate, the data will be un-necessarily surprising, which shows up as an un-necessarily low evidence value, which in turn takes longer (possibly hugely longer) to compute.
[0159] Particularly in electrospray, it is easy to choose a prior that is not appropriate. This is because a given mass M may carry charges Z varying over a substantial range, perhaps anywhere from 10 to 20 for a mass of 20000. A prior on this distribution is needed, because mock data must be predicted. Given that the charge states appear separately in the observed M/Z data, it might seem reasonable to assign a separate prior for each charge state e.g.:
[0160] However, it then becomes very unlikely that a mass will appear with a low total signal strength, because all 11 individual strengths have to be small before the total can be small. This is not usually expectedreal spectra usually have many weak signals and this, according to the prior, is extremely improbable. Hence nested sampling runs much too slowly, in practice freezing onto any of a variety of wrong answers.
[0161] It is better to use a two-stage prior for the signal strengths. First, a master prior is assigned to the total intensity I. In one embodiment this may be Cauchy:
Prior(I)1(I.sup.2+constant).
[0162] With total intensity fixed, the subsidiary prior on charge state becomes a prior on the relative proportions assigned to specific charges. In one embodiment this may be uniform:
Prior for(Z=10 and Z=11 and . . . Z=20GIVEN I)=constant.
[0163] In another embodiment, the charge-state signals could be correlated and/or weighted by charge. With this sort of two-stage prior, the algorithm no longer freezes inappropriately.
[0164] The immediate output from nested sampling is an ensemble of several dozen typical spectra, each in the form of a list of parent masses. These masses have intensities which are separately and plausibly distributed over charge. Just as in statistical mechanics (which helped to inspire nested sampling), the ensemble can be used to define mean properties together with fluctuations. In this way, nested-sampling results can be refined to a list of reliably inferred masses, with proper error bars expressing statistical uncertainty, and full knowledge of how each mass relates to the data.
[0165] Individual parent masses are accompanied by, maybe dominated by, their isotope distributions. In typical deconvolution, the isotopic composition of a given mass M is fixed at some ratio pattern: [0166] Parent:Isotope#1:Isotope#2: . . . .
given by an average chemical composition. In the standard arrangement mock data is produced from trial parent masses by convolution with this mass-dependent isotope distribution, expanded to cover the charge states, and finally convolved with the instrumental peak shape.
[0167] Another complication in the analysis of mass spectral data is the presence of a variety of naturally occurring or artificially introduced isotopic variants of the elements comprising the molecules being analyzed. Furthermore, deviations from the assumed pattern can occur for particular compositions. These induce harmonic artefacts at wrong masses, as the probability factors try to fit the data better. In one arrangement a distribution: [0168] Prior for (Parent, Isotope#1, Isotope#2, . . . )
of isotope proportions may be used. This distribution should be peaked around the average, but also allow appropriate flexibility.
[0169] For each dataset, an appropriate model of the instrumental peak shape corresponding to an isotopically pure species can be used. For example, a fixed full width at half maximum might be used for quadrupole data, whereas a fixed instrument resolution could be specified for TOF data.
[0170] In a further arrangement, the computation may be reformulated by using importance sampling to reduce the computational load. This statistical method has the side-effect of improving the accuracy and fidelity of the results obtained. In the original embodiment, each parent has a uniform prior over its mass:
prior(M)=flat
and the given likelihood Lhood(M) is used directly. If this is the only mass present, this likelihood yields the joint distribution:
Joint(M)=prior(M)Lhood(M)
which represents the very simplest (single-parent) deconvolution.
[0171] But it is also possible to write:
Joint(M)=density(M)(prior(M)Lhood(M)/density(M))
for arbitrary density. Instead of starting with the prior and applying the likelihood, it is also possible to start with the new density and apply the modified likelihood:
Modified(M)=prior(M)Lhood(M)/density(M)
[0172] If the density removes structure from the likelihood and modifies it to something less sharp and spiky, this will reduce the computational load.
[0173] As it happens, there is a natural density to hand. Most mass spectrometry data is essentially linear, so that:
Mock data=(Linear matrix).Math.(Target masses)
Applying that linear matrix in reverse (as its transpose) to the real data yields a candidate:
density=(transpose of Linear matrix).Math.(real data)
[0174] This density is a doubly-blurred version of the true target, blurred once in the instrument and by the multiplicity of charge state, and again via the transpose. Nevertheless, the computational task of deconvolving it is often very much less than having to start from scratch, with a flat prior. Such a program runs much more quickly and precisely.
[0175] In another arrangement, the data being deconvoluted may come from a TOF, Quadrupole, FTICR, Orbitrap, Magnetic sector, 3D Ion trap or Linear ion trap. In each of these instances, an appropriate model of peak shape and width as a function of mass to charge ratio and intensity should be used.
[0176] In a further arrangement, the data being deconvolved may be produced from ions generated by an ion source from ESI, ETD etc.
[0177] In each of these instances, the distribution of charge states is characteristic of the technique. For example, ions produced by MALDI ionization are usually singly charged, while electrospray produces a distribution over a large range of charge states for large molecules.
[0178] In a yet further arrangement, the data being processed may be from species that have been separated using a separation device selected from the group including but not limited to: LC, GC, IMS, CE, FAIMS or combinations of these or any other suitable separation device. In each case, the distribution over the extra analytical dimensions is treated similarly to the distribution over charge states as described above.
[0179] In a still further arrangement, the data being deconvolved may be produced from a sample containing proteins, peptides, oligonucleotides, carbohydrates, phosphopeptides, and fragments or a mixture of these. In each case, the isotope model or models employed should reflect the composition of the type of sample being analyzed. As part of this embodiment, trial masses may be assigned individual molecule types.
[0180] Although the present invention has been described with reference to the preferred embodiments, it will be understood by those skilled in the art that various changes in form and detail may be made without departing from the scope of the invention as set forth in the accompanying claims.