Adduct-Based System and Methods for Analysis and Identification of Mass Spectrometry Data
20170365458 · 2017-12-21
Inventors
- James R. Collins (Seattle, WA, US)
- Bethanie R. Edwards (Honolulu, HI, US)
- Helen F. Fredricks (Rochester, MA, US)
- Benjamin AS Van Mooy (Falmouth, MA, US)
Cpc classification
H01J49/0036
ELECTRICITY
G06F16/90
PHYSICS
International classification
H01J49/42
ELECTRICITY
Abstract
A system and method to screen a plurality of molecules in datasets obtained from mass spectroscopy, including selecting and receiving at least one dataset of mass spectral data, and selecting customizable m/z mass tolerance peaks to assign initial compound assignments from at least one adduct ion hierarchy database for at least one compound having a parent molecule. Adduct ion hierarchy screening is applied to at least a portion of the dataset, wherein selected dataset features are tested to determine if they represent the most abundant expected adduct of the parent molecule class and if the expected adduct assignment hierarchy are present in the dataset.
Claims
1. A method of screening a plurality of molecules in datasets obtained from mass spectroscopy, the method comprising: (a) selecting and receiving at least one dataset of mass spectral data; (b) selecting customizable m/z mass tolerance peaks to assign initial compound assignments from at least one adduct ion hierarchy database for at least one compound having a parent molecule; and (c) applying adduct ion hierarchy screening to at least a portion of the database, wherein selected dataset features are tested to determine if they represent the most abundant expected adduct of the parent molecule class and if the expected adduct assignment hierarchy are present in the dataset.
2. The method of claim 1 further including pre-processing the dataset including aggregating the dataset into subsets based on at least one feature.
3. The method of claim 2 further including applying a screening criteria to identify and remove secondary isotypes in the subsets.
4. The method of claim 2 further including applying a screening criteria of retention time to at least one subset of the database based on a relationship of retention time of the dataset features as compared with a predefined retention time window of the compound assignment's parent molecule class.
5. The method of claim 4 further including applying a screening criteria to identify and annotate isomer and isobars in the subset.
6. The method of claim 4 further including assigning one or more identifications and annotations to each feature in the subset.
7. The method of claim 1, wherein preprocessing the dataset includes at least one of feature detection, retention time correction, peak grouping, m/z, secondary isotope identification, and a combination thereof.
8. The method of claim 1, wherein the feature is selected from retention time, acyl carbon number, peak, and a combination thereof.
9. The method of claim 1, wherein the first screening criteria includes identifying and excluding secondary isotopes.
10. The method of claim 4 further including applying a screening criteria to the subset to exclude one or more specific molecules, specific chemical moieties or molecules containing a specific number of one or more chemical moieties, in a customizable manner.
11. A system for screening a plurality of molecules in mass spectroscopy datasets, the system comprising a processor programmed to execute the method of claim 1, and a user interface to present screening results to a user.
12. A computer program for screening a plurality of molecules in mass spectroscopy datasets, the program comprising the method of claim 1, wherein said program is executed on a computer device.
13. The method of claim 4, further including formatting the screened subset such that it will be analyzable by additional software on a computer device.
14. The method of claim 4, further including performing statistical analysis on the subset after adduct ion hierarchy screening is applied.
15. The method of claim 4, further including exporting the screened subset to a common file format readable by additional software on a computer device.
16. The method of claim 4 wherein the method annotates the resulting subset with codes demarking the degree to which the assignment complies with the hierarchy screening criteria.
17. The method of claim 1 wherein at least one adduct ion hierarchy database is generated in at least one of a positive ion mode and a negative ion mode.
18. The method of claim 1 further including generating the dataset utilizing at least one additional chemical that is added to an eluent to which the molecules are exposed at least prior to the mass spectroscopy.
19. The method of claim 1 further including preparing at least one adduct ion hierarchy database, and selecting that database for screening the dataset.
20. The method of claim 1 further including generating adduct ion hierarchy databases from empirical data produced from standardized parent molecules that have undergone ionization and measurement by mass spectrometry.
21. The method of claim 17 further including ranking the databases by adduct ion ranking.
22. The method of claim 1 wherein the mass spectrometry data is selected to include at least one of liquid chromatography-mass spectrometry data, gas chromatography-mass spectrometry data, Fourier transform mass spectrometry data, direct infusion mass spectrometry data, capillary electrophoresis mass spectrometry data, ion mobility shift mass spectrometry data, desorption electrospray ionization mass spectrometry data, nanostructure initiator mass spectrometry or matrix assisted mass spectrometry data.
23. The method of claim 1 further including generating a confidence value for the identifications assigned to each feature.
Description
BRIEF DESCRIPTION OF THE DRAWINGS
[0020] The drawings described herein constitute part of this specification and include exemplary embodiments of the inventive system and methods, also referred to as the software, which may be further embodied in various forms. It is to be understood that in some instances, various aspects of the invention may be shown exaggerated or enlarged to facilitate an understanding of the invention. One or more drawings and Tables can be generated and presented on a user interface. In what follows, preferred embodiments of the invention are explained in more detail with reference to the drawings, in which:
[0021]
[0022]
[0023]
[0024]
[0025]
[0026]
[0027]
[0028]
[0029]
BRIEF DESCRIPTION OF THE TABLES
[0030] Table 1. Quality control samples of known composition analyzed by the inventive software system.
[0031] Table 2. Progressive screening and annotation of the P. tricornutum dataset using xcms, CAMERA, and the inventive software system.
[0032] Table 3. Database dimension and ranges of structural properties considered for each lipid class.
[0033] Table 4. Relative abundances, by rank, for adduct ions of lipid and oxylipin species in the database for Example 1.
[0034] Table 5. Pigment abbreviations used in the software system's database.
[0035] Table 6. Retention time window criteria for various compounds and compound classes.
[0036] Table 7. xcms, CAMERA and inventive software system settings used in analysis of the P. tricornutum dataset.
[0037] Table 8. Evaluation of method performance using IPL standards and alternative software systems for feature detection and chromatographic alignment.
[0038] Table 9. Annotation of isomers and isobars in screened P. tricornutum dataset.
[0039] Table 10. List of groups of P. tricornutum lipidome components determined by similarity profile analysis on 0 μM and 150 μM H.sub.2O.sub.2 treatments at 24 hours.
[0040] Table 11. Molecular Characteristics of IPL, ox-IPL, and TAG observed in P. tricornutum after 24 hours.
[0041] Table 12. Examples of isomers and isobar annotation for the P. tricornutum dataset.
[0042] Table 13. Example of confidence codes available for the system to annotate compound assignments.
DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS
[0043] The described features, advantages, and characteristics may be combined in any suitable manner in one or more embodiments. One skilled in the relevant art will recognize that the system may be practiced without one or more of the specific features or advantages of a particular embodiment. In other instances, additional features and advantages may be recognized in certain embodiments that may not be present in all embodiments.
[0044] This invention may be accomplished by a system and/or method of screening a plurality of molecules in datasets obtained from mass spectroscopy, including selecting and receiving at least one dataset of mass spectral data, optionally pre-processing the dataset including aggregating the dataset into subsets based on at least one feature, and optionally applying a first screening criteria to identify and remove secondary isotypes in the subsets. The method further includes selecting customizable m/z mass tolerance peaks to assign initial compound assignments from at least one adduct ion hierarchy database for at least one compound having a parent molecule, and optionally applying a second screening criteria of retention time to at least one subset based on a relationship of retention time of the dataset features as compared with a predefined retention time window of the compound assignment's parent molecule class. The method also includes applying adduct ion hierarchy screening to at least a portion of the database, such as to at least the subset, wherein selected dataset features are tested to determine if they represent the most abundant expected adduct of the parent molecule class and if the expected adduct assignment hierarchy are present in the dataset.
Design and Scope of Adduct Ion Databases.
[0045] The inventive software system draws compound assignments from customizable databases that contain structural and adduct ion abundance data for any molecule that generates a series of reproducible ionization reactions, including nonpolar lipids, IPL, ox-IPL, and oxylipins (Table 3; Table 5). Each database entry represents a different adduct ion of a potential analyte; because analytes present differently in positive and negative ionization modes, separate onboard databases have been generated for compound identification in each mode and custom databases can be supplied by the user for each mode. In one embodiment, the software system includes at least one default database that contain entries for 14,068 unique compounds, some of them particular to marine algae (Table 3; Table 5). Alternatively, some embodiments allow the user to generate their own databases. The use of onboard databases is one distinguishing feature of the present software system over other conventional software packages that rely exclusively on external databases.
Adduct Ion Database Generation.
[0046] Databases are created in the software system by pairing empirical data with an in silico simulation. The onboard databases present in some embodiments were generated by first calculating exact masses for various triacylglycerols (TAG), free fatty acids (FFA), polyunsaturated aldehydes (PUA), and molecules belonging to eight different classes of intact polar diacylglycerol (IP-DAG). Within each of these classes, the masses of a wide range of possible structures having fatty acid (FA) moieties of different acyl chain length, unsaturation, and oxidation were calculated (Table 3). The exact masses for several photosynthetic pigments common to the marine environment were also included (Table 3; Table 5). Molecules and adduct ions are further identified by the “sum composition” of their constitutive double bonds and acyl carbon atoms in each compound (e.g., PC 34:1, rather than PC 16:0-18:1).
Determination of Relative Abundances of Adduct Ions for Inclusion in Databases.
[0047] Different chemical structures (e.g. having a different shape and/or polarity) cause different molecules to interact in distinct and reproducible ways with the LC/MS eluent chemicals, including different chemicals that may be added to the LC/MS eluents. The resulting adducts are ranked based on the overall proportion of each type of adduct ion present for the specified conditions. During database generation, the system uses empirical data for the LC/MS adduct ion(s) typically formed by each compound's parent, either from default databases such as shown in Table 4, or from user-supplied databases. Multiple entries of adduct ions for each parent compound, also referred to as a parent molecule, are entered into the database, each entry representing one commonly-formed adduct ion. The ranking of adduct ions form the basis for the hierarchy-based screening of compound assignments by systems according to the present invention. In a preferred embodiment, authentic standards for representative compounds were used to confirm any onboard databases.
[0048] In many embodiments, a series of tables can be used with the software system to define additional analytes and adducts beyond those which are included in the onboard databases. For each new molecule or molecule class, the software system requires (1) the elemental composition of the new molecule or parent molecule of the new molecule class, (2) a tabulation of expected adducts (defining, as necessary, any new adducts), (3) empirical adduct hierarchy data for any new adducts, and (4) if applicable, the ranges of acyl carbon atoms, double bonds, and oxidization states for which entries are to be generated.
Lipidomics Workflow Based on Xcms, CAMERA, and the Software System.
[0049] Data is supplied to the inventive software system, step 102,
Database Assignments and Progressive Screening Using Orthogonal Criteria.
[0050] In many embodiments, after pre-processing,
[0051] First, initial compound assignments are applied to features from the database using a narrow, customizable m/z mass tolerance specified by the user, step 120,
[0052] Next, the software system screens the feature's retention time against a retention time “window” defined for the accompanying assignment's parent lipid class, steps 132 and 134. Many preferred embodiments include a set of default retention time window data (Table 6) for the chromatographic conditions described herein. Additional, optional filters can be applied in some embodiments to exclude assignments, steps 134 and 136, of specific molecules, for example: IPL, ox-IPL, FFA, and PUA that may contain one or more specific properties (e.g. an odd total number of acyl carbon atoms). Some embodiments would apply filters specific to data derived exclusively from eukaryotic origin, because non-acetogenic fatty acid synthesis is confined almost exclusively to bacteria and archaea, allowing for improved and faster analysis of the mass spectrometry data.
[0053] After applying the above initial optional criteria and deciding which assignments to retain, step 138,
[0054] In one construction, the system determines whether the user elected even/odd carbon number screening, step 140,
[0055] Continuing the explanation of one construction of the present invention, after retaining the specific assignment, the system then determines if the assignment represents the most abundant adduct, step 148. More particularly, is the adduct represented by this assignment the most abundant expected adduct of the parent compound? If not, it is determined whether the most abundant expected adduct of the parent compound was also identified in its pseudo-spectrum, step 150. If not, the assignment is discarded, step 152, and the terms “C4” or “C5” may be designated as value 151 in certain constructions. The term “value” is also referred to herein as “code” or “signal”. An example of codes utilized in one construction according to the present invention is provided in Table 13. Returning to
[0056] The code or value C2a indicates that the adduct ion hierarchy for the parent compound is completely satisfied, that is, the pseudo-spectrum contains peak-groups representing every adduct ion of the compound of greater theoretical abundance than the least abundant adduct ion present. The code or value C2b indicates that the adduct ion of greatest theoretical abundance and some lesser adduct ion is present, but adduct ions of intermediate abundance are not observed. These “annotation code” values, when designated, assist the system and/or the user in evaluating assignment confidence during subsequent data analysis. In other words, in some embodiments the software system annotates the resulting assignment data using simple codes that indicate the degree to which the assignment complies with the hierarchy rules. Assignments that fail the adduct ion hierarchy screening criteria are excised from the dataset and all remaining assignments in the dataset are then pooled.
[0057] If the system determines in step 148,
[0058] In many preferred embodiments, additional rules-based screening is performed on the pooled data to identify and annotate possible isomers and isobars, such as isomer and isobar detection and annotation represented by flowchart 100d,
[0059] In step 176,
[0060] Codes can be applied to identify positional or regio-isomers, functional structural isomers, or isobars. The software system can apply one or more codes to a given assignment as long as the criterion for each is satisfied. Upon completion of screening, some embodiments of the software system produces an annotated dataset, while other embodiments of the software system produce computer code containing the annotated dataset for additional computer software. Some embodiments of the software system will then perform statistical analysis on the final matrix of compound assignments. Some embodiments will export the final results to a common file format for external analysis.
Example 1
Model Dataset Used to Demonstrate the Software System.
[0061] Oxidative stress is an imbalance between reactive oxygen species and an organism's ability to detoxify the reactive molecules and repair any damage caused by the reactive molecules. Oxidative stress is believed to play important roles in the pathogenesis of many human diseases, including cancers, autism, infections and Parkinson's disease, to name a few. Understanding and measuring an organism's level of oxidative stress is an important step of identity and treating human disease before it can detrimentally impact the individual.
[0062] One use of the present system is to examine the effect of oxidative stress on a model algal lipidome, providing for a better understanding of the mechanisms and effects of oxidative stress. The present software system takes mass spec data collected from cultures of a mutant strain of the marine diatom Phaeodactylum tricornutum, which was designed for studies of oxidative stress. In this specific example, a strain of P. tricornutum (CCMP2561; Provasoli-Guillard National Center for Marine Algae and Microbiota) was genetically modified to express a reduction-oxidation sensitive green fluorescent protein (roGFP) at different locations within the cell. Cultures of the transformants were treated with three concentrations of H.sub.2O.sub.2 (0, 30, and 150 μmol L.sup.−1) to evaluate the effects of peroxidation. The software utilized in this Example can be found in one or both of the following code repositories of Github at https://github.com/vanmooylipidomics/LOBSTAHS and Bioconductor at http://bioconductor.org/packages/release/bioc/html/LOBSTAHS.html and are incorporated herein by reference.
Sample Collection and Extraction.
[0063] In this example, duplicate samples for lipid analysis were collected from each treatment at 4 hour, 8 hour, and 24 hour timepoints. Two procedural blanks were also collected. Sample material was collected by vacuum onto 0.7 μm pore size glass fiber filters (GF/F), which were snap frozen in liquid nitrogen and then stored at −80° C. until thawed for extraction. Extraction was performed using a modified Bligh and Dyer method described in Popendorf et al.; an internal standard (dinitrophenyl-phosphatidylethanolamine, DNP-PE) and a synthetic antioxidant (butylated hydroxytoluene, BHT) were added at time of extraction. Lipid extracts were transferred to 2 mL HPLC vials, topped with argon, and stored at −80° C. prior to analysis. All chemicals used in sample extraction and chromatography were LC/MS grade or higher. Where used, water was obtained from a Milli-Q system without further treatment (EMD Millipore, Billerica, Mass., USA).
HPLC-ESI-MS Analysis.
[0064] Samples from the P. tricornutum dataset were analyzed by HPLC-ESI-MS using a modification of the method described in Hummel et al. Lipid extracts were evaporated to near dryness and reconstituted in a similar volume of 7:3 acetonitrile:isopropanol. Headspace was filled with argon to minimize further oxidation. For HPLC analysis, an Agilent 1200 system (Agilent, Santa Clara, Calif., USA) comprising temperature-controlled autosampler (4° C.), binary pump, and diode array detector, was coupled to a Thermo Exactive Plus Orbitrap mass spectrometer (ThermoFisher Scientific, Waltham, Mass., USA). Chromatographic conditions, electrospray ionization source settings, MS acquisition settings, and procedures used for calibration of the mass spectrometer are described in the Supporting Information. Using authentic standards and two independent methods for MS feature detection, we determined the average relative mass uncertainty of the Exactive was <0.2 ppm (Table 1; Table 8).
Analysis of P. tricornutum Data Using the Software System.
[0065] The software system was then used to identify and annotate lipidome components in the positive ionization mode data. In this example, the embodiment used the R package IPO to optimize settings for several xcms functions, and a 2.5 ppm mass uncertainty tolerance was used to obtain database matches in the software system. In other embodiments, the IPO functionality will be included in the software system. Using the annotated output obtained from the software system, the relative abundances of lipidome constituents present in the 0 and 150 μM H.sub.2O.sub.2 treatments at 24 h was calculated. Statistical techniques were used to identify biomarkers of oxidative stress. Unless otherwise noted, the analysis was restricted to only “high confidence” assignments; these were assignments without structural isomers or isobars given codes of C1 or C2a according to the logic in
Screening and Annotation of P. tricornutum Data in the Software System.
[0066] In this example the software system identified 21,869, or 6.4%, of the 340,991 mass spectral features initially detected in the dataset. Sequential application of the various screening criteria allowed for the exclusion of features from the dataset based on specific characteristics (Table 2). Of these initial features, 177,053, or 52%, were immediately eliminated as likely secondary isotope peaks identified by the software system. The 163,938 remaining features were then matched at 2.5 ppm against entries in the default positive mode database. The software system was then used to perform screening based on feature retention time and assignment total acyl carbon number. The software system excluded 7,792 features because the retention time fell outside the range expected for the assignment's parent lipid class. An additional 7,733 features were eliminated because the compound assignment did not contain an even total number of acyl carbon atoms; this optional restriction was applied given the known eukaryotic origin of the data. Adduct ion hierarchy screening was then applied to the remaining 52,337 features. Application of this final orthogonal filter yielded a dataset containing 2,056 compound assignments; these assignments represented 1,969 unique parent compounds (Table 2).
[0067] The identities of 1,163, or 57%, of these final database assignments were unique within the scope of the database, meaning the underlying features were matched in the final dataset to only one possible parent compound. 1,149 of these assignments were either IPL, ox-IPL, or TAG as shown in
[0068]
Identification and Annotation of Isomers and Isobars.
[0069] The remaining 893 assignments (43.4%) were characterized by some degree of ambiguity, meaning the dataset contained at least one isobar or structural functional isomer of the underlying features (Table 9; symbols with lightest tones in
The five shading symbols listed in the upper left of each of the “A” series of
(1) “High & moderate confidence IDs.sup.a”, the darkest tones indicate high and moderate confidence IDs (identifications) for which no structural isomers or isobars were detected; these are compounds annotated with codes “C1,” “C2a,” or “C2b” in the LOBSTAHS workflow illustrated in
(2) “Functional structural isomer(s) present.sup.b”, ≧1 structural isomer of an adduct of this compound is present in dataset (
(3) “Isobars present.sup.c”, adduct ion of ≧1 other compound is an isobar of the dominant adduct of this compound; i.e., m/z of the adducts are the 2 ppm match tolerance used in initial assignments (
(4) “Doubly ambiguous ID.sup.d”, ≧1 structural isomer and 1 competing assignment of second type both present; and
(5) “≧regioisomers identified in dataset.sup.e”, compounds of which multiple regio-isomers were identified in single sample, indicating possible oxidation of the same parent molecule at different structural positions.
Additionally, the double, angled arrows in the lower right above the phrases “+double bond” and “+acyl.sup.f carbon” in each of
[0070] In 752 instances, the dominant adduct of the parent compound was a (functional) structural isomer of the dominant adduct of a different compound assigned from the database (Table 12, first example). In 195 cases, the dominant adduct ion of the parent compound was an isobar of the primary adduct ion of a different compound (Table 12, second example). The 195 ambiguous assignments represented 43.4% of all assignments in the screened dataset, they belonged to just 25% of retained features (27% of peak groups; Table 9). The difference was due to the presence of a small number of features (793) whose 54 assignments were doubly ambiguous, i.e., having both isobars and functional structural isomers (symbols with two-tone shading in
Annotation of Potential Regioisomers.
[0071] The software system also identified regioisomers for 352 unique parent compounds in the P. tricornutum lipidome (Table 9; symbols with black dots in
Evaluation of Screening and Identification Performance Using Two Methods.
[0072] As a means of validating the accuracy and reliability of the software system's approach, the software system was made to identify and annotate all species present in 5 quality control (QC) samples of known composition that were interspersed randomly with samples from the P. tricornutum dataset prior to analysis on the mass spectrometer (Table 1; Table 8). Table 12 provides examples of isomer and isobar annotation from the P. tricornutum dataset. The samples contained a mixture of authentic IPL standards that has been used extensively in other work. Because the choice of pre-processing software can have a significant impact on feature detection, the software was used in parallel with an alternative software program, MAVEN. In both cases, the inventive software system correctly identified all components of the standard mixture without ambiguity (Table 1; Table 8). As a second means of validation, the software system was run on two independent inventories of the P. tricornutum lipidome and assignments made by the software system were compared. The software system found and identified with high confidence 13 of the 16 most abundant IPL and TAG species in one inventory, and nearly all in the other. This additional tests further proves the utility of the inventive software system.
Resilience of Core P. tricornutum Lipidome under Oxidative Stress.
[0073] Evidence of the effect of oxidative stress on the lipidome of P. tricornutum was observed through comparison of compounds identified in 0 and 150 μM H.sub.2O.sub.2 treatments at 24 h (
[0074] As noted above,
Differences in Degree of Remodeling Between Lipid Classes and Functional Groupings.
[0075] The software system's similarity profile analysis of the scaled data was used to place the annotated features into 181 groups of components which clustered significantly according to their behavior (
Fatty Acid Chain Elongation is an Apparent Response to Oxidative Stress in the Chloroplast.
[0076] Oxidative stress appeared to induce elongation of fatty acids throughout the P. tricornutum lipidome (Table 11). Lipid moieties upregulated by oxidative stress had longer fatty acid chains than those that were downregulated. The greatest breadth of structural change was in monogalactosyldiacylglycerol (MGDG), a lipid typically localized to the chloroplast (Table 11;
Significant Enrichment Observed in TAG.
[0077] Whereas the impact of oxidative stress within most lipid classes was confined to relatively modest changes in structural properties, treatment with 150 μM H.sub.2O.sub.2 induced a very significant enrichment in the fraction of peak area the software system identified as triacylglycerols (TAG;
Example 2
[0078] Conversion of .raw Data Files to .mzXML Format.
[0079] After acquiring data from the mass spectrometer, the software system converts all Thermo .raw files in a given dataset to the open-source .mzXML format, which is used by many chromatographic alignment and peak picking applications. It then converts the profile-mode mass spectral data in each file to a series of centroids. Finally, the software system automates the extraction of the positive and negative ion mode full scan events from each sample into separate files. In the Exactive instrument configuration described above, the full scan events from the two ion modes appeared in each data file as the first and third scan events at each time point, respectively. (The second and fourth scan events at each time point were the positive and negative mode AIF scans.) The extraction and separation of scans from the two ion modes was necessary to accomplish subsequent analysis using the pipeline.
Sample Injection, Chromatography and ESI Source Settings.
[0080] 20 μL injections of sample were made onto a C8 Xbridge HPLC column (particle size 5 μm, length 150 mm, width 2.1 mm; Waters Corp., Milford, Mass., USA). Eluent A consisted of water with 1% 1M ammonium acetate and 0.1% acetic acid. Eluent B consisted of 70% acetonitrile, 30% isopropanol with 1% 1M ammonium acetate and 0.1% acetic acid. Gradient elution was performed with the following program (total run time 30 min) at a constant flow rate of 0.4 mL min.sup.−1: 45% A for 1 min to 35% A at 4 min, then from 25% A to 11% A at 12 min, then to 1% A at 15 min with an isocratic hold until 25 min, and finally back to 45% A for 5 min column equilibration. ESI source settings were: Spray voltage, 4.5 kV (+), 3.0 kV (−); capillary temperature, 150° C.; sheath gas and auxiliary gas, both 21 (arbitrary units); heated ESI probe temperature, 350° C.
Mass Spectrometer Acquisition Settings.
[0081] Mass data were collected on a ThermoFisher Exactive Plus Orbitrap instrument in full scan (FS) and all-ion-fragmentation modes (AIF) while alternating between positive and negative ion modes. A scan range of 150-1500 m/z was used for all modes in sequence (FT MS positive full scan, FT MS positive AIF, FT MS negative full scan, and FT MS negative AIF, respectively). The S-lens RF level was set to 85.00. Mass resolution was set to the maximum possible value of 140,000 (FWHM at m/z 200) for both FS and AIF. This mass resolution setting corresponded to an observed resolution of 75,100 at the m/z (875.5505) of the internal standard, DNP-PE. The observed resolution at m/z 1269.0952, that of the compound in the screened dataset with the highest molecular weight (TAG 76.6+4O), was 41,100. Using these settings, 8 and 14 MS scans across a typical peak were obtained.
Procedures Used for Weekly and Real-Time Calibration of the Exactive.
[0082] The mass spectrometer was calibrated weekly in both positive and negative ion modes by infusing calibration mixes available from ThermoFisher Scientific. Low-level eluent contaminants were also utilized as lock masses, providing real-time recalibration; C16:0 (255.23295) and C18:0 (283.26425) fatty acids were used in negative ion mode, while a polysiloxane (536.16537) and phthalate (391.28429) were used in positive ion mode. At least one of the lock masses was found during each positive and negative full scan event.
Script for Pre-Processing Data in xcms and CAMERA.
[0083] In this embodiment, the software system accepts data preprocessed by the xcms and CAMERA scripts (utilized in the R computer program) using the script “prepOrbidata.R.” The user can further modify the script as necessary. The R package IPO was used to optimize settings for xcms and CAMERA, obtaining the parameter values given in Table 7. We used these parameter values to obtain the results presented in the text.
Determination of Retention Time Window Data.
[0084] The retention time (RT) window data in Table 6 were obtained primarily from authentic standards for representative compounds of each parent lipid class under the chromatographic conditions described in above. Observations of various lipids in environmental samples allowed the consideration of additional species. While the software system applies the retention time data contained in Table 6 as a default, detailed instructions and an example data table are included in the onboard documentation for use with retention time data for other chromatographic methods. As for the adduct ion hierarchy data, retention time data for ox-IPL are inherited from the unoxidized parent molecule. By default, the software system expands the retention time window for each lipid class by 20% of its given width to account for (1) shifts in retention time that may occur during chromatographic alignment with xcms and (2) slight variations in retention time that distinguish the different positional (i.e., regio-) isomers of the same parent lipid. This window can be narrowed or expanded with user input.
Analysis of Positive Ionization Mode P. tricornutum Data Using Xcms, CAMERA, and One Embodiment of the Software System.
[0085] To examine the effect of oxidative stress on the P. tricornutum lipidome, the software system workflow in
Choice of Matching Tolerance.
[0086] To account for variability in performance expected from natural samples, a 2.5 ppm mass uncertainty tolerance was used when matching against the databases. This tolerance was one order of magnitude more conservative than the 0.22 ppm mass uncertainty observed with authentic standards (Table 1 and Table 8), yet considerably more restrictive than the various default standards used for matching in other recently introduced metabolomics applications. When combined with HPLC separation and the high mass resolution of the Exactive, the 2.5 ppm tolerance still allowed for the assignment of distinct identities to isobaric masses.
Statistical Analysis and Visualization of the P. tricornutum Lipidome.
[0087] The software system workflow is designed to facilitate examination of relative changes in the abundances of lipids in a given dataset, not to enable absolute quantification of specific analytes or direct comparisons between datasets. With this in mind, the annotated output from the software system was used to calculate the relative abundances of P. tricornutum lipidome constituents present in the 0 and 150 μM H.sub.2O.sub.2 treatments at 24 h. The analysis was performed as follows: [0088] 1. The processed dataset was extracted using the “PtH2O2_mz-rt_plots.R” script and a subset of “high confidence” assignments to be used in all subsequent analyses (i.e., assignments annotated with codes C1 or C2a and having no identified structural isomers or isobars;
In this example, the heatmaps and dendrogram in
[0091] Mass spectrometry workflow is enhanced according to the present invention by high-throughput annotation and putative identification of ionizable molecules in high-mass-accuracy HPLC-MS data. Orthogonal, rules-based screening criteria are utilized based on adduct ion formation hierarchy patterns and other properties to accurately identify compounds. A confidence value may be generated for each assignment, and a user interface such as a display screen or a printer may present screening results such as tables, graphs, diagrams or lists to a user.
[0092] Reference throughout this specification to “one embodiment,” “an embodiment,” “one construction”, or similar language means that a particular feature, structure, or characteristic described in connection with the embodiment is included in at least one embodiment. Thus appearances of the phrase “in one embodiment,” “in one construction,” and similar language throughout this specification may, but do not necessarily, all refer to the same embodiment.
[0093] Although specific features of the present invention are shown in some drawings and not in others, this is for convenience only, as each feature may be combined with any or all of the other features in accordance with the invention. While there have been shown, described, and pointed out fundamental novel features of the invention as applied to a preferred embodiment thereof, it will be understood that various omissions, substitutions, and changes in the form and details of the devices illustrated, and in their operation, may be made by those skilled in the art without departing from the spirit and scope of the invention. For example, it is expressly intended that all combinations of those elements and/or steps that perform substantially the same function, in substantially the same way, to achieve the same results be within the scope of the invention. Substitutions of elements from one described embodiment to another are also fully intended and contemplated. It is also to be understood that the drawings are not necessarily drawn to scale, but that they are merely conceptual in nature.
[0094] It is the intention, therefore, to be limited only as indicated by the scope of the claims appended hereto. Other embodiments will occur to those skilled in the art and are within the following claims.