MASS SPECTROMETRIC IDENTIFICATION OF MICROORGANISMS IN COMPLEX SAMPLES

20220412988 · 2022-12-29

    Inventors

    Cpc classification

    International classification

    Abstract

    Microorganisms are identified as present in a complex sample or mixed culture by acquiring a mass spectrum of the sample and comparing it to combination spectra, each of which is formed by combining at least two reference mass spectra of known microorganisms. Microorganisms corresponding to the reference spectra used to form the combination spectrum are identified as present in the sample if that combination spectrum exhibits a better match with the sample mass spectrum than any one of reference mass spectra used to form that combination spectrum. It is also possible to identify microorganisms by forming a difference spectrum by subtracting a reference mass spectrum from the sample mass spectrum and comparing the difference spectrum to the reference mass spectra.

    Claims

    1. A method for identifying microorganisms present in a sample and differentiating samples that comprise a single microorganism from those that comprise more than one microorganism, comprising: (a) acquiring a mass spectrum of the sample; (b) comparing the sample mass spectrum to each of a plurality of reference mass spectra, wherein each reference mass spectrum is a mass spectrum of a known microorganism; (c) selecting as a best set of reference mass spectra, those reference mass spectra that are found to most closely match the sample spectrum in the comparisons in step (b); (d) ascertaining whether the best set contains reference mass spectra of different species of microorganism and, if that is the case, combining reference mass spectra of microorganisms of different species in the best set of reference mass spectra to form combination spectra; (e) comparing the sample mass spectrum to the combination spectra; and (f) labeling the sample as comprising more than one microorganism if the combination spectra comparisons in step (e) result in the finding that microorganisms whose reference mass spectra have been combined to form a combination spectrum are present in the sample.

    2. The method of claim 1, wherein selected microorganisms are identified as present in the sample when a match between the sample mass spectrum and a combination spectrum formed from reference mass spectra of the selected microorganisms is better than any match between the sample spectrum and any single reference mass spectrum of the reference mass spectra combined to form the combination spectrum.

    3. The method of claim 1, wherein step (d) comprises: (d1) combining reference mass spectra with a weighting factor for each reference mass spectrum to form a combination spectrum; (d2) determining a similarity indicator between the combination spectrum formed in step (d1) and the sample mass spectrum; and (d3) repeating steps (d1) and (d2) while modifying the weighting factors until the similarity indicator is maximized.

    4. The method of claim 1, wherein selected microorganisms are identified as present in the sample when a match between the sample mass spectrum and a combination spectrum formed from reference mass spectra of the selected microorganisms is better than any match between the sample spectrum and any single reference mass spectrum in the best set of reference spectra.

    5. The method of claim 1, wherein selected microorganisms are identified as present in the sample when a similarity indicator for a match between the sample mass spectrum and a combination spectrum formed from reference mass spectra of the selected microorganisms is greater than a predetermined minimum value.

    6. The method of claim 1, wherein in step (c), reference mass spectra are selected manually to form the best set of reference mass spectra.

    7. The method of claim 1, wherein step (c) comprises ranking each reference mass spectra by closeness of a match between that reference mass spectra and the sample mass spectra and selecting a predefined number of highest ranking reference mass spectra to form the best set of reference mass spectra.

    8. The method of claim 7, wherein the predefined number is between 3 and 20.

    9. The method of claim 1, wherein step (c) comprises ranking each reference mass spectra by closeness of a match between that reference mass spectra and the sample mass spectra and selecting 0.01 to 0.1 percent of highest ranking reference mass spectra to form the best set of reference mass spectra.

    10. The method of claim 1, wherein step (c) comprises computing for each reference mass spectra a similarity factor that indicates the closeness of a match between that reference mass spectra and the sample mass spectra and selecting reference mass spectra with a similarity indicator greater than a predetermined minimum value as the best set of reference mass spectra.

    11. A method for identifying microorganisms present in a sample and differentiating samples that comprise a single microorganism from those that comprise more than one microorganism, comprising: (a) acquiring a mass spectrum of the sample; (b) obtaining a plurality of reference mass spectra, wherein each reference mass spectrum is a mass spectrum of a known microorganism; (c) combining reference mass spectra of microorganisms of different species to form combination spectra; (d) comparing the sample mass spectrum to the combination spectra; and (e) labeling the sample as comprising more than one microorganism if the combination spectra comparisons in step (d) result in the finding that microorganisms whose reference mass spectra have been combined to form a combination spectrum are present in the sample.

    12. The method of claim 11, wherein step (c) comprises combining reference mass spectra of microorganisms that are commonly found in the same location to form combination spectra.

    13. The method of claim 11, wherein step (c) comprises combining reference mass spectra that exhibit the closest matches to the sample mass spectrum.

    14. The method of claim 11, wherein selected microorganisms are identified as present in the sample when a match between the sample mass spectrum and a combination spectrum formed from reference mass spectra of the selected microorganisms is better than any match between the sample spectrum and any single reference mass spectrum.

    15. A method for identifying microorganisms present in a sample and differentiating samples that comprise a single microorganism from those that comprise more than one microorganism, comprising: (a) acquiring a mass spectrum of the sample; (b) obtaining a plurality of reference mass spectra, wherein each reference mass spectrum is a mass spectrum of a known microorganism; (c) subtracting at least one reference mass spectrum from the sample mass spectrum to form a difference spectrum; (d) comparing the difference mass spectrum to the reference mass spectra; and (e) labeling the sample as comprising more than one microorganism if the comparisons in step (d) result in a close match of the difference mass spectrum with a reference mass spectrum.

    16. The method of claim 15, wherein in step (d), a microorganism is identified as present in the sample when a similarity indicator for the match between a reference mass spectrum of that microorganism and the difference mass spectrum is greater than a predetermined minimum value.

    Description

    BRIEF DESCRIPTION OF THE DRAWINGS

    [0026] FIG. 1 is a schematic representation of a first method according to the invention for identifying bacteria in complex samples with the steps S0, S11 and S12.

    [0027] FIG. 2 is a schematic representation of a second method according to the invention with a hierarchical method sequence and the steps S0, S21, S22, S23 and S24.

    [0028] FIG. 3 shows a normalized, measured MALDI time-of-flight mass spectrum (10) of a sample under investigation and a reconstructed mass spectrum (20) derived from a peak list.

    [0029] FIGS. 4A and 4B show two reference spectra (30) and (40) of a best set (REF.sub.B), which both sufficiently match the mass spectrum (20) of the sample, which is also shown, in order for the corresponding microorganisms (FIG. 3A: Pseudomonas aeruginosa; FIG. 3B: Proteus mirabilis) to be regarded as identified in the sample. The microorganisms differ from each other as from their taxonomic order.

    [0030] FIG. 5 shows the mass spectrum of sample (20) and a combination spectrum (50) of two reference spectra of the best set (REF.sub.B), where one of the reference spectra originates from a microorganism of the species “Pseudomonas aeruginosa” and the other reference spectrum from a microorganism of the species “Proteus mirabilis”.

    [0031] FIG. 6 is a schematic representation of a third method according to the invention consisting of the steps S0, S31 and S32 and using difference spectra rather than combination spectra to identify microorganisms in complex samples.

    DETAILED DESCRIPTION

    [0032] While the invention has been shown and described with reference to a number of embodiments thereof, it will be recognized by those skilled in the art that various changes in form and detail may be made herein without departing from the spirit and scope of the invention as defined by the appended claims.

    [0033] A method according to the invention for identifying bacteria in complex samples is schematically presented in FIG. 1. In step S0, a sample under investigation is cultured, and then a MALDI time-of-flight mass spectrum of the sample (MS) is acquired. In step S11, at least two reference spectra (REF) of a database (DB) are combined to form a combination spectrum, or a combination spectrum (CS*) previously stored in the database (DB) is selected. In step S12, the microorganisms of the generated or selected combination spectrum (CS) are only regarded as identified in the sample (ID) if the combination spectrum (CS) fulfils one of the above-stated criteria, e.g. a better match with the mass spectrum of the sample (MS) than any of the reference spectra from which the combination spectrum (CS) was generated. Otherwise, the corresponding microorganisms of the combination spectrum (CS) are regarded as not identified (NO ID).

    [0034] A particularly preferred hierarchical method for identifying bacteria in complex samples is schematically presented in FIG. 2, showing the steps S0 and S21 to S24 described below.

    [0035] In step S0, a sample under investigation is transferred onto a mass spectrometric sample support. The sample is assumed to be, with some probability, a mixture of two or more microbes, either because the colonies on an agar plate were not clearly separated, or the microbes came directly from a blood culture or a culture of other body fluids. The mass spectrometric sample support usually has a large number of spatially separated sample points, onto each of which microbe samples can be loaded. The microbe sample on the sample support is sprinkled with a solution of a conventional matrix substance for ionization by matrix-assisted laser desorption (MALDI). The organic solvent usually penetrates into the transferred cells and destroys them. The solvent subsequently evaporates and the dissolved matrix substance crystallizes, while some of the molecular cell components, particularly soluble proteins, released by the destruction of the cells are incorporated as analyte molecules into the matrix crystals.

    [0036] The matrix crystals and the analyte molecules incorporated therein are bombarded with laser light pulses in the ion source of a time-of-flight mass spectrometer, causing analyte molecules to be desorbed and ionized together with the matrix substance. The analyte ions thus produced are temporally separated in the time-of-flight mass spectrometer due to their mass-dependent time of flight, and are detected in a detector. The measured flight times of the ions are then converted into masses. The overwhelming majority of ions are protein ions which, after the ionization by the MALDI process, are present as singly charged ions (charge number z=1), which is why we can simply refer here to the mass m of the analyte ion, instead of using the correct term “charge-related mass” m/z.

    [0037] The time-of-flight mass spectrometers used for the identification of microorganisms are operated without a reflector because the detection sensitivity then is much higher (“linear operating mode”), although the mass resolution and the mass accuracy are significantly better when using a reflector. But in the reflector mode, only around a twentieth of the ion signals appear, and the detection sensitivity is one to two orders of magnitude worse. The high sensitivity is due to the fact that, in linear mode, not only the stable analyte ions, but also fragment ions and even the neutral particles from metastable decays of the analyte ions are detected. The secondary electron multipliers (SEM) used as the detector detect not only the analyte and fragment ions but even the neutral particles which are created from ion disintegrations during the time of flight, because these neutral particles also generate secondary electrons when they strike the SEM. When a singly charged molecular ion decays into five particles, then necessarily four of them are neutrals. All the fragment ions and neutral particles of one species of molecular ion possess the speed of the molecular ion from which they originated, and therefore reach the detector simultaneously with their molecular ions, creating an increased ion signal. The increased detection sensitivity is often so crucial for the identification of microorganisms that the disadvantages of the linear mode of operation must be tolerated. For this application, one even increases the energy of the desorbing and ionizing laser pulses, which leads to an increased yield of analyte ions, but also strongly increases the number of fragment particles per molecular ion, which is not a problem here for the reasons stated.

    [0038] The acquisition of a mass spectrum with a time-of-flight mass spectrometer usually requires the acquisition of many individual spectra, which are each generated by a single laser pulse and usually added together to form a sum spectrum by adding measurement points with the same flight time. In general, a sum spectrum consists of several hundred individual spectra, for which modern time-of-flight mass spectrometers need only a few seconds. Such a sum spectrum is usually processed further: for example, the time of flight is converted into a mass by a function calibrated before, the background is corrected and the noise in the mass spectrum is filtered out. A peak list is usually generated from the processed sum spectrum.

    [0039] The upper half of FIG. 3 shows a measured MALDI time-of-flight mass spectrum (10) of a sample under investigation, which has been normalized to the value 1. The lower half of FIG. 3 shows a spectrum reconstructed from the peak list (20) of the sum spectrum (10), which contains only the significant signals of the sum spectrum (10) and thus requires much less storage space than the sum spectrum (10). The mass axis in FIG. 3 ranges from 3,000 to 12,000 daltons. At present, the time-of-flight mass spectra used for the identification of microorganisms are usually acquired in a mass range between around 2,000 daltons and 20,000 daltons. The signals in the lower mass range up to about 2,500 daltons are not very usable. For ionization by the MALDI process, the signals in the lower mass range are predominantly attributable to ions of the matrix substance and their clusters, but also to those molecular cell components that vary depending on the culturing and preparation conditions, and are therefore not suitable for a reliable identification. The best identification results are obtained if only the signals in the mass range between 3,000 and 15,000 daltons are evaluated.

    [0040] In step S21, the measured mass spectrum (MS), here the peak list, on which the spectrum (20) is based, is compared with all the reference spectra (REF) which are stored as peak lists in the database (DB). Each comparison is done by a calculation of the similarity indicators. The reference spectra (REF) were obtained previously from pure cultures (isolates) of correctly identified microorganisms. Nowadays, validated commercial databases for the identification of microorganisms contains reference spectra of several thousand different microorganisms.

    [0041] Table 1 shows the genus, species and strain designations together with the similarity indicators of the twenty microorganisms whose reference spectra best match the mass spectrum of the sample (MS), i.e. of the peak list for spectrum (20).

    TABLE-US-00001 TABLE 1 1 Pseudomonas aeruginosa 8147_2_CHB 2.373 2 Pseudomonas aeruginosa DSM 50071T HAM 2.325 3 Proteus mirabilis DSM 50903_DSM 2.257 4 Pseudomonas aeruginosa ATCC 27853_CHB 2.208 5 Pseudomonas aeruginosa 19955_1 CHB 2.189 6 Pseudomonas aeruginosa ATCC 27853 THL 2.187 7 Proteus mirabilis 13210 1_CHB 2.185 8 Proteus mirabilis (PX) 22086112_MLD 2.090 9 Proteus mirabilis 9482_2 CHB 2.085 10 Proteus mirabilis DSM 18254_DSM 2.084 11 Proteus mirabilis 22086103_MLD 2.071 12 Proteus mirabilis DSM 30115_DSM 2.064 13 Proteus mirabilis DSM 46227_DSM 2.059 14 Proteus mirabilis DSM 788_DSM 2.028 15 Pseudomonas jinjuensis LMG 21316 HAM 1.671 16 Proteus penneri DSM 4544_DSM 1.556 17 Pseudomonas resinovorans LMG 2274 HAM 1.426 18 Proteus vulgaris DSM 13387 HAM 1.422 19 Serratia rubidaea DSM 46275 DSM 1.299 20 Proteus vulgaris (PX) 22086129_MLD 1.283 . . .

    [0042] The similarity indicator in the right-hand column of Table 1 is a logarithmic parameter normalized to a maximum value of 3.00 for a complete agreement with the peak list (20). An similarity indicator between 2.3 and 3 is a criterion for a very safe identification of the species. With an similarity indicator of between 2.0 and 2.3, a reliable identification of the genus and a probable identification of the species is assumed. The range of values between 1.7 and 2.0 still allows a probable identification of the genus, while below a value of 1.7 no reliable identification has been obtained.

    [0043] In this example embodiment, the first ten microorganisms of Table 1 are arbitrarily selected as best set of reference spectra (REF.sub.B), which are used in the steps S22 and S23 below. The remaining reference spectra (REF.sub.R) of the database are not considered further in the following steps.

    [0044] In step S22, the investigation focuses on whether the selected microorganisms of the best set differ in respect of their species, genus and/or a higher taxonomic hierarchy level. If all the microorganisms selected are only different strains of the same species, the procedure is halted. If not, the procedure is continued with steps S23 and S24.

    [0045] The best set of reference spectra contains exactly two species of microorganisms: Pseudomonas aeruginosa and Proteus mirabilis. Both microorganisms belong to the taxonomic class “Gammaproteobacteria”, but already differ by their taxonomic orders “Pseudomonadales” and “Enterobacteriales”. The best set contains five reference spectra of microorganisms of the species “Pseudomonas aeruginosa” and five reference spectra of microorganisms of species “Proteus mirabilis”; even the “best” microorganism from the best set (Pseudomonas aeruginosa 8147_2_CHB) has an similarity indicator which is only slightly greater than 2.3. Pseudomonas aeruginosa is a gram-negative oxidase-positive bacterium of the genus Pseudomonas (family: Pseudomonadaceae, order: Pseudomonadales). This widespread bacterium is found in humid environments and, as a pathogen, it plays a significant role in the increasing and often life-threatening hospital infections because it has multiple resistances to antibiotics due to its metabolism and the structure of its cell membrane. Proteus mirabilis is a gram-negative rod-shaped bacterium of the genus Proteus (family: Enterobacteriaceae, order: Enterobacteriales). This bacterium is a facultative pathogen which often occurs in the large intestine even of healthy people, but does not necessarily cause diseases. It can, however, cause additional clinical syndromes in immunodeficient persons, but treatment with antibiotics is usually successful.

    [0046] FIGS. 4A and 4B show the reconstructed reference spectra of “Pseudomonas aeruginosa 8147_2_CHB” (30) and “Proteus mirabilis DSM 50903_DSM” (40), each in comparison with the spectrum of peak list (20) of the sample under investigation. The two reference spectra (30) and (40) exhibit the highest similarity indicators of their relevant species (see rows 1 and 3 in Table 1). In FIGS. 4A and 4B those signals of the peak list (20) which do not occur in the reference spectrum (30) or in the reference spectrum (40) are annotated with a question mark. While the signal (21) of peak list (20) appears in the reference spectrum (30) as signal (31), for example, there is nothing corresponding to signal (22) in the reference spectrum (30). The reference spectrum (40) has, in contrast, a signal (42) which corresponds to signal (22). The two reference signals (30) and (40) appear to complement each other, at least with reference to the signals (21) and (22), which points to a mixed culture and could explain the large number of annotated signals in FIGS. 4A and 4B.

    [0047] For such a sample, where a widespread pathogen of hospital infections is contained in the results list alongside a merely facultative pathogen, rapid and certain identification is particularly important to ensure successful treatment.

    [0048] In step S23, the reference spectra of the different species of the best set (REFS) are combined in all possible pairs to form different combination spectra (CS). From the 10 reference spectra (5× “Pseudomonas aeruginosa”, 5× “Proteus mirabilis”) a total of 25 (5×5) combination spectra (CS) are formed, which are again compared with the peak list (20). For reasons of clarity, Table 2 shows only the best 10 of the 25 pairs of microorganisms with the similarity indicators of the corresponding combination spectrum.

    TABLE-US-00002 TABLE 2 1 Pseudomonas aeruginosa DSM 50071T HAM + 2.640 Proteus mirabilis 13210 1_CHB 2 Pseudomonas aeruginosa DSM 50071T HAM + 2.575 Proteus mirabilis 9482_2 CHB 3 Pseudomonas aeruginosa DSM 50071T HAM + 2.565 Proteus mirabilis DSM 50903_DSM 4 Pseudomonas aeruginosa ATCC 27853_CHB + 2.565 Proteus mirabilis 13210 1_CHB 5 Pseudomonas aeruginosa DSM 50071T HAM + 2.559 Proteus mirabilis DSM 18254_DSM 6 Pseudomonas aeruginosa 8147_2_CHB + 2.549 Proteus mirabilis 13210 1_CHB 7 Pseudomonas aeruginosa 8147_2_CHB + 2.547 Proteus mirabilis 9482_2 CHB 8 Pseudomonas aeruginosa ATCC 27853_CHB + 2.545 Proteus mirabilis 9482_2 CHB 9 Pseudomonas aeruginosa ATCC 27853_CHB + 2.533 Proteus mirabilis DSM 50903_DSM 10  Pseudomonas aeruginosa 8147_2_CHB + 2.513 Proteus mirabilis DSM 50903_DSM . . .

    [0049] FIG. 5 shows the spectrum of the peak list (20) and the combination spectrum (50), which was combined from the reference spectra of “Pseudomonas aeruginosa DSM 50071T HAM” and “Proteus mirabilis 13210 1_CHB” (No. 1 in Table 2). The combination spectrum (50) shows the best match of all 25 combination spectra (CS) and has an similarity indicator which is significantly greater than 2.3. Correspondingly, the number of non-matching signals, which are annotated in the peak list (20) with a question mark, is significantly smaller than for the best reference spectra (30) and (40).

    [0050] In the concluding step S24 of FIG. 2, each of the 25 combination spectra (CS) is investigated to see whether it shows a better match with the peak list (20) than the two reference spectra (REFS) from which it was combined. If this is the case, the corresponding microorganisms of the combination spectrum (CS) are regarded as identified (ID) in the sample. Otherwise, the sample is labeled not identifiable (NO ID). It can be seen that here, the above-mentioned criterion for an identification is fulfilled, so both species “Pseudomonas aeruginosa” and “Proteus mirabilis” are regarded as identified. It should be noted that the similarity indicators of all the combination spectra (CS) are greater than 2.3, which additionally speaks for a very probable identification of both species. A comparison of Tables 1 and 2 makes clear that the combination spectrum from the two best reference spectra of both species (entries 1 and 3 in Table 1) do not show the largest similarity indicator, but are only in tenth place in the ranking.

    [0051] It is within the scope of the present invention that the reference spectra of the best set can be specified in other ways and combined in different ways to form a combination spectrum or several combination spectra. The criterion for the identification of the microorganisms used in this example embodiment is also only one preferred possibility.

    [0052] FIG. 6 is a schematic representation of a further preferred method as defined in the present invention. Step S0 corresponds to the one in FIGS. 1 and 2. In step S31, a difference spectrum (DS) is generated by subtracting at least one reference spectrum (REF) of a database (DB) from the mass spectrum (MS). In step S32, the difference spectrum is compared with the reference spectra (REF) and analyzed to see whether it matches one of the reference spectra (REF) sufficiently well for the corresponding microorganism to be identified with certainty.

    [0053] The features according to the invention detailed in the description of the invention, in the example embodiments and in the figures can each be applied individually or as a combination of several features in order to achieve the objective.