Method for the spectrometric characterization of microorganisms
11661620 · 2023-05-30
Inventors
Cpc classification
C12Q1/04
CHEMISTRY; METALLURGY
International classification
Abstract
The invention relates to a method for the spectrometric characterization of microorganisms, comprising: providing a test microorganism; acquiring spectrometric measurement data from the test microorganism under potential exposure to variance that is not based on taxonomic classification; selecting a classifier which is trained to determine the identity of a microorganism on a second taxonomic level; and applying the classifier to the measurement data in order to determine the identity of the test microorganism on the second taxonomic level, wherein the classifier is variance-conditioned in such a way that it largely or completely masks out the effect of variance in the characterization of the test microorganism on the second taxonomic level.
Claims
1. A method for the spectrometric characterization of microorganisms, comprising: providing a test microorganism whose identity is known on a first taxonomic level; acquiring spectrometric measurement data from the test microorganism under conditions which allow the influence of at least one source of variance that is not based on a taxonomic classification of the test microorganism; selecting a classifier which is trained to determine an identity of a microorganism on a second taxonomic level which is subordinate to the first taxonomic level, where possible identities of the classifier on the second taxonomic level are assigned to the known identity of the test microorganism on the first taxonomic level, and applying the classifier to the measurement data in order to determine the identity of the test microorganism on the second taxonomic level; wherein the classifier is variance-conditioned by obtaining it through training on targetedly variance-loaded spectrometric reference data of different known reference microorganisms which exhibit the same identity as the test microorganism on the first taxonomic level and cover different identities on the second taxonomic level, where the training includes the stipulation of giving greater weighting to spectral characteristics of a first type from the reference data which promote the differentiation of the different identities on the second taxonomic level, than to spectral characteristics of a second type from the reference data which are affected by the targeted variance, in order to largely or completely mask out an effect of variance in the characterization of the test microorganism on the second taxonomic level.
2. The method according to claim 1, wherein the provision includes isolation of the test microorganism from a habitat.
3. The method according to claim 2, wherein the habitat is a biological and/or chemical matrix.
4. The method according to claim 3, wherein the isolation of the test microorganism includes the removal of the matrix.
5. The method according to claim 1, wherein the provision of the test microorganism includes a multiplication step.
6. The method according to claim 1, wherein the test microorganism is sterilized before the spectrometric measurement data are acquired.
7. The method according to claim 6, wherein the sterilization includes exposure of the test microorganism to a metabolism-inhibiting liquid or to an impact of energy.
8. The method according to claim 1, wherein the first taxonomic level and the second taxonomic level are immediately adjacent to each other.
9. The method according to claim 8, wherein the first taxonomic level corresponds to a species and the second taxonomic level corresponds to a subspecies.
10. The method according to claim 8, wherein the first taxonomic level corresponds to a species and the second taxonomic level comprises different varieties, e.g. pathogenic and non-pathogenic varieties, resistant and sensitive (susceptible) varieties, or different strains of the species.
11. The method according to claim 1, wherein the identity of the test microorganism on the first taxonomic level was determined in advance by means of at least one of the following methods: (i) mass spectrometry, (ii) infrared spectrometry, (iii) growth on selective media (“API (Analytical Profile Index) test”) and (iv) gene sequence analyses.
12. The method according to claim 1, wherein the one or more variances are of atmospheric origin.
13. The method according to claim 12, wherein the one or more variances contain different values on at least one of the following scales: temperature, humidity, pressure, and carbon dioxide content of ambient air.
14. The method according to claim 1, wherein the classifier is obtained and trained with the aid of one or more methods of machine learning.
15. The method according to claim 14, wherein said methods of machine learning comprise at least one of artificial neural networks (ANN) or linear discriminant analyses (LDA).
16. The method according to claim 1, wherein the characterization uses infrared spectrometric methods.
Description
BRIEF DESCRIPTION OF THE DRAWINGS
(1) The invention can be better understood by referring to the following illustrations. The elements in the illustrations are not necessarily to scale, but are primarily intended to illustrate the principles of the invention (mostly schematically):
(2)
(3)
(4)
(5)
(6)
DETAILED DESCRIPTION
(7) While the invention has been illustrated and explained with reference to a number of embodiments, those skilled in the art will recognize that various changes in form and detail can be made without departing from the scope of the technical teaching, as defined in the enclosed claims.
(8)
(9) This finding gave reason to assume that this property of stable distinguishability even when the variance during a spectrometric measurement is considerable (10% to 80% on the humidity scale) can be used to eliminate this variance in the characterization by means of advanced evaluation methods, such as methods of machine learning, and thus to obviate the need for complex conversions of the spectrometers used for this purpose.
(10) The spectra on which the diagram in
(11) The result is shown in
(12) For an infrared spectrometric characterization measurement in transmission, a Fourier transform spectrometer (FT-IR), which provides a high resolution, can be used. See diagram of the measurement setup in
(13) The infrared spectra are based on thousands of vibrations of the functional groups and the polar bonds in the biological material; these in turn originate from all the components of the microorganism cells, such as DNA, RNA, proteins, internal structures, membranes, and cell walls, through to energy stores. There are no obvious assignments of molecules to individual characteristics in the spectra, even though certain spectral ranges can be preferentially assigned to certain molecular species: the fatty acid range from 3050 to 2800 cm.sup.−1 with vibrations of the CH.sub.2 and CH.sub.3 groups, the amide range from 1750 to 1500 cm.sup.−1 with peptide bonds, the polysaccharide range from 1200 to 900 cm.sup.−1. The range from 900 to 700 cm.sup.−1 is sometimes called the fingerprint range because it contains something from all molecules and is very important for differentiating between the varieties.
(14) In a slightly modified embodiment, the infrared spectra can also be measured in reflected light. In this case they are prepared on a metallically reflective substrate made of aluminum, for example. It is also possible to use Raman spectroscopy, which has the advantage that the spectra of the prepared microorganisms can also be measured in liquids, and also require much smaller quantities of sample material.
(15) The knowledge gained from
(16) (i) Prepare the Reference Microorganisms and Specify the Variance(s).
(17) The first task is to specify the classes to be distinguished. If knowledge of the species, e.g. Streptococcus pneumoniae, as the identity on the first taxonomic level is assumed, the objective can be to determine the corresponding serotypes as possible identities on the second, subordinate taxonomic level. As an example, the 23 serotypes of Streptococcus pneumoniae which are found most frequently in clinical tests can be selected. The reference biomass of these microorganisms can be obtained from the publicly operated depositories such as the Leibniz Institute DSMZ—German Collection of Microorganisms and Cell Cultures GmbH in Braunschweig.
(18) To give adequate consideration to the variance within the organism, a representative selection of microorganisms of the classes to be distinguished can be taken into account. Depending on availability, this can be three to six different strains per serotype in the example of Streptococcus pneumoniae; in the case of the 23 most common serotypes, 69 to 138 strains could be used for compiling the reference data and creating the classifier.
(19) The next task is to specify the parameter whose variance is to be imposed on the recording of the reference data and whose variable occurrence during an infrared spectrometric measurement appears possible. This can be an atmospheric variance, e.g. humidity, pressure, gas concentration or temperature. In principle, more than one variance parameter can be taken into account when recording the reference data, for example both humidity and temperature. However, broader coverage in respect of the conceivable variances is also associated with a corresponding increase in the work required to measure the reference data, since the different representative values or reference values of the variance parameters have to be recorded in combination with each other. A list of reference points of the variance parameter(s) is selected which cover all realistic conditions during a spectrometric measurement. It should be possible to interpolate between the values of this representative selection of reference points.
(20) (ii) Recording the Reference Data
(21) First the strains of the reference microorganisms can be prepared in a standardized way. For example, after incubation on or in a suitable culture medium and, if necessary, after being sterilized to prevent biological contaminations, they can be deposited on a specimen slide for infrared spectrometry in several replicates and then introduced into the measurement chamber. The measurement chamber is maintained at a constant, predetermined value in respect of the variance parameter(s), for example 10% relative humidity at 20° C. After the specimen slide is introduced, it is preferable to wait a certain length of time, e.g. five to ten minutes, so that the prepared biomass of the reference microorganisms can become acclimatized to the preset conditions.
(22) After all the parameters have settled, the reference data of the prepared reference microorganisms can be recorded under the preset conditions. This procedure is repeated under the appropriately varied conditions, i.e. for example at 30%, 55% and 85% relative humidity and constant 20° C. Each change in the variance value should be followed by an acclimatization period of several minutes to allow the transient processes to decay and to obtain reproducible stable results.
(23) This method of recording reference data can be supplemented by measurements of subordinate variances, which result, for example, from slightly different incubation conditions (biological replicates), or preparation conditions (e.g. technical replicates, use of different batches of reagents/agents or chemicals), or from measurements taken on different spectrometers to allow for instrumental variances. The reference data thus recorded is checked for completeness, obvious outliers (e.g. using methods of Local Outlier Factoring, LOF), and/or plausibility, and are corrected and/or re-recorded, where necessary.
(24) (iii) Training of the Variance-Conditioned Classifier
(25) It is preferable to use methods of machine learning, e.g. artificial neural networks (ANN) or linear discriminant analyses (LDA). In respect of the class affiliation, e.g. serotype #1, serotype #2, . . . , serotype #23 in the previously described example of Streptococcus pneumoniae, the training is supervised. Regarding the variance conditions, i.e. different ambient conditions (e.g. humidity) or other influencing factors (e.g. varying incubation, preparation, spectrometer), the training of the classifier is unsupervised, however. This is equivalent to the requirement to emphasize the significance of those spectral characteristics in the reference data which maximize the distinctiveness of the individual classes (here serotypes of the species Streptococcus pneumoniae), whereas those spectral characteristics which are strongly influenced by the variances have a lower weighting and are thus virtually masked out. The spectral characteristics can manifest themselves in the principal components, for example.
(26) In simple terms, and for the purpose of illustration (without any claim to strict scientific correctness), the machine learning algorithm identifies those partial volumes in a usually multi-dimensional, multivariate feature space which are each to be assigned to one of the classes distinguished (i.e. identities on the second taxonomic level). An unexpected aspect of this basically known method of taking interferences into account was that atmospheric variances such as relative humidity do not cause the spectral characteristics of one serotype/strain to overlap with those of other serotypes/strains when the humidity varies, but instead they remain separate, and thus ensure distinguishability in a space of spectral characteristics, also under such varying conditions.
(27) As is usual in such training phases which use reference data, there is the option to test the efficiency of the resulting classifier by means of a cross-validation. If appropriate, the machine learning algorithm can be adjusted on the basis of the results of the cross-validation in order to further improve the accuracy of the classifier.
(28) (iv) Validation (Optional)
(29) When the taxonomic assignment of one or more test microorganisms is known, a validation test run can be conducted under conditions which permit the expected variance (e.g. varying relative humidity) in order to verify the efficiency on the basis of external data.
(30) This procedure for creating a classifier can be repeated to create a variance-conditioned classifier database with a very wide range of microorganisms, which in turn can be identified on different taxonomic levels. Reference data can preferably be acquired and processed from pathogens which occur in the clinical environment with the greatest frequency.
(31) After the variance-conditioned classifier is created, the method for characterizing a microorganism can be conducted as follows, see the schematic sequence in
(32) First, the identity of the test microorganism must be known or must have been determined on the first taxonomic level, e.g. the species, using a mass spectrometer such as the MALDI Biotyper® (Bruker Daltonik GmbH, Bremen, Germany). On this basis, the variance-conditioned classifier that is appropriate for the identity determined is selected. By way of example, attention is drawn in this context to the method described in EP 3 083 981 A1.
(33) To obtain sufficient biomass, the test microorganism can be incubated in a nutrient solution or on a flat nutrient medium. The microorganism cells thus grown can then be removed from the nutrient medium, for example by separating them from the nutrient solution, e.g. by centrifuging or filtering, or by sampling from an agar plate. For the purpose of sterilization, the microorganisms thus harvested can be re-suspended in an activity-inhibiting liquid such as ethanol (e.g. 70% v/v).
(34) Microorganisms react very sensitively to changes in growth conditions, such as different media, temperatures, nutrients, changes in the gas supply (oxygen and others), moisture, incubation period etc. These factors can bring about changes in cell composition and in metabolism, which can be detected with infrared spectrometry. For the purpose of incubation, the cell material of a pure single colony can be spread onto an agar plate using a spatula in order to bring about confluent growth. This technique enables the sampling of cells in a very reproducible mixture of the different growth phases which are always present in colonies. For most clinically relevant strains, the optimum incubation period is around 16 to 24 hours, and the incubation temperature frequently used for bacteria is around 35° C. to 37° C. The sample material of an incubated test microorganism can be harvested directly from the center of the cell layer e.g. using a calibrated platinum loop with a diameter of one millimeter (step A).
(35) When the test microorganism is grown on a flat nutrient medium such as agar, biomass can be sampled from one or more colonies and deposited directly on a spectrometric specimen slide. It is important to ensure uniform distribution, with the option to sterilize the biomass by irradiating it with ultraviolet light (e.g. in the case of Streptococcus pneumoniae). Alternatively, the biomass can likewise be re-suspended in a metabolism-inhibiting liquid (step B). The liquid can also be de-ionized water, which does not usually exert any metabolism-inhibiting effect. In this case also, the test microorganism can be sterilized by ultraviolet radiation or other energy source (e.g. heat) after being deposited on a test site of a specimen slide.
(36) Care must be taken that no residues of the nutrient medium, which could interfere with the measurement result, adhere to the test microorganism taken out of or from the nutrient medium. To achieve uniform distribution of the biomass of the test microorganism in the suspension, small cylinders or beads of reaction-inert material such as steel can be added to the suspension and the sealed suspension vessel can then be shaken (step C). The suspension is then aliquoted and applied gently e.g. by means of a pipette with a plastic tip, onto the specimen slide in replicates (step D), whose number may vary from protocol to protocol. Uniform application with homogeneous layer thickness promises the best measurement results (step E). After all samples under investigation have been applied to the specimen slide, it is left to stand for several minutes, e.g. ten to thirty minutes, at a specified temperature, e.g. 37° C., for the suspensions to dry (step F). If the test microorganism is applied to the specimen slide as soon as it is harvested from the incubation vessel without any further re-suspension, the drying can be omitted completely, or at least it can be made much shorter.
(37) The specimen slides thus prepared can then be introduced into a measurement chamber of a spectrometer and measured sample by sample under conditions which allow the influence of at least one source of variance. Several positions on the specimen slide can also be coated with test standard biomass to check the technical performance of the spectrometer, for example in line with the applicant's method explained in EP 3 392 342 A1.
(38) The spectra recorded can be subjected to an analysis with the variance-conditioned classifier created in advance after the usual processing steps, such as baseline subtraction, smoothing and calculation of the second derivative. As described above, only (or at least predominantly) those spectral characteristics that are not influenced by the variance, or only to a slight degree, are taken into account here, whereas those spectral characteristics which exhibit a high variance-induced variation are largely or completely masked out.
(39) Processing the measurement data with the variance-conditioned classifier leads to the spectrum under investigation being assigned to one of the possible identities on the second taxonomic level. In the example of Streptococcus pneumoniae, this means one of the referenced serotypes. Only in rare cases is a reliable characterization not possible, for example because of unforeseen disturbances during the incubation, sample preparation or measurement, or because the identity of the test microorganism sought on the second taxonomic level is not included in the reference data (e.g. in the case of a very rare serotype which is of almost no relevance in clinical practice).
(40)
(41) The reference data of the different serotypes of the reference microorganisms Legionella pneumophila, whose underlying strains are coded with different symbols such as triangles, squares and circles, were acquired under four different relative humidities (arid 10%, semi-arid 30%, humid 55% and tropical 85%). In the diagram, this variance essentially manifests itself in the elongation of the data clouds along the principal component axis PC 2. However, it is clear that, irrespective of the variance, the data cloud belonging to serotype 1 (SV #1), is sufficiently removed from the grouped data cloud of the other serotypes SV #2 to SV #15 to ensure the distinguishability on the basis of spectral characteristics. If a clinic experiences an increase in the number of cases of diarrhea which can be ascribed in a first analysis, for example with the established mass spectrometric MALDI-TOF method, to the bacterial species Legionella pneumophila, an appropriately trained classifier can be used in the subsequent infrared spectrometric analysis of the isolated and incubated pathogen to distinguish the particularly pathogenic serotype SV #1 from the other less dangerous serotypes SV #2 to SV #15 in order to start a specific treatment in the case of a positive result. This procedure can of course be transferred to other microorganisms. The flexibility of the classifier creation described here is boundless.
(42) Starting from the afore-described methods, variance-conditioned classifiers are determined for a plurality of possible micro-organisms and also for a plurality of possible sources of variance, individually and also several in combination, during a spectrometric measurement. With knowledge of the identity of a microorganism to be characterized on a first taxonomic level, a spectrometric sub-characterization of the identity on a second subordinate taxonomic level can thus be robustly and reliably carried out by selecting the appropriate variance-conditioned classifier.
(43) Further embodiments of the invention are conceivable in addition to the embodiments described by way of example. With knowledge of this disclosure, those skilled in the art can easily design further advantageous embodiments, which are to be covered by the scope of protection of the claims, including any equivalents as the case may be.