Methods and systems for detecting aerosol particles without using complex organic MALDI matrices
11996280 ยท 2024-05-28
Assignee
Inventors
- Michael MCLOUGHLIN (Sykesville, MD, US)
- WAYNE A. BRYDEN (Sykesville, MD, US)
- Charles J. Call (Sykesville, MD, US)
- Dapeng CHEN (Sykesville, MD, US)
Cpc classification
G01N27/64
PHYSICS
H01J49/0036
ELECTRICITY
International classification
H01J49/16
ELECTRICITY
G01N27/64
PHYSICS
Abstract
Disclosed are systems are methods for identifying the composition of single aerosol particles, particularly that of bioaerosol particles, without pre-treatment using complex organic MALDI matrices. A continuous timing laser may be used to index aerosol particles, measure particle properties, and trigger a pulse ionization laser. Ionized fragments and optionally photons associated with each particle producing by the ionization laser may be analyzed using one or more detectors including a TOF-MS detector and an optical detector. The laser pulse may comprise a simultaneous IR and UV laser pulse when fragments comprise predominantly of UV chromophores. Unique spectral data associated with each indexed particle from each detector may be compiled using data fusion to generate compiled spectral data. Machine learning methods may be used to improve the prediction of composition over time.
Claims
1. A system for identifying the composition of bioaerosol particles without pre-treatment of the bioaerosol particles using complex organic MALDI matrices, the system comprising: an aerosol beam generator to generate a beam of single particles; a continuous laser generator to generate a single continuous laser, in association with a data analysis system, configured to: index each particle in the beam of single particles; optically characterize in association with one or more laser scattering devices, particle size, particle shape, and fluorescence of each indexed particle, and generate optical data; and select which indexed particle is to be ionized; a pulse ionization laser generator triggered by the continuous laser, having an ionization region of less than about 150 ?m in diameter and configured to simultaneously generate an IR laser pulse and an UV laser pulse when each selected indexed particle reaches the ionization region, and is simultaneously exposed to the IR laser pulse and the UV laser pulse to produce at least one of ionized fragments of each selected indexed particle and photons associated with each selected indexed particle, wherein the molecular weight of each ionized fragment is between about 1 kDa and 150 kDa; and, a TOFMS detector to analyze the ionized fragments associated with each selected indexed particle and generate unique mass spectral data associated with each selected indexed particle, wherein the data analysis system is further configured to: compile the optical data with unique mass spectral data associated with each selected indexed particle using data fusion; and compare the compiled optical data with a training data set comprising of a knowledge base of known biological matter to predict composition of the bioaerosol particles.
2. The system of claim 1, wherein the IR laser pulse is characterized by a wavelength of between about 1.0 micrometer and about 1.2 micrometer.
3. The system of claim 1, wherein the UV laser pulse is characterized by a wavelength of between about 250 nm and about 400 nm.
4. The system of claim 1, wherein the wavelength of the IR pulse is about 1.06 micrometer.
5. The system of claim 1, wherein the wavelength of the UV pulse is about 355 nm.
6. The system of claim 1, wherein the IR laser pulse width is between about 1 ns and about 10 ns.
7. The system of claim 1, wherein the IR laser pulse repetition rate is about 1 kHz.
8. The system of claim 1, wherein the travel time of each particle from the aerosol beam generator to the ionization region of the ionization pulse laser is less than about 1 s.
9. The system of claim 1, wherein the pulse ionization laser generator includes a Nd:YAG laser generator configured to simultaneously generate an IR laser pulse and an UV laser pulse.
10. The system of claim 1, wherein the data analysis system is further configured to: improve the prediction of composition over time using machine learning methods implemented using a machine learning engine disposed in data communication with the data analysis system.
11. The system of claim 1, wherein the pulse ionization laser power density is between about 1 MW/cm.sup.2 and about 20 MW/cm.sup.2.
12. A method for identifying the composition of bioaerosol particles without pre-treatment of the bioaerosol particles using complex organic MALDI matrices, the method comprising: generating an aerosol particle beam of single particles using an aerosol beam generator; using a single continuous laser, in association with a data analysis system: indexing each particle in the beam of single particles; optically characterizing in association with one or more laser scattering devices, particle size, particle shape, and fluorescence of each indexed particle, and generating optical data; and selecting which indexed particle is to be ionized; triggering an ionization pulse laser generator using the continuous laser to simultaneously generate an IR laser pulse and an UV laser pulse when each selected indexed particle reaches the ionization region of the ionization pulse laser and simultaneously exposing each particle to the IR laser pulse and the UV laser pulse laser to produce at least one of ionized fragments of each selected indexed particle and photons associated with each selected indexed particle, wherein the molecular weight of each ionized fragment is between about 1 kDa and about 150 kDa; analyzing the ionized fragments of each selected indexed particle using a TOFMS detector to generate unique mass spectral data associated with each selected indexed particle; and, determining the composition of each selected indexed particle by: compiling the optical data with unique mass spectral data associated with each selected indexed particle using data fusion; and comparing the compiled data with a training data set comprising of a knowledge base of known biological matter to predict composition of the bioaerosol particles.
13. The method of claim 12, wherein selecting which indexed particle is to be ionized comprises determining whether at least one of particle size, particle shape, and fluorescence of the indexed particle meets a predetermined threshold value.
14. The method of claim 12, wherein the determining the composition step further comprises: improving the prediction of composition over time using machine learning methods implemented using a machine learning engine disposed in data communication with the data analysis system.
15. The method claim 12, wherein the IR laser pulse is characterized by a wavelength of between about 1.0 micrometer and about 1.2 micrometer.
16. The method claim 12, wherein the UV laser pulse is characterized by a wavelength of between about 250 nm and about 400 nm.
Description
DRAWINGS
(1) The foregoing aspects and many of the attendant advantages of this disclosure will become more readily appreciated as the same becomes better understood by reference to the following detailed description, when taken in conjunction with the accompanying drawings, wherein:
(2)
(3)
(4)
(5)
(6)
(7)
(8)
(9) All reference numerals, designators and callouts in the figures are hereby incorporated by this reference as if fully set forth herein. The failure to number an element in a figure is not intended to waive any rights. Unnumbered references may also be identified by alpha characters in the figures and appendices.
(10) The following detailed description includes references to the accompanying drawings, which form a part of the detailed description. The drawings show, by way of illustration, specific embodiments in which the disclosed systems and methods may be practiced. These embodiments, which are to be understood as examples or options, are described in enough detail to enable those skilled in the art to practice the present invention. The embodiments may be combined, other embodiments may be utilized, or structural or logical changes may be made, without departing from the scope of the invention. The following detailed description is, therefore, not to be taken in a limiting sense and the scope of the invention is defined by the appended claims and their legal equivalents.
(11) In this disclosure, aerosol generally means a suspension of particles dispersed in air or gas. Real-time analysis of aerosols generally means analytical methods and devices that identify the aerosol analyte within a matter of minutes after the aerosol sample to be analyzed is introduced to the analytical device or system. The terms a or an are used to include one or more than one, and the term or is used to refer to a nonexclusive or unless otherwise indicated. In addition, it is to be understood that the phraseology or terminology employed herein, and not otherwise defined, is for the purpose of description only and not of limitation. Unless otherwise specified in this disclosure, for construing the scope of the term about, the error bounds associated with the values (dimensions, operating conditions etc.) disclosed is ?10% of the values indicated in this disclosure. The error bounds associated with the values disclosed as percentages is ?1% of the percentages indicated. The word substantially used before a specific word includes the meanings considerable in extent to that which is specified, and largely but not wholly that which is specified.
DETAILED DISCLOSURE
(12) Particular aspects of the invention are described below in considerable detail for the purpose for illustrating the compositions, and principles, and operations of the disclosed methods and systems. However, various modifications may be made, and the scope of the invention is not limited to the exemplary aspects described.
(13) In exemplary system 100 (
(14) In an exemplary method 200 (
(15) Aerosol analyte particles collected from ambient air typically contains a significant amount of water. There is strong association of water with background atmospheric particles and particularly for particles containing biological macromolecules such as proteins and DNA. In a bacterial cell, lipopolysaccharide, peptidoglycan and glycan may make up for only about 10% of the dry weight of the vegetative cell. Further, many other compounds associated with biological particles contain large amounts of hydroxyl groups that will have the same strong laser interactions as water. Water (ambient humidity/moisture) associated with every particle sampled from the atmosphere may potentially be used as a laser absorbing matrix for single particle TOF-MS. The ubiquitous presence of water in atmospheric aerosol particles provides a mechanism for ion generation across a broad spectrum of masses. As previously described, in exemplary system 100, the travel time (or residence time) of a particle from beam generator 102 to being hit with laser 108 is less than about 1 s. This short residence time permits the analysis of IR chromophores in exemplary method 300. As a result of this short transit time, water that is already strongly bound to the surface of the particles 301 as a thin film (e.g., monolayer film) or contained within the particles as water or hydroxyl groups, does not evaporate but is available for strong interaction with IR laser pulses in step 302. Biological matter usually contains molecules in high concentrations that contain infrared-active hydroxyl groups. In fact, every cellular interaction in the body involves specific interactions between carbohydrate molecules that decorate the surface and exist throughout cellular material. Furthermore, normal preparations of biological materials are often contaminated with growth media such as agar. These materials (IR chromophores) strongly absorb IR radiation because of their high content of hydroxyl groups. The IR laser pulse may have a wavelength between about 2.7 micrometer and about 3.3 micrometer. The wavelength of the IR pulse may be about 2.94 micrometers. IR laser repetition rate is typically in the 1 kHz range and pulse width may be between about 40 microsecond and about 100 microsecond. The IR laser power density may be between about 1 MW/cm.sup.2 and about 20 MW/cm.sup.2. The overlap between the infrared absorption of hydroxyl containing molecules such as water, carbohydrate and Agar, and the IR laser line is also shown
(16) In the event that aerosol particles comprise of non-biological particles and identifying the chemical composition of the particles is desired, analysis of these particles may be done by hard ionization to generate small ions. An IR laser pulse with laser power densities between about 20 MW/cm.sup.2 and about 150 MW/cm.sup.2 may be used for this purpose. Methods 200 and 300 may be modified to enable switching between hard ionization that generates fragments of molecular weight less than 1 kDa, and typically less than 500 Da, and soft ionization that generates fragments of molecular weight typically greater than 1 kDa.
(17) In the exemplary methods described above, each individual aerosol analyte particle is indexed prior to ionization and tracked using at least one continuous laser. Further a continuous laser may be used to measure particle properties such as size and shape. Each individual particle is indexed and tracked to enable data fusion of mass spectral data associated with each particle and the optical properties of each particle. These optical properties may include size, shape and polarization of the particles. Indexing allows mass spectral data collected after ionization of each particle to be associated with each particle. The large amount of data related to each particle in the aerosol beam may then be filtered and analyzed using data fusion protocols in data analysis system 110 to identify the composition and type of particles in real-time and with a high accuracy, sensitivity, and specificity. Data fusion may be defined as a combination of data from multiple sources to obtain improved information in terms of less expensive, higher quality, or more relevant information. A review of data fusion techniques is provided by Castanedo [2], which is incorporated by reference herein in its entirety.
(18) In exemplary methods 200 and 300, in addition to TOF-MS mass spectral analysis, one or more optical detection methods may also be employed because when the analyte aerosol particles absorb sufficient light energy from a laser pulse, they emit characteristic photons as they transition from a high-energy state to a lower energy state and generate transient optical signatures such as high-order fluorescence, laser-induced breakdown spectroscopy (LIB S), Raman spectra and infrared spectra. Therefore, in additional to mass spectrometry, optical sensors/detectors 109 may be used to identify the composition of the aerosol particles. Measured data collected using both TOF-MS and optical sensors may be processed using data fusion techniques to provide information on the composition of the aerosol analytes. By collecting information from a variety of detectors that include one or more optical methods and mass spectrometry, it is possible to filter and analyze the data associated with each particles using data fusion protocols to rapidly (close to real-time) identify the composition and type of particles with a high accuracy, sensitivity, and specificity. For each indexed individual aerosol particle, data from each of the measurements comprising at least one of TOF-MS, LIBS, Raman spectroscopy and infrared spectroscopy, may be transferred to the sensor data fusion engine 108 where artificial intelligence tools including machine learning and deep learning may be employed to fully characterize the particles.
(19) In LIBS, a laser pulse (e.g. from a high energy Nd:YAG laser with a wavelength of about 1064 nm) is focused on the particle to ablate a small amount of the particle to generate a plasma. The analyte particle breakdown (dissociate) into ionic and atomic species. When the plasma cools, characteristic atomic emission lines of the elements may be observed using an optical detector such as a CCD detector. Another exemplary optical detection tool is Raman spectroscopy. Raman spectroscopy provides information about molecular vibrations that can be used for sample identification and quantitation. The technique involves focusing a laser beam (e.g. a UV laser source with wavelength between about 330 and about 360 nm) on a sample and detecting inelastic scattered light. The majority of the scattered light is of the same frequency as the excitation source and is known as Rayleigh or elastic scattering. A very small amount of the scattered light is shifted in energy from the laser frequency, due to interactions between the incident electromagnetic waves and the vibrational energy levels of the molecules in the sample. Plotting the intensity of this shifted light versus frequency results in a Raman spectrum of the sample. In fluorescence spectroscopy, the analyte molecules are excited by irradiation at a certain wavelength and emit radiation of a different wavelength. The emission spectrum provides information for both qualitative and quantitative analysis. When light of an appropriate wavelength is absorbed by a molecule, the electronic state of the molecule changes from the ground state to one of many vibrational levels in one of the excited electronic states. Once the molecule is in this excited state, relaxation can occur via several processes. Fluorescence is one of these processes and results in the emission of light. By analyzing the different frequencies of light emitted in fluorescent spectroscopy, along with their relative intensities, the chemical structure associated with different vibrational levels can be determined. Certain amino acids in biological samples, for example tryptophan, have high fluorescent quantum efficiencies, which favors the use of fluorescent spectroscopy for identifying these amino acids.
(20) Machine learning (ML) techniques for analyzing collected spectral data obtained using machine learning engine 111 offers a significant improvement to manual data processing for analyte identification, which is slow and labor intensive. Machine learning is generally a subset of artificial intelligence and comprise algorithms whose performance improve with data analysis over time. Supervised machine learning methods may be used. Supervised learning comprises the task of learning a function that maps an input to an output based on example input-output pairs. It infers a function from labeled training data consisting of a set of training examples. Machine learning also includes deep learning methodologies which are unsupervised learning methods that can identify signatures in complex data sets without the need to a priori identify specific features. Unsupervised machine learning methods and semi-supervised (hybrid methods between supervised and unsupervised learning) may also be used. Unsupervised learning methods may comprise a type learning that helps find previously unknown patterns in data set without pre-existing labels. Two exemplary methods used in unsupervised learning are principal component and cluster analysis. Cluster analysis is used in unsupervised learning to group, or segment, datasets with shared attributes in order to extrapolate algorithmic relationships. Cluster analysis is a branch of machine learning that groups the data that has not been labelled, classified or categorized. Cluster analysis identifies commonalities in the data and reacts based on the presence or absence of such commonalities in each new piece of data. This approach helps detect anomalous data points. Unsupervised learning methods may be used for anomaly detection, which can be helpful in identifying previously unknown hazards. For example, air samples may be analyzed at periodic intervals to measure the composition of particles in air and to identify the properties of the particles (e.g., size, shape, fluorescence) and spectra associated with particles to get a baseline data information of particles in normal ambient air. Particles in ambient air after an event such as the release of biological threat agents into the atmosphere would provide particle property data and spectral data that deviate from baseline data and would highlight an anomaly (as evidenced by anomalous spectra) and provide an opportunity to take necessary remedial steps to mitigate the threat. As previously described, the compiled spectral data may be compared with a training data set comprising of a knowledge base of known biological matter spectra to predict particle composition. System 110 may be in data communication with machine learning engine 111 to allow for updating the training data set knowledge based and improving the prediction of composition over time. Biological matter mass spectra cover a range that is about three orders of magnitude greater than chemical mass spectra, significantly complicating the application of automated techniques.
(21) In addition, environmental contaminants can reduce signal strength by competing with the target during the ionization process (competitive ionization), a introduce signature components (clutter) that must be deconvolved with the target signature. Current automated methods are mostly limited to searching for very pure targets in samples with no environmental clutter. The disclosed exemplary methods eliminate competitive ionization by physically separating target analyte from clutter and eliminates ambiguities in the signature (each event is assumed to be an either target or clutter). An exemplary ML schematic diagram 400 for identifying tuberculosis (TB) biomarkers using high-resolution mass spectrometry is shown in
(22) In method 400, Significance Analysis of Microarrays (SAM) SAM techniques were also applied in step 403 to the extracted signals in step 401 to identify strongly discriminative features, and to select the most powerful features to distinguish the two classes of samples. SAM is a feature selection algorithm that is designed to process a big data set and identify the strongest features between two classes of samples. SAM analysis returned a feature ranking list based on their quantity-changes, statistical significance, and false positive rates. Features identified by SAM were optimized using Support Vector Machines (SVMs) in step 404. SVMs are a supervised machine learning-based classifier that uses a training dataset to define separation hyperplane in a fashion that an unknown sample can be classified depending on the side of separation hyperplane. The advantage of SVMs depends on their ability to process high dimensional data and predict analyte composition and continuously improve the knowledge-base contained in the training data set.
(23) As an example, SAM-based feature selection with extracted signals from negative ions is shown in
(24) To identify the presence of biological threat agents in the atmosphere, air samples may be collected at predetermined time intervals and analyzed using the exemplary methods disclosed above to generate a historical data set (training data set) of background/baseline information in data analysis system 110. Analysis may be improved by time using machine learning algorithms run in engine 111. Variations in background information may be modeled to map out normal behavior of the atmosphere in a protected area. When a release of biological, biochemical, or chemical aerosol particles is suspected, sampling of air using the exemplary methods described above will result in information that deviates from historical background information. The first signature of the presence of such a threat will be a sharp deviation from the normal background. At this stage, algorithmic decisions may be made as to the composition of each individual particle. Remedial actions can therefore be taken quickly to protect human life and to prevent loss of life.
(25) The exemplary methods and devices disclosed above may also be used for analysis of liquid samples. In this case, an aliquot of the sample may be aerosolized using suitable means. For example, a nebulizer may be used to aerosolize the liquid sample in air. Analyte particles may also be extracted from a swab or may be in the form of a solid sample which may be dissolved using a suitable solvent. An aliquot of the sample may then be aerosolized using suitable means. For example, a nebulizer may be used to aerosolize the liquid sample in air. In addition to bacteria, the disclosed exemplary methods and devices may be used to identify viral and toxins in real-time. By analyzing data collected from one or more optical detector and from mass spectrometry, the biological fingerprint of analyte particles may be obtained in real-time.
(26) The disclosed exemplary methods obviate the need for using complex sample processing steps associated with MALDI TOF-MS, while still producing large informative ions particularly in the case of biological aerosol particles. Further, generation of large molecular fragments may also be improved by treating the aerosol particles using a spray of water or solvent-water mixture before ionization using the IR laser pulse (for example, in method 300). Organic solvents comprising at least one of methanol, ethanol, and isopropanol may be used. The MALDI process requires a sample processing step whereby another chemical (usually a complex organic molecule in a solvent) coats the sample before it is analyzed in the TOF-MS. Methods 200 and 300 may be modified to permit a MALDI matrix (simple matrices such as organic solvents) coating step. The MALDI technique coupled with high-mass-range time-of-flight (TOF) mass spectrometry may also permit direct analysis of large peptide components, and complete proteins enabling whole cell biological identification. Commonly owned International Application PCT/US2016/48395 entitled Coating of Aerosol Particles Using an Acoustic Coater, which is incorporated by reference herein in its entirety, describes conventional MALDI TOF mass spectrometry, provides examples of complex organic MALDI matrices, and discloses methods and devices for applying a coating of a MALDI matrix solution to bio aerosol particles prior to their analysis in an aerosol time-of-flight mass spectrometer.
(27) The Abstract is provided to comply with 37 C.F.R. ? 1.72(b), to allow the reader to determine quickly from a cursory inspection the nature and gist of the technical disclosure. It should not be used to interpret or limit the scope or meaning of the claims.
(28) Although the present disclosure has been described in connection with the preferred form of practicing it, those of ordinary skill in the art will understand that many modifications can be made thereto without departing from the spirit of the present disclosure. Accordingly, it is not intended that the scope of the disclosure in any way be limited by the above description.
(29) It should also be understood that a variety of changes may be made without departing from the essence of the disclosure. Such changes are also implicitly included in the description. They still fall within the scope of this disclosure. It should be understood that this disclosure is intended to yield a patent covering numerous aspects of the disclosure both independently and as an overall system and in both method and apparatus modes.
(30) Further, each of the various elements of the disclosure and claims may also be achieved in a variety of manners. This disclosure should be understood to encompass each such variation, be it a variation of an implementation of any apparatus implementation, a method or process implementation, or even merely a variation of any element of these.
(31) Particularly, it should be understood that the words for each element may be expressed by equivalent apparatus terms or method termseven if only the function or result is the same. Such equivalent, broader, or even more generic terms should be considered to be encompassed in the description of each element or action. Such terms can be substituted where desired to make explicit the implicitly broad coverage to which this disclosure is entitled. It should be understood that all actions may be expressed as a means for taking that action or as an element which causes that action. Similarly, each physical element disclosed should be understood to encompass a disclosure of the action which that physical element facilitates.
(32) In addition, as to each term used it should be understood that unless its utilization in this application is inconsistent with such interpretation, common dictionary definitions should be understood as incorporated for each term and all definitions, alternative terms, and synonyms such as contained in at least one of a standard technical dictionary recognized by artisans and the Random House Webster's Unabridged Dictionary, latest edition are hereby incorporated by reference.
(33) Further, the use of the transitional phrase comprising is used to maintain the open-end claims herein, according to traditional claim interpretation. Thus, unless the context requires otherwise, it should be understood that variations such as comprises or comprising, are intended to imply the inclusion of a stated element or step or group of elements or steps, but not the exclusion of any other element or step or group of elements or steps. Such terms should be interpreted in their most expansive forms so as to afford the applicant the broadest coverage legally permissible.
REFERENCES
(34) 1. Wan G-H, Wu C-L, Chen Y-F, Huang S-H, Wang Y-L, et al. (2014), Particle Size Concentration Distribution and Influences on Exhaled Breath Particles in Mechanically Ventilated Patients, PLoS ONE 9(1): e87088. 2. Castanedo, F., A Review of Data Fusion Techniques, The Scientific World Journal, 2013. 3. Warschat, C. et al., Mass Spectrometry of Levitated Droplets by Thermally Unconfined Infrared-Laser Desorption, Anal. Chem. 2015, 87, 8323-8327.