Method for Increased Throughput
20230207299 · 2023-06-29
Inventors
- Thomas R Covey (Newmarket, CA)
- Gordana Ivosev (Etobicoke, CA)
- Peter Kovarik (Markham, CA)
- Chang Liu (Richmond Hill, CA)
Cpc classification
H01J49/0454
ELECTRICITY
H01J49/0036
ELECTRICITY
H01J49/0418
ELECTRICITY
International classification
Abstract
A trace of intensity versus time values is received for a series of samples produced by a mass spectrometer. Also, a series of ejections times corresponding to the series of samples produced by a sample introduction system is received. A series of expected peak times corresponding to the series of ejection times are calculated using a known delay time from ejection to mass analysis. At least one isolated peak of the trace is identified using the series of expected peak times. A peak profile is calculated by fitting a mixture of at least two different distribution functions to the at least one isolated peak. For at least one time of the series of expected peak times, an area of a peak at the one time is calculated by fitting the peak profile to the trace at the one time and calculating an area of the fitted peak profile.
Claims
1. A system for calculating the area of a sample peak of a trace produced using high-throughput sample introduction coupled mass spectrometry, comprising: a sample introduction system that ejects each sample of a series of samples at an ejection time, producing a series of ejections times corresponding to the series of samples, and ionizes each ejected sample of the series of samples, producing an ion beam; a mass spectrometer that receives the ion beam and mass analyzes the ion beam over time, producing a trace of intensity versus time values for one or more mass-to-charge ratio (m/z) values for the series of samples; and a processor that receives the trace and the series of ejection times, calculates a series of expected peak times corresponding to the series of ejection times using a known delay time from ejection to mass analysis, identifies at least one isolated peak of the trace using the series of expected peak times, calculates a peak profile by fitting a mixture of at least two different distribution functions to the at least one isolated peak, and for at least one time of the series of expected peak times, calculates an area of a peak at the one time by fitting the peak profile to the trace at the one time and calculating an area of the fitted peak profile.
2. The system of claim 1, wherein the processor identifies at least one isolated peak of the trace using the series of expected peak times by identifying one or more peaks that has a minimum overlap with adjacent peaks by calculating intensities at midpoints between peaks using the series of expected peak times and selecting each peak that has an intensity at each midpoint with an adjacent peak that is less than a threshold intensity value, and identifying a peak of the one or more peaks that has a minimum overlap and that has the highest intensity as the at least one isolated peak.
3. The system of claim 1, wherein the at least two different distribution functions comprise a Gaussian distribution function.
4. The system of claim 1, wherein the at least two different distribution functions comprise a Weibull distribution function.
5. The system of claim 1, wherein the sample introduction system comprises a surface analysis system.
6. The system of claim 5, wherein the surface analysis system comprises a matrix-assisted laser desorption ionization (MALDI) device.
7. The system of claim 5, wherein the surface analysis system comprises a laser diode thermal desorption (LDTD) device.
8. The system of claim 1, wherein the sample introduction system comprises a flow injection device and an ion source device.
9. The system of claim 8, wherein the flow injection device comprises a timed valve device that injects sample into a flowing stream through a valve at each ejection time of the series of ejection times and wherein the ion source device ionizes samples of the flowing stream, producing the ion beam.
10. The system of claim 8, wherein the flow injection device comprises a droplet dispenser that ejects the series of samples as droplets into a flowing stream at each ejection time of the series of ejection times and wherein the ion source device ionizes samples of the flowing stream, producing the ion beam.
11. The system of claim 10, wherein the droplet dispenser comprises an acoustic droplet ejection (ADE) device that ejects the series of samples as droplets into an inlet of a tube of an open port interface (OPI), wherein the OPI mixes the droplets of the series of samples with a solvent in the tube to form a series of analyte-solvent dilutions and transfers the series of dilutions to an outlet of the tube of the OPI, and wherein the ion source device receives the series of dilutions and ionizes samples of the series of dilutions, producing the ion beam.
12. The system of claim 1, wherein each time of the series of expected peak times comprises a time at which an apex of a peak is expected.
13. The system of claim 1, wherein the mixture of at least two different distribution functions produces an asymmetric peak that has a larger leading edge gradient than a trailing edge gradient.
14. A method for calculating the area of a sample peak of a trace produced using high-throughput sample introduction coupled mass spectrometry, comprising: receiving a trace of intensity versus time values for one or more mass-to-charge ratio (m/z) values for a series of samples produced by a mass spectrometer and a series of ejections times corresponding to the series of samples produced by a sample introduction system using a processor; calculating a series of expected peak times corresponding to the series of ejection times using a known delay time from ejection to mass analysis using the processor; identifying at least one isolated peak of the trace using the series of expected peak times using the processor; calculating a peak profile by fitting a mixture of at least two different distribution functions to the at least one isolated peak using the processor; and for at least one time of the series of expected peak times, calculating an area of a peak at the one time by fitting the peak profile to the trace at the one time and calculating an area of the fitted peak profile using the processor.
15. A computer program product, comprising a non-transitory and tangible computer-readable storage medium whose contents include a program with instructions being executed on a processor so as to perform a method for calculating the area of a sample peak of a trace produced using high-throughput sample introduction coupled mass spectrometry, the method comprising: providing a system, wherein the system comprises one or more distinct software modules, and wherein the distinct software modules comprise an analysis module; receiving a trace of intensity versus time values for one or more mass-to-charge ratio (m/z) values for a series of samples produced by a mass spectrometer and a series of ejections times corresponding to the series of samples produced by a sample introduction system using the analysis module; calculating a series of expected peak times corresponding to the series of ejection times using a known delay time from ejection to mass analysis using the analysis module; identifying at least one isolated peak of the trace using the series of expected peak times using the analysis module; calculating a peak profile by fitting a mixture of at least two different distribution functions to the at least one isolated peak using the analysis module; and for at least one time of the series of expected peak times, calculating an area of a peak at the one time by fitting the peak profile to the trace at the one time and calculating an area of the fitted peak profile using the analysis module.
Description
BRIEF DESCRIPTION OF THE DRAWINGS
[0058] The skilled artisan will understand that the drawings, described below, are for illustration purposes only. The drawings are not intended to limit the scope of the present teachings in any way.
[0059]
[0060]
[0061]
[0062]
[0063]
[0064]
[0065]
[0066]
[0067]
[0068]
[0069]
[0070]
[0071] Before one or more embodiments of the present teachings are described in detail, one skilled in the art will appreciate that the present teachings are not limited in their application to the details of construction, the arrangements of components, and the arrangement of steps set forth in the following detailed description or illustrated in the drawings. Also, it is to be understood that the phraseology and terminology used herein is for the purpose of description and should not be regarded as limiting.
DESCRIPTION OF VARIOUS EMBODIMENTS
Computer-Implemented System
[0072]
[0073] Computer system 200 may be coupled via bus 202 to a display 212, such as a cathode ray tube (CRT) or liquid crystal display (LCD), for displaying information to a computer user. An input device 214, including alphanumeric and other keys, is coupled to bus 202 for communicating information and command selections to processor 204. Another type of user input device is cursor control 216, such as a mouse, a trackball or cursor direction keys for communicating direction information and command selections to processor 204 and for controlling cursor movement on display 212. This input device typically has two degrees of freedom in two axes, a first axis (i.e., x) and a second axis (i.e., y), that allows the device to specify positions in a plane.
[0074] A computer system 200 can perform the present teachings. Consistent with certain implementations of the present teachings, results are provided by computer system 200 in response to processor 204 executing one or more sequences of one or more instructions contained in memory 206. Such instructions may be read into memory 206 from another computer-readable medium, such as storage device 210. Execution of the sequences of instructions contained in memory 206 causes processor 204 to perform the process described herein. Alternatively, hard-wired circuitry may be used in place of or in combination with software instructions to implement the present teachings. Thus, implementations of the present teachings are not limited to any specific combination of hardware circuitry and software.
[0075] In various embodiments, computer system 200 can be connected to one or more other computer systems, like computer system 200, across a network to form a networked system. The network can include a private network or a public network such as the Internet. In the networked system, one or more computer systems can store and serve the data to other computer systems. The one or more computer systems that store and serve the data can be referred to as servers or the cloud, in a cloud computing scenario. The one or more computer systems can include one or more web servers, for example. The other computer systems that send and receive data to and from the servers or the cloud can be referred to as client or cloud devices, for example.
[0076] The term “computer-readable medium” as used herein refers to any media that participates in providing instructions to processor 204 for execution. Such a medium may take many forms, including but not limited to, non-volatile media, volatile media, and transmission media. Non-volatile media includes, for example, optical or magnetic disks, such as storage device 210. Volatile media includes dynamic memory, such as memory 206. Transmission media includes coaxial cables, copper wire, and fiber optics, including the wires that comprise bus 202.
[0077] Common forms of computer-readable media or computer program products include, for example, a floppy disk, a flexible disk, hard disk, magnetic tape, or any other magnetic medium, a CD-ROM, digital video disc (DVD), a Blu-ray Disc, any other optical medium, a thumb drive, a memory card, a RAM, PROM, and EPROM, a FLASH-EPROM, any other memory chip or cartridge, or any other tangible medium from which a computer can read.
[0078] Various forms of computer readable media may be involved in carrying one or more sequences of one or more instructions to processor 204 for execution. For example, the instructions may initially be carried on the magnetic disk of a remote computer. The remote computer can load the instructions into its dynamic memory and send the instructions over a telephone line using a modem. A modem local to computer system 200 can receive the data on the telephone line and use an infra-red transmitter to convert the data to an infra-red signal. An infra-red detector coupled to bus 202 can receive the data carried in the infra-red signal and place the data on bus 202. Bus 202 carries the data to memory 206, from which processor 204 retrieves and executes the instructions. The instructions received by memory 206 may optionally be stored on storage device 210 either before or after execution by processor 204.
[0079] In accordance with various embodiments, instructions configured to be executed by a processor to perform a method are stored on a computer-readable medium. The computer-readable medium can be a device that stores digital information. For example, a computer-readable medium includes a compact disc read-only memory (CD-ROM) as is known in the art for storing software. The computer-readable medium is accessed by a processor suitable for executing instructions configured to be executed.
[0080] The following descriptions of various implementations of the present teachings have been presented for purposes of illustration and description. It is not exhaustive and does not limit the present teachings to the precise form disclosed. Modifications and variations are possible in light of the above teachings or may be acquired from practicing of the present teachings. Additionally, the described implementation includes software, but the present teachings may be implemented as a combination of hardware and software or in hardware alone. The present teachings may be implemented with both object-oriented and non-object-oriented programming systems.
Calculating Area of High-Throughout Sample Peaks
[0081] High-throughput sample analysis is critical to the drug discovery process. Mass spectrometry (MS) based methods can achieve label-free, universal mass detection of a wide range of analytes with exceptional sensitivity, selectivity, and specificity. Recently, a number of sample introduction systems for MS-based analysis have been improved to provide higher throughput.
[0082] For some of these technologies, such as acoustic ejection mass spectrometry (AEMS), the sample is delivered to the mass spectrometer quite fast (multiple samples per second). The limiting factor for the throughput, however, is the ability of the data processing algorithm to accurately integrate the area of a peak when signals from adjacent signals are partially overlapped.
[0083] Calculating or integrating the peak area of AEMS peaks is especially challenging when a peak of lower intensity immediately follows a peak of much higher signal intensity. Essentially, the lower intensity peak becomes part of or is convolved with the higher intensity peak.
[0084] Conventionally, many algorithms are available to integrate interfering chromatographic peaks. Unfortunately, AEMS peaks do not have the same shape as chromatographic peaks. Consequently, the algorithms used to integrate convolved chromatographic peaks cannot be used to integrate AEMS peaks.
[0085]
[0086] For example, first AEMS peak 310 is a convolved peak. A second peak is convolved with peak 310 in trailing edge 312 of peak 310. The conventional chromatographic peak integrating algorithm can detect the convolved peak, sometimes referred to as a shoulder peak.
[0087] In order to re-create the second peak, the peak integrating algorithm creates a peak profile or peak profile model. This peak profile is created by fitting a mixture of three Gaussian distribution functions to an isolated peak (not shown) of the AEMS trace. Generally, the most intense isolated peak is used. Once the peak profile is created, it can be used to re-create the second peak.
[0088] In plot 300, the peak profile is fitted to the AEMS trace to create modeled second peak 320. A comparison of trailing edge 312 of peak 310 and second peak 320 shows that modeled second peak 320 does not fit trailing edge 312 of peak 310 particularly well. In addition, second peak 320 lacks the asymmetry of an AEMS peak, which is characterized by a larger gradient for the leading edge than for the trailing edge. In other words, second peak 320 is a symmetric peak.
[0089] AEMS peak 330 further highlights the difficulty the conventional chromatographic peak integrating algorithm has with non-convolved AEMS peaks. The integrating algorithm creates peak 340 to model actual peak 330. However, leading edge 341 of modeled peak 340 cannot match the faster rising leading edge 331 of actual peak 330. In addition, trailing edge 342 of modeled peak 340 cannot match the longer trailing edge 332 of actual peak 330.
[0090]
[0091] In general, any peak shape can be modeled as a mixture of Gaussian distributions. The problem, however, with using increasing numbers of Gaussian distributions is the increasing number of parameters needed. Each Gaussian distribution has a set of parameters. Using multiple Gaussian distributions then requires specifying multiple sets of parameters. Unfortunately, however, an AEMS trace only provides a limited number of points across each peak. For example, it may not be possible to use a mixture of distribution functions that requires more than nine parameters if there are only 20 available points across a peak.
[0092]
[0093] Again, first AEMS peak 510 is a convolved peak. A second peak is convolved with peak 510 in trailing edge 512 of peak 510. The conventional chromatographic peak integrating algorithm can detect the convolved peak.
[0094] In order to re-create the second peak, the peak integrating algorithm creates a peak profile or peak profile model. This peak profile is created by fitting a mixture of six Gaussian distribution functions to an isolated peak (not shown) of the AEMS trace.
[0095] In plot 500, the peak profile is fitted to the AEMS trace to create modeled second peak 520. A comparison of trailing edge 512 of peak 510 and second peak 520 shows that modeled second peak 520 fits trailing edge 512 of peak 510 quite well.
[0096] However, the shape of modeled second peak 520 is still not correct. The shape lacks the asymmetry of an AEMS peak, which is characterized by a larger gradient for the leading edge than for the trailing edge. In other words, second peak 520 is still a symmetric peak.
[0097] AEMS peak 530 further highlights the difficulty the conventional chromatographic peak integrating algorithm has with non-convolved AEMS peaks. The integrating algorithm creates peak 540 to model actual peak 530. By fitting a mixture of six Gaussian distribution functions, trailing edge 542 of modeled peak 540 now matches the longer trailing edge 532 of actual peak 530. However, leading edge 541 of modeled peak 540 still cannot match the more sharply rising leading edge 531 of actual peak 530.
[0098]
[0099] However,
[0100] In various embodiments, AEMS peak area calculation or integration is improved by using the ejection timing data provided by the (ADE) device. Expected AEMS peak times corresponding to the ADE ejection times are calculated using a known delay time from the ejection of a sample to its mass analysis. These expected AEMS peak times are then used by the AEMS peak integrating algorithm to fit the peak profile to the AEMS trace.
[0101] No conventional chromatographic peak integrating algorithm has used sample ejection times because the delay time through a chromatographic column is dependent on the particular sample being analyzed. In other words, the elution of samples through a chromatographic column can vary widely.
[0102] Also, in various embodiments, AEMS peak area calculation or integration is improved by using at least two different distribution functions. As should be understood, two different distribution functions can include the use of two functions of the same type such as a Guassin function, but containing different parameters. As described above, using multiple distributions of the same type of distribution function can require more parameters to adjust. Using at least two different distribution functions of different type, however, can provide the peak shape asymmetry using fewer parameters.
[0103] In general, an AEMS peak has a stable shape. An AEMS peak has a small peak width variation and a consistent delay with respect to the known ejection or injection time. The coefficient of variation (CV) for the area of an AEMS peak is 3-8%.
[0104] In various embodiments, an AEMS peak is first modeled using a peak profile. The AEMS peak profile has an analytical curve or shape able to handle strong rising and long tailing signals. The peak profile is able to handle first derivative singularity points in a numerical optimization. The peak profile is created from an optimum mixture model that deviates from a Gaussian distribution by including at least one additional distribution function.
[0105] The peak profile is then fitted to the AEMS trace using the ADE ejection times as input to constrain the optimization. The ADE ejection times can also be used to create the peak profile. They can be used to identify an isolated AEMS peak from which the peak profile is created.
[0106]
[0107] In order to re-create the second peak, the AEMS peak integrating algorithm creates a peak profile or peak profile model. This peak profile is created by fitting a mixture of a Gaussian distribution function and a Weibull distribution function to an isolated peak (not shown) of the AEMS trace.
[0108] In plot 700, the peak profile is fitted to the AEMS trace to create modeled second peak 720. This fitting now uses the known ejection time of the sample producing the second peak. In other words, from the known ejection of the sample producing the second peak, the expected time of the second peak is calculated. The expected time of the second peak is then used to fit the peak profile to the AEMS trace. Modeled second peak 720 is now well fitted to trailing edge 712 of peak 710.
[0109] In related embodiments, it also possible to adjust individual peak times using constrain time-parameter optimization. In such embodiments, the optimization of the peak position can be performed since it's positions is known with a certain precision as there is some randomness in the variation of exact elution time with respect to injection timing (a parameter that is specifically known)
[0110] Figure is not created by constrained optimization but it could be in general
[0111] To me, this is not saying that we fit just intensities (stretching peak profile that we place at the predetermined time position)
[0112] But it seas that we fit profile, meaning we fit all profile parameters, meaning intensity and position, but we use predetermine time in that fitting operation
[0113] In addition, due to using a Gaussian distribution function and a Weibull distribution function, second peak 720 now has the correct AEMS peak shape. Specifically, second peak 720 now includes a larger gradient for the leading edge than for the trailing edge. Second peak 720 is now an asymmetric peak.
[0114] Modeled peak 730 for actual AEMS peak 710 is also improved. The leading edge of modeled peak 730 still includes only a slight deviation from the leading edge of actual peak 710. In addition, this deviation can be compensated for by adjusting the parameters of modeled peak 730.
[0115] The peak area calculation or integration shown in
System for Calculating the Area of a Sample Peak
[0116]
[0117] Sample introduction system 801 ejects each sample of a series of samples 811 at an ejection time. A series of ejections times 812 corresponding to series of samples 811 is produced. Sample introduction system 801 also ionizes each ejected sample of series of samples 811, producing an ion beam 831.
[0118] Mass spectrometer 802 receives ion beam 831 and mass analyzes ion beam 831 over time. A trace 841 of intensity versus time values for one or more m/z values for series of samples 811 is produced.
[0119] Processor 803 receives trace 841 and series of ejection times 812. Processor 803 calculates a series of expected peak times corresponding to series of ejection times 812 using a known delay time from ejection to mass analysis. Processor 803 identifies at least one isolated peak 842 of trace 841 using the series of expected peak times. Processor 803 calculates a peak profile 843 by fitting a mixture of at least two different distribution functions to at least one isolated peak 842. Finally, for at least one time of the series of the expected peak times, processor 803 calculates an area of a peak at the one time by fitting peak profile 843 to trace 841 at the one time and calculating an area of fitted peak profile 844.
[0120]
[0121] In various embodiments, processor 803 identifies at least one isolated peak 842 of trace 841 using the series of expected peak times by using the series of expected peak times to determine if there is overlap between peaks. Specifically, processor 803 identifies one or more peaks that have a minimum overlap with adjacent peaks. This is done, for example, by calculating intensities at midpoints between peaks using the series of expected peak times. Then each peak that has an intensity at each midpoint with an adjacent peak that is less than a threshold intensity value is selected. Finally, a peak of the one or more peaks that has a minimum overlap and that has the highest intensity is selected as at least one isolated peak 842.
[0122] In various embodiments, expected peak times are for a peak apex. Specifically, each time of the series of expected peak times includes a time at which an apex of a peak is expected.
[0123] In various embodiments, the mixture of at least two different distribution functions is used to model an asymmetric peak. Specifically, the mixture of at least two different distribution functions produces an asymmetric peak that has a larger leading edge gradient than a trailing edge gradient.
[0124] In various embodiments, the at least two different distribution functions comprise a Gaussian distribution function. In various embodiments, the at least two different distribution functions comprise a Weibull distribution function.
[0125] In various embodiments, sample introduction system 801 includes a surface analysis system. In various embodiments, the surface analysis system can be, but is not limited to, a matrix-assisted laser desorption/ionization (MALDI) device or a laser diode thermal desorption (LDTD) device.
[0126] In various embodiments, sample introduction system 801 includes a flow injection device and an ion source device. For example, the flow injection device can be a timed valve device that injects sample into a flowing stream through a valve at each ejection time of series of ejection times 812 and the ion source device ionizes samples of the flowing stream, producing ion beam 831.
[0127] In various embodiments, the flow injection device can be a droplet dispenser that ejects series of samples 811 as droplets into a flowing stream at each ejection time of the series of ejection times and the ion source device ionizes samples of the flowing stream, producing ion beam 831.
[0128] In various embodiments, and as shown in
[0129] Mass spectrometer 802 can perform MS or MS/MS. Mass spectrometer 802 can be any type of mass spectrometer. Mass spectrometer 802 is shown as including a time-of-flight (TOF) mass analyzer, but mass spectrometer 802 can include any type of mass analyzer, including a triple quadrupole mass analyzer.
[0130] In various embodiments, processor 803 is used to send and receive instructions, control signals, and data to and from sample introduction system 801 and mass spectrometer 802. Processor 803 controls or provides instructions by, for example, controlling one or more voltage, current, or pressure sources (not shown). Processor 803 can be a separate device as shown in
[0131] Note that terms “eject,” “ejection,” “ejection times,” and the like are used throughout this written description in reference to a sample introduction system. One of ordinary skill in the art can appreciate that other terms can also be used to describe the movement of sample from the sample introduction system, such as, but not limited to, terms like “inject,” “injection,” and “injection times.”
Method for Calculating the Area of a Sample Peak
[0132]
[0133] In step 910 of method 900, a trace of intensity versus time values for one or more m/z values is received for a series of samples produced by a mass spectrometer using a processor. Also, a series of ejections times corresponding to the series of samples produced by a sample introduction system is received using a processor.
[0134] In step 920, a series of expected peak times corresponding to the series of ejection times are calculated using a known delay time from ejection to mass analysis using the processor.
[0135] In step 930, at least one isolated peak of the trace is identified using the series of expected peak times using the processor.
[0136] In step 940, a peak profile is calculated by fitting a mixture of at least two different distribution functions to the at least one isolated peak using the processor.
[0137] In step 950, for at least one time of the series of expected peak times, an area of a peak at the one time is calculated by fitting the peak profile to the trace at the one time and calculating an area of the fitted peak profile using the processor.
Computer Program Product for Calculating the Area of a Sample Peak
[0138] In various embodiments, computer program products include a tangible computer-readable storage medium whose contents include a program with instructions being executed on a processor so as to perform a method for calculating the area of a sample peak of a trace produced using high-throughput sample introduction coupled mass spectrometry. This method is performed by a system that includes one or more distinct software modules.
[0139]
[0140] Analysis module 1010 receives a trace of intensity versus time values for one or more m/z values for a series of samples produced by a mass spectrometer. Analysis module 1010 also receives a series of ejections times corresponding to the series of samples produced by a sample introduction system.
[0141] Analysis module 1010 calculates series of expected peak times corresponding to the series of ejection times using a known delay time from ejection to mass analysis. Analysis module 1010 identifies at least one isolated peak of the trace using the series of expected peak times.
[0142] Analysis module 1010 calculates a peak profile by fitting a mixture of at least two different distribution functions to the at least one isolated peak. Finally, for at least one time of the series of expected peak times, analysis module 1010 calculates an area of a peak at the one time by fitting the peak profile to the trace at the one time and calculating an area of the fitted peak profile.
[0143] Further, in describing various embodiments, the specification may have presented a method and/or process as a particular sequence of steps. However, to the extent that the method or process does not rely on the particular order of steps set forth herein, the method or process should not be limited to the particular sequence of steps described. As one of ordinary skill in the art would appreciate, other sequences of steps may be possible. Therefore, the particular order of the steps set forth in the specification should not be construed as limitations on the claims. In addition, the claims directed to the method and/or process should not be limited to the performance of their steps in the order written, and one skilled in the art can readily appreciate that the sequences may be varied and still remain within the spirit and scope of the various embodiments.