Method and device for processing data

11321425 · 2022-05-03

Assignee

Inventors

Cpc classification

International classification

Abstract

A first reference line that is a regression line obtained from data within a predetermined range including a starting point of a peak detected from data of a graph showing changes in intensity with respect to a parameter, a second reference line that is a regression line obtained from data within a predetermined range including a ending point of the peak, and a third reference line connecting the starting and ending points, and one or more intermediate control points in a triangle defined by the first, second, and third reference lines are determined; and a Bezier curve between the starting point and the ending point is created to be determined to be a baseline of the peak, the Bezier curve being determined by control points of the starting point, the one or more intermediate control points, and the ending point in order on a parameter axis.

Claims

1. A data processing method comprising: detecting a starting point and an ending point of a peak from data of a chromatogram or a spectrogram, that is obtained from a chromatograph or a spectrometer, showing changes in intensity with respect to a parameter; determining a first reference line that is a regression line obtained from data within a predetermined range including the starting point of the peak, a second reference line that is a regression line obtained from data within a predetermined range including the ending point of the peak, and a third reference line that is a straight line connecting the starting point of the peak and the ending point of the peak, determining one or more intermediate control points in a triangle defined by the first reference line, the second reference line and the third reference line; creating a baseline of the peak by creating a Bezier curve between the starting point of the peak and the ending point of the peak to be the baseline of the peak, the Bezier curve being determined by the starting point of the peak, the one or more intermediate control points of the peak, and the ending point of the peak in order on a parameter axis; creating a baseline-subtracted chromatogram or spectrogram by removing the baseline from the chromatogram or spectrogram; and displaying the baseline-subtracted chromatogram or spectrogram on a display for analyzing at least one component of a sample provided in the chromatograph or the spectrometer and said sample is represented in the baseline-subtracted chromatogram or spectrogram.

2. The data processing method according to claim 1, wherein the intermediate control point is an intersection of the first reference line and the second reference line.

3. The data processing method according to claim 1, wherein the intermediate control point is a point that is different from an intersection of the first reference line and the second reference line and has a same value of the parameter as the intersection.

4. A system comprising: a display; and a data processing device comprising at least one processor that implements: a peak detector that detects a starting point and an ending point of a peak from data of a chromatogram or spectrogram, that is obtained from a chromatograph or a spectrometer, showing changes in intensity with respect to a parameter; an intermediate control point determiner that determines a first reference line that is a regression line obtained from data within a predetermined range including the starting point of the peak, a second reference line that is a regression line obtained from data within a predetermined range including the ending point of the peak, a third reference line that is a straight line connecting the starting point of the peak and the ending point of the peak, and one or more intermediate control points in a triangle defined by the first reference line, the second reference line and the third reference line; a baseline determiner that creates a baseline of the peak by creating a Bezier curve between the starting point of the peak and the ending point of the peak to be the baseline of the peak, the Bezier curve being determined by the starting point of the peak, the one or more intermediate control points, and the ending point of the peak in order on a parameter axis; and a baseline-subtracted chromatogram or spectrogram creator that creates a baseline-subtracted chromatogram or spectrogram by removing the baseline from the chromatogram or spectrogram, and that causes the display to display the baseline-subtracted chromatogram or spectrogram for analyzing at least one component of a sample provided in the chromatograph or the spectrometer and said sample is represented in the baseline-subtracted chromatogram or spectrogram.

Description

BRIEF DESCRIPTION OF DRAWINGS

(1) FIG. 1 is a schematic configuration diagram showing one embodiment of a data processing device according to the present invention.

(2) FIG. 2 is a flowchart showing a basic operation in a data processing method according to the present invention.

(3) FIG. 3 is a schematic diagram showing control points that include a starting point and an ending point of one of peaks of a chromatogram, and an intermediate control point determined based on the two points, and showing a baseline as a Bezier curve created from the control points.

(4) FIG. 4 is a schematic diagram showing another example of a method for determining the intermediate control point.

(5) FIG. 5 is a flowchart showing an example of a specific operation in step S4 in one embodiment of a data processing method according to the present invention.

(6) FIG. 6 is a flowchart showing an example of a specific operation in steps S5 and S7 in one embodiment of the data processing method according to the present invention.

(7) FIG. 7 is a schematic diagram showing another example of a method for determining the intermediate control point.

(8) FIG. 8 is a schematic diagram showing another example of a method for determining the intermediate control point.

(9) FIGS. 9A-9C are diagrams showing an example of a conventional method for determining a baseline in a chromatogram.

DESCRIPTION OF EMBODIMENTS

(10) Embodiments of a method and a device for processing data according to the present invention will be described with reference to FIG. 1 to FIG. 8.

(11) A data processing device 10 of the present embodiment is used together with a data recording unit 1, a display device 2, and an input device 3. The data recording unit 1 is a device for recording data obtained during measurement by a detector included in a liquid chromatograph, a gas chromatograph, or the like, and is composed of a hard disk, a memory, and the like. In an example shown in FIG. 1, the data recording unit 1 is provided outside the data processing device 10, but may be provided in the data processing device 10. The display device 2 is a display for displaying results of information and data processing during data processing by the data processing device 10. The input device 3 is a device for allowing a user to input necessary information to the data processing device 10, and is composed of a keyboard, a mouse, and the like.

(12) The data processing device 10 includes a chromatogram creator 11, a peak detector 12, an intermediate control point determiner 13, a baseline determiner 14, and a baseline-subtracted chromatogram creator 15. These components are actually implemented by hardware, such as a CPU and a memory of a computer, and software. Hereinafter, with reference to a flowchart in FIG. 2 and schematic diagrams of a peak of a chromatogram in FIG. 3 and FIG. 4, one embodiment of a data processing method according to the present invention will be described, and functions of each component of the data processing device 10 will be also described.

(13) First, the chromatogram creator 11 obtains data from the data recording unit 1, and creates a chromatogram in a conventional manner (step S1). In a case where a spectrum is processed, operations to create a chromatogram are unnecessary, and it is only necessary to obtain data from the data recording unit 1.

(14) Next, the peak detector 12 detects a peak from the chromatogram created by the chromatogram creator 11 (step S2). The detection of the peak may be performed in a conventional manner (for example, see Patent Literature 2).

(15) Subsequently, the intermediate control point determiner 13 determines, among control points of a Bezier curve used to determine a baseline for the section of the peak detected by the peak detector 12, an intermediate control point other than a starting point and an ending point, as follows (see FIG. 3).

(16) First, a starting point 211 and an ending point 212 are obtained from a peak 21 detected by the peak detector 12 (step S3).

(17) Next, a first reference line 231 that is a regression line obtained from data within a first range 221 that is a predetermined range including the starting point 211, and a second reference line 232 that is a regression line obtained from data within a second range 222 that is a predetermined range including the ending point 212 are obtained (step S4). In an example of a chromatogram 20 shown in FIG. 3, the first range 221 is set as a range outside the peak 21 from the starting point 211, and the second range 222 is set as a range outside the peak 21 from the ending point 212, respectively. However, each of the ranges may be set across outside and inside the peak 21 with including the starting point 211 or the ending point 212, or may be set inside the peak 21.

(18) Then, the intersection of the first reference line 231 and the second reference line 232 is determined to be an intermediate control point 213 (step S5). In the example of FIG. 3, the intermediate control point 213 is the intersection of the first reference line 231 and the second reference line 232, but an intermediate control point has only to be within a triangle defined by the first reference line 231, the second reference line 232, and a third reference line 233 that is a straight line connecting the starting point 211 and the ending point 212. In an example of a chromatogram 20A shown in FIG. 4, a perpendicular 26 is drawn from an intersection 25 of a first reference line 231A obtained from a starting point 211A of a peak 21A and a second reference line 232A obtained from an ending point 212A down to a parameter axis (the horizontal axis of retention time in the chromatogram), and a point on the perpendicular 26 is obtained as an intermediate control point 213A.

(19) If the chromatogram has a plurality of peaks, the operations in steps S3 to S5 described above are also performed on the other peaks. Accordingly, in step S6, it is checked whether the determination of the intermediate control points for all the peaks detected in step S2 is completed, and if it is YES (completed), then the process proceeds to step S7; if it is NO, then the process returns to step S3.

(20) In step S7, the baseline determiner 14 determines a Bezier curve that has, in order on the parameter axis, the starting point 211 as the first control point, the intermediate control point 213 as the second control point, and the ending point 212 as the third (the last) control point, and determines the Bezier curve to be a partial baseline 24 between the starting point 211 and the ending point 212 (see FIG. 3). Similarly in the example of FIG. 4, a Bezier curve being determined by control points that consist of the starting point 211A, the ending point 212A, and the intermediate control point 213A is determined to be a partial baseline 24A for the section of a peak 21A. The baseline determiner 14 also performs the same operations as described above on sections between the starting points and the ending points of the other peaks. Through the operations described above, the baseline determiner 14 determines the Bezier curve obtained by the above method to be the baseline for the section between the starting point and the ending point of each peak, and determines the curves of the chromatogram to be the baselines for the other sections.

(21) The baseline-subtracted chromatogram creator 15 subtracts the baseline determined by the baseline determiner 14 from the chromatogram to create a baseline-subtracted chromatogram (step S8), and displays the baseline-subtracted chromatogram on the display of the display device 2. Thus, the operation of the data processing method of the present embodiment is finished.

(22) According to the data processing method of the present embodiment, it is possible to determine a baseline to provide a smooth connection at the front and back of the starting point 211 and the ending point 212 of the peak 21.

(23) Next, an example of a specific operation in steps S4, S5, and S7 in the data processing method of the present embodiment will be described with reference to FIG. 5, FIG. 6, and also FIG. 3 described above. Here, a case where the chromatogram has a single peak is described, and accordingly, explanation of step S6 is omitted. However, also when the chromatogram has a plurality of peaks and accordingly, step S6 is performed, the same operation as the following can be applied to each peak.

(24) First, the operations up to step S3 are performed, in step S4-1, the retention time of the starting point of the first range 221 is defined as ls, the retention time of the ending point is defined as le, also the retention time of the starting point of the second range 222 is defined as rs, and the retention time of the ending point is defined as re (see FIG. 3). Here, the starting point 211 of the peak is set as the ending point (retention time le) of the first range 221, and the end point 212 of the peak is set as the starting point (retention time rs) of the second range 222. That is, the first range 221 and the second range 222 are both set outside the peak. At this stage, the retention time is of the starting point of the first range 221 and the retention time re of the ending point of the second range 222 may be any value.

(25) Then, the value of a natural number n is set to 1 (step S4-2). The value n is set to break operations of the following steps S4-3 to S4-5 when the operations are repeated a predetermined maximum number n.sub.max of times.

(26) Subsequently, a regression line is obtained from data of the chromatogram within the first range 221 (between the retention times ls and le) (step S4-3). Then, it is determined whether the residual sum of squares of the regression line and the chromatogram within the first range 221 is equal to or greater than a predetermined threshold value (step S4-4).

(27) If this residual sum of squares is equal to or greater than the predetermined threshold value (YES in step S4-4), the obtained regression line may not be appropriate as the first reference line. In this case, first, in step S4-5, it is checked whether n reaches the maximum number n.sub.max. If n reaches the maximum number n.sub.max (YES in step S4-5), then the process proceeds to step S4-7, and the regression line obtained in step S4-3 is determined to be the first reference line 231. On the other hand, if n does not reach the maximum number n.sub.max (NO in step S4-5), then n is incremented by one, the retention time is of the starting point of the first range 221 is replaced with a value of “le−(le−ls)/2” (step S4-6), and the process returns to step S4-3. The length of the retention time of the first range 221 after the replacement in step S4-6 is half the length before the replacement.

(28) If the residual sum of squares is less than the predetermined threshold value (NO in step S4-4), then the process proceeds to step S4-7 as it is, and the regression line obtained in step S4-3 is determined to be the first reference line 231.

(29) After step S4-7, the same operations as steps S4-2 to S4-7 are performed on the second range 222 (these operations are collectively referred to as step S4-8 in FIG. 5 and their details are omitted) to determine the second reference line 232.

(30) After step S4-8, the process proceeds to step S5-1 (FIG. 6). In step S5-1, the starting point 211 of the peak is determined to be the first control point of the Bezier curve. Subsequently, in step S5-2, an intersection of the first reference line 231 and the second reference line 232 is obtained.

(31) Then, in step S5-3, it is determined whether the retention time of the intersection is between the retention time of the starting point 211 and the retention time of the ending point 212. If this determination is YES, in step S5-4, the third reference line 233 connecting the starting point and the ending point of the peak is obtained. Then, in step S5-5, a point (which may be the intersection) that is on a line segment between the intersection and the third reference line 233 in a perpendicular drawn from the intersection down to the parameter axis is determined to be the second control point (the intermediate control point 213).

(32) On the other hand, if the determination in step S5-3 is NO, the intersection is not appropriate to be used to determine the control point, and accordingly, one or more control points are determined by other methods (an example of which will be described later with reference to FIG. 7) in step S5-6.

(33) After step S5-5 or step S5-6, the ending point 212 is determined to be the last control point (step S5-7).

(34) Next, in step S7-1, a Bezier curve is created based on the three (or more) control points obtained in step S5, by using the number of sampling points that is twice the number of sampling points between the starting point and the ending point of the peak (i.e., at half the sampling interval). In step S7-2, the Bezier curve thus created is changed to have the same number of sampling points (same interval) as the chromatogram through linear interpolation, and the resulting Bezier curve is determined to be the baseline. In this way, matching the chromatogram and the baseline in number of sampling points (interval) makes it easy to subtract the baseline from the chromatogram in next step S8. After that, the operation in step S8 is performed as described above, and a series of operations is thus finished.

(35) The present invention is not limited to the above embodiment.

(36) For example, as shown in FIG. 7, when a peak 21B of a chromatogram 20B is affected by drift, the intersection of a first reference line 231B that is a regression line obtained from data within a predetermined range including a starting point 211B of the peak 21B and a second reference line 232B that is similarly obtained for an ending point 212B of the peak 21B may be far out of the range of the peak. Since it is not appropriate to obtain the intermediate control point based on the intersection that is far out of the range of the peak 21B in this way, instead, any point in a triangle defined by the first reference line 231B to the third reference line (not shown in FIG. 7 for clarity) in vicinity of the starting point 211B is preferably determined to be the intermediate control point. In the example of FIG. 7, a point determined to be a first intermediate control point 213B is on the first reference line 231B at the position shifted from the starting point 211B to the ending point 212B side along the parameter axis by one tenth of the distance (rs−le) between the starting point 211B and the ending point 212B. Since it is not possible to successfully determine a Bezier curve closer to the ending point 212B only using the first intermediate control point 213B, the starting point 211B, and the ending point 212B, in the example of FIG. 7, additionally, a point determined to be a second intermediate control point 214B is on the second reference line 232B at the position shifted from the ending point 212B to the starting point 211B side along the parameter axis by one tenth of the distance (rs−le). Note that the second intermediate control point 214B is not in the triangle. The starting point 211B, the first intermediate control point 213B, the second intermediate control point 214B, and the ending point 212B are used as control points in this order to create a Bezier curve, and it is thus possible to determine the resulting Bezier curve to be a partial baseline 24B for the section of the peak 21B.

(37) Further, for example, as shown in FIG. 8, two or more intermediate control points may be set in a triangle defined by the first reference line to the third reference line. In the example of FIG. 8, for the peak 21A of the same chromatogram 20A as in FIG. 4, a first intermediate control point 213C is set on the first reference line 231A closer to the ending point 212A by a time δt from the starting point 211A, and a second intermediate control point 214C is set on the second reference line 232A closer to the starting point 211A by a time δt from the ending point 212A. The first intermediate control point 213C and the second intermediate control point 214C are both arranged within (on the sides of) the triangle defined by the first reference line 231A, the second reference line 232A, and the third reference line 233A. Then, the starting point 211A, the first intermediate control point 213C, the second intermediate control point 214C, and the ending point 212A are used as control points in this order to create a Bezier curve, and it is thus possible to determine the resulting Bezier curve to be a partial baseline 24C for the section of the peak 21A. Furthermore, in the example of FIG. 8, one or more intermediate control points may be further provided between the first intermediate control point 213C and the second intermediate control point 214C in the triangle.

REFERENCE SIGNS LIST

(38) 1 . . . Data Recording Unit 2 . . . Display Device 3 . . . Input Device 10 . . . Data Processing Device 11 . . . Chromatogram Creator 12 . . . Peak Detector 13 . . . Intermediate Control Point Determiner 14 . . . Baseline Determiner 15 . . . Baseline-subtracted Chromatogram Creator 20, 20A, 20B . . . Chromatogram 21, 21A, 21B . . . Peak 211, 211A, 211B, 92 . . . Starting Point 212, 212A, 212B, 93 . . . Ending Point 213, 213A . . . Intermediate Control Point 213B, 213C . . . First Intermediate Control Point 214B, 214C . . . Second Intermediate Control Point 221 . . . First Range 222 . . . Second Range 231, 231A, 231B . . . First Reference Line 232, 232A, 232B . . . Second Reference Line 233, 233A . . . Third Reference Line 24, 24A, 24B, 24C, 941, 942 . . . Partial Baseline 25 . . . Intersection of First Reference Line and Second Reference Line 26 . . . Perpendicular drawn from Intersection 25 down to Parameter Axis 91 . . . Peak Top