Analysis method for supporting classification
10401275 ยท 2019-09-03
Assignee
Inventors
- Barbara Kavsek (Sankt Radegund bei Graz, AT)
- Peter Lederer (Graz, AT)
- Peter Taal (HK Oost-Souburg, NL)
- Jan van den Boogaart (AG Someren, NL)
Cpc classification
G01N15/1456
PHYSICS
G01N2021/4769
PHYSICS
International classification
Abstract
The invention relates to an analysis method for supporting classification, a determination method for determining analysis parameters Y.sub.s, E.sub.i, I.sub.i, .sub.i for the analysis method, a computer program product, and an optical analysis system for supporting classification, with which system analysis parameters Y.sub.s, E.sub.i, I.sub.i, .sub.i can be defined on the basis of first and second calibration data. The parameters provide classification support according to the discriminant analysis and on the basis of measured values P.sub.i of optical characteristics i, in particular of organic dispersions, and the information content thereof for classification, in particular the diagnosis of disease; and permit a classification proposal or a diagnosis proposal in comparison with a threshold Y.sub.s.
Claims
1. An analysis method for classification assistance having the following steps: focusing an analysis beam from a light source into a test dispersion to be classified; sensing the analysis beam extending through the test dispersion for optical features (i) of the test dispersion; ascertaining measured values (P.sub.i) for the optical features (i) of the test dispersion to be classified, wherein the test dispersion is formed from a dispersion medium comprising a dispersed phase, and the dispersed phase has cells or cell components of organic materials, calculating a classification index (Y), defined by
Y(n=2)=l.sub.1*(P.sub.1E.sub.1)/.sub.1+l.sub.2*(P.sub.2E.sub.2)/.sub.2, wherein n=2 and a first optical feature is a measure of optical scattering in a transverse X direction in relation to the analysis beam, and a second optical feature is a measure of optical scattering in a Y direction, which is perpendicular to the X direction and also perpendicular to the analysis beam, and wherein the analysis beam extends through the test dispersion; and providing a diagnosis proposal based on the classification index that indicates whether the test dispersion originates from a sick patient, a healthy patient, or cannot be determined via a computer applying reduced computing power in response to usage of the significance parameter (l.sub.i) of the respective optical feature (i).
2. The analysis method as claimed in claim 1, wherein the mean value (E.sub.i) of the optical feature (i) and the standard deviation (.sub.i) of the optical feature (i) were ascertained on the basis of first calibration data, wherein the first calibration data are derived from dispersions having a negative classification, the negative classification indicating a negative diagnosis, and second calibration data, wherein the second calibration data are derived from dispersions having a positive classification, the positive classification indicating a deficiency, a parasitic infestation, or a nonnormal state.
3. The analysis method as claimed in claim 2, wherein the significance parameters (l.sub.i) are derived from the first calibration data and from the second calibration data by means of a discrimination analysis, wherein a Bayesian theory is used in the discrimination analysis.
4. The analysis method as claimed in claim 3, wherein, based on a first mean classification index (Y.sub.1) of the first calibration data and a second mean classification index (Y.sub.2) of the second calibration data, a threshold value (Y.sub.S) is usable for the classification assistance, wherein Y>Y.sub.S is assigned to positive classification and Y<Y.sub.S is assigned to negative classification.
5. The analysis method as claimed in claim 4, wherein a positive classification indicates a presence of a deficiency, a parasitic infestation, or a nonnormal state, wherein the deficiency is an anemia.
6. The analysis method as claimed in claim 5 wherein the deficiency is a Mediterranean anemia or a sickle-cell anemia.
7. The analysis method as claimed in claim 5 wherein the parasitic infestation is a leishmaniasis.
8. The analysis method as claimed in claim 4 wherein the first mean classification index (Y.sub.1) comprises a first focal point, and the second mean classification index (Y.sub.2) comprises a second focal point.
9. The analysis method as claimed in claim 2 including an ascertaining step for determining analysis parameters (Y.sub.S, E.sub.i, I.sub.i, .sub.i), wherein for an established number (n) of optical features (i), the analysis parameters (Y.sub.S, E.sub.i, I.sub.i, .sub.i) of the standard deviation (.sub.i), the mean value (E.sub.i), and the significance parameters (I.sub.i) are ascertained based on the first and the second calibration data.
10. The analysis method as claimed in claim 9, wherein at least one first control parameter, comprising a minimum or a fit parameter, is calculated based on the analysis parameters (Y.sub.S, E.sub.i, I.sub.i, .sub.i) to evaluate the classification assistance.
11. The analysis method as claimed in claim 10, wherein at least a part of the analysis parameters (Y.sub.S, E.sub.i, I.sub.i, .sub.i) is adapted after an evaluation by the first control parameter, wherein an adaptation of the at least one analysis parameter (Y.sub.S, E.sub.i, I.sub.i, .sub.i) and the evaluation of the classification assistance are executable alternately until an improvement is no longer possible or desirable.
12. The analysis method as claimed in claim 11, wherein the at least one part of the analysis parameters (Y.sub.S, E.sub.i, I.sub.i, .sub.i) is the significance parameters (I.sub.i) or the threshold value (Y.sub.S).
13. The analysis method as claimed in claim 11, wherein as soon as an improvement is no longer possible or desired, the analysis method is executable again using a reduced number of optical features (i), wherein the analysis parameters (Y.sub.S, E.sub.i, I.sub.i, .sub.i) of one nonsignificant optical feature (i) or multiple nonsignificant features (i) are no longer taken into consideration, wherein a lack of significance of an optical feature (i) is determined on the basis of the significance parameter (I.sub.i).
14. The analysis method as claimed in claim 13, wherein a lack of significance exists if the influence of the optical feature (i) on the classification index (Y) is small.
15. A non-transitory computer readable storage medium comprising instructions configured to cause the computer to execute the analysis method as claimed in claim 9.
16. An optical analysis system for carrying out the analysis method as claimed in claim 9, the optical analysis system comprising: a light source for providing an analysis beam directed at an organic dispersion; a plurality of lenses for focusing the analysis beam before and after the organic dispersion; a beam splitter for splitting the analysis beam or a mirror for reflecting the analysis beam or both; one or more spectral sensors positioned to receive the analysis beam from the beam splitter, the mirror, or both; and the computer.
17. A non-transitory computer readable storage medium comprising instructions configured to cause the computer to execute the analysis method as claimed in claim 1.
18. An optical analysis system for carrying out the analysis method as claimed in claim 1, the optical analysis system comprising: a light source for providing an analysis beam directed at an organic dispersion; a plurality of lenses for focusing the analysis beam before and after the organic dispersion; a beam splitter for splitting the analysis beam or a mirror for reflecting the analysis beam or both; one or more spectral sensors positioned to receive the analysis beam from the beam splitter, the mirror, or both; and the computer.
19. The analysis method as claimed in claim 1 wherein the dispersion medium comprises blood plasma.
Description
BRIEF DESCRIPTION OF THE DRAWINGS
(1) The invention is described and explained in greater detail hereafter on the basis of the exemplary embodiments illustrated in the figures. In the figures:
(2)
(3)
(4)
(5)
(6)
(7)
(8)
(9)
(10)
DETAILED DESCRIPTION
(11)
(12)
(13) The classification index Y can be output in this case, for example, together with a threshold value Y.sub.S, so that it is apparent for the operator of the optical analysis system 10 how a classification can be performed. However, this relates not only to a classification assistance in this case, in which a classification proposal is output by the analysis method. This means that a classification proposal is possibly established by the system, specifically if the classification index Y is greater than or less than a threshold Y.sub.S, but if, for example, a diagnosis of an illness is to be performed, it is always incumbent upon the physician of the patient who has provided the organic dispersion in the form of a blood sample to make the diagnosis of an existing illness or not. If the classification index Y is precisely at the threshold value Y=Y.sub.S, the optical analysis system does not provide any classification assistance.
(14)
(15) The diagram shown in
(16)
(17) The malaria diagnosis assistance 20 is based on a training phase 28 and a subsequent test phase 31. In this case, the test phase 31 enables the standardization 27 of the training phase 28.
(18) Firstly, the training 22 is begun on the basis of a value allocation 25 for the analysis parameters Y.sub.s, E.sub.i, l.sub.i, .sub.i, wherein a probability distribution of the classification index is generated on the basis of calibration data and a scaling is performed on the basis of this probability distribution, which enables a threshold Y.sub.s to be specified. If the scaling 29 was successful, the value allocation of the analysis parameters Y.sub.s, E.sub.i, l.sub.i, .sub.i is thus retained, to be used in the analysis method.
(19) In a second phase, the prediction 23, a test is performed on the basis of test data, which preferably are not part of the calibration data, but rather have been taken from another data set or another examination or the like.
(20) On the basis of the found set of analysis parameters Y.sub.s, E.sub.i, l.sub.i, .sub.i, which in particular include the significance parameters l.sub.i, classifications of test dispersions are performed, for which unambiguous classifications are already provided on the basis of the calibration data.
(21) The classification 32 is the classification which can be assigned to the analysis method.
(22) In the exemplary embodiment, the sensitivity and specificity are observed, to establish the classification accuracy, or in the case of a malaria classification assistance, the diagnosis accuracy, respectively.
(23) The specificity is defined as the probability of a negative diagnosis, supposing that this classification is correct.
(24) Sensitivity is defined as the probability of a positive classification, supposing that this classification is correct.
(25) The correctness is established in the case of sensitivity and also in the case of sensitivity using the calibration data in a comparison.
(26) In addition, other criteria are also to be taken into consideration, which can possibly be used alternatively or optionally, namely a canonical correlation 34, Wilks's lambda 35, fit parameter chi.sup.2 36, or a P-value 37 observation.
(27) After the quality control 24, it can be decided whether the analysis method can offer an efficient classification rule, which meets the quality standards of medical analysis systems.
(28)
(29) The processing sequence 50 having the steps of resampling 21, training 22, prediction 23, and the quality control 24 can be repeated several hundred times, also thousands of times, typically precisely 1000 times, if personal computers are used.
(30) If the analysis parameters found, in particular the significance parameters l.sub.i, cannot be optimized further, the inner loop thus terminates with the processing sequence 50. The inner loop can propose a classification rule upon each execution. This is accordingly true upon the execution of the outer loop 52.
(31) A query 51 is then performed as to whether a significance parameter l.sub.i is sufficiently small that it cannot contribute usable information to the classification index. If such a significance parameter l.sub.i is found, all analysis parameters which are associated with the respective optical measured value are thus not considered further, these include the standard deviation .sub.i, the mean value E.sub.i, and the significance parameter l.sub.i of the irrelevant optical feature.
(32) If it was recognized, for example, that the number of fragments in the dispersion is unimportant for the malaria diagnosis, the significance parameter, which can be l.sub.28, for example, could thus have assumed a value which is very close to zero. Therefore, the number of fragments is subsequently no longer used for the malaria diagnosis, in that the standard deviation .sub.28, the mean value E.sub.28, and the significance parameter l.sub.28 are no longer taken into consideration. This can be achieved in a computer program in that l.sub.28 is set to zero (l.sub.28=0). The number n is reduced by one, whereby proceeding from n=29, for example, only n=28 optical features are still taken into consideration. In this case, the optical feature i=29 would receive the counter number 28, whereby the renewed start 52 of the outer loop can now be continued using 28 optical measured values.
(33)
(34) Furthermore, the knowledge of the actually provided classification is incorporated into the diagram insofar as known positive classifications are identified in the distribution using wide bars and negative classifications are shown in a distribution using narrow bars.
(35) To go from the upper diagram to the lower diagram, the two distributions of different classifications were considered as a probability density distribution, and scaled accordingly therefrom. This scaling can set the surface area of the respective distributions to one, for example. Other scalings are possibly also advantageous. The scalings ensure that the different number of classifications in the calibration data does not have any influence on the classification rule to be determined.
(36) Ideally, a significant separation would be desired between the probability density distributions D shown. In this manner, a threshold Y.sub.s would be determinable very easily, specifically in that Y.sub.s would be established between the value ranges of the two distributions. In the present case of the malaria diagnosis, it behaves so that the probability of an incorrect classification is possible for a range around the established threshold Y.sub.s, because the classification index Y only has inadequate informative power in this range. The closer the classification index Y at a test dispersion is to the threshold value Y.sub.s, the more probable is an incorrect classification.
(37) A discrimination analysis of the distributions provided in
(38) TABLE-US-00001 TABLE 1 i l.sub.i E.sub.i .sub.i 1 0.0243143292327641 3.85878279704797 0.996350482855477 2 1.05744699221905 86.2085734326568 12.953846866348 3 1.3620780210087 31.3808859769373 2.65337438754902 4 0.298681341952128 16.7073148570111 3.05622820486547 5 0.039685909568696 3.18916825922509 0.610392795099183 6 0.802606052864478 22.4169387250923 6.67725822054997 7 0.013290099049454 1.80846317896679 0.490486418856065 8 0.172647099054505 0.602234248154981 0.216930342394397 9 0.0796067332070757 7.38290569464945 11.6539118822964 10 0.197381907473051 4.55739805811808 10.0180223771838 11 0.399165319578081 19.4079367287823 22.1428759582116 12 0.0494342161138479 1.24748103597786 2.31647351118736 13 1.44026210168956 26.9894740940959 5.0938073590873 14 0.520893299030793 4.28650641605166 1.07368357499902 15 0.284413335456635 16.4872902721402 12.5135939166568 16 0.531892835621228 7.23708487084871 8.95211789259242 17 0.698987246767777 336.37508399262 233.481906156752 18 0.488427163694765 9.19492226291513 2.00576799720274 19 0.683216666590792 53.4022117638376 11.7120398800053 20 2.39463588229883 23.307653449262 2.81445425488391 21 0.493291518472633 5.742889099631 0.731852460724153 22 0.199742256744743 1.95712033394834 0.25494636006376 23 1.76951528599063 0.839687189114391 0.161239146887579 24 0.0918169689792613 1.37246023431734 0.00515681120954237 25 2.99867547853134 34.8957316577491 5.81915230603462 26 2.46145277256995 17.5466706845018 3.58836148561811 27 0.0213275666681148 8.33807254059041 7.74938961818539 28 0.477404018065088 0.0352406872693727 0.136793871627217 29 0272433630134861 0.016729979704797 0.0197033898754757 The above table lists the analysis parameters l.sub.i, E.sub.i, .sub.i, wherein both the associated measured value P.sub.i and also the listed analysis parameters l.sub.i, E.sub.i, .sub.i are assigned to the respective optical feature i, and wherein the parameter i is to be understood as the number of the respective optical feature.
(39) TABLE-US-00002 TABLE 2 ADVIA parameter/optical Definition of the measured value i P.sub.i feature P.sub.i for the respective feature i 1 P.sub.1 RBC Count of the red blood cells from the RBC/PLT channel 2 P.sub.2 MCV Mean cell volume of the counted red blood cells (RBC) 3 P.sub.3 CHCM Mean hemoglobin concentration (g/dL) of the cells (from RBC) 4 P.sub.4 RDW Density distribution breadth of the red blood cells (from RBC) 5 P.sub.5 HDW Breadth of the hemoglobin density distribution (g/dL) 6 P.sub.6 plat_mode PLT mode 7 P.sub.7 mu_fit -FIT 8 P.sub.8 sig_fit -FIT 9 P.sub.9 micro_pcnt Number of the RBC cells in percent having fewer than 60 fL 10 P.sub.10 macro_pcnt Number of the RBC cells in percent having more than 120 fL 11 P.sub.11 hypo_pcnt Number of the red blood cells in percent having a hemoglobin concentration of less than 28 g/dL 12 P.sub.12 hyper_pcnt Number of the red blood cells in percent having a hemoglobin concentration of less than 41 g/dL 13 P13 H_mean Mean value of the hemoglobin concentration in picograms 14 P14 H_deviation Breadth of the hemoglobin content distribution in picograms 15 P15 VHC_covar Covariance of the hemoglobin density distribution 16 P16 MN_PMN_valley The BASO MN/PMN minimum as a depression between the MN and PMN clusters of the BASO cytogram 17 P17 PLT Count of the recognized blood platelets 18 P18 MPV Mean blood platelet volume 19 P19 PDW Blood platelet distribution breadth 20 P20 MPC Mean blood platelet component concentration 21 P21 PCDW Distribution breadth of the blood platelet component concentration 22 P22 MPM Mean blood platelet dry mass 23 P23 PMDW Distribution depth of the blood platelet dry mass 24 P24 PLT_Mean_n Mean of a refractive index of the blood platelets (from PLT measurement) 25 P25 PLT_Mean_X Mean value of the diffraction deflection in X direction 26 P26 PLT_Mean_Y Mean value of the diffraction deflection in Y direction 27 P27 Large_PLT Large blood platelets (number) 28 P28 RBC_Fragments Number of the counted fragments in the PLT diffraction cytogram 29 P29 RBC_Ghosts Number of the events in the PLT diffraction cytogram without assignment ADVIA is a hematological analysis system having automated measured value ascertainment. The measured values P.sub.i are indicated with the respective abbreviation thereof, as are used in the so-called ADVIA export files, and are defined in the last column.
(40) At this point, a reference is established to WO 00/58727 (PCT/US00/08512), the content of the disclosure of which is to be integrated by explicit reference into this intellectual property, in particular insofar as it relates to the measured values and optical features of the disclosed hematological analysis system used therein.
(41) In the exemplary embodiment of the tables, a threshold Y.sub.s=1.822 is proposed for the malaria diagnosis assistance. A diagnosis proposal can therefore be provided for each ascertained classification index of a blood sample with a small residual risk, namely with Y>1.822 a positive malaria finding and with Y<1.822 a negative malaria finding. Y=1.822 does not provide any assistance.
(42) During the quality control 24 of the malaria diagnosis assistance, the specificity distribution 41 and the sensitivity distribution 40 were ascertained and plotted against one another in a two-dimensional graph. If the distributions are divided into 4 equal parts, the two middle quarters can thus be taken into consideration for this observation. In other words, the IQR 42 (interquartile range) of the distribution 40 and the IQR 43 of the distribution 41 are taken into consideration. To achieve the best results, it is recommended that classification rules be selected which are established within the intersection range S shown, to meet the highest medical requirements.
(43) In the case of a simple mean value calculation, the sensitivity reaches 96.25%, and upon an observation of the IQR, it reaches a range from 59% to 97.5%. The specificity reaches an average of 98.29% and upon an observation of the IQR reaches a range from 97.98% to 89.60%. A correct classification is therefore extremely probable, wherein incorrect classification proposals can be excluded with only a small residual risk.
(44)
(45) In summary, the invention relates to an analysis method for classification assistance, an ascertainment method for determining analysis parameters Y.sub.S, E.sub.i, l.sub.i, .sub.i for the analysis method, a computer program product, and an optical analysis system for classification assistance, in which, based on first and second calibration data, analysis parameters Y.sub.S, E.sub.i, l.sub.i, .sub.i can be established, which provide a classification assistance according to rules of discrimination analysis, which, on the basis of measured values P.sub.i of optical features i, in particular organic dispersions, enable the information content thereof for the classification, in particular illness diagnoses, a classification proposal or diagnosis proposal in comparison to a threshold Y.sub.S.