METHOD FOR RAPIDLY DETERMINING GRADE OF BLACK TEA
20220291181 · 2022-09-15
Inventors
- Huimei Cai (Hefei, CN)
- Mengying Shuai (Hefei, CN)
- Xiaochun Wan (Hefei, CN)
- Daxiang Li (Hefei, CN)
- Chuanyi Peng (Hefei, CN)
Cpc classification
G01N30/8675
PHYSICS
G01N30/8682
PHYSICS
International classification
Abstract
A method for determining a grade of black tea by HPLC detection belongs to the field of tea grade determination. The specific steps are as follows: adding known black tea powder samples of different grades into boiling water of 95-100° C. for extraction, and filtering with a filter membrane with a pore size in a range of 0.20-0.25 μm to obtain black tea sample liquid; measuring contents of ten components by peak area normalization method; standardizing data of the contents of the ten components in a black tea sample solution; carrying out unsupervised principal component analysis; carrying out supervised partial least squares discriminant analysis; carrying out hierarchical clustering analysis on the basis of partial least squares discriminant analysis, and finally establishing a tea grade discrimination model based on HPLC. The method is simple, accurate and efficient, and whose effectiveness is not affected by the variety of black tea.
Claims
1. A method for black tea grade determination, comprising the following steps: (1) preparing high-performance liquid chromatography (HPLC) standard solutions of ten kinds of components with different concentrations according to a gradient, and plotting standard curves; (2) adding black tea powder samples of different known grades individually into boiling water at 95-100° C. for extraction, cooling to room temperature after the extraction, followed by centrifugation, and then performing filtration of a supernatant with a filter membrane with a pore size in a range of 0.20-0.25 micrometers (mu) to obtain black tea sample solutions; (3) separating and identifying components of the black tea sample solutions of the known grades by HPLC, and measuring contents of the ten kinds of components by a peak area normalization method; (4) standardizing data of the contents of the ten kinds of components in the black tea sample solutions, then carrying out unsupervised principal component analysis, and subsequently carrying out supervised partial least square discriminant analysis; (5) carrying out hierarchical clustering analysis on the basis of partial least squares discriminant analysis to obtain re-classification charts of different grades of different kinds of black teas, and establishing a tea grade discrimination model based on HPLC; and (6) processing a black tea powder sample to be tested by step (2) and step (3) once to obtain data associated with the black tea powder sample to be tested, and importing the obtained data into the tea grade discrimination model in step (5) to determine a grade of the black tea to be tested; wherein the ten kinds of components include epigallocatechin, epicatechin, epigallocatechin gallate, epicatechin gallate, theaflavin, theaflavin-3-gallate, theaflavin-3-gallate, theaflavin-bis-gallate, theobromine and thein.
2. The method according to claim 1, wherein during separating and identifying components of the black tea sample solutions of the known grades by HPLC, a mobile phase of HPLC separation is acetonitrile and ultrapure water, and a flow rate thereof is 0.6-1.0 mL/min.
3. The method according to claim 1, wherein the standardizing data of the contents of the ten kinds of components in the black tea sample solutions, then carrying out unsupervised principal component analysis specifically comprises: (1) setting up n numbers of samples and p numbers of indexes, obtaining a data matrix X=(x.sub.ij).sub.n×p, i=1, 2, . . . , n, j=1, 2, . . . , p, x.sub.ij represents a j.sup.th index value of an i.sup.th sample; (2) performing standardized transformation of data with Z-score method: Z.sub.ij=(x.sub.ij−x.sub.j)/S.sub.j; (3) finding a correlation matrix R of index data: R=(r.sub.jk).sub.p×p, j=1, 2, . . . , k=1, 2, . . . , p; r.sub.jk is a correlation coefficient between the index j and the index k; (4) finding eigenvectors of eigenvalues of the correlation matrix R to determine principal components: obtaining p numbers of eigenvalues λ.sub.g (g=1, 2, . . . , p) from a characteristic equation |λ.sub.Ip−R|=0, ranking λ.sub.1 in order of magnitude as λ.sub.1≥λ.sub.2≥ . . . ≥λ.sub.p≥0, wherein λ.sub.g is a variance of principal component and its magnitude describes a role of each principal component in describing an evaluated object; according to the characteristic equation, each of the eigenvalues corresponds to one of eigenvectors L.sub.g (L.sub.g=lg.sub.1, lg.sub.2, . . . , l.sub.gp) g=1, 2, . . . , p; index variables after the standardizing are transformed into the principal components as that: F.sub.g=l.sub.g1Z.sub.1+l.sub.g2Z.sub.2+ . . . +l.sub.gpZ.sub.p (g=1, 2, . . . , P), where F.sub.1 is called as a first principal component, F.sub.2 is called as a second principal component, . . . , F.sub.p is called as a P.sup.th principal component; (5) calculating a variance contribution rate and determine a number of the principal components: the number of the principal components is equal to a number of original indexes, and if there are more original indexes, it is more troublesome to conduct comprehensive evaluation, principal component analysis is to select as few K numbers of principal components (k<p) as possible for comprehensive evaluation, and at the same time, make lost information as little as possible.
4. The method according to claim 1, wherein an algorithm of the partial least squares discriminant analysis specifically comprises: (1) modeling method: setting up n numbers of samples, with q numbers of dependent variables and p numbers of independent variables; forming data tables X and Y for the independent and dependent variables; using partial least squares regression to extract t and u from X and Y respectively, with t and u carrying as much information as possible about variances in their respective data tables, and t and u being correlated to a maximum extent possible; after a first component has been extracted, implementing the partial least squares regression for X on t and for Y on t, respectively; if the regression equation has reached a target accuracy, the algorithm terminates; otherwise, a second round of component extraction is performed using residual information from an interpretation of X by t and residual information from an interpretation of Y by t, and then being repeated until the target accuracy is achieved; if multiple components are eventually extracted for X, the partial least squares regression then is performed by imposing a regression of yk on these components of X, which is expressed as a regression equation of yk on original independent variables; (2) marking a data matrix obtained by X after standardization as E.sub.0=(E.sub.01, . . . , E.sub.0p)n×p and a matrix corresponding to Y as F.sub.0=(F.sub.01, . . . , F.sub.0q)n×q; noting that t.sub.1 is a first component of E.sub.0, t.sub.1=E.sub.0w1, w.sub.1 is a first axis of E.sub.0 and is a unit vector, i.e. ∥w.sub.1∥=1; marking u.sub.1 as a first component of F.sub.0, u.sub.1=F.sub.0c1, c.sub.1 is a first axis of F.sub.0, and ∥c.sub.1∥=1; then, solving a following optimization problem, i.e., noting that θ.sub.1=w.sub.1′E.sub.0′F.sub.0c1, which is precisely a objective function value of the optimization problem; using Lagrange algorithm, obtaining E.sub.0′F.sub.0F.sub.0′E.sub.0W1=θ.sub.12W1 and F.sub.0′E.sub.0E.sub.0′F.sub.0c1=θ.sub.12c.sub.1; therefore, w.sub.1 is a unit eigenvector corresponding to a maximum eigenvalue of E.sub.0′F.sub.0F.sub.0′E.sub.0 matrix, and c.sub.1 is a unit eigenvector corresponding to a maximum eigenvalue θ.sub.12 of F.sub.0′E.sub.0E.sub.0′F.sub.0 matrix; components t.sub.1=E.sub.0w1 and u.sub.1=F.sub.0c1 are obtained after finding the axes w.sub.1 and c.sub.1; then, finding regression equations: E.sub.0=t.sub.1p.sub.1′+E.sub.1, F.sub.0=t.sub.1r1′+F.sub.1 of E.sub.0 and F.sub.0 on t.sub.1 respectively, where regression coefficient vectors are p.sub.1=E.sub.0′t.sub.1/∥t.sub.1∥.sub.2; r.sub.1=F.sub.0′t.sub.1/∥t.sub.1∥.sub.2; and E.sub.1 and F.sub.1 are residual matrices of the two regression equations respectively; and (3) replacing E.sub.0 and F.sub.0 with the residual matrices E.sub.1 and F.sub.1, and then finding second axes w.sub.2 and c.sub.2 and second components t.sub.2 and u.sub.2, where t.sub.2=E.sub.1w.sub.2, u.sub.2=F.sub.1c.sub.2, θ.sub.2=<t.sub.2, u.sub.2>=w.sub.2′E.sub.1′F.sub.1c.sub.2; w.sub.2 is a unit eigenvector corresponding to a maximum eigenvalue of E.sub.1′F.sub.1F.sub.1′E.sub.1 matrix, while c.sub.2 is a unit eigenvector corresponding to a maximum eigenvalue θ.sub.22 of F.sub.1′E.sub.1E.sub.1′F.sub.1 matrix; calculating regression coefficients p.sub.2=E.sub.1′t.sub.2/∥t.sub.2∥.sub.2; r.sub.2=F.sub.1′t.sub.2/∥t.sub.2∥2; therefore, there are regression equations: E.sub.1=t.sub.2p.sub.2′+E.sub.2, F.sub.1=t.sub.2r.sub.2′+F.sub.2; in this way, if a rank of X is A, then E.sub.0=t.sub.1p.sub.1′+ . . . +tA.sub.pA′; F.sub.0=t.sub.1r.sub.1′+ . . . +tArA′+FA; and (4) cross-validity: one more component is worthwhile when a prediction error sum of squares in a case for one more component and one less sample divided by an error sum of squares in another case for one less component is less than 0.952, otherwise it is not worthwhile.
Description
BRIEF DESCRIPTION OF THE DRAWINGS
[0026]
[0027]
[0028]
[0029]
[0030]
[0031]
[0032]
[0033]
[0034]
[0035]
[0036]
[0037]
DETAILED DESCRIPTION OF EMBODIMENTS
[0038] For a better understanding of the technical features, objectives and beneficial effects of the present application, a further description of the application is given below in connection with specific embodiments, but the application is not limited to the present embodiments.
[0039] In all the embodiments of the application, the tea grade is determined according to the Chinese national standard (GB/T 23776-2009).
[0040] Tanyang Gongfu black tea (Special-grade: 1, 2, 3; Grade I: 4, 5, 6; Grade II: 7, 8, 9; Grade III: 10, 11, 12; Grade IV: 13, 14, 15);
[0041] Keemun black tea (Special-grade: 1, 2, 3; Grade I: 4, 5, 6; Grade II: 7, 8, 9; Grade III: 10, 11, 12; Grade IV: 13, 14, 15; Grade V: 16, 17, 18; Grade VI: 19, 20, 21);
[0042] Sichuan Leshan black tea (Special-grade: 1, 2, 3; Grade I: 4, 5, 6; Grade II: 7, 8, 9).
Embodiment 1
[0043] Weighing 0.200 g of evenly ground Tanyang Gongfu black tea powder sample into a centrifuge tube, adding 10 mL of distilled water at 80° C., mixing evenly, immediately moving into boiling water at 95-100° C. for extraction for 10 min, performing stirring once every 3-5 min, letting it cool to room temperature after leaching, and performing centrifugation; taking the supernatant and performing filtration by a filter membrane with pore size of 0.20 mu and putting the filtrate into a liquid vial; using HPLC to separate and identify the components of tea, where the mobile phase is acetonitrile and ultrapure water with a flow rate of 0.6-1.0 mL/min and the peak area normalization method is used to measure the content; importing the data into SIMCA software, standardizing the data and performing unsupervised principal component analysis, which makes the classification result more objective without pre-classification; then, carrying out supervised partial least squares discriminant analysis; obtaining rather accurate results by enlarging the differences between groups and reducing the differences within groups, and obtaining relevant data such as contribution rate and predictive ability, with cumulative predictive power values (Q.sup.2) and cumulative variance contribution margin (R2Y) close to 1.0 indicating a good model; further, carrying out hierarchical cluster analysis on the basis of partial least squares discriminant analysis, where the hierarchical cluster analysis is based on the nature of subjects to be classified, the distance corresponding to a large and small difference in nature is far and near, it could be seen directly that the cluster analysis gives a re-grading diagram of different grades of different kinds of black tea, value of VIP in partial least squares discriminant analysis quantifies the contribution of each variable to the classification, and a VIP value larger than 1 indicates that the variables differ significantly between different classes of black tea in different grades; and finally, developing a model for establishing tea grade differentiation based on HPLC; the VIP values show that theobromine, theMe and EGC can be used to classify Tanyanggong black tea into Special-grade, Grade I, Grade II and Grade III; the results are consistent with those determined by GB/T 23776-2009, proving that this method to grade black tea is accurate and valid;
[0044] Importing the data into the multivariate analysis software SPSS, standardizing the data and then performing unsupervised principal component analysis, which makes the classification result more objective without pre-classification; then carrying out supervised partial least squares discriminant analysis; obtaining rather accurate results by enlarging the differences between groups and reducing the differences within groups, and obtaining relevant data such as contribution rate and predictive ability, with cumulative predictive ability (Q.sup.2) and cumulative variance contribution rate (R2Y) close to 1.0 indicating a good model; further, carrying out hierarchical clustering analysis based on the partial least squares discriminant analysis, where hierarchical clustering analysis is to classify subjects according to their properties, and the distance between the large and small properties is far and near; it can be seen directly that clustering analysis gives the re-classification charts of different grades and different kinds of black tea; the VIP values in partial least squares discriminant analysis could be used to quantify the contribution of each variable to classification, with VIP value greater than 1 indicating that there are significant differences among different types and grades of black tea; finally, developing a tea grade discrimination model based on HPLC.
[0045] Among them the data standardization and unsupervised principal component analysis methods of the multivariate analysis software are as follows:
[0046] (1) standardization of original index data
[0047] setting up n numbers of samples and p numbers of indicators (also referred to as indexes), obtaining the available data matrix X=(x.sub.ij).sub.n×p, i=1, 2, . . . , n represents n numbers of samples, j=1, 2, . . . , p; p represents p numbers of indicators, x.sub.ij stands for the j.sup.th index value of the i.sup.th sample;
[0048] (2) standardized transformation of the data with Z-score method: Z.sub.ij=(x.sub.ij−x.sub.j)/S.sub.j
[0049] (3) finding the correlation matrix of index data: R=(r.sub.jk).sub.p×p, j=1, 2, . . . , k=1, 2, . . . , p; r.sub.jk is the correlation coefficient between index j and index k.
[0050] (4) finding the eigenvectors of the eigenvalues of the correlation matrix R to determine the principal components: obtaining the p numbers of characteristic roots λ.sub.g (g=1, 2, . . . , p) from the characteristic equation |λ.sub.Ip−R|=0, ranking λ.sub.g in order of magnitude as λ.sub.1≥λ.sub.2≥ . . . ≥λ.sub.p≥0, where λ.sub.g is the variance of principal component, and its magnitude describes the role of each principal component in describing the evaluated object; according to the characteristic equation, corresponding each characteristic root to a characteristic vector L.sub.g (L.sub.g=lg.sub.1, lg.sub.2, . . . , l.sub.gp), g=1, 2, p; transforming the standardized index variables into principal components: F.sub.g=l.sub.g1Z.sub.1+l.sub.g2Z.sub.2+ . . . +l.sub.gpZ.sub.p (g=1, 2, . . . , p), where F.sub.1 is called as first principal component, F.sub.2 is called as second principal component, . . . , and F.sub.p is called as p.sup.th principal component;
[0051] (5) finding the variance contribution rate and determining the number of principal components: generally the number of principal components is equal to the number of original indicators; if there are many original indicators, comprehensive evaluation will be more troublesome, the principal component analysis is to select as few k principal components (k<p) as possible for comprehensive evaluation while still keeping the amount of information lost as low as possible.
[0052] The steps in the algorithm for partial least squares discriminant analysis are as follows:
[0053] (1) modeling method: setting up n numbers of samples, with q numbers of dependent variables and p numbers of independent variables; forming data tables X and Y for the independent and dependent variables; using partial least squares regression to extract t and u from X and Y respectively, with t and u carrying as much information as possible about the variance in their respective data tables, and t and u being correlated to the maximum extent possible; after the first component has been extracted, implementing the partial least squares regression for X on t and for Y on t, respectively; if the regression equation has reached satisfactory accuracy, the algorithm terminates; otherwise, a second round of component extraction is performed using the residual information from the interpretation of X by t and the residual information from the interpretation of Y by t. This is repeated until a more satisfactory accuracy can be achieved; if multiple components in total are eventually extracted for X, the partial least squares regression will be performed by imposing a regression of yk on these components of X, which will then be expressed as a regression equation of yk on the original independent variable;
[0054] (2) marking the data matrix obtained by X after standardization as E.sub.0=(E.sub.01, E.sub.0p).sub.n×p and the matrix corresponding to Y as F.sub.0=(F.sub.01, . . . , F.sub.0q)n×q; noting that t.sub.1 is the first component of E.sub.0, t.sub.1=E.sub.0w1, w.sub.1 is the first axis of E.sub.0 and it is a unit vector, i.e. ∥w.sub.1∥=1; marking u.sub.1 as the first component of F.sub.0, u.sub.1=F.sub.0c1, c.sub.1 is the first axis of E.sub.0, and ∥c.sub.1∥=1; then, solving the following optimization problem, i.e., noting that θ.sub.1=w.sub.1′E.sub.0′F.sub.0c1, which is precisely the value of the objective function of the optimization problem; using Lagrange algorithm, obtaining E.sub.0′F.sub.0F.sub.0′E.sub.0w1=θ.sub.12W1 and F.sub.0′E.sub.0E.sub.0′F.sub.0c1=θ.sub.12c.sub.1; therefore, w.sub.1 is the unit eigenvector corresponding to the maximum eigenvalue of E.sub.0′F.sub.0F.sub.0′E.sub.0 matrix, and c.sub.1 is the unit eigenvector corresponding to the maximum eigenvalue θ.sub.12 of F.sub.0′E.sub.0E.sub.0′F.sub.0 matrix; the components t.sub.1=E.sub.0w1 and u.sub.1=F.sub.0c1 can be obtained after finding the axes w.sub.1 and c.sub.1; then, finding the regression equations: E.sub.0=t.sub.1p.sub.1′+E.sub.1, F.sub.0=t.sub.1r1′+F.sub.1 of E.sub.0 and F.sub.0 on t.sub.1 respectively, wherein the regression coefficient vectors are p.sub.1=E.sub.0′t.sub.1/∥r.sub.1∥.sub.2; r.sub.1=F.sub.0′t.sub.1/∥t.sub.1∥.sub.2; and E.sub.1 and F.sub.1 are residual matrices of the two equations respectively; and
[0055] (3) replacing E.sub.0 and F.sub.0 with residual matrices E.sub.1 and F.sub.1, and then finding the second axes w.sub.2 and c.sub.2 and the second components t.sub.2 and u.sub.2, where t.sub.2=E.sub.1w.sub.2, u.sub.2=F.sub.1c.sub.2, θ.sub.2=<t.sub.2, u.sub.2>=w.sub.2E.sub.1′F.sub.1c.sub.2; w.sub.2 is the unit eigenvector corresponding to the maximum eigenvalue of E.sub.1′F.sub.1F.sub.1′E.sub.1 matrix, while c.sub.2 is the unit eigenvector corresponding to the maximum eigenvalue θ.sub.22 of F.sub.1′E.sub.1E.sub.1′F.sub.1 matrix; calculating the regression coefficient P.sub.2=E.sub.1′t.sub.2/∥t.sub.2∥2r.sub.2=F.sub.1′t.sub.2/∥t.sub.2∥2; therefore, there are regression equations E.sub.1=t.sub.2p.sub.2+E.sub.2, F.sub.1=t.sub.2r.sub.2′+F.sub.2; in this way, if the rank of X is A, then E.sub.0=t.sub.1p.sub.1′+ . . . +tA.sub.pA′; F.sub.0=t.sub.1r.sub.1′+ . . . +tArA′+FA;
[0056] (4) cross-validity: one more component is worthwhile if a prediction error sum of squares (all dependent variables and predicted samples combined) in case for one more component and one less sample divided by an error sum of squares (all dependent variables and samples combined) in case for one less component is less than 0.952.
Embodiment 2
[0057] Weighing 0.200 g of evenly ground Keemun black tea powder sample into a centrifugal tube, adding 10 mL of distilled water at 80° C., mixing well, immediately moving the tube into boiling water at 95-100° C. for extraction for 10 min, performing stirring once every 3-5 min, letting it cool to room temperature after leaching, followed by centrifugation; taking the supernatant and performing filtration with a filter membrane with pore size of 0.22 μm, then, putting the filtrate into a liquid vial; using HPLC to separate and identify the components of tea, wherein the mobile phase is acetonitrile and ultrapure water, the flow rate is 0.6-1.0 mL/min, and the content is measured by peak area normalization method; importing the data into SIMCA software, standardizing the data and performing unsupervised principal component analysis, which makes the classification result more objective without pre-classification; then, carrying out supervised partial least squares discriminant analysis, obtaining rather accurate results by enlarging the differences between groups and reducing the differences within groups, and obtaining relevant data such as contribution rate and predictive ability, with the cumulative predictive ability (Q.sup.2) and cumulative variance contribution rate (R2Y) close to 1.0 indicating a good model; further, carrying out hierarchical clustering analysis based on the partial least squares discriminant analysis, wherein hierarchical clustering analysis is to classify subjects according to their properties, and the distance between the large and small properties is far and near; it can be seen directly that clustering analysis gives the re-classification charts of different grades and different kinds of black tea; the VIP values in partial least squares discriminant analysis can be used to quantify the contribution of each variable to classification, with VIP value greater than 1 indicating that there are significant differences among different types and grades of black tea; finally, developing a tea grade discrimination model based on HPLC, wherein the VIP value shows that thein and ECG can be used to classify Keemun black tea into high grade (Special-grade, Grade I, Grade II and Grade III) and low grade (Grade IV, Grade V and Grade VI); the results are consistent with those determined by GB/T23776-2009, proving that this method for grading black tea is accurate and effective.
[0058] Among them, the data standardization of the multivariate analysis software, the unsupervised principal component analysis method and the algorithm of the partial least squares discriminant analysis are the same as those in Embodiment 1.
Embodiment 3
[0059] Weighing 0.200 g of evenly ground Sichuan Leshan black tea powder sample into a centrifuge tube, adding 10 mL of distilled water at 80° C., mixing well, immediately moving the tube into boiling water at 95-100° C. for extraction for 10 min, performing stirring once every 3-5 min, letting it cool to room temperature after leaching, followed by centrifugation; taking the supernatant which is subjected to filtration by a filter membrane with pore size of 0.25 μm, then putting the filtrate into a liquid vial, and separating and identify the components of tea by HPLC wherein the mobile phase A is ultrapure water, mobile phase B is 1% formic acid, mobile phase C is acetonitrile, the flow rate is 0.8 mL/min, the elution duration is 70 min, the ultraviolet detection wavelength is 280 nm, and the column temperature is 30° C.; substituting the peak area into the standard curve to measure the content of the compound; importing the data into SIMCA software, standardizing the data and performing unsupervised principal component analysis, which makes the classification result more objective without pre-classification; then, carrying out supervised partial least squares discriminant analysis, obtaining rather accurate results by enlarging the differences between groups and reducing the differences within groups, and obtaining relevant data such as contribution rate and predictive ability, with the cumulative predictive ability (Q.sup.2) and cumulative variance contribution rate (R2Y) close to 1.0 indicating a good model; further, carrying out hierarchical clustering analysis based on the partial least squares discriminant analysis, wherein hierarchical clustering analysis is to classify subjects according to their properties, and the distance between the large and small properties is far and near; it can be seen directly that clustering analysis gives the re-classification charts of different grades and different kinds of black tea; finally, developing a tea grade discrimination model based on HPLC, wherein the VIP value in partial least squares discriminant analysis can be used to quantify the contribution of each variable to classification, with VIP value greater than 1 indicating that there are significant differences among different types and grades of black tea; the VIP value shows that thein, ECG, theobromine and EGCG can be used to classify Sichuan black tea into Special-grade, Grade I and Grade II; the evaluation results of grade are consistent with those of GB/T23776-2009, which proves that the method for evaluating the grade of black tea is accurate and effective.
[0060] The data standardization of the multivariate analysis software, the unsupervised principal component analysis method and the algorithm of the partial least squares discriminant analysis are the same as those in Embodiment 1.