Spectral preprocessing method and device suitable for fruit near-infrared nondestructive detection, and computer-readable medium
12085503 ยท 2024-09-10
Assignee
Inventors
Cpc classification
International classification
Abstract
The present disclosure discloses a spectral preprocessing method and device suitable for fruit near-infrared nondestructive detection, which includes the following steps: calculating overall noise distribution of spectral data; performing adaptive threshold segmentation on the spectral data to form a plurality of segmentation intervals, calculating noise of each interval in each segmentation interval, performing cyclic adaptive parameter SG filtering on each interval of an original spectrum to find optimal filtering parameters; performing sliding polynomial fitting on the spectral data within the respective intervals, deriving a derivative of a fitting polynomial of each point at a current point.
Claims
1. A spectral preprocessing method suitable for fruit near-infrared nondestructive detection, comprising: Step 1: measuring absorbance, and calculating overall noise distribution of spectral data by using second-order difference; Step 2: performing adaptive threshold segmentation on the spectral data according to the noise distribution, and forming a plurality of segmentation intervals, calculating a noise of each interval in each segmentation interval, performing cyclic adaptive parameter SG (Savitzky-Golay) filtering on each interval of an original spectrum with a residual less than a noise threshold as an optimization objective, and finding optimal filtering parameters; Step 3: using the optimal filter parameters to perform sliding window polynomial fitting on the spectral data within the respective intervals, and deriving a derivative of a fitting polynomial of each point at a current point; Step 4: calculating derivatives of all points of absorbance data point by point; the measuring absorbance in the Step 1 comprises the following process: turning off a light source, putting calibrated white balls and fruit samples respectively, turning on a spectrometer to acquire spectral data i.sub.w and i.sub.0 on backs of the white balls and the fruit samples as background signals of the spectrometer, then turning on a light to preheat for a preset time, acquiring transmission near-infrared spectra I.sub.w and I.sub.0 of the white balls and the fruit samples respectively, and calculating an absorbance sequence A with a calculation formula as follows:
q.sub.1*m.sub.1+q.sub.2*m.sub.2+q.sub.3*m.sub.3=m.sub.0
q.sub.1+q.sub.2+q.sub.3=1; the formula of the inverval between-class variance u.sup.2 is:
u.sup.2=q.sub.1*(m.sub.1?m.sub.0).sup.2+q.sub.2*(m.sub.2?m.sub.0).sup.2+q.sub.3*(m.sub.3?m.sub.0).sup.2; setting an upper limit of the number of cycles, adjusting the segmentation points t.sub.1 and t.sub.2 (580<t.sub.1<t.sub.2<1000) randomly and repeatedly, and calculating a new f(t.sub.1,t.sub.2) circularly, and a maximal between-class variance found at the end of circulation corresponds to (t.sub.1,t.sub.2):
s.sub.1=?{square root over (6)}?.sub.A1,s.sub.2=?{square root over (6)}?.sub.A2,s.sub.3=
A(i)=p.sub.0+p.sub.1*i+p.sub.2*i.sup.2+p.sub.3*i.sup.3+ . . . +p.sub.g-1*i.sup.g-1;
A(i)=p.sub.1+2*p.sub.2*i+3*p.sub.3*i.sup.2+ . . . +(g?1)*p.sub.g-1*i.sup.g-2;
A(i)=2*p.sub.2*6*p.sub.3*i+ . . . +(g?2)*(g?1)*p.sub.g-1*i.sup.g-3; i=0 at the center of the window;
A(0)=p.sub.1,A(0)=2*p.sub.2; i=0 at a window center;
A(0)=p.sub.1,A(0)=2*p.sub.2; sliding window to calculate a derivative value of the center point of the window point by point according to the above formula, and directly obtaining the first derivative A(i) second derivative A(i) after filtering.
2. A spectral preprocessing device suitable for fruit near-infrared nondestructive detection, and comprising: an absorbance calculating module, configured to measure absorbance; a noise distribution calculating module, configured to calculate overall noise distribution of spectral data by using second-order difference; an optimal filter parameter calculating module, configured to perform adaptive threshold segmentation on spectral data according to the noise distribution to form a plurality of segmentation intervals, calculate noise of each interval in each segmentation interval, perform cyclic adaptive parameter SG filtering on each interval of an original spectrum with a residual less than a noise threshold as an optimization objective to find optimal filtering parameters; a polynomial fitting module, configured to use the optimal filter parameters to perform sliding window polynomial fitting on the spectral data within the respective intervals, and derive a derivative of a fitting polynomial of each point at a current point; an all-point derivative calculating module, configured to calculate derivatives of all points of absorbance data point by point; the spectral preprocessing device performs the spectral preprocessing method according to claim 1.
3. A computer-readable medium, wherein the computer-readable medium stores a program code executable by a processor, and the program code causes the processor to perform the spectral preprocessing method according to claim 1.
Description
BRIEF DESCRIPTION OF THE DRAWINGS
(1) The present disclosure can be better understood with reference to the following descriptions given in conjunction with the accompanying drawings, in which the same or similar reference numerals are used to indicate the same or similar parts throughout the accompanying drawings. The accompanying drawings, together with the detailed description hereinafter, are included and form a part of this specification, and serve to further illustrate preferred embodiments of the present disclosure and explain the principles and advantages of the present disclosure. In the accompanying drawings:
(2)
(3)
DETAILED DESCRIPTION OF THE EMBODIMENTS
(4) Embodiments of the present disclosure will be described hereinafter with reference to the accompanying drawings. Elements and features described in one drawing or one embodiment of the present disclosure may be combined with elements and features shown in one or more other drawings or embodiments. It should be noted that for the sake of clarity, representations and descriptions of components and processes that are not related to the present disclosure and are known to those skilled in the art are omitted from the accompanying drawings and descriptions.
(5) The embodiment of the present disclosure provides a spectral preprocessing method suitable for fruit near-infrared nondestructive detection. The flow chart is shown in
A(i)=p.sub.1+2*p.sub.2*i+3*p.sub.3*i.sup.2+ . . . +(g?1)*p.sub.g-1*i.sub.g-2;
(6) Process 1: measuring absorbance of the sample to be tested;
(7) Process 1: calculating the second-order difference of the absorbance;
(8) Process 2: selecting the optimal interval segmentation according to the second-order difference result;
(9) Process 3: calculating the overall noise of each interval;
(10) Process 4: searching for the optimal filtering parameters with noise as the threshold;
(11) Process 5: fitting a polynomial function in a sliding window using the optimal parameters for the absorbance data;
(12) Process 6: deriving the derivative of the function at the center point of the window according to the polynomial coefficient;
(13) Process 7: calculating the derivatives of all points of absorbance data point by point;
(14) Process 8: ending.
(15) The process 2-4 selects the optimal interval segmentation according to the second-order difference result, calculates the overall noise of each interval, and searches for the optimal filtering parameters with the noise as the threshold. Specifically, adaptive threshold segmentation is performed on the spectral data according to the noise distribution to form a plurality of segmentation intervals, noise of each interval is calculated in each segmentation interval, cyclic adaptive parameter SG filtering is performed on each interval of an original spectrum with a residual less than a noise threshold as an optimization objective, and optimal filtering parameters are found.
(16) The process 5 specifically includes using the optimal filter parameters to perform sliding polynomial fitting on the spectral data within the respective intervals, and deriving a derivative of a fitting polynomial of each point at a current point.
(17) The method of dividing the interval and finding the optimal parameters is an improvement of the existing SG filtering, and the SG filtering and derivative are integrated to be completed in one step, so that the data quality after derivation is effectively improved.
(18) Specifically, as a specific embodiment, the spectral preprocessing specifically includes the following steps:
(19) Step 1: light source is turned off, calibrated white balls and fruit samples are put respectively, a spectrometer is turned on to acquire spectral data i.sub.w and i.sub.0 on the back of the white balls and fruit samples as background signals of the spectrometer, then a light is turned on to preheat for 30 minutes, transmission near-infrared spectra I.sub.w and I.sub.0 of the white balls and fruit samples are respectively acquired, and an absorbance sequence A is calculated with the calculation formula as follows:
(20)
(21) Step 2: interval interception is performed on the absorbance sequence A, an interval band from 580 nm to 1000 nm is taken to generate a new absorbance sequence A.sub.0, and second-order difference operation is performed on A.sub.0, wherein the method is to use [1,?2,1] operator to perform sliding convolution on A.sub.0 to generate a second-order difference absorbance sequence A, and A can reflect noise distribution of the absorbance sequence A.sub.0 (as shown in
(22) Step 3: optimal interval segmentation of noise is found according to the absorbance second-order difference sequence A, and two segmentation points t.sub.1: 650 nm and t.sub.2: 870 nm are preset to form three segmentation intervals according to characteristics of fruit near-infrared transmission spectral data. Because the grating type near-infrared spectrometer is characterized in that the sensitivity in the near-infrared region is lower than that in a visible light interval, the noise is calculated innovatively according to intervals in the present disclosure. In addition, because the absorption band of the main components of fruits is in the interval of 800 to 950, it is especially necessary to improve the signal quality in this interval. When being set, multiple intervals can be set. In view of the fact that the more intervals, the greater the calculation amount. According to the characteristics of fruit near-infrared spectrum, based on a large number of experiments, the applicant found that setting three intervals is most suitable, and two segmentation points are respectively set as t.sub.1: 650 nm and t.sub.2: 870 nm. Of course, the values of t.sub.1 and t.sub.2 are not necessarily 650 nm and 870 nm, t.sub.1 can be in the range of 650 nm?50 nm, and t.sub.2 can be in the range of 870 nm?50 nm.
(23) The numbers of spectral points in the three segmentation intervals are respectively denoted as n.sub.1, n.sub.2 and n.sub.3, spectral average values in the three intervals are respectively denoted as m.sub.1, m.sub.2 and m.sub.3, an overall spectral average value is m.sub.0, an interval inter-class variance f(t.sub.1,t.sub.2)=u.sup.2 at this time is calculated, an arithmetic square root of the inter-class variance u.sup.2 is an inter-class standard deviation u, and the calculation process of the inter-class variance u.sup.2 is as follows:
(24) The probability that the spectrum falls into three intervals is respectively denoted as q.sub.1, q.sub.2 and q.sub.3, and:
(25)
(26) The following formulas are satisfied:
q.sub.1*m.sub.1+q.sub.2*m.sub.2+q.sub.3*m.sub.3=m.sub.0
q.sub.1+q.sub.2+q.sub.3=1
(27) The formula of inter-class variance u.sup.2 is:
u.sup.2=q.sub.1*(m.sub.1?m.sub.0).sup.2+q.sub.2*(m.sub.2?m.sub.0).sup.2+q.sub.3*(m.sub.3?m.sub.0).sup.2.
(28) An upper limit of the number of cycles is set as 1000 using an optimization algorithm (such as a simulated annealing algorithm), the segmentation points t.sub.1 and t.sub.2 (580<t.sub.1<t.sub.2<1000) are randomly adjusted repeatedly, and new f(t.sub.1,t.sub.2) is calculated, and the found (t.sub.1,t.sub.2) corresponding to a maximal inter-class variance ends circularly:
f(t.sub.1,t.sub.2)=u.sup.2;
(29)
(30) At this time, the segmentation points t.sub.1 and t.sub.2 are the optimal segmentation points, and the absorbance second-order difference sequence A is segmented into three parts: A.sub.1, A.sub.2 and A.sub.3.
(31) Step 4: a specific method of calculating average noises s.sub.1, s.sub.2 and s.sub.3 of A.sub.1, A.sub.2 and A.sub.3 respectively is as follows: taking A.sub.1 as an example, segmenting A.sub.1 into L intervals Seg.sub.i(i=0,1,2 . . . L?1) equally, in which the number of data points in each interval is n, calculating the average value v and the standard deviation ? of the data in each interval, finding the interval Seg.sub.i with a minimal standard deviation with the average value v.sub.i and the standard deviation ?.sub.i, covering the interval Seg.sub.i with a window with a width of n, sliding to the right for detection, moving one data point at a time, calculating whether the data point belongs to the interval of [v.sub.i?3?.sub.i,v.sub.i+3?.sub.i], if so, keeping this point unchanged, if not, replacing this point with the average value of the data in the current window, repeating the above steps until all the data on the right are detected, detecting the data on the left using the same method, calculating the standard deviation ?.sub.A1 of the processed A.sub.1 finally, and calculating the standard deviation ?.sub.A2 and ?.sub.A3 of A.sub.2 and A.sub.3 using the same method;
s.sub.1=?{square root over (6)}?.sub.A1,s.sub.2=?{square root over (6)}?.sub.A2,s.sub.3=
(32) Step 5: the absorbance sequence A.sub.0 is segmented into intervals A.sub.1, A.sub.2 and A.sub.3 by the segmentation points t.sub.1 and t.sub.2, in which the numbers of spectral points in the three intervals respectively are n.sub.1, n.sub.2 and n.sub.3, SG filtering is respectively used in each interval, initial parameters are window w=5, order g=3, the filtered data are A.sub.1*, A.sub.2*, and A.sub.3*, residuals are ?.sub.1, ?.sub.2 and ?.sub.3, respectively, in which the residuals are calculated using the following formulas:
(33)
(34)
(35)
(36) Parameters are traversed, the order is fixed first, the window is traversed, 2 is added to the window w.sub.1(5?w.sub.1?n.sub.1/2, w.sub.1=2i+1, i?Z) each time and 1 is added to the order g.sub.1(3?g.sub.1, g.sub.1?Z) each time, new residuals are calculated iteratively until ?.sub.1 is greater than s.sub.1, iteration stops, the parameters w.sub.1 and g.sub.1 are obtained at this time which are the optimal parameters of the SG filtering, and w.sub.2, g.sub.2, w.sub.3, and g.sub.3 are obtained using the same method.
(37) Step 6: the optimal filter parameters obtained in the Step 5 are used to perform sliding polynomial fitting within the respective intervals A.sub.1, A.sub.2 and A.sub.3, in which a fitting coefficient of each point is p[p.sub.0, p.sub.1 . . . P.sub.g-2, P.sub.g-1], that is:
A(i)=p.sub.0+p.sub.1*i+p.sub.2*i.sup.2+p.sub.3*i.sup.3+ . . . +p.sub.g-1*i.sup.g-1; in which
(38)
(39) The first and second derivatives of the polynomial at a center point of the window are calculated according to the fitting coefficient, and the calculation formula is:
A(i)=p.sub.1+2*p.sub.2*i+3*p.sub.3*i.sup.2+ . . . +(g?1)*p.sub.g-1*i.sup.g-2;
A(i)=2*p.sub.2*6*p.sub.3*i+ . . . +(g?2)*(g?1)*p.sub.g-1*i.sup.g-3; i=0 at the center of the window;
A(0)=p.sub.1,A(0)=2*p.sub.2.
(40) A sliding window is used to calculate a derivative value of the center point of the window point by point according to the above formula, and the filtered first derivative A(i) and second derivative A(i) are directly obtained.
(41) According to the scheme, the present disclosure has the following advantages: 1. The optimal interval is divided according to the signal-to-noise ratio in combination with the transmission spectrum characteristics of fruits, and the optimal filtering parameters are searched with the average noise of the interval as the threshold limit in each interval, so that the first and second derivatives of the signal-to-noise ratio are obtained. Therefore, the noise increased by differential derivation can be effectively reduced, and the data quality is further improved. In the prior art, SG filtering is a very common filtering method. In the present disclosure, SG filtering is optimized, the optimal interval segmentation step is added, the noise of each interval is calculated, and the filtering parameters are adjusted with the noise as the optimization threshold, so that the optimal parameters are ensured to be used in each interval. 2. In the prior art, the scheme is to perform generally direct differential derivative operation, which will increase noise. Every time the first derivative is added, noise will be increased, resulting in a low signal-to-noise ratio of the second derivative. In the present disclosure, differential operation is replaced by derivation using a fitting function, thereby further reducing the noise increased by differential derivation. 3. The present disclosure integrates the optimized SG filtering and derivative, effectively reducing the extra noise of common differential derivation, improving the data quality and providing a good foundation for subsequent modeling.
(42) The embodiment of the present disclosure further provides a computer-readable medium, wherein the computer-readable medium stores a program code executable by a processor, and the program code causes the processor to perform the spectral preprocessing method described above.
(43) The embodiment of the present disclosure further provides a spectral preprocessing device suitable for fruit near-infrared nondestructive detection, including: an absorbance calculating module, which is configured to measure absorbance; a noise distribution calculating module, which is configured to calculate overall noise distribution of spectral data by using second-order difference; an optimal filter parameter calculating module, which is configured to perform adaptive threshold segmentation on the spectral data according to the noise distribution to form a plurality of segmentation intervals, calculate noise of each interval in each segmentation interval, perform cyclic adaptive parameter SG filtering on each interval of an original spectrum with a residual less than a noise threshold as an optimization objective to find optimal filtering parameters; a polynomial fitting module, which is configured to use the optimal filter parameters to perform sliding polynomial fitting on the spectral data within the respective intervals, and derive a derivative of a fitting polynomial of each point at a current point; an all-point derivative calculating module, which is configured to calculate the derivatives of all points of absorbance data point by point.
(44) In addition, the method of the present disclosure is not limited to being executed in the time sequence described in the specification, but can also be executed in other time sequences, in parallel or independently. Therefore, the execution order of the method described in this specification does not limit the technical scope of the present disclosure.
(45) Although the present disclosure has been disclosed by describing specific embodiments thereof, it should be understood that all the above embodiments and examples are illustrative and not restrictive. Those skilled in the art can design various modifications, improvements or equivalents to the present disclosure within the spirit and scope of the appended claims. These modifications, improvements or equivalents should also be considered as included in the protection scope of the present disclosure.