Method for fault diagnosis of an aero-engine rolling bearing based on random forest of power spectrum entropy

11333575 · 2022-05-17

Assignee

Inventors

Cpc classification

International classification

Abstract

The present invention belongs to the technical field of fault diagnosis of aero-engines, and provides a method for fault diagnosis of an aero-engine rolling bearing based on random forest of power spectrum entropy. Aiming at the above-mentioned defects existing in the prior art, a method for fault diagnosis of an aero-engine rolling bearing based on random forest is provided, wherein test measured data for an aero-engine rolling bearing provided by a research institute are used for establishing a training dataset and a test dataset first; and based on an idea of fault feature extraction, time domain statistical analysis and frequency domain analysis are conducted on original collection data by adopting wavelet analysis; thereby realizing effective fault diagnosis from the perspective of engineering application.

Claims

1. A method for fault diagnosis of an aero-engine rolling bearing based on random forest of power spectrum entropy, comprising the following steps: step 1: preprocessing aero-engine rolling bearing fault data, comprising: (1) preprocessing rolling bearing experimental measured data comprising eight groups of parameters: rotational speed n.sub.1, vibration acceleration of driven end a.sub.1, vibration acceleration of fan end a.sub.2, fault diameter d, number of balls Z, inner radius r.sub.1, outer radius r.sub.2 and contact angle α.sub.2; (2) combining and storing the rolling bearing experimental measured data collected at multiple experiment sites to establish a rolling bearing fault database; (3) analyzing feature data extracted from the rolling bearing fault database and adopting a linear resampling method to resample the feature data; (4) normalizing the resampled data in order to eliminate an order of magnitude difference between data of different dimensions and avoid a large prediction error caused by the order of magnitude difference between input and output data, by using Min-Max scaling defined as follows:
normalized X.sub.nor=(X.sub.nor−X.sub.min)/(X.sub.max−X.sub.min), where, X.sub.nor is a data series to be normalized, X.sub.min is a minimum number in the data series, and X.sub.max is a maximum number in the data series; and (5) conducting visualization processing on the normalized data, and conducting simple clustering and cleaning on the normalized data; step 2: extracting a feature vector of rolling bearing data by processing the clustered and cleaned data to generate the feature vector to characterize conditions of vibration fault features, the feature vector comprising time-domain parameters and power spectrum entropy defined as: (1) the time-domain feature parameters of a vibration signal comprising dimensional time-domain vibration parameters and dimensionless time-domain vibration parameters, wherein data of the vibration signal are set as {X.sub.t}.sub.t=1.sup.N, and N is experimental observation time, and wherein the dimensional time-domain vibration parameters are defined as follows: mean value : X ¯ = 1 N .Math. t = 1 N .Math. X t .Math. , variance : S 2 = 1 N .Math. t = 1 N ( X t - X _ ) 2 , root mean square value : X R M S = 1 N .Math. N t = 1 X t 2 , peak value : X p = max ( .Math. X t .Math. ) , and wherein the dimensionless time-domain vibration parameters are defined as follows: crest factor : C f = X p X RMS , skewness index : X S K E = 1 N .Math. t = 1 N ( X t - X ¯ ) 3 S 2 3 , kurtosis value : X K U R = 1 N .Math. t = 1 N ( X t - X ¯ ) 4 S 2 4 , impulse factor : I = X p X ¯ , shape factor : X SHA = X R M S X ¯ , clearance factor : X C L E = X p ( 1 N .Math. t = 1 N .Math. X t .Math. ) 2 , and wherein N is experimental observation time, that is, original channel length; and (2) the power spectrum entropy, derived from the following steps: decomposing and reconstructing the vibration signal by a wavelet toolbox, and before decomposing and reconstructing, selecting and determining a wavelet basis function, a wavelet order and a wavelet packet decomposition level, wherein the wavelet basis function is selected to be a Daubechies wavelet, the wavelet order is selected to be 1, and the wavelet packet decomposition level is selected to be 3; after orthogonal decomposition of wavelet packets, monitoring energy of a frequency band obtained by each of the wavelet packets, and monitoring all components of the vibration signal that comprises harmonic components, wherein a calculation formula of the components of the vibration signal is: E i j = .Math. S i j ( t ) .Math. 2 dt = ( .Math. k = 1 n .Math. x i j ( u ) .Math. 2 ) 1 2 , where, E.sub.ij is energy, S.sub.ij(t) is reconstruction signal, i is a number of layers of wavelet decomposition, and j is a node of the ith layer, j=1, . . . , 2.sup.i; u=1, 2, . . . , n, n∈Z, and n is a number of discrete points of the reconstruction signal; and calculating the power spectrum entropy, wherein, power spectrum is change of signal with limited power in a unit frequency band with frequency; after the jth layer wavelet packet decomposition is conducted on the signal, a wavelet packet decomposition sequence S(j, m) is obtained, where m takes 0 to 2.sup.j−1, and the wavelet decomposition of the signal herein is regarded as a division, and a measure of the division is defined as: P ( j , m ) ( i ) = S F ( j , m ) ( i ) / .Math. i = 1 N S F ( j , m ) ( i ) , where S.sub.F(i,m)(i) is the ith value of Fourier transform sequence of S(j, m); and N is original channel length; and based on a basic theory of information entropy, the power spectrum entropy on wavelet packet space is defined as: H ( j , m ) = - .Math. i = 1 n P ( j , m ) ( i ) log P ( j , m ) ( i ) , wherein, based on analysis, the time-domain parameters and the power spectrum entropy are selected as input attribute in a random forest method of feature parameters; and step 3: establishing a training database, wherein, a sample size of an r category of fault is set as G(r), and a set of samples after sparse representation are {X(1), X(2) . . . (X(G(r))}, wherein X(h)={x.sub.h(1), x.sub.h(2) . . . x.sub.h(dim)} is multidimensional feature vector corresponding to each of the samples in dim dimensions, dim>1; and {y(1), y(2) . . . y(G(r))} is set as a corresponding multi-category fault label, and inputs of a random forest model are {X(h)} and outputs of the random forest model are {y(h)}; and step 4: building a rolling bearing vibration fault classification model based on the random forest model using two parameters of random forest, respectively: ntree, that is the number of generated decision trees; and mtry, that is the feature number of regression trees, wherein, the random forest model are trained as follows: 1) first giving a training setTrain, a test set Test and dimensions of feature F, and determining the number of decision trees ntree, depth of each tree depth and the feature number of regression trees mtry; 2) for the ith tree, i=1: ntree, extracting a training set Train with the same size as Train in a returnable way from Train(i) as a sample of a root node; 3) if reaching an end condition on a current node, that is, a minimum number of samples s on the node and a minimum information gain g on the node, setting the current node as a leaf node and continuing to train other nodes in sequence; if not reaching an end condition on the current node, randomly selecting F dimensional feature fmtry from mtry<<F dimensional feature; and using the mtry dimensional feature, seeking one-dimensional feature w that a classification effect is best and a threshold thereof, and continuing to train other nodes; 4) repeating the steps 2) and 3) until all nodes are trained or labeled as leaf nodes; 5) repeating the steps 2), 3) and 4) until all decision trees are trained; 6) for the sample in the test set Test, starting at a root node, according to the threshold of the current node, judging whether to enter the left node or right node until a certain leaf node is reached, and outputting classification labels; and 7) according to test centralized data, conducting statistics on accuracy rate of classification, and evaluating the classification effect of the random forest model.

Description

DESCRIPTION OF DRAWINGS

(1) FIG. 1 is a flow chart of establishing a method for fault diagnosis of an aero-engine rolling bearing.

(2) FIG. 2 is a display diagram of time domain vibration signal of a rolling bearing.

(3) FIG. 3 is a display diagram of analysis results of power spectrum entropy (frequency band energy spectrum).

DETAILED DESCRIPTION

(4) Specific embodiments of the present invention are further described below in combination with accompanying drawings and the technical solution.

(5) The data used in the method are 320 groups of test data of rolling bearing provided by a research institute.

(6) Step 1: Preprocessing on Aero-Engine Rolling Bearing Fault Data

(7) (1) rolling bearing original data comprise eight groups of parameters: rotational speed n.sub.1, vibration acceleration of driven end a.sub.1, vibration acceleration of fan end a.sub.2, fault diameter d, number of balls Z, inner radius r.sub.1, outer radius r.sub.2 and contact angle α.sub.2;

(8) (2) data integration: the rolling bearing experimental data comprise data collected at multiple experiment sites. A rolling bearing fault warehouse is established;

(9) after the fault features are to be extracted, conducting processing in steps (3), (4) and (5) on the feature data:

(10) (3) resampling: analyzing the data. Because sampling time intervals are different, a linear resampling method is used to resample the aero-engine performance parameter data for the convenience of subsequent rolling prediction:

(11) (4) normalization: conducting normalization processing on the resampled data and converting the data into data within a certain range in order to eliminate the order of magnitude difference between data of each dimension and avoid a large prediction error caused by the order of magnitude difference between input and output data: and using Min-Max scaling.

(12) (5) data filtering and cleaning: conducting simple clustering and cleaning on fault data:

(13) step 2: Extracting Feature Vector of Rolling Bearing Data

(14) processing the collected bearing vibration data as feature vector to characterize conditions of vibration fault features. The feature vector mainly comprises time-domain parameters and power spectrum entropy.

(15) (1) Time-Domain Parameter

(16) The change of time-domain parameters of vibration signal often reflects the change of working state of the equipment, and some time-domain parameters of the signal are used as feature parameters. The time-domain feature parameters during vibration are usually divided into dimensional parameters and dimensionless parameters.

(17) (2) Power Spectrum Entropy

(18) Decomposing and reconstructing are conducted on the vibration signal by a wavelet toolbox in MATLAB. Before decomposing and reconstructing, the suitable wavelet basis function, wavelet order and wavelet packet decomposition level are first selected, wherein the wavelet basis function selects Db wavelet, the wavelet order selects 1 and wavelet packet decomposition level selects 3;

(19) Based on the basic theory of information entropy, the power spectrum entropy on wavelet packet space is defined and calculated at the same time.

(20) To sum up, based on analysis, 11 parameters such as the time-domain parameters (mean value, variance, root mean square value, peak value, crest factor, skewness index, kurtosis value, impulse factor, shape factor and clearance factor) and power spectrum entropy are selected as input attribute in a random forest method of feature parameters. The extraction of feature parameters is shown in Table 1 (the fault extraction results of 10 samples are taken). The computational analysis for the power spectrum entropy is shown in FIG. 2.

(21) TABLE-US-00001 TABLE 1 Feature Parameter Extraction (Time-domain Parameter and Power Spectrum Entropy) Energy Mean Peak Crest Skewness Kurtosis Impulse Shape Clearance Spectrum value Variance RMS value Factor Index Value Factor factor Factor Entropy 61.2789 4.50E+06  2.1938 0.2061 0.094  −0.0289 8.33E−04 0.0034 0.0358 3.29E−06 1.5176 334.9924 1.35E+08 20.8191 5.2024 0.2499 −0.0289 8.33E−04 0.0155 0.0621 2.04E−05 1.6545 319.8547 1.23E+08 18.5822 4.5145 0.2429 −0.0289 8.33E−04 0.0141 0.0581 1.76E−05 1.5884 304.4911 1.11E+08 17.3633 3.1545 0.1817 −0.0289 8.33E−04 0.0104 0.057  1.32E−05 1.7279 474.1136 2.70E+08 22.2932 3.2394 0.1453 −0.0289 8.33E−04 0.0068 0.047  7.99E−06 1.3964 431.0915 2.23E+08 20.1841 2.6692 0.1322 −0.0289 8.33E−04 0.0062 0.0468 7.19E−06 1.4035 365.4315 1.60E+08 23.2048 5.6954 0.2454 −0.0289 8.33E−04 0.0156 0.0635 2.08E−05 1.6565 112.5617 1.52E+07  4.4339 0.7173 0.1618 −0.0289 8.33E−04 0.0064 0.0394 6.46E−06 1.8967 128.653 1.99E+07  5.6969 0.9463 0.1661 −0.0289 8.33E−04 0.0074 0.0443 7.92E−06 1.489  426.9896 2.19E+08 19.8143 3.0444 0.1536 −0.0289 8.33E−04 0.0071 0.0464 8.10E−06 1.4184

(22) Step 3: Establishing a Training Database

(23) There are 320 groups of data in this experiment. According to the extracted feature vector with 11 dimensions, wherein the number of dimensions of the input data is 11. The output data are the corresponding fault types, and there are 10 categories of faults in total. In the 320 groups of data, the 200 groups of training data are selected, and the remaining 120 groups are used for test data.

(24) Step 4: Building a Rolling Bearing Vibration Fault Classification Model Based on Random Forest

(25) There are two important parameters of random forest, that is ntree and mtry, wherein ntree is the number of generated decision trees, and mtry is the number of features of the regression tree:

(26) the training steps of the random forest model are as follows:

(27) (1) first giving a training set Train, a test set Test and the dimensions of feature F, and determining the number of decision trees ntree, depth of each tree and the feature number of regression trees mtry;

(28) (2) for the ith tree (i=1: ntree), extracting a training set Train(i) with the same size as Train in a returnable way from Train as a sample of a root node;

(29) (3) determining the corresponding threshold and other factors for the current node according to the condition whether the current node has reached the termination condition.

(30) (4) repeating the above steps until all the decision trees are trained, thereby establishing a diagnostic model.

(31) (6) for the sample in the test set Test, starting at a root node, according to the threshold of the current node, judging whether to enter the left node or right node until a certain leaf node is reached, and outputting classification labels;

(32) (7) conducting statistics on the accuracy rate of classification. The classification effects of the 10 categories of faults are shown in Table 2.

(33) TABLE-US-00002 TABLE 2 Classification Effect Statistics of Each Fault Type of Rolling Bearing Based on Random Forest Model Test Correctly Classification Samples Classified Samples Accuracy Rate Fault Type (piece) (piece) (%) Normal 12 11 91.7 inner ring fault diameter 12 12 100 0.07 inch inner ring fault diameter 12 11 91.7 0.14 inch inner ring fault diameter 12 12 100 0.21 inch outer ring fault diameter 12 12 100 0.07 inch outer ring fault diameter 12 12 100 0.14 inch outer ring fault diameter 12 12 100 0.21 inch ball fault diameter 12 12 100 0.07 inch ball fault diameter 12 11 91.7 0.14 inch ball fault diameter 12 12 100 0.21 inch Total 120 117 97.5

(34) It can be known from the statistical results in above Table that the random forest fault diagnosis model has higher fault classification effects, the fault diagnosis rate reaches 97.5%, and at the same time the experimental results also explain the favorable effect that the time domain factor and power spectrum entropy are used as fault features to characterize an original sensor signal. To sum up, a method for fault diagnosis of an aero-engine rolling bearing based on the random forest of power spectrum entropy proposed herein reaches a better application effect.