TRANSMISSION LINE DEFECT IDENTIFICATION METHOD BASED ON SALIENCY MAP AND SEMANTIC-EMBEDDED FEATURE PYRAMID
20230360390 · 2023-11-09
Inventors
- Qiang Yang (Hangzhou, CN)
- Chao Su (Hangzhou, CN)
- Yuan Cao (Hangzhou, CN)
- Di Jiang (Hangzhou, CN)
- Hao XU (Hangzhou, CN)
- Kaidi Qiu (Hangzhou, CN)
Cpc classification
G06V10/774
PHYSICS
G06T2207/20016
PHYSICS
G06V10/7715
PHYSICS
Y04S10/50
GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
G06T3/4053
PHYSICS
G06V10/464
PHYSICS
International classification
G06T3/40
PHYSICS
G06V10/46
PHYSICS
G06V10/77
PHYSICS
Abstract
The present disclosure provides a transmission line defect identification method based on a saliency map and a semantic-embedded feature pyramid, including the following steps: step 1: cleaning and classifying a dataset; step 2: generating a super-resolution image for a small target of a transmission line by using an Electric Line-Enhanced Super-Resolution Generative Adversarial Network (EL-ESRGAN) model; step 3: performing image saliency detection on the dataset by constructing a U.sup.2-Net; step 4: performing data augmentation on the dataset by using GridMask and random cutout algorithms based on a saliency map, and generating a classified dataset; and step 5: performing image classification on a normal set and a defect set by using a ResNet34 classification algorithm and a deep semantic embedding (DSE)-based feature pyramid classification network.
Claims
1. A transmission line defect identification method based on a saliency map and a semantic-embedded feature pyramid, the method comprising the following steps: 1) taking a target image of a transmission line as a dataset, labeling, based on whether the transmission line has a defect, the dataset as a normal set or a defect set, and classifying the dataset as a small target set or a non-small target set based on a size of the target image and a given threshold; 2) performing image super-resolution expansion on the small target set by using an Electric Line-Enhanced Super-Resolution Generative Adversarial Network (EL-ESRGAN) algorithm, combining the non-small target set and the small target set obtained after image super-resolution expansion, compressing a combined set based on a size of the small target set, and dividing the combined set into a training set and a test set; 3) generating the saliency map of an image in the training set by using a nested saliency detection network (U.sup.2-Net), ensuring integrity of a key region of a detection target by using a morphological expansion algorithm, generating a cutout region randomly for a part whose saliency score is less than a threshold, and padding a pixel randomly to form a data-augmented image set; 4) inputting a data-augmented image and its label into a deep semantic embedding (DSE)-based feature pyramid classification network to perform training to obtain a trained classifier; and 5) obtaining image data of an inspected target of the transmission line in real time, and taking the image data as an input of the trained classifier to output an identification result.
2. The method according to claim 1, wherein performing the image super-resolution expansion on the small target further comprises: defining loss functions of a generator and a discriminator of an EL-ESRGAN model, wherein formulas of the loss functions are as follows:
L.sub.G.sup.Ra=−E.sub.x.sub.
L.sub.D.sup.Ra=−E.sub.x.sub.
3. The method according to claim 1, further comprising: building a residual U-block (RSU) network based on a residual block network structure; building, by stacking the RSU network, the U.sup.2-Net composed of 11 stages; generating a saliency score of the target image of the transmission line by using the U.sup.2-Net, ensuring integrity of the key region of the detection target by using the morphological expansion algorithm, and generating an image mask region; and randomly selecting GridMask and random cutout algorithms to perform a cutout operation randomly in the image mask region, and padding the pixel randomly.
4. The method according to claim 1, further comprising a DSE-based feature pyramid classification network comprising: a residual network (ResNet) feature extraction module, wherein an input of the ResNet feature extraction module is the target image of the transmission line, and an output of the ResNet feature extraction module is features of different scales of the image; an enhanced feature pyramid network (EFPN) module, wherein an input of the EFPN module is the features of the different scales that are generated by the ResNet feature extraction module, and an output of the EFPN module is a feature obtained by fusing features of adjacent scales; a DSE module, where an input of the DSE module is the fused feature generated by the EFPN module, and an output of the DSE module is a low-resolution feature with rich semantic information and a high-resolution feature with rich position information; a deep feature fusion (DFF) module, wherein an input of the DFF module is the low-resolution feature and the high-resolution feature generated by the DSE module, and an output of the DFF module is a feature vector obtained by performing convolution and pooling operations on the high-resolution feature and the low-resolution feature; and an image object classification network (OC), wherein an input of the OC is the feature vector generated by the DFF module for the high-resolution feature and the low-resolution feature, and an output of the OC is a classification result indicating whether the inspected target of the transmission line is faulty.
5. The method according to claim 1, wherein the dataset is an insulator self-explosion dataset of the transmission line.
Description
BRIEF DESCRIPTION OF THE DRAWINGS
[0042] In order to describe the technical solutions in the embodiments of the present disclosure more clearly, the accompanying drawings required for describing the embodiments are briefly described below. Obviously, the accompanying drawings in the following description show merely some embodiments of the present disclosure, and a person of ordinary skill in the art can further derive other accompanying drawings from these accompanying drawings without creative efforts.
[0043]
[0044]
[0045]
[0046]
[0047]
[0048]
[0049]
[0050]
[0051]
[0052]
[0053]
DETAILED DESCRIPTION OF THE EMBODIMENTS
[0054] The technical solutions of the embodiments of the present disclosure are clearly and completely described below with reference to the accompanying drawings. Apparently, the described embodiments are merely a part rather than all of the embodiments of the present disclosure. All other embodiments obtained by a person of ordinary skill in the art on the basis of the embodiments of the present disclosure without creative efforts shall fall within the protection scope of the present disclosure.
[0055] As shown in
[0061] In the step 2), a main structure of a generator G in an EL-ESRGAN is shown in
[0062] As shown in
[0063] In the step 3), the U.sup.2-Net is used to generate the saliency map of the image. A structure of the U.sup.2-Net is shown in
[0064] A structure of the RSU network is shown in
[0065] In
[0066] In the step 3), the saliency map of the U.sup.2-Net is used to guide image augmentation. A corresponding algorithm is implemented according to the following steps: [0067] a) Take the target image img and a saliency grayscale map generated by the U.sup.2-Net as an input of the algorithm. [0068] b) Calculate a median grayscale value θ in a grayscale and a saliency value grayscale.sub.ij of a pixel (i, j) in the grayscale map, and calculate, according to a formula binary={grayscale.sub.ij≥θ}i∈width, j∈height, whether an identified target is available. [0069] c) Perform two morphological expansion calculations S=(binary⊕B)⊕B, on the identified target, where B represents a 5×5 structural element in an expanded protection region, and ⊕ represents a symbol of the morphological expansion calculation. [0070] d) Randomly select GridMask and random cutout algorithms: and for a non-empty mask set, if the GridMask algorithm is selected, randomly select a grid number d of the GridMask algorithm and an information retention ratio r.sub.g in a grid, and generate the mask set; or if the random cutout algorithm is selected, randomly select a region ratio s of a cutout region to be randomly cut out and an aspect ratio r.sub.c of the cutout region, and generate the mask set. [0071] e) Cut out pixel information of the mask set in the original image, and randomly pad the pixel to obtain the data-augmented image img.
[0072] In the step 4), the DSE-based enhanced feature pyramid classification network is shown in
[0073] In the DFF module, feature processing of a high-level feature map is completed by two residual blocks and a bypass connection. A configuration of the residual block is shown in Table 1. After convolution, each layer is connected to one batch normalization layer and one ReLU activation layer of nonlinear transformation.
TABLE-US-00001 TABLE 1 Quantity Quantity Residual Layer Convolution Pixel padded of input of output block No. kernel size at an edge channels channels Main Conv1 1 × 1 0 256 64 channel Conv2 3 × 3 1 64 64 Conv3 3 × 3 1 64 256 Bypass Conv4 1 × 1 0 256 256 channel
[0074] Feature processing of a low-level feature map has a similar structure to that of the high-level feature map, except that an atrous convolution residual block instead of the original residual block is used. A configuration of the atrous convolution residual block is shown in Table 2. After convolution, each layer is connected to one batch normalization layer and one ReLU activation layer of nonlinear transformation.
TABLE-US-00002 TABLE 2 Atrous convolution Quantity Quantity Residual Convolution Dilation of input of output block Layer No. kernel size rate channels channels Main Atrous-Conv1 1 × 1 1 256 64 channel Atrous-Conv2 3 × 3 3 64 64 Atrous-Conv3 3 × 3 5 64 256 Bypass Atrous-Conv4 1 × 1 1 256 256 channel
[0075] The ResNet34 is taken as a benchmark to carry out a defect elimination experiment for each module in the present disclosure. Experimental results are shown in
[0076] In defect identification of the transmission line, it is necessary to improve a recall rate on a premise of ensuring accuracy, so as to find faults as much as possible and reduce potential risks to transmission safety. Therefore, an F-Score is introduced as an evaluation indicator of measuring the accuracy and the recall rate, and is defined as follows:
[0077] In the present disclosure, accuracy, a recall rate, and an F-Score of each model are shown in Table 3:
TABLE-US-00003 TABLE 3 Recall EFPN DSE DFF Accuracy rate F1-Score F2-Score ✓ ✓ ✓ 0.9619 0.9469 0.9544 0.9499 0.9610 0.9054 0.9324 0.9160 ✓ 0.9665 0.9369 0.9515 0.9427 ✓ 0.9545 0.9269 0.9405 0.9323 ✓ 0.9721 0.9088 0.9394 0.9208 ✓ ✓ 0.9554 0.9405 0.9479 0.9435 ✓ ✓ 0.9610 0.9341 0.9473 0.9393 ✓ ✓ 0.9619 0.9358 0.9487 0.9409
[0078] In order to check all defects of the transmission line as much as possible and avoid a potential power failure risk, the F2-Score, which has a higher recall rate and more tends to check all potential risks as much as possible, is taken as the evaluation indicator. The DSE-based enhanced feature pyramid classification network proposed in the present disclosure can better find more potential risks.
[0079] Experimental results of data augmentation and defect elimination in the step 2) and the step 3) of the present disclosure are shown in Table 4:
TABLE-US-00004 TABLE 4 Accuracy Accuracy of the of the Model Accuracy normal set defect set ResNet34 0.8801 0.9545 0.6507 (+0.0000) (+0.0000) (+0.0000) ResNet34 + data augmentation 0.8947 0.9610 0.6905 (+0.0146) (+0.0065) (+0.0398) DSE enhancement of the DSE-based 0.9228 0.9610 0.7991 enhanced feature pyramid (+0.0427) (+0.0065) (+0.1484) classification network Feature pyramid classification 0.9305 0.9619 0.8338 network + data augmentation (+0.0504) (+0.0074) (+0.1831)
[0080] It can be seen from Table 4 that the data augmentation method can improve the accuracy of the defect set more effectively, because the data augmentation method decouples more background factors from the identified target, and improves a defect set with low classification accuracy.
[0081] The foregoing embodiments are only used to explain the technical solutions of the present disclosure, and are not intended to limit the same. Although the present disclosure is described in detail with reference to the foregoing embodiments, those of ordinary skill in the art should understand that they can still modify the technical solutions described in the foregoing embodiments, or make equivalent substitutions on some technical features therein. These modifications or substitutions do not make the essence of the corresponding technical solutions deviate from the spirit and scope of the technical solutions of the embodiments of the present disclosure. The present disclosure is not limited to the above-mentioned optional implementations, and anyone can derive other products in various forms under the enlightenment of the present disclosure. The above-mentioned specific implementations should not be construed as limiting the protection scope of the present disclosure, and the protection scope of the present disclosure should be defined by the claims. Moreover, the description can be used to interpret the claims.
[0082] The preferred embodiments of the present disclosure disclosed above are only used to help illustrate the present disclosure. The preferred embodiments neither describe all the details in detail, nor limit the present disclosure to the specific implementations described. Obviously, many modifications and changes may be made based on the content of the present specification. In the present specification, these embodiments are selected and specifically described to better explain the principle and practical application of the present disclosure, so that a person skilled in the art can well understand and use the present disclosure. The present disclosure is only limited by the claims and a full scope and equivalents thereof.