CLC number:
On-line Access: 2024-08-27
Received: 2023-10-17
Revision Accepted: 2024-05-08
Crosschecked: 2023-03-17
Cited: 0
Clicked: 1592
Zonghan MU, Yong QIN, Chongchong YU, Yunpeng WU, Zhipeng WANG, Huaizhi YANG, Yonghui HUANG. Adaptive cropping shallow attention network for defect detection of bridge girder steel using unmanned aerial vehicle images[J]. Journal of Zhejiang University Science A, 2023, 24(3): 243-256.
@article{title="Adaptive cropping shallow attention network for defect detection of bridge girder steel using unmanned aerial vehicle images",
author="Zonghan MU, Yong QIN, Chongchong YU, Yunpeng WU, Zhipeng WANG, Huaizhi YANG, Yonghui HUANG",
journal="Journal of Zhejiang University Science A",
volume="24",
number="3",
pages="243-256",
year="2023",
publisher="Zhejiang University Press & Springer",
doi="10.1631/jzus.A2200175"
}
%0 Journal Article
%T Adaptive cropping shallow attention network for defect detection of bridge girder steel using unmanned aerial vehicle images
%A Zonghan MU
%A Yong QIN
%A Chongchong YU
%A Yunpeng WU
%A Zhipeng WANG
%A Huaizhi YANG
%A Yonghui HUANG
%J Journal of Zhejiang University SCIENCE A
%V 24
%N 3
%P 243-256
%@ 1673-565X
%D 2023
%I Zhejiang University Press & Springer
%DOI 10.1631/jzus.A2200175
TY - JOUR
T1 - Adaptive cropping shallow attention network for defect detection of bridge girder steel using unmanned aerial vehicle images
A1 - Zonghan MU
A1 - Yong QIN
A1 - Chongchong YU
A1 - Yunpeng WU
A1 - Zhipeng WANG
A1 - Huaizhi YANG
A1 - Yonghui HUANG
J0 - Journal of Zhejiang University Science A
VL - 24
IS - 3
SP - 243
EP - 256
%@ 1673-565X
Y1 - 2023
PB - Zhejiang University Press & Springer
ER -
DOI - 10.1631/jzus.A2200175
Abstract: bridges are an important part of railway infrastructure and need regular inspection and maintenance. Using unmanned aerial vehicle (UAV) technology to inspect railway infrastructure is an active research issue. However, due to the large size of UAV images, flight distance, and height changes, the object scale changes dramatically. At the same time, the elements of interest in railway bridges, such as bolts and corrosion, are small and dense objects, and the sample data set is seriously unbalanced, posing great challenges to the accurate detection of defects. In this paper, an adaptive cropping shallow attention network (ACSANet) is proposed, which includes an adaptive cropping strategy for large UAV images and a shallow attention network for small object detection in limited samples. To enhance the accuracy and generalization of the model, the shallow attention network model integrates a coordinate attention (CA) mechanism module and an alpha intersection over union (α-IOU) loss function, and then carries out defect detection on the bolts, steel surfaces, and railings of railway bridges. The test results show that the ACSANet model outperforms the YOLOv5s model using adaptive cropping strategy in terms of the total mAP (an evaluation index) and missing bolt mAP by 5% and 30%, respectively. Also, compared with the YOLOv5s model that adopts the common cropping strategy, the total mAP and missing bolt mAP are improved by 10% and 60%, respectively. Compared with the YOLOv5s model without any cropping strategy, the total mAP and missing bolt mAP are improved by 40% and 67%, respectively.
[1]AliR, KangD, SuhG, et al., 2021. Real-time multiple damage mapping using autonomous UAV and deep faster region-based neural networks for GPS-denied structures. Automation in Construction, 130:103831.
[2]ArivazhaganS, ShebiahRN, MagdaleneJS, et al., 2015. Railway track derailment inspection system using segmentation based fractal texture analysis. ICTACT Journal on Image and Video Processing, 6(1):1060-1065.
[3]BochkovskiyA, WangCY, LiaoHYM, 2020. YOLOv4: optimal speed and accuracy of object detection. arXiv: 2004.10934.
[4]BoxGEP, CoxDR, 1964. An analysis of transformations. Journal of the Royal Statistical Society: Series B (Methodological), 26(2):211-243.
[5]ChaYJ, ChoiW, BüyüköztürkO, 2017. Deep learning-based crack damage detection using convolutional neural networks. Computer-Aided Civil and Infrastructure Engineering, 32(5):361-378.
[6]ChaYJ, ChoiW, SuhG, et al., 2018. Autonomous structural visual inspection using region-based deep learning for detecting multiple damage types. Computer-Aided Civil and Infrastructure Engineering, 33(9):731-747.
[7]ChenJW, LiuZG, WangHR, et al., 2018. Automatic defect detection of fasteners on the catenary support device using deep convolutional neural network. IEEE Transactions on Instrumentation and Measurement, 67(2):257-269.
[8]ChenP, WuYP, QinY, et al., 2019. Rail fastener defect inspection based on UAV images: a comparative study. Proceedings of the 4th International Conference on Electrical and Information Technologies for Rail Transportation, p.685-694.
[9]ChenQ, LiuL, HanR, et al., 2019. Image identification method on highspeed railway contact network based on YOLO v3 and SENet. Chinese Control Conference, p.8772-8777.
[10]ChenYK, ZhangPZ, LiZM, et al., 2020. Stitcher: feedback-driven data provider for object detection. arXiv: 2004. 12432.
[11]ChoiW, ChaYJ, 2020. SDDNet: real-time crack segmentation. IEEE Transactions on Industrial Electronics, 67(9):8016-8025.
[12]DuqueL, SeoJ, WackerJ, 2018. Bridge deterioration quantification protocol using UAV. Journal of Bridge Engineering, 23(10):04018080.
[13]HeJB, ErfaniS, MaXJ, et al., 2021. Alpha-IOU: a family of power intersection over union losses for bounding box regression. arXiv: 2110.13675.
[14]HouQB, ZhouDQ, FengJS, 2021. Coordinate attention for efficient mobile network design. IEEE/CVF Conference on Computer Vision and Pattern Recognition, p.13708-13717.
[15]HuJ, ShenL, AlbanieS, et al., 2020. Squeeze-and-excitation networks. IEEE Transactions on Pattern Analysis and Machine Intelligence, 42(8):2011-2023.
[16]JiaXY, LuoWG, 2019. Crack damage detection of bridge based on convolutional neural networks. Chinese Control and Decision Conference, p.3995-4000.
[17]KangDH, ChaYJ, 2018. Autonomous UAVs for structural health monitoring using deep learning and an ultrasonic beacon system with geo-tagging. Computer-Aided Civil and Infrastructure Engineering, 33(10):885-902.
[18]KangDH, ChaYJ, 2021. Efficient attention-based deep encoder and decoder for automatic crack segmentation. Structural Health Monitoring, 21(5):1-16.
[19]KangDH, BenipalSS, GopalDL, et al., 2020. Hybrid pixel-level concrete crack segmentation and quantification across complex backgrounds using deep learning. Automation in Construction, 118:103291.
[20]KisantalM, WojnaZ, MurawskiJ, et al., 2019. Augmentation for small object detection. arXiv: 1902.07296.
[21]LiuG, HanJ, RongWZ, 2021. Feedback-driven loss function for small object detection. Image and Vision Computing, 111:104197.
[22]LiuJH, WuYP, QinY, et al., 2019. Defect detection for bird-preventing and fasteners on the catenary support device using improved Faster R-CNN. Proceedings of the 4th International Conference on Electrical and Information Technologies for Rail Transportation, p.695-704.
[23]LiuW, AnguelovD, ErhanD, et al., 2016. SSD: single shot MultiBox detector. Proceedings of the 14th European Conference on Computer Vision, p.21-37.
[24]LongA, KimCW, KondoY, 2021. Detecting loosening bolts of highway bridges by image processing techniques. Proceedings of the 16th East Asian-Pacific Conference on Structural Engineering and Construction, p.119-127.
[25]MorgenthalG, HallermannN, KerstenJ, et al., 2019. Framework for automated UAS-based structural condition assessment of bridges. Automation in Construction, 97:77-95.
[26]NohJ, BaeW, LeeW, et al., 2019. Better to follow, follow to be better: towards precise supervision of feature super-resolution for small object detection. IEEE/CVF International Conference on Computer Vision, p.9724-9733.
[27]RahmanMA, YangW, 2016. Optimizing intersection-over-union in deep neural networks for image segmentation. The 12th International Symposium on Advances in Visual Computing, p.234-244.
[28]RamanaL, ChoiW, ChaYJ, 2017. Automated vision-based loosened bolt detection using the cascade detector. Sensors and Instrumentation, 5:23-28.
[29]RedmonJ, DivvalaS, GirshickR, et al., 2016. You only look once: unified, real-time object detection. IEEE Conference on Computer Vision and Pattern Recognition, p.779-788.
[30]RezatofighiH, TsoiN, GwakJ, et al., 2019. Generalized intersection over union: a metric and a loss for bounding box regression. IEEE/CVF Conference on Computer Vision and Pattern Recognition, p.658-666.
[31]ShaoZF, LiCM, LiDR, et al., 2020. An accurate matching method for projecting vector data into surveillance video to monitor and protect cultivated land. ISPRS International Journal of Geo-Information, 9(7):448.
[32]TangX, DuDK, HeZQ, et al., 2018. PyramidBox: a context-assisted single shot face detector. The 15th European Conference on Computer Vision, p.812-828.
[33]TaoX, ZhangDP, MaWZ, et al., 2018. Automatic metallic surface defect detection and recognition with convolutional neural networks. Applied Sciences, 8(9):1575.
[34]van EttenA, 2018. You only look twice: rapid multi-scale object detection in satellite imagery. arXiv: 1805.09512.
[35]WangJK, HeXH, FamingS, et al., 2021. A real-time bridge crack detection method based on an improved inception-resnet-v2 structure. IEEE Access, 9:93209-93223.
[36]WangZQ, ZhangYS, YuY, et al., 2021. Prior-information auxiliary module: an injector to a deep learning bridge detection model. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 14:6270-6278.
[37]WeiZQ, LiangD, ZhangD, et al., 2022. Learning calibrated-guidance for object detection in aerial images. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 15:2721-2733.
[38]WooS, ParkJ, LeeJY, et al., 2018. CBAM: convolutional block attention module. The 15th European Conference on Computer Vision, p.3-19.
[39]WuYP, QinY, WangZP, et al., 2018. A UAV-based visual inspection method for rail surface defects. Applied Sciences, 8(7):1028.
[40]WuYP, QinY, QianY, et al., 2022. Hybrid deep learning architecture for rail surface segmentation and surface defect detection. Computer-Aided Civil and Infrastructure Engineering, 37(2):227-244.
[41]YangCHY, HuangZH, WangNY, 2021. QueryDet: cascaded sparse query for accelerating high-resolution small object detection. arXiv: 2103.09136.
[42]ZhangHY, CisseM, DauphinYN, et al., 2018. Mixup: beyond empirical risk minimization. The 6th International Conference on Learning Representations.
[43]ZhangYF, RenWQ, ZhangZ, et al., 2022. Focal and efficient IOU loss for accurate bounding box regression. Neurocomputing, 506:146-157.
[44]ZhengZH, WangP, LiuW, et al., 2019. Distance-IOU loss: faster and better learning for bounding box regression. Proceedings of the 34th AAAI Conference on Artificial Intelligence, p.12993-13000.
[45]ZhuXK, LyuSC, WangX, et al., 2021. TPH-YOLOv5: improved YOLOv5 based on transformer prediction head for object detection on drone-captured scenarios. IEEE/CVF International Conference on Computer Vision Workshops, p.2778-2788.
Open peer comments: Debate/Discuss/Question/Opinion
<1>