JZUS - Journal of Zhejiang University SCIENCE

Journal of Zhejiang University SCIENCE A

Accepted manuscript available online (unedited version)

Adaptive cropping shallow attention network for defect detection of bridge girder steel using unmanned aerial vehicle images

Author(s): Zonghan MU, Yong QIN, Chongchong YU, Yunpeng WU, Zhipeng WANG, Huaizhi YANG, Yonghui HUANG
Affiliation(s): State Key Lab of Rail Traffic Control & Safety, Beijing Jiaotong University, Beijing 100091, China; more
Corresponding email(s): yqin@bjtu.edu.cn
Key Words: Railway; Bridge; Unmanned aerial vehicle (UAV) image; Small object detection; Defect detection

Share this article to： More <<< Previous Paper \|Next Paper >>>

Zonghan MU, Yong QIN, Chongchong YU, Yunpeng WU, Zhipeng WANG, Huaizhi YANG, Yonghui HUANG. Adaptive cropping shallow attention network for defect detection of bridge girder steel using unmanned aerial vehicle images[J]. Journal of Zhejiang University Science A,in press.Frontiers of Information Technology & Electronic Engineering,in press.https://doi.org/10.1631/jzus.A2200175

@article{title="Adaptive cropping shallow attention network for defect detection of bridge girder steel using unmanned aerial vehicle images",
author="Zonghan MU, Yong QIN, Chongchong YU, Yunpeng WU, Zhipeng WANG, Huaizhi YANG, Yonghui HUANG",
journal="Journal of Zhejiang University Science A",
year="in press",
publisher="Zhejiang University Press & Springer",
doi="https://doi.org/10.1631/jzus.A2200175"
}

%0 Journal Article
%T Adaptive cropping shallow attention network for defect detection of bridge girder steel using unmanned aerial vehicle images
%A Zonghan MU
%A Yong QIN
%A Chongchong YU
%A Yunpeng WU
%A Zhipeng WANG
%A Huaizhi YANG
%A Yonghui HUANG
%J Journal of Zhejiang University SCIENCE A
%P 243-256
%@ 1673-565X
%D in press
%I Zhejiang University Press & Springer
doi="https://doi.org/10.1631/jzus.A2200175"

TY - JOUR
T1 - Adaptive cropping shallow attention network for defect detection of bridge girder steel using unmanned aerial vehicle images
A1 - Zonghan MU
A1 - Yong QIN
A1 - Chongchong YU
A1 - Yunpeng WU
A1 - Zhipeng WANG
A1 - Huaizhi YANG
A1 - Yonghui HUANG
J0 - Journal of Zhejiang University Science A
SP - 243
EP - 256
%@ 1673-565X
Y1 - in press
PB - Zhejiang University Press & Springer
ER -
doi="https://doi.org/10.1631/jzus.A2200175"

Abstract
Chinese Summary
Academic Network
Reviewer Comment

Abstract: Bridges are an important part of railway infrastructure and need regular inspection and maintenance. Using unmanned aerial vehicle (UAV) technology to inspect railway infrastructure is an active research issue. However, due to the large size of UAV images, flight distance, and height changes, the object scale changes dramatically. At the same time, the elements of interest in railway bridges, such as bolts and corrosion, are small and dense objects, and the sample data set is seriously unbalanced, posing great challenges to the accurate detection of defects. In this paper, an adaptive cropping shallow attention network (ACSANet) is proposed, which includes an adaptive cropping strategy for large UAV images and a shallow attention network for small object detection in limited samples. To enhance the accuracy and generalization of the model, the shallow attention network model integrates a coordinate attention (CA) mechanism module and an alpha intersection over union (α‍-IOU) loss function, and then carries out defect detection on the bolts, steel surfaces, and railings of railway bridges. The test results show that the ACSANet model outperforms the YOLOv5s model using adaptive cropping strategy in terms of the total mAP (an evaluation index) and missing bolt mAP by 5% and 30%, respectively. Also, compared with the YOLOv5s model that adopts the common cropping strategy, the total mAP and missing bolt mAP are improved by 10% and 60%, respectively. Compared with the YOLOv5s model without any cropping strategy, the total mAP and missing bolt mAP are improved by 40% and 67%, respectively.

适用于铁路桥梁钢结构无人机图像缺陷检测的自适应裁剪浅层注意力网络

作者：牟宗涵^1,2，秦勇¹，于重重³，吴云鹏⁴，王志鹏¹，杨怀志⁵，黄永辉⁵
机构：¹北京交通大学，轨道交通控制与安全国家重点实验室，中国北京，100091；²北京交通大学，交通运输学院，中国北京，100091；³北京工商大学，人工智能学院，中国北京，100048；⁴石家庄铁道大学，安全工程与应急管理学院，中国河北，050043；⁵京沪高速铁路股份有限公司，中国北京，100038
目的：桥梁钢结构以及钢结构上的高强度螺栓长期受风雨侵蚀，常常会有锈蚀或缺失的情况发生，而人工巡检的效率低、危险性大且视觉盲区多。本文期望通过无人机拍摄，对铁路桥梁钢结构图像所包含的检测目标（螺母正常、螺栓正常、螺栓缺失、螺母缺失、钢表面锈蚀和钢栏杆锈蚀）进行识别和检测，以提高铁路桥梁巡检工作的精度和效率。
创新点：1.提出了一种自适应图像裁剪方法，可根据图像的具体情况，自适应的调整图像的分割尺寸以及裁剪重叠区域面积，可以消除无人机拍摄距离以及焦距不固定带来的负面影响，并且提高小目标的检测效果；2.基于铁路桥梁钢结构待检测对象的特征，提出了浅层注意力网络，使模型能够更加关注待检测对象的浅层特征，从而使锈蚀区域更易于检测；3.将坐标注意力（CA）机制模块集成到浅层注意力网络模型当中，帮助网络在大范围的无人机拍摄场景下找到缺陷区域；4.将阿尔法并交比（α-IOU）损失函数集成到浅层注意力网络模型当中，提高针对铁路桥梁钢结构小数据集的训练和测试精度。
方法：1.提出自适应图像裁剪策略，对无人机大尺寸图像进行处理，得到更易于网络检测出缺陷目标的小图像；2.通过对YOLO网络进行改进，得到更关注浅层特征的浅注意力网络，提高对锈蚀、缺失的检测精度；3.集成CA注意力机制和α-IOU损失函数到浅注意力网络中，提高图像检测的精度。
结论：1.在小数据集中，待检测目标与输入图像的比例对最终的检测结果有明显影响；在本研究使用的数据集中，图像与主目标比例在20?1到80?1之间时，以50?1为界限，大于50?1时，精度变化较大，但是训练时间基本不变，而小于50?1时，精度基本不变，但是训练时间变化较大，因此在训练过程中，存在一个临界点，此时训练效率和测试结果最佳。2.更深层的网络会干扰小目标、少样本且简单特征对象的检测精度；对比其他策略相同但网络结构不同的检测结果，ACSANet相较于ACNet+CA+α-IOU的螺栓缺失精度提高了近10%。3.不同的注意力机制由于注意方向不同，并不一定会提高检测精度；合适的注意力机制以及损失函数可以对铁路桥梁钢结构无人机图像目标进行更好的检测，采用不合适的注意力机制会对检测产生负面效果。

关键词组：铁路；桥梁；无人机图像；小目标检测；缺陷检测

Darkslateblue:Affiliate; Royal Blue:Author; Turquoise:Article

Reference

[1]AliR, KangD, SuhG, et al., 2021. Real-time multiple damage mapping using autonomous UAV and deep faster region-based neural networks for GPS-denied structures. Automation in Construction, 130:103831.

[2]ArivazhaganS, ShebiahRN, MagdaleneJS, et al., 2015. Railway track derailment inspection system using segmentation based fractal texture analysis. ICTACT Journal on Image and Video Processing, 6(1):1060-1065.

[3]BochkovskiyA, WangCY, LiaoHYM, 2020. YOLOv4: optimal speed and accuracy of object detection. arXiv: 2004.10934.

[4]BoxGEP, CoxDR, 1964. An analysis of transformations. Journal of the Royal Statistical Society: Series B (Methodological), 26(2):211-243.

[5]ChaYJ, ChoiW, BüyüköztürkO, 2017. Deep learning-based crack damage detection using convolutional neural networks. Computer-Aided Civil and Infrastructure Engineering, 32(5):361-378.

[6]ChaYJ, ChoiW, SuhG, et al., 2018. Autonomous structural visual inspection using region-based deep learning for detecting multiple damage types. Computer-Aided Civil and Infrastructure Engineering, 33(9):731-747.

[7]ChenJW, LiuZG, WangHR, et al., 2018. Automatic defect detection of fasteners on the catenary support device using deep convolutional neural network. IEEE Transactions on Instrumentation and Measurement, 67(2):257-269.

[8]ChenP, WuYP, QinY, et al., 2019. Rail fastener defect inspection based on UAV images: a comparative study. Proceedings of the 4th International Conference on Electrical and Information Technologies for Rail Transportation, p.685-694.

[9]ChenQ, LiuL, HanR, et al., 2019. Image identification method on highspeed railway contact network based on YOLO v3 and SENet. Chinese Control Conference, p.8772-8777.

[10]ChenYK, ZhangPZ, LiZM, et al., 2020. Stitcher: feedback-driven data provider for object detection. arXiv: 2004. 12432.

[11]ChoiW, ChaYJ, 2020. SDDNet: real-time crack segmentation. IEEE Transactions on Industrial Electronics, 67(9):8016-8025.

[12]DuqueL, SeoJ, WackerJ, 2018. Bridge deterioration quantification protocol using UAV. Journal of Bridge Engineering, 23(10):04018080.

[13]HeJB, ErfaniS, MaXJ, et al., 2021. Alpha-IOU: a family of power intersection over union losses for bounding box regression. arXiv: 2110.13675.

[14]HouQB, ZhouDQ, FengJS, 2021. Coordinate attention for efficient mobile network design. IEEE/CVF Conference on Computer Vision and Pattern Recognition, p.13708-13717.

[15]HuJ, ShenL, AlbanieS, et al., 2020. Squeeze-and-excitation networks. IEEE Transactions on Pattern Analysis and Machine Intelligence, 42(8):2011-2023.

[16]JiaXY, LuoWG, 2019. Crack damage detection of bridge based on convolutional neural networks. Chinese Control and Decision Conference, p.3995-4000.

[17]KangDH, ChaYJ, 2018. Autonomous UAVs for structural health monitoring using deep learning and an ultrasonic beacon system with geo-tagging. Computer-Aided Civil and Infrastructure Engineering, 33(10):885-902.

[18]KangDH, ChaYJ, 2021. Efficient attention-based deep encoder and decoder for automatic crack segmentation. Structural Health Monitoring, 21(5):1-16.

[19]KangDH, BenipalSS, GopalDL, et al., 2020. Hybrid pixel-level concrete crack segmentation and quantification across complex backgrounds using deep learning. Automation in Construction, 118:103291.

[20]KisantalM, WojnaZ, MurawskiJ, et al., 2019. Augmentation for small object detection. arXiv: 1902.07296.

[21]LiuG, HanJ, RongWZ, 2021. Feedback-driven loss function for small object detection. Image and Vision Computing, 111:104197.

[22]LiuJH, WuYP, QinY, et al., 2019. Defect detection for bird-preventing and fasteners on the catenary support device using improved Faster R-CNN. Proceedings of the 4th International Conference on Electrical and Information Technologies for Rail Transportation, p.695-704.

[23]LiuW, AnguelovD, ErhanD, et al., 2016. SSD: single shot MultiBox detector. Proceedings of the 14th European Conference on Computer Vision, p.21-37.

[24]LongA, KimCW, KondoY, 2021. Detecting loosening bolts of highway bridges by image processing techniques. Proceedings of the 16th East Asian-Pacific Conference on Structural Engineering and Construction, p.119-127.

[25]MorgenthalG, HallermannN, KerstenJ, et al., 2019. Framework for automated UAS-based structural condition assessment of bridges. Automation in Construction, 97:77-95.

[26]NohJ, BaeW, LeeW, et al., 2019. Better to follow, follow to be better: towards precise supervision of feature super-resolution for small object detection. IEEE/CVF International Conference on Computer Vision, p.9724-9733.

[27]RahmanMA, YangW, 2016. Optimizing intersection-over-union in deep neural networks for image segmentation. The 12th International Symposium on Advances in Visual Computing, p.234-244.

[28]RamanaL, ChoiW, ChaYJ, 2017. Automated vision-based loosened bolt detection using the cascade detector. Sensors and Instrumentation, 5:23-28.

[29]RedmonJ, DivvalaS, GirshickR, et al., 2016. You only look once: unified, real-time object detection. IEEE Conference on Computer Vision and Pattern Recognition, p.779-788.

[30]RezatofighiH, TsoiN, GwakJ, et al., 2019. Generalized intersection over union: a metric and a loss for bounding box regression. IEEE/CVF Conference on Computer Vision and Pattern Recognition, p.658-666.

[31]ShaoZF, LiCM, LiDR, et al., 2020. An accurate matching method for projecting vector data into surveillance video to monitor and protect cultivated land. ISPRS International Journal of Geo-Information, 9(7):448.

[32]TangX, DuDK, HeZQ, et al., 2018. PyramidBox: a context-assisted single shot face detector. The 15th European Conference on Computer Vision, p.812-828.

[33]TaoX, ZhangDP, MaWZ, et al., 2018. Automatic metallic surface defect detection and recognition with convolutional neural networks. Applied Sciences, 8(9):1575.

[34]van EttenA, 2018. You only look twice: rapid multi-scale object detection in satellite imagery. arXiv: 1805.09512.

[35]WangJK, HeXH, FamingS, et al., 2021. A real-time bridge crack detection method based on an improved inception-resnet-v2 structure. IEEE Access, 9:93209-93223.

[36]WangZQ, ZhangYS, YuY, et al., 2021. Prior-information auxiliary module: an injector to a deep learning bridge detection model. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 14:6270-6278.

[37]WeiZQ, LiangD, ZhangD, et al., 2022. Learning calibrated-guidance for object detection in aerial images. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 15:2721-2733.

[38]WooS, ParkJ, LeeJY, et al., 2018. CBAM: convolutional block attention module. The 15th European Conference on Computer Vision, p.3-19.

[39]WuYP, QinY, WangZP, et al., 2018. A UAV-based visual inspection method for rail surface defects. Applied Sciences, 8(7):1028.

[40]WuYP, QinY, QianY, et al., 2022. Hybrid deep learning architecture for rail surface segmentation and surface defect detection. Computer-Aided Civil and Infrastructure Engineering, 37(2):227-244.

[41]YangCHY, HuangZH, WangNY, 2021. QueryDet: cascaded sparse query for accelerating high-resolution small object detection. arXiv: 2103.09136.

[42]ZhangHY, CisseM, DauphinYN, et al., 2018. Mixup: beyond empirical risk minimization. The 6th International Conference on Learning Representations.

[43]ZhangYF, RenWQ, ZhangZ, et al., 2022. Focal and efficient IOU loss for accurate bounding box regression. Neurocomputing, 506:146-157.

[44]ZhengZH, WangP, LiuW, et al., 2019. Distance-IOU loss: faster and better learning for bounding box regression. Proceedings of the 34th AAAI Conference on Artificial Intelligence, p.12993-13000.

[45]ZhuXK, LyuSC, WangX, et al., 2021. TPH-YOLOv5: improved YOLOv5 based on transformer prediction head for object detection on drone-captured scenarios. IEEE/CVF International Conference on Computer Vision Workshops, p.2778-2788.

Open peer comments: Debate/Discuss/Question/Opinion

<1>

- Go to

适用于铁路桥梁钢结构无人机图像缺陷检测的自适应裁剪浅层注意力网络

Darkslateblue:Affiliate; Royal Blue:Author; Turquoise:Article

Reference