JZUS - Journal of Zhejiang University SCIENCE

Frontiers of Information Technology & Electronic Engineering

Accepted manuscript available online (unedited version)

A modified YOLOv4 detection method for a vision-based underwater garbage cleaning robot

Author(s): Manjun TIAN, Xiali LI, Shihan KONG, Licheng WU, Junzhi YU
Affiliation(s): First Research Institute of the Ministry of Public Security of PRC, Beijing 100048, China; more
Corresponding email(s): tianmanjun2018@163.com, xiaer_li@163.com, kongshihan@pku.edu.cn, wulicheng@tsinghua.edu.cn, junzhi.yu@ia.ac.cn
Key Words: Object detection; Aquatic environment; Garbage cleaning robot; Modified YOLOv4

Share this article to： More <<< Previous Paper \|Next Paper >>>

Manjun TIAN, Xiali LI, Shihan KONG, Licheng WU, Junzhi YU. A modified YOLOv4 detection method for a vision-based underwater garbage cleaning robot[J]. Frontiers of Information Technology & Electronic Engineering,in press.https://doi.org/10.1631/FITEE.2100473

@article{title="A modified YOLOv4 detection method for a vision-based underwater garbage cleaning robot",
author="Manjun TIAN, Xiali LI, Shihan KONG, Licheng WU, Junzhi YU",
journal="Frontiers of Information Technology & Electronic Engineering",
year="in press",
publisher="Zhejiang University Press & Springer",
doi="https://doi.org/10.1631/FITEE.2100473"
}

%0 Journal Article
%T A modified YOLOv4 detection method for a vision-based underwater garbage cleaning robot
%A Manjun TIAN
%A Xiali LI
%A Shihan KONG
%A Licheng WU
%A Junzhi YU
%J Frontiers of Information Technology & Electronic Engineering
%P 1217-1228
%@ 2095-9184
%D in press
%I Zhejiang University Press & Springer
doi="https://doi.org/10.1631/FITEE.2100473"

TY - JOUR
T1 - A modified YOLOv4 detection method for a vision-based underwater garbage cleaning robot
A1 - Manjun TIAN
A1 - Xiali LI
A1 - Shihan KONG
A1 - Licheng WU
A1 - Junzhi YU
J0 - Frontiers of Information Technology & Electronic Engineering
SP - 1217
EP - 1228
%@ 2095-9184
Y1 - in press
PB - Zhejiang University Press & Springer
ER -
doi="https://doi.org/10.1631/FITEE.2100473"

Abstract
Chinese Summary
Academic Network
Reviewer Comment

Abstract: To tackle the problem of aquatic environment pollution, a vision-based autonomous underwater garbage cleaning robot has been developed in our laboratory. We propose a garbage detection method based on a modified YOLOv4, allowing high-speed and high-precision object detection. Specifically, the YOLOv4 algorithm is chosen as a basic neural network framework to perform object detection. With the purpose of further improvement on the detection accuracy, YOLOv4 is transformed into a four-scale detection method. To improve the detection speed, model pruning is applied to the new model. By virtue of the improved detection methods, the robot can collect garbage autonomously. The detection speed is up to 66.67 frames/s with a mean average precision (mAP) of 95.099%, and experimental results demonstrate that both the detection speed and the accuracy of the improved YOLOv4 are excellent.

基于改进YOLOv4的水下垃圾清理机器人视觉检测算法

田满军^1,2，李霞丽²，孔诗涵³，吴立成²，喻俊志^3,4
¹公安部第一研究所，中国北京市，100048
²中央民族大学信息工程学院，中国北京市，100081
³北京大学工学院先进制造与机器人系，中国北京市，100871
⁴中国科学院自动化研究所复杂系统管理与控制国家重点实验室，中国北京市，100190
摘要：为解决水环境污染问题，依托基于视觉的水下垃圾自主清理机器人，提出一种基于改进YOLOv4的垃圾检测方法，可实现高速、高精度的目标检测。具体而言，选择YOLOv4算法作为执行目标检测的基本神经网络框架。为进一步提高检测精度，将传统YOLOv4改进为四尺度检测算法；为提高检测速度，对新模型进行模型剪枝操作。同时，将所提方法应用于水下机器人，实现了自主垃圾收集作业。检测速度可达66.67 帧/秒，平均准确率可达95.099%；实验结果表明，改进后的YOLOv4算法在检测速度和精度方面均表现优秀。

关键词组：目标检测；水环境；垃圾清理机器人；改进YOLOv4

Darkslateblue:Affiliate; Royal Blue:Author; Turquoise:Article

Reference

[1]Albitar H, Dandan K, Ananiev A, et al., 2016. Underwater robotics: surface cleaning technics, adhesion and locomotion systems. Int J Adv Robot Syst, 13(1):7.

[2]Astapov S, Preden JS, Ehala J, et al., 2014. Object detection for military surveillance using distributed multimodal smart sensors. Proc 19^th Int Conf on Digital Signal Processing, p.366-371.

[3]Bai JQ, Lian SG, Liu ZX, et al., 2018. Deep learning based robot for automatically picking up garbage on the grass. IEEE Trans Consum Electron, 64(3):382-389.

[4]Benjdira B, Khursheed T, Koubaa A, et al., 2019. Car detection using unmanned aerial vehicles: comparison between faster R-CNN and YOLOv3. Proc 1^st Int Conf on Unmanned Vehicle Systems-Oman, p.1-6.

[5]Bochkovskiy A, Wang CY, Liao HYM, 2020. YOLOv4: optimal speed and accuracy of object detection. https://arxiv.org/abs/2004.10934

[6]Choi H, 2018. Deep learning in nuclear medicine and molecular imaging: current perspectives and future directions. Nucl Med Mol Imag, 52(2):109-118.

[7]Dalal N, Triggs B, 2005. Histograms of oriented gradients for human detection. Proc IEEE Computer Society Conf on Computer Vision and Pattern Recognition, p.886-893.

[8]Ekins P, Gupta J, 2019. Perspective: a healthy planet for healthy people. Glob Sustain, 2:1-9.

[9]Fei Y, Wang KCP, Zhang A, et al., 2020. Pixel-level cracking detection on 3D asphalt pavement images through deep-learning-based crackNet-V. IEEE Trans Intell Transp Syst, 21(1):273-284.

[10]Felzenszwalb P, McAllester D, Ramanan D, 2008. A discriminatively trained, multiscale, deformable part model. IEEE Int Conf on Computer Vision and Pattern Recognition, p.24-26.

[11]Fu ZH, Chen YW, Yong HW, et al., 2019. Foreground gating and background refining network for surveillance object detection. IEEE Trans Image Process, 28(12):6077-6090.

[12]Girshick R, 2015. Fast R-CNN. Proc IEEE Int Conf on Computer Vision, p.1440-1448.

[13]Girshick R, Donahue J, Darrell T, et al., 2014. Rich feature hierarchies for accurate object detection and semantic segmentation. Proc IEEE Conf on Computer Vision and Pattern Recognition, p.580-587.

[14]Gural PS, 2019. Deep learning algorithms applied to the classification of video meteor detections. Mon Not R Astron Soc, 489(4):5109-5118.

[15]Hannun AY, Rajpurkar P, Haghpanahi M, et al., 2019. Cardiologist-level arrhythmia detection and classification in ambulatory electrocardiograms using a deep neural network. Nat Med, 25(1):65-69.

[16]He KM, Zhang XY, Ren SQ, et al., 2015. Spatial pyramid pooling in deep convolutional networks for visual recognition. IEEE Trans Patt Anal Mach Intell, 37(9):1904-1916.

[17]He KM, Zhang XY, Ren SQ, et al., 2016. Identity mappings in deep residual networks. Proc 14th European Conf on Computer Vision, p.630-645.

[18]Hong J, Fulton M, Sattar J, 2020. A generative approach towards improved robotic detection of marine litter. Proc IEEE Int Conf on Robotics and Automation, p.10525-10531.

[19]Horng GJ, Liu MX, Chen CC, 2020. The smart image recognition mechanism for crop harvesting system in intelligent agriculture. IEEE Sens J, 20(5):2766-2781.

[20]Hsu WY, Lin WY, 2020. Ratio-and-scale-aware YOLO for pedestrian detection. IEEE Trans Image Process, 30:934-947.

[21]Hussain E, Hasan M, Rahman A, et al., 2021. CoroDet: a deep learning based classification for COVID-19 detection using chest X-ray images. Chaos Sol Fract, 142:110495.

[22]Jambeck JR, Geyer R, Wilcox C, et al., 2015. Plastic waste inputs from land into the ocean. Science, 347(6223):768771.

[23]Karatzas P, Melagraki G, Ellis LJA, et al., 2020. Development of deep learning models for predicting the effects of exposure to engineered nanomaterials on Daphnia magna. Small, 16(36):2001080.

[24]Kim J, Mishra AK, Limosani R, et al., 2019. Control strategies for cleaning robots in domestic applications: a comprehensive review. Int J Adv Robot Syst, 16(4):1-21.

[25]Kong SH, Tian MJ, Qiu CL, et al., 2021. IWSCR: an intelligent water surface cleaner robot for collecting floating garbage. IEEE Trans Syst Man Cybern Syst, 51(10):6358-6368.

[26]Krizhevsky A, Sutskever I, Hinton GE, 2017. ImageNet classification with deep convolutional neural networks. Commun ACM, 60(6):84-90.

[27]Laschi C, Mazzolai B, Cianchetti M, 2016. Soft robotics: technologies and systems pushing the boundaries of robot abilities. Sci Robot, 41(1):eaah3690.

[28]Li CY, Guo CL, Ren WQ, et al., 2019. An underwater image enhancement benchmark dataset and beyond. IEEE Trans Image Process, 29:4376-4389.

[29]Li HP, Xiong PF, An J, et al., 2018. Pyramid attention network for semantic segmentation. Proc British Machine Vision Conf, p.285.

[30]Li XL, Tian MJ, Kong SH, et al., 2020. A modified YOLOv3 detection method for vision-based water surface garbage capture robot. Int J Adv Robot Syst, 17(3):1-11.

[31]Lin TY, Dollár P, Girshick R, et al., 2017. Feature pyramid networks for object detection. Proc IEEE Conf on Computer Vision and Pattern Recognition, p.936-944.

[32]Liu W, Anguelov D, Erhan D, et al., 2016. SSD: single shot multibox detector. European Conf on Computer Vision, p.21-37.

[33]Liu Z, Li JG, Shen ZQ, et al., 2017. Learning efficient convolutional networks through network slimming. Proc IEEE Int Conf on Computer Vision, p.2755-2763.

[34]Lowe DG, 2004. Distinctive image features from scale-invariant keypoints. Int J Comput Vis, 60(2):91-110.

[35]Mahler J, Pokorny FT, Hou B, et al., 2016. Dex-Net 1.0: a cloud-based network of 3D objects for robust grasp planning using a multi-armed bandit model with correlated rewards. Proc IEEE Int Conf on Robotics and Automation, p.1957-1964.

[36]Mahler J, Liang J, Niyaz S, et al., 2017. Dex-Net 2.0: deep learning to plan robust grasps with synthetic point clouds and analytic grasp metrics. https://arxiv.org/abs/1703.09312

[37]Mahler J, Matl M, Liu XY, et al., 2018. Dex-Net 3.0: computing robust vacuum suction grasp targets in point clouds using a new analytic model and deep learning. Proc IEEE Int Conf on Robotics and Automation, p.5620-5627.

[38]Mahler J, Matl M, Satish V, et al., 2019. Learning ambidextrous robot grasping policies. Sci Robot, 4(26):eaau4984.

[39]Mhalla A, Chateau T, Gazzah S, et al., 2019. An embedded computer-vision system for multi-object detection in traffic surveillance. IEEE Trans Intell Transp Syst, 20(11):4006-4018.

[40]Ming X, Wei FY, Zhang T, et al., 2022. Group sampling for scale invariant face detection. IEEE Trans Patt Anal Mach Intell, 44(2):985-1001.

[41]Molchanov P, Mallya A, Tyree S, et al., 2019. Importance estimation for neural network pruning. Proc IEEE/CVF Conf on Computer Vision and Pattern Recognition, p.11256-11264.

[42]Ojala T, Pietikäinen M, Maenpaa T, 2002. Multiresolution gray-scale and rotation invariant texture classification with local binary patterns. IEEE Trans Patt Anal Mach Intell, 24(7):971-987.

[43]Ostle C, Thompson RC, Broughton D, et al., 2019. The rise in ocean plastics evidenced from a 60-year time series. Nat Commun, 10(1):1622.

[44]Park JH, Hwang HW, Moon JH, et al., 2019. Automated identification of cephalometric landmarks: Part 1—comparisons between the latest deep-learning methods YOLOV3 and SSD. Angle Orthod, 89(6):903-909.

[45]Prabakaran V, Elara MR, Pathmakumar T, et al., 2018. Floor cleaning robot with reconfigurable mechanism. Autom Constr, 91:155-165.

[46]Pu SL, Zhao W, Chen WJ, et al., 2021. Unsupervised object detection with scene-adaptive concept learning. Front Inform Technol Electron Eng, 22(5):638-651.

[47]Redmon J, Farhadi A, 2017. YOLO9000: better, faster, stronger. Proc IEEE Conf on Computer Vision and Pattern Recognition, p.6517-6525.

[48]Redmon J, Farhadi A, 2018. YOLOv3: an incremental improvement. https://arxiv.org/abs/1804.02767

[49]Redmon J, Divvala S, Girshick R, et al., 2016. You only look once: unified, realtime object detection. Proc IEEE Conf on Computer Vision and Pattern Recognition, p.779-788.

[50]Ren SQ, He KM, Girshick RB, et al., 2015. Faster R-CNN: towards real-time object detection with region proposal networks. Proc Annual Conf on Neural Information Processing Systems, p.91-99.

[51]Simonyan K, Zisserman A, 2015. Very deep convolutional networks for large-scale image recognition. https://arxiv.org/abs/1409.1556

[52]Song ZG, Zou SM, Zhou WX, et al., 2020. Clinically applicable histopathological diagnosis system for gastric cancer detection using deep learning. Nat Commun, 11(1):4294.

[53]Szegedy C, Liu W, Jia YQ, et al., 2015. Going deeper with convolutions. Proc IEEE Conf on Computer Vision and Pattern Recognition, p.1-9.

[54]Tian MJ, Li XL, Kong SH, et al., 2021. Pruning-based YOLOv4 algorithm for underwater gabage detection. Proc 40^{th^{Chinese Control Conf, p.4008-4013.}}

[55]Tschandl P, 2020. Problems and potentials of automated object detection for skin cancer recognition. JAMA Dermatol, 156(1):23-24.

[56]Valdenegro-Toro M, 2019. Deep neural networks for marine debris detection in sonar images. https://arxiv.org/abs/1905.05241

[57]Viola P, Jones M, 2001. Rapid object detection using a boosted cascade of simple features. Proc IEEE Computer Society Conf on Computer Vision and Pattern Recognition, p.511-518.

[58]Wang CY, Liao HYM, Wu YH, et al., 2020. CSPNet: a new backbone that can enhance learning capability of CNN. Proc IEEE/CVF Conf on Computer Vision and Pattern Recognition Workshops, p.1571-1580.

[59]Whitehill J, Omlin CW, 2006. Haar features for FACS AU recognition. Proc 7th Int Conf on Automatic Face and Gesture Recognition, p.5-101.

[60]Xu M, Karuppusamy NS, Kang BY, 2017. A novel design to improve the cooperative ability of the multi-cleaning robot in the unknown environment. Adv Sci Lett, 23(10):9557-9560.

Open peer comments: Debate/Discuss/Question/Opinion

<1>

- Go to

基于改进YOLOv4的水下垃圾清理机器人视觉检测算法

Darkslateblue:Affiliate; Royal Blue:Author; Turquoise:Article

Reference