CLC number: TP391.4
On-line Access: 2024-08-27
Received: 2023-10-17
Revision Accepted: 2024-05-08
Crosschecked: 2023-04-09
Cited: 0
Clicked: 1609
Citations: Bibtex RefMan EndNote GB/T7714
Xiaofei QIN, Wenkai HU, Chen XIAO, Changxiang HE, Songwen PEI, Xuedian ZHANG. Attention-based efficient robot grasp detection network[J]. Frontiers of Information Technology & Electronic Engineering, 2023, 24(10): 1430-1444.
@article{title="Attention-based efficient robot grasp detection network",
author="Xiaofei QIN, Wenkai HU, Chen XIAO, Changxiang HE, Songwen PEI, Xuedian ZHANG",
journal="Frontiers of Information Technology & Electronic Engineering",
volume="24",
number="10",
pages="1430-1444",
year="2023",
publisher="Zhejiang University Press & Springer",
doi="10.1631/FITEE.2200502"
}
%0 Journal Article
%T Attention-based efficient robot grasp detection network
%A Xiaofei QIN
%A Wenkai HU
%A Chen XIAO
%A Changxiang HE
%A Songwen PEI
%A Xuedian ZHANG
%J Frontiers of Information Technology & Electronic Engineering
%V 24
%N 10
%P 1430-1444
%@ 2095-9184
%D 2023
%I Zhejiang University Press & Springer
%DOI 10.1631/FITEE.2200502
TY - JOUR
T1 - Attention-based efficient robot grasp detection network
A1 - Xiaofei QIN
A1 - Wenkai HU
A1 - Chen XIAO
A1 - Changxiang HE
A1 - Songwen PEI
A1 - Xuedian ZHANG
J0 - Frontiers of Information Technology & Electronic Engineering
VL - 24
IS - 10
SP - 1430
EP - 1444
%@ 2095-9184
Y1 - 2023
PB - Zhejiang University Press & Springer
ER -
DOI - 10.1631/FITEE.2200502
Abstract: To balance the inference speed and detection accuracy of a grasp detection algorithm, which are both important for robot grasping tasks, we propose an encoder–;decoder structured pixel-level grasp detection neural network named the attention-based efficient robot grasp detection network (AE-GDN). Three spatial attention modules are introduced in the encoder stages to enhance the detailed information, and three channel attention modules are introduced in the decoder stages to extract more semantic information. Several lightweight and efficient DenseBlocks are used to connect the encoder and decoder paths to improve the feature modeling capability of AE-GDN. A high intersection over union (IoU) value between the predicted grasp rectangle and the ground truth does not necessarily mean a high-quality grasp configuration, but might cause a collision. This is because traditional IoU loss calculation methods treat the center part of the predicted rectangle as having the same importance as the area around the grippers. We design a new IoU loss calculation method based on an hourglass box matching mechanism, which will create good correspondence between high IoUs and high-quality grasp configurations. AE-GDN achieves the accuracy of 98.9% and 96.6% on the Cornell and Jacquard datasets, respectively. The inference speed reaches 43.5 frames per second with only about 1.2×106 parameters. The proposed AE-GDN has also been deployed on a practical robotic arm grasping system and performs grasping well. Codes are available at https://github.com/robvincen/robot_gradethttps://github.com/robvincen/robot_gradet.
[1]Ainetter S, Fraundorfer F, 2021. End-to-end trainable deep neural network for robotic grasp detection and semantic segmentation from RGB. Proc IEEE Int Conf on Robotics and Automation, p.13452-13458.
[2]Asif U, Bennamoun M, Sohel FA, 2017. RGB-D object recognition and grasp detection using hierarchical cascaded forests. IEEE Trans Rob, 33(3):547-564.
[3]Asif U, Tang JB, Harrer S, 2018. GraspNet: an efficient convolutional neural network for real-time grasp detection for low-powered devices. Proc 27th Int Joint Conf on Artificial Intelligence, p.4875-4882.
[4]Asif U, Tang JB, Harrer S, 2019. Densely supervised grasp detector (DSGD). Proc 33rd AAAI Conf on Artificial Intelligence, p.8085-8093.
[5]Chen L, Huang PF, Meng ZJ, 2019. Convolutional multi-grasp detection using grasp path for RGBD images. Rob Auton Syst, 113:94-103.
[6]Chen L, Huang PF, Li YH, et al., 2020. Detecting graspable rectangles of objects in robotic grasping. Int J Contr Autom Syst, 18(5):1343-1352.
[7]Chu FJ, Xu RN, Vela PA, 2018a. Deep grasp: detection and localization of grasps with deep neural networks. https://arxiv.org/abs/1802.00520v2
[8]Chu FJ, Xu RN, Vela PA, 2018b. Real-world multiobject, multigrasp detection. IEEE Rob Autom Lett, 3(4):3355-3362.
[9]Depierre A, Dellandréa E, Chen LM, 2018. Jacquard: a large scale dataset for robotic grasp detection. Proc IEEE/RSJ Int Conf on Intelligent Robots and Systems, p.3511-3516.
[10]Fang B, Long XM, Sun FC, et al., 2022. Tactile-based fabric defect detection using convolutional neural network with attention mechanism. IEEE Trans Instrum Meas, 71:5011309.
[11]Ghazaei G, Laina I, Rupprecht C, et al., 2018. Dealing with ambiguity in robotic grasping via multiple predictions. Proc 14th Asian Conf on Computer Vision, p.38-55.
[12]Guo D, Sun FC, Liu HP, et al., 2017. A hybrid deep architecture for robotic grasp detection. Proc IEEE Int Conf on Robotics and Automation, p.1609-1614.
[13]Hara K, Vemulapalli R, Chellappa R, 2017. Designing deep convolutional neural networks for continuous object orientation estimation. https://arxiv.org/abs/1702.01499
[14]He KM, Zhang XY, Ren SQ, et al., 2016. Deep residual learning for image recognition. Proc IEEE Conf on Computer Vision and Pattern Recognition, p.770-778.
[15]Huang G, Liu Z, van der Maaten L, et al., 2017. Densely connected convolutional networks. Proc IEEE Conf on on Computer Vision and Pattern Recognition, p.4700-4708.
[16]Jaderberg M, Simonyan K, Zisserman A, et al., 2015. Spatial transformer networks. Proc 28th Int Conf on Neural Information Processing Systems, p.2017-2025.
[17]Jiang Y, Moseson S, Saxena A, 2011. Efficient grasping from RGBD images: learning using a new rectangle representation. Proc IEEE Int Conf on Robotics and Automation, p.3304-3311.
[18]Karaoguz H, Jensfelt P, 2019. Object detection approach for robot grasp detection. Proc Int Conf on Robotics and Automation, p.4953-4959.
[19]Krizhevsky A, Sutskever I, Hinton GE, 2012. ImageNet classification with deep convolutional neural networks. Proc 25th Int Conf on Neural Information Processing Systems, p.1097-1105.
[20]Kumra S, Kanan C, 2017. Robotic grasp detection using deep convolutional neural networks. Proc IEEE/RSJ Int Conf on Intelligent Robots and Systems, p.769-776.
[21]Kumra S, Joshi S, Sahin F, 2020. Antipodal robotic grasping using generative residual convolutional neural network. Proc IEEE/RSJ Int Conf on Intelligent Robots and Systems, p.9626-9633.
[22]Lenz I, Lee H, Saxena A, 2015. Deep learning for detecting robotic grasps. Int J Rob Res, 34(4-5):705-724.
[23]Liu FK, Sun F, Fang B, et al., 2023. Hybrid robotic grasping with a soft multimodal gripper and a deep multistage learning scheme. IEEE Trans Rob, 39(3):2379-2399.
[24]Mahler J, Liang J, Niyaz S, et al., 2017. Dex-Net 2.0: deep learning to plan robust grasps with synthetic point clouds and analytic grasp metrics. https://arxiv.org/abs/1703.09312
[25]Morrison D, Corke P, Leitner J, 2018. Closing the loop for robotic grasping: a real-time, generative grasp synthesis approach. https://arxiv.org/abs/1804.05172
[26]Morrison D, Corke P, Leitner J, 2020. Learning robust, real-time, reactive robotic grasping. Int J Rob Res, 39(2-3):183-201.
[27]Park D, Seo Y, Chun SY, 2020. Real-time, highly accurate robotic grasp detection using fully convolutional neural network with rotation ensemble module. Proc IEEE Int Conf on Robotics and Automation, p.9397-9403.
[28]Pinto L, Gupta A, 2016. Supersizing self-supervision: learning to grasp from 50K tries and 700 robot hours. Proc IEEE Int Conf on Robotics and Automation, p.3406-3413.
[29]Quigley M, Conley K, Gerkey BP, et al., 2009. ROS: an open-source robot operating system. Proc ICRA Workshop on Open Source Software, p.5.
[30]Redmon J, Angelova A, 2015. Real-time grasp detection using convolutional neural networks. Proc IEEE Int Conf on Robotics and Automation, p.1316-1322.
[31]Rezatofighi H, Tsoi N, Gwak J, et al., 2019. Generalized intersection over union: a metric and a loss for bounding box regression. Proc IEEE/CVF Conf on Computer Vision and Pattern Recognition, p.658-666.
[32]Song YN, Gao L, Li XY, et al., 2020. A novel robotic grasp detection method based on region proposal networks. Rob Comput Integr Manuf, 65:101963.
[33]Wang Q, Fan Z, Seng WH, et al., 2022. Cloud-assisted cognition adaptation for service robots in changing home environments. Front Inform Technol Electron Eng, 23(2):246-257.
[34]Wang Y, Zheng YT, Gao BY, et al., 2021. Double-dot network for antipodal grasp detection. Proc IEEE/RSJ Int Conf on Intelligent Robots and Systems, p.4654-4661.
[35]Wang ZC, Li ZQ, Wang B, et al., 2016. Robot grasp detection using multimodal deep convolutional neural networks. Adv Mech Eng, 8(9):1687814016668077.
[36]Woo S, Park J, Lee JY, et al., 2018. CBAM: convolutional block attention module. Proc 15th European Conf on Computer Vision, p.3-19.
[37]Zeiler MD, Fergus R, 2014. Visualizing and understanding convolutional networks. Proc 13th European Conf on Computer Vision, p.818-833.
[38]Zhang HB, Lan XG, Bai ST, et al., 2019. RoI-based robotic grasp detection for object overlapping scenes. Proc IEEE/RSJ Int Conf on Intelligent Robots and Systems, p.4768-4775.
[39]Zhou XW, Lan XG, Zhang HB, et al., 2018. Fully convolutional grasp detection network with oriented anchor box. Proc IEEE/RSJ Int Conf on Intelligent Robots and Systems, p.7223-7230.
Open peer comments: Debate/Discuss/Question/Opinion
<1>