JZUS - Journal of Zhejiang University SCIENCE

Frontiers of Information Technology & Electronic Engineering

ISSN 2095-9184 (print), ISSN 2095-9230 (online)

2023 Vol.24 No.10 P.1430-1444

Attention-based efficient robot grasp detection network

Xiaofei QIN, Wenkai HU, Chen XIAO, Changxiang HE, Songwen PEI, Xuedian ZHANG

School of Optical-Electrical and Computer Engineering, University of Shanghai for Science and Technology, Shanghai 200093, China; College of Science, University of Shanghai for Science and Technology, Shanghai 200093, China; Shanghai Key Laboratory of Modern Optical System, Shanghai 200093, China; Key Laboratory of Biomedical Optical Technology and Devices of Ministry of Education, Shanghai 200093, China; Shanghai Institute of Intelligent Science and Technology, Tongji University, Shanghai 201210, China

xiaofei.qin@usst.edu.cn, obmmd_zxd@163.com

Abstract: To balance the inference speed and detection accuracy of a grasp detection algorithm, which are both important for robot grasping tasks, we propose an encoder–decoder structured pixel-level grasp detection neural network named the attention-based efficient robot grasp detection network (AE-GDN). Three spatial attention modules are introduced in the encoder stages to enhance the detailed information, and three channel attention modules are introduced in the decoder stages to extract more semantic information. Several lightweight and efficient DenseBlocks are used to connect the encoder and decoder paths to improve the feature modeling capability of AE-GDN. A high intersection over union (IoU) value between the predicted grasp rectangle and the ground truth does not necessarily mean a high-quality grasp configuration, but might cause a collision. This is because traditional IoU loss calculation methods treat the center part of the predicted rectangle as having the same importance as the area around the grippers. We design a new IoU loss calculation method based on an hourglass box matching mechanism, which will create good correspondence between high IoUs and high-quality grasp configurations. AE-GDN achieves the accuracy of 98.9% and 96.6% on the Cornell and Jacquard datasets, respectively. The inference speed reaches 43.5 frames per second with only about 1.2×10⁶ parameters. The proposed AE-GDN has also been deployed on a practical robotic arm grasping system and performs grasping well. Codes are available at https://github.com/robvincen/robot_gradethttps://github.com/robvincen/robot_gradet.

Key words: Robot grasp detection; Attention mechanism; Encoder–decoder; Neural network

Chinese Summary <25> 基于注意力的高效机器人抓取检测网络

秦晓飞¹，胡文凯¹，肖晨²，何常香²，裴颂文^1,3,4，张学典^1,3,4,5
¹上海理工大学光电信息与计算机工程学院，中国上海市，200093
²上海理工大学理学院，中国上海市，200093
³上海市现代光学系统重点实验室，中国上海市，200093
⁴医用光学技术与仪器教育部重点实验室，中国上海市，200093
⁵同济大学上海智能科学与技术研究所，中国上海市，201210
摘要：为平衡抓取检测算法的推理速度和检测精度，本文提出一种编码器-解码器结构的像素级抓取检测神经网络，称为基于注意力的高效机器人抓取检测网络（AE-GDN）。在编码器阶段引入3个空间注意模块以增强细节信息，在解码器阶段引入3个通道注意模块以提取更多语义信息。采用多个轻量高效的DenseBlocks连接编码器和解码器，提高AE-GDN的特征建模能力。预测得到的抓取矩形框与标签抓取框之间的高交并比（IoU）值并不意味着高质量的抓取配置，但可能会导致碰撞。这是因为传统IoU损失计算方法将预测抓取框中心部分像素与夹爪附近像素视为同等重要。本文设计了一种新的基于沙漏形匹配机制的IoU损失计算方法，该方法可在高IoU和高质量抓取配置之间建立良好对应关系。AE-GDN在Cornell和Jacquard数据集上的准确率分别达到98.9%和96.6%。推理速度达到每秒43.5帧，参数仅约1.2×10⁶。本文提出的AE-GDN已实际部署在机械臂抓取系统中，并实现良好抓取性能。代码可在https://github.com/robvincen/robot_gradet获得。

关键词组：机器人抓取检测；注意力机制；编码器-解码器；神经网络

Share this article to： More

Go to Contents

References:

Open peer comments: Debate/Discuss/Question/Opinion

<1>

DOI:

10.1631/FITEE.2200502

CLC number:

TP391.4

Download Full Text:

Click Here

Downloaded:

3898

Download summary:

Downloaded:

906

Clicked:

2489

Cited:

On-line Access:

2024-08-27

Received:

2023-10-17

Revision Accepted:

2024-05-08

Crosschecked:

2023-04-09

Journal of Zhejiang University-SCIENCE, 38 Zheda Road, Hangzhou 310027, China
Tel: +86-571-87952276; Fax: +86-571-87952331; E-mail: jzus@zju.edu.cn
Copyright © 2000~ Journal of Zhejiang University-SCIENCE

CONTENTS

INSTR. FOR AUTHOR

FOR REVIEWER

ABOUT JZUS

Publishing Service