JZUS - Journal of Zhejiang University SCIENCE

Frontiers of Information Technology & Electronic Engineering

ISSN 2095-9184 (print), ISSN 2095-9230 (online)

2022 Vol.23 No.8 P.1174-1188

Behavioral control task supervisor with memory based on reinforcement learning for human–multi-robot coordination systems

Jie HUANG, Zhibin MO, Zhenyi ZHANG, Yutao CHEN

School of Electrical Engineering and Automation, Fuzhou University, Fuzhou 350108, China; 5G+ Industrial Internet Institute, Fuzhou University, Fuzhou 350108, China; Key Laboratory of Industrial Automation Control Technology and Information Processing of Fujian Province, Fuzhou University, Fuzhou 350108, China

yutao.chen@fzu.edu.cn

Abstract: In this study, a novel reinforcement learning task supervisor (RLTS) with memory in a behavioral control framework is proposed for human–multi-robot coordination systems (HMRCSs). Existing HMRCSs suffer from high decision-making time cost and large task tracking errors caused by repeated human intervention, which restricts the autonomy of multi-robot systems (MRSs). Moreover, existing task supervisors in the null-space-based behavioral control (NSBC) framework need to formulate many priority-switching rules manually, which makes it difficult to realize an optimal behavioral priority adjustment strategy in the case of multiple robots and multiple tasks. The proposed RLTS with memory provides a detailed integration of the deep Q-network (DQN) and long short-term memory (LSTM) knowledge base within the NSBC framework, to achieve an optimal behavioral priority adjustment strategy in the presence of task conflict and to reduce the frequency of human intervention. Specifically, the proposed RLTS with memory begins by memorizing human intervention history when the robot systems are not confident in emergencies, and then reloads the history information when encountering the same situation that has been tackled by humans previously. Simulation results demonstrate the effectiveness of the proposed RLTS. Finally, an experiment using a group of mobile robots subject to external noise and disturbances validates the effectiveness of the proposed RLTS with memory in uncertain real-world environments.

Key words: Human–multi-robot coordination systems; Null-space-based behavioral control; Task supervisor; Reinforcement learning; Knowledge base

Chinese Summary <47> 面向人–多机器人协同系统的带记忆强化学习行为控制任务管理器

黄捷^1,2,3，莫智斌^1,2,3，张祯毅^1,2,3，陈宇韬^1,2,3
¹福州大学电气工程与自动化学院，中国福州市，350108
²福州大学5G+工业互联网研究院，中国福州市，350108
³福州大学工业自动化控制技术与信息处理福建省高校重点实验室，中国福州市，350108
摘要：针对人–多机器人协同系统提出一种基于行为控制框架的带记忆强化学习任务管理器（RLTS）。由于重复的人工干预，现有人–多机器人协同系统决策时间成本高、任务跟踪误差大，限制了多机器人系统的自主性。此外，基于零空间行为控制框架的任务管理器依赖手动制定优先级切换规则，难以在多机器人和多任务情况下实现最优行为优先级调整策略。提出一种带记忆强化学习任务管理器，基于零空间行为控制框架融合深度Q-网络和长短时记忆神经网络知识库，实现任务冲突时最优行为优先级调整策略以及降低人为干预频率。当机器人在紧急情况下置信度不足时，所提带记忆强化学习任务管理器会记忆人类干预历史，在遭遇相同人工干预情况时重新加载历史控制信号。仿真结果验证了该方法的有效性。最后，通过一组受外界噪声和干扰的移动机器人实验，验证了所提带记忆强化学习任务管理器在不确定现实环境中的有效性。

关键词组：人–多机器人协同系统；基于零空间行为控制；任务管理器；强化学习；知识库

Share this article to： More

Go to Contents

References:

Open peer comments: Debate/Discuss/Question/Opinion

<1>

DOI:

10.1631/FITEE.2100280

CLC number:

TP18

Download Full Text:

Click Here

Downloaded:

8955

Download summary:

Downloaded:

612

Clicked:

3858

Cited:

On-line Access:

2024-08-27

Received:

2023-10-17

Revision Accepted:

2024-05-08

Crosschecked:

2022-02-15

Journal of Zhejiang University-SCIENCE, 38 Zheda Road, Hangzhou 310027, China
Tel: +86-571-87952276; Fax: +86-571-87952331; E-mail: jzus@zju.edu.cn
Copyright © 2000~ Journal of Zhejiang University-SCIENCE

CONTENTS

INSTR. FOR AUTHOR

FOR REVIEWER

ABOUT JZUS

Publishing Service