JZUS - Journal of Zhejiang University SCIENCE

Frontiers of Information Technology & Electronic Engineering

ISSN 2095-9184 (print), ISSN 2095-9230 (online)

2022 Vol.23 No.11 P.1673-1683

A deep Q-learning network based active object detection model with a novel training algorithm for service robots

Shaopeng LIU, Guohui TIAN, Yongcheng CUI, Xuyang SHAO

School of Control Science and Engineering, Shandong University, Jinan 250061, China

shaopeng.liu66@mail.sdu.edu.cn, g.h.tian@sdu.edu.cn

Abstract: This paper focuses on the problem of active object detection (AOD). AOD is important for service robots to complete tasks in the family environment, and leads robots to approach the target object by taking appropriate moving actions. Most of the current AOD methods are based on reinforcement learning with low training efficiency and testing accuracy. Therefore, an AOD model based on a deep Q-learning network (DQN) with a novel training algorithm is proposed in this paper. The DQN model is designed to fit the Q-values of various actions, and includes state space, feature extraction, and a multilayer perceptron. In contrast to existing research, a novel training algorithm based on memory is designed for the proposed DQN model to improve training efficiency and testing accuracy. In addition, a method of generating the end state is presented to judge when to stop the AOD task during the training process. Sufficient comparison experiments and ablation studies are performed based on an AOD dataset, proving that the presented method has better performance than the comparable methods and that the proposed training algorithm is more effective than the raw training algorithm.

Key words: Active object detection; Deep Q-learning network; Training method; Service robots

Chinese Summary <50> 基于深度Q学习网络与新训练算法的服务机器人主动物品检测模型

刘少鹏，田国会，崔永成，邵旭阳
山东大学控制科学与工程学院，中国济南市，250061
摘要：本文研究了主动物品检测(AOD)问题。AOD是服务机器人在家庭环境中完成服务任务的重要组成部分，通过适当的移动动作引导机器人接近目标物品。目前基于强化学习的AOD模型存在训练效率低和测试精度差的问题。因此，本文提出一种基于深度Q学习网络的AOD模型，并设计了一种新的模型训练算法。该模型旨在拟合各种动作Q值，包括状态空间、特征提取和多层感知机。与现有研究不同，本文针对所提AOD模型设计了一种基于记忆的训练算法，以提高模型训练效率和测试精度。此外，提出一种最终状态生成方法判断训练过程中AOD任务何时停止。本文所提方法在AOD数据集上进行了充分的对比实验和消融实验。实验结果表明所提方法优于其他同类方法，所设计的训练算法比原始训练算法更高效。

关键词组：主动物品检测；深度Q学习网络；训练算法；服务机器人

Share this article to： More

Go to Contents

References:

Open peer comments: Debate/Discuss/Question/Opinion

<1>

DOI:

10.1631/FITEE.2200109

CLC number:

TP242

Download Full Text:

Click Here

Downloaded:

8623

Download summary:

Downloaded:

845

Clicked:

3668

Cited:

On-line Access:

2024-08-27

Received:

2023-10-17

Revision Accepted:

2024-05-08

Crosschecked:

2022-07-29

Journal of Zhejiang University-SCIENCE, 38 Zheda Road, Hangzhou 310027, China
Tel: +86-571-87952276; Fax: +86-571-87952331; E-mail: jzus@zju.edu.cn
Copyright © 2000~ Journal of Zhejiang University-SCIENCE

CONTENTS

INSTR. FOR AUTHOR

FOR REVIEWER

ABOUT JZUS

Publishing Service