JZUS - Journal of Zhejiang University SCIENCE

Frontiers of Information Technology & Electronic Engineering

ISSN 2095-9184 (print), ISSN 2095-9230 (online)

2025 Vol.26 No.3 P.385-399

Significance extraction based on data augmentation for reinforcement learning

Yuxi HAN, Dequan LI, Yang YANG

Faculty of Artificial Intelligence, Anhui University of Science and Technology, Huainan 232000, China

hanyuxi0712@163.com, leedqcpp@126.com

Abstract: Deep reinforcement learning has shown remarkable capabilities in visual tasks, but it does not have a good generalization ability in the context of interference signals in the input images; this approach is therefore hard to be applied to trained agents in a new environment. To enable agents to distinguish between noise signals and important pixels in images, data augmentation techniques and the establishment of auxiliary networks are proven effective solutions. We introduce a novel algorithm, namely, saliency-extracted Q-value by augmentation (SEQA), which encourages the agent to explore unknown states more comprehensively and focus its attention on important information. Specifically, SEQA masks out interfering features and extracts salient features and then updates the mask decoder network with critic losses to encourage the agent to focus on important features and make correct decisions. We evaluate our algorithm on the DeepMind Control generalization benchmark (DMControl-GB), and the experimental results show that our algorithm greatly improves training efficiency and stability. Meanwhile, our algorithm is superior to state-of-the-art reinforcement learning methods in terms of sample efficiency and generalization in most DMControl-GB tasks.

Key words: Deep reinforcement learning; Visual tasks; Generalization; Data augmentation; Significance; DeepMind Control generalization benchmark

Chinese Summary <13> 基于数据增强的显著性提取强化学习

韩玉玺，李德权，杨洋
安徽理工大学人工智能学院，中国淮南市，232000
摘要：深度强化学习在视觉任务中展现了显著的能力，但在输入图像受到干扰信号的情况下，其泛化能力较弱，因此难以将训练有素的智能体应用于新环境中。为了让智能体能区分图像中的噪声信号和重要像素，数据增强技术和辅助网络的建立是有效的解决方案。提出一种新的算法，即增强提取显著性Q值（SEQA），该算法鼓励智能体全面探索未知状态，并将注意力集中在重要信息上。具体来说，SEQA屏蔽干扰特征，提取显著特征，使用评论家损失更新掩码解码网络，从而促使智能体关注重要特征并做出正确决策。在DeepMind控制泛化基准上评估该算法，实验结果表明，该算法极大提高了训练效率和稳定性。同时，在大多数DeepMind控制泛化基准任务中，我们的算法在样本效率和泛化能力方面优于最先进的强化学习方法。

关键词组：深度强化学习；视觉任务；泛化；数据增强；显著性；DeepMind控制泛化基准

Share this article to： More

Go to Contents

References:

Open peer comments: Debate/Discuss/Question/Opinion

<1>

DOI:

10.1631/FITEE.2400406

CLC number:

TP391.4

Download Full Text:

Click Here

Downloaded:

1006

Download summary:

Downloaded:

251

Clicked:

722

Cited:

On-line Access:

2025-04-03

Received:

2024-05-17

Revision Accepted:

2024-09-18

Crosschecked:

2025-04-07

Journal of Zhejiang University-SCIENCE, 38 Zheda Road, Hangzhou 310027, China
Tel: +86-571-87952276; Fax: +86-571-87952331; E-mail: jzus@zju.edu.cn
Copyright © 2000~ Journal of Zhejiang University-SCIENCE

CONTENTS

INSTR. FOR AUTHOR

FOR REVIEWER

ABOUT JZUS

Publishing Service