JZUS - Journal of Zhejiang University SCIENCE

Frontiers of Information Technology & Electronic Engineering

ISSN 2095-9184 (print), ISSN 2095-9230 (online)

2024 Vol.25 No.6 P.763-790

Transformer in reinforcement learning for decision-making: a survey

Weilin YUAN, Jiaxing CHEN, Shaofei CHEN, Dawei FENG, Zhenzhen HU, Peng LI, Weiwei ZHAO

College of Information and Communication, National University of Defense Technology, Wuhan 430014, China; College of Intelligence Science and Technology, National University of Defense Technology, Changsha 410072, China; Laboratory for Parallel and Distributed Processing, National University of Defense Technology, Changsha 410072, China

yuanweilin12@nudt.edu.cn, zhaozww@163.com

Abstract: Reinforcement learning (RL) has become a dominant decision-making paradigm and has achieved notable success in many real-world applications. Notably, deep neural networks play a crucial role in unlocking RL’s potential in large-scale decision-making tasks. Inspired by current major success of Transformer in natural language processing and computer vision, numerous bottlenecks have been overcome by combining Transformer with RL for decision-making. This paper presents a multiangle systematic survey of various Transformer-based RL (TransRL) models applied in decision-making tasks, including basic models, advanced algorithms, representative implementation instances, typical applications, and known challenges. Our work aims to provide insights into problems that inherently arise with the current RL approaches, and examines how we can address them with better TransRL models. To our knowledge, we are the first to present a comprehensive review of the recent Transformer research developments in RL for decision-making. We hope that this survey provides a comprehensive review of TransRL models and inspires the RL community in its pursuit of future directions. To keep track of the rapid TransRL developments in the decision-making domains, we summarize the latest papers and their open-source implementations at https://github.com/williamyuanv0/Transformer-in-Reinforcement-Learning-for-Decision-Making-A-Survey.

Key words: Transformer; Reinforcement learning (RL); Decision-making (DM); Deep neural network (DNN); Multi-agent reinforcement learning (MARL); Meta-reinforcement learning (Meta-RL)

Chinese Summary <26> 基于Transformer的强化学习方法在智能决策领域的应用：综述

袁唯淋¹，陈佳星²，陈少飞²，冯大为³，胡振震²，李鹏²，赵卫伟¹
¹国防科技大学信息通信学院，中国武汉市，430014
²国防科技大学智能科学学院，中国长沙市，410072
³国防科技大学并行与分布计算全国重点实验室，中国长沙市，410072
摘要：强化学习已成为一种主导的决策范式，在许多现实应用中取得令人瞩目的成果。在大规模决策场景中，深度神经网络成为释放强化学习巨大潜力的关键所在。受自然语言和视觉领域中先进Transformer方法的启发，Transformer和强化学习的结合，突破了智能决策领域许多瓶颈。本文从基础模型、先进算法、代表性示例、典型应用和挑战分析等层面，归纳总结了基于Transformer的强化学习方法（TransRL），旨在深入分析当前强化学习方法的痛点，讨论TransRL如何突破强化学习范式的局限。据我们所知，本文是第一篇系统性回顾基于Transformer的强化学习方法在智能决策领域应用进展的综述，期望提供一个全面的TransRL讨论基础，推动强化学习在此领域的应用。为便于跟进TransRL的前沿进展，我们整理了最新相关论文及其开源项目，详见https://github.com/williamyuanv0/Transformer-in-Reinforcement-Learning-for-Decision-Making-A-Survey。

关键词组：Transformer；强化学习；智能决策；深度神经网络；多智能体强化学习；元强化学习

Share this article to： More

Go to Contents

References:

Open peer comments: Debate/Discuss/Question/Opinion

<1>

DOI:

10.1631/FITEE.2300548

CLC number:

TP18

Download Full Text:

Click Here

Downloaded:

2369

Download summary:

Downloaded:

491

Clicked:

2331

Cited:

On-line Access:

2024-08-27

Received:

2023-10-17

Revision Accepted:

2024-05-08

Crosschecked:

2023-11-24

Journal of Zhejiang University-SCIENCE, 38 Zheda Road, Hangzhou 310027, China
Tel: +86-571-87952276; Fax: +86-571-87952331; E-mail: jzus@zju.edu.cn
Copyright © 2000~ Journal of Zhejiang University-SCIENCE

CONTENTS

INSTR. FOR AUTHOR

FOR REVIEWER

ABOUT JZUS

Publishing Service