JZUS - Journal of Zhejiang University SCIENCE

Frontiers of Information Technology & Electronic Engineering

ISSN 2095-9184 (print), ISSN 2095-9230 (online)

2023 Vol.24 No.1 P.131-140

Stochastic pedestrian avoidance for autonomous vehicles using hybrid reinforcement learning

Huiqian LI, Jin HUANG, Zhong CAO, Diange YANG, Zhihua ZHONG

School of Vehicle and Mobility, Tsinghua University, Beijing 100084, China; Chinese Academy of Engineering, Beijing 100088, China

lihq20@mails.tsinghua.edu.cn, huangjin@tsinghua.edu.cn, caoc15@mails.tsinghua.edu.cn, ydg@tsinghua.edu.cn

Abstract: Ensuring the safety of pedestrians is essential and challenging when autonomous vehicles are involved. Classical pedestrian avoidance strategies cannot handle uncertainty, and learning-based methods lack performance guarantees. In this paper we propose a hybrid reinforcement learning (HRL) approach for autonomous vehicles to safely interact with pedestrians behaving uncertainly. The method integrates the rule-based strategy and reinforcement learning strategy. The confidence of both strategies is evaluated using the data recorded in the training process. Then we design an activation function to select the final policy with higher confidence. In this way, we can guarantee that the final policy performance is not worse than that of the rule-based policy. To demonstrate the effectiveness of the proposed method, we validate it in simulation using an accelerated testing technique to generate stochastic pedestrians. The results indicate that it increases the success rate for pedestrian avoidance to 98.8%, compared with 94.4% of the baseline method.

Key words: Pedestrian; Hybrid reinforcement learning; Autonomous vehicles; Decision-making

Chinese Summary <42> 基于混合强化学习的自动驾驶汽车行人避撞方法

李惠乾¹，黄晋¹，曹重¹，杨殿阁¹，钟志华²
¹清华大学车辆与运载学院，中国北京市，100084
²中国工程院，中国北京市，100088
摘要：确保行人的安全对自动驾驶汽车而言至关重要，同时也具有一定挑战。经典的行人避撞策略无法应对不确定性，而基于学习的方法缺乏明确的性能保障。本文提出一种基于混合强化学习的行人避撞方法，以使自动驾驶车辆能够与具有行为不确定性的行人安全交互。该方法集成了规则策略和强化学习策略，并设计了一个激活函数选择具有更高置信度的作为最终策略，通过这种方式保证最终策略的表现不亚于规则策略。为说明所提方法的有效性，本文使用一种加速测试方法生成了行为随机的行人进行仿真验证。结果表明，该方法在测试场景中的成功率，相比基准方法的94.4%，提升至98.8%。

关键词组：行人；混合强化学习；自动驾驶汽车；决策

Share this article to： More

Go to Contents

References:

Open peer comments: Debate/Discuss/Question/Opinion

<1>

DOI:

10.1631/FITEE.2200128

CLC number:

TP18;U495

Download Full Text:

Click Here

Downloaded:

5256

Download summary:

Downloaded:

732

Clicked:

3178

Cited:

On-line Access:

2024-08-27

Received:

2023-10-17

Revision Accepted:

2024-05-08

Crosschecked:

2022-08-10

Journal of Zhejiang University-SCIENCE, 38 Zheda Road, Hangzhou 310027, China
Tel: +86-571-87952276; Fax: +86-571-87952331; E-mail: jzus@zju.edu.cn
Copyright © 2000~ Journal of Zhejiang University-SCIENCE

CONTENTS

INSTR. FOR AUTHOR

FOR REVIEWER

ABOUT JZUS

Publishing Service