|
Frontiers of Information Technology & Electronic Engineering
ISSN 2095-9184 (print), ISSN 2095-9230 (online)
2024 Vol.25 No.4 P.569-584
An anti-collision algorithm for robotic search-and-rescue tasks in unknown dynamic environments
Abstract: This paper deals with the search-and-rescue tasks of a mobile robot with multiple interesting targets in an unknown dynamic environment. The problem is challenging because the mobile robot needs to search for multiple targets while avoiding obstacles simultaneously. To ensure that the mobile robot avoids obstacles properly, we propose a mixed-strategy Nash equilibrium based Dyna-Q (MNDQ) algorithm. First, a multi-objective layered structure is introduced to simplify the representation of multiple objectives and reduce computational complexity. This structure divides the overall task into subtasks, including searching for targets and avoiding obstacles. Second, a risk-monitoring mechanism is proposed based on the relative positions of dynamic risks. This mechanism helps the robot avoid potential collisions and unnecessary detours. Then, to improve sampling efficiency, MNDQ is presented, which combines Dyna-Q and mixed-strategy Nash equilibrium. By using mixed-strategy Nash equilibrium, the agent makes decisions in the form of probabilities, maximizing the expected rewards and improving the overall performance of the Dyna-Q algorithm. Furthermore, a series of simulations are conducted to verify the effectiveness of the proposed method. The results show that MNDQ performs well and exhibits robustness, providing a competitive solution for future autonomous robot navigation tasks.
Key words: Search and rescue; Reinforcement learning; Game theory; Collision avoidance; Decision-making
1北京大学计算机学院,中国北京市,100871
2天津(滨海)人工智能创新中心,中国天津市,300457
3智能博弈与决策实验室,中国北京市,100071
4国防科技大学计算机学院,中国长沙市,410073
摘要:本文研究未知动态环境下具有多个兴趣目标的移动机器人搜救任务问题。由于移动机器人需要搜救多个目标并避开障碍,此类问题具有挑战性。为确保移动机器人合理避碰,本文提出一种基于混合策略纳什均衡的Dyna-Q算法(MNDQ)。首先,引入一种多目标分层结构以简化问题,该结构将整个任务划分为多个子任务,包括搜索目标和躲避障碍。其次,提出基于动态风险相对位置的风险监测机制,使机器人避免潜在碰撞和绕路。此外,为提高采样效率,提出了结合Dyna-Q和混合策略纳什均衡的强化学习方法(MNDQ)。根据混合策略纳什均衡,智能体以概率的形式做出决策从而最大化期望回报,提高Dyna-Q算法的整体性能。最后,通过仿真实验验证所提方法的有效性。结果表明,该方法具有良好的表现并为未来的机器人自主导航任务提供了解决思路。
关键词组:
References:
Open peer comments: Debate/Discuss/Question/Opinion
<1>
DOI:
10.1631/FITEE.2300151
CLC number:
TP183
Download Full Text:
Downloaded:
1741
Download summary:
<Click Here>Downloaded:
290Clicked:
1407
Cited:
0
On-line Access:
2024-08-27
Received:
2023-10-17
Revision Accepted:
2024-05-08
Crosschecked:
2023-08-25