Publishing Service

Polishing & Checking

Frontiers of Information Technology & Electronic Engineering

ISSN 2095-9184 (print), ISSN 2095-9230 (online)

Cooperative channel assignment for VANETs based on multiagent reinforcement learning

Abstract: Dynamic channel assignment (DCA) plays a key role in extending vehicular ad-hoc network capacity and mitigating congestion. However, channel assignment under vehicular direct communication scenarios faces mutual influence of large-scale nodes, the lack of centralized coordination, unknown global state information, and other challenges. To solve this problem, a multiagent reinforcement learning (RL) based cooperative DCA (RL-CDCA) mechanism is proposed. Specifically, each vehicular node can successfully learn the proper strategies of channel selection and backoff adaptation from the real-time channel state information (CSI) using two cooperative RL models. In addition, neural networks are constructed as nonlinear Q-function approximators, which facilitates the mapping of the continuously sensed input to the mixed policy output. Nodes are driven to locally share and incorporate their individual rewards such that they can optimize their policies in a distributed collaborative manner. Simulation results show that the proposed multiagent RL-CDCA can better reduce the one-hop packet delay by no less than 73.73%, improve the packet delivery ratio by no less than 12.66% on average in a highly dense situation, and improve the fairness of the global network resource allocation.

Key words: Vehicular ad-hoc networks, Reinforcement learning, Dynamic channel assignment, Multichannel

Chinese Summary  <29> 基于多智能体强化学习的车载自组织网络协作信道分配

王云鹏,郑坤贤,田大新,段续庭,周建山
北京航空航天大学交通科学与工程学院,大数据科学与脑机智能高精尖创新中心,中国北京市,100191

摘要:动态信道分配(DCA)在扩展车载自组织网络容量和缓解其拥塞方面起着关键作用。然而,在车-车直连通信场景下,信道分配面临大规模节点相互影响、缺乏集中式协调、全局网络状态信息未知以及其他挑战。为解决该问题,提出一种基于多智能体强化学习(RL)的协作动态信道分配(RL-CDCA)机制。具体而言,每个车辆节点都可借助2个互相协作的RL模型,从实时信道状态信息中成功学习信道选择和信道接入自适应退避的正确策略。此外,将神经网络构造为非线性Q函数逼近器,有助于将感测到的连续输入值映射到混合策略输出。多智能体RL-CDCA驱动节点共享本地奖励并合并区域内其他节点各自的奖励,以便它们能够以分布式协作方式优化各自策略。仿真结果表明,与4种现有机制相比,所提多智能体RL-CDCA算法即便在路网车辆高度密集的情况下仍能将单跳数据包传输延迟减少不小于73.73%,将平均数据包递送成功率提高不小于12.66%,并更好地保证网络资源分配公平性。

关键词组:车载自组织网络;强化学习;动态信道分配;多信道


Share this article to: More

Go to Contents

References:

<Show All>

Open peer comments: Debate/Discuss/Question/Opinion

<1>

Please provide your name, email address and a comment





DOI:

10.1631/FITEE.1900308

CLC number:

U495

Download Full Text:

Click Here

Downloaded:

5833

Download summary:

<Click Here> 

Downloaded:

1577

Clicked:

5390

Cited:

0

On-line Access:

2020-07-10

Received:

2019-06-21

Revision Accepted:

2020-01-03

Crosschecked:

2020-06-11

Journal of Zhejiang University-SCIENCE, 38 Zheda Road, Hangzhou 310027, China
Tel: +86-571-87952276; Fax: +86-571-87952331; E-mail: jzus@zju.edu.cn
Copyright © 2000~ Journal of Zhejiang University-SCIENCE