CLC number: U495
On-line Access: 2024-08-27
Received: 2023-10-17
Revision Accepted: 2024-05-08
Crosschecked: 2020-06-11
Cited: 0
Clicked: 6095
Yun-peng Wang, Kun-xian Zheng, Da-xin Tian, Xu-ting Duan, Jian-shan Zhou. Cooperative channel assignment for VANETs based on multiagent reinforcement learning[J]. Frontiers of Information Technology & Electronic Engineering, 2020, 21(7): 1047-1058.
@article{title="Cooperative channel assignment for VANETs based on multiagent reinforcement learning",
author="Yun-peng Wang, Kun-xian Zheng, Da-xin Tian, Xu-ting Duan, Jian-shan Zhou",
journal="Frontiers of Information Technology & Electronic Engineering",
volume="21",
number="7",
pages="1047-1058",
year="2020",
publisher="Zhejiang University Press & Springer",
doi="10.1631/FITEE.1900308"
}
%0 Journal Article
%T Cooperative channel assignment for VANETs based on multiagent reinforcement learning
%A Yun-peng Wang
%A Kun-xian Zheng
%A Da-xin Tian
%A Xu-ting Duan
%A Jian-shan Zhou
%J Frontiers of Information Technology & Electronic Engineering
%V 21
%N 7
%P 1047-1058
%@ 2095-9184
%D 2020
%I Zhejiang University Press & Springer
%DOI 10.1631/FITEE.1900308
TY - JOUR
T1 - Cooperative channel assignment for VANETs based on multiagent reinforcement learning
A1 - Yun-peng Wang
A1 - Kun-xian Zheng
A1 - Da-xin Tian
A1 - Xu-ting Duan
A1 - Jian-shan Zhou
J0 - Frontiers of Information Technology & Electronic Engineering
VL - 21
IS - 7
SP - 1047
EP - 1058
%@ 2095-9184
Y1 - 2020
PB - Zhejiang University Press & Springer
ER -
DOI - 10.1631/FITEE.1900308
Abstract: dynamic channel assignment (DCA) plays a key role in extending vehicular ad-hoc network capacity and mitigating congestion. However, channel assignment under vehicular direct communication scenarios faces mutual influence of large-scale nodes, the lack of centralized coordination, unknown global state information, and other challenges. To solve this problem, a multiagent reinforcement learning (RL) based cooperative DCA (RL-CDCA) mechanism is proposed. Specifically, each vehicular node can successfully learn the proper strategies of channel selection and backoff adaptation from the real-time channel state information (CSI) using two cooperative RL models. In addition, neural networks are constructed as nonlinear Q-function approximators, which facilitates the mapping of the continuously sensed input to the mixed policy output. Nodes are driven to locally share and incorporate their individual rewards such that they can optimize their policies in a distributed collaborative manner. Simulation results show that the proposed multiagent RL-CDCA can better reduce the one-hop packet delay by no less than 73.73%, improve the packet delivery ratio by no less than 12.66% on average in a highly dense situation, and improve the fairness of the global network resource allocation.
[1]Ahmed SAM, Ariffin SHS, Fisal N, 2013. Overview of wireless access in vehicular environment (wave) protocols and standards. Ind J Sci Technol, 7(6):4994-5001.
[2]Ahmed T, Le Moullec Y, 2017. A QoS optimization approach in cognitive body area networks for healthcare applications. Sensors, 17(4):780.
[3]Ahmed T, Ahmed F, Le Moullec Y, 2017. Optimization of channel allocation in wireless body area networks by means of reinforcement learning. IEEE Asia Pacific Conf on Wireless and Mobile, p.120-123.
[4]Almohammedi AA, Noordin NK, Sali A, et al., 2017. An adaptive multi-channel assignment and coordination scheme for IEEE 802.11p/1609.4 in vehicular ad-hoc networks. IEEE Access, 6:2781-2802.
[5]Arulkumaran K, Deisenroth MP, Brundage M, et al., 2017. A brief survey of deep reinforcement learning. IEEE Signal Process Mag, 34(6):26-38.
[6]Atallah R, Assi C, Khabbaz M, 2017. Deep reinforcement learning-based scheduling for roadside communication networks. 15th Int Symp on Modeling and Optimization in Mobile, p.1-8.
[7]Audhya GK, Sinha K, Ghosh SC, et al., 2011. A survey on the channel assignment problem in wireless networks. Wirel Commun Mob Comput, 11(5):583-609.
[8]Barto AG, Sutton RS, 1998. Reinforcement Learning: an Introduction. MIT Press, Cambridge, MA, USA.
[9]Cheeneebash J, Lozano JA, Rughooputh HCS, 2012. A survey on the algorithms used to solve the channel assignment problem. Rec Pat Telecommun, 1(1):54-71.
[10]He Y, Zhao N, Yin HX, 2017. Integrated networking, caching, and computing for connected vehicles: a deep reinforcement learning approach. IEEE Trans Veh Technol, 67(1):44-55.
[11]Jain RK, Chiu DMW, Hawe WR, 1998. A Quantitative Measure of Fairness and Discrimination for Resource Allocation in Shared Computer Systems. CoRR. cs. NI/9809099, DEC, Hudson, Canada.
[12]Kaelbling LP, Littman ML, Moore AW, 1996. Reinforcement learning: a survey. J Artif Intell Res, 4(1):237-285.
[13]Li L, Lv YS, Wang FY, 2016. Traffic signal timing via deep reinforcement learning. IEEE/CAA J Autom Sin, 3(3):247-254.
[14]Li XH, Hu BJ, Chen HB, et al., 2015. An RSU-coordinated synchronous multi-channel MAC scheme for vehicular ad hoc networks. IEEE Access, 3:2794-2802.
[15]Liu N, Li Z, Xu JL, et al., 2017. A hierarchical framework of cloud resource allocation and power management using deep reinforcement learning. IEEE 37th Int Conf on Distributed Computing Systems, p.372-382.
[16]Liu SJ, Hu X, Wang WD, 2018. Deep reinforcement learning based dynamic channel allocation algorithm in multibeam satellite systems. IEEE Access, 6:15733-15742.
[17]Louta M, Sarigiannidis P, Misra S, et al., 2014. RLAM: a dynamic and efficient reinforcement learning-based adaptive mapping scheme in mobile WiMAX networks. Mob Inform Syst, 10(2):173-196.
[18]Maddison CJ, Huang A, Sutskever I, et al., 2014. Move evaluation in go using deep convolutional neural networks. https://arxiv.org/abs/1412.6564
[19]Mao HZ, Alizadeh M, Menache I, et al., 2016. Resource management with deep reinforcement learning. Proc 15th ACM Workshop on Hot Topics in Networks, p.50-56.
[20]Mnih V, Kavukcuoglu K, Silver D, et al., 2013. Playing Atari with deep reinforcement learning. https://arxiv.org/abs/1312.5602
[21]Mnih V, Kavukcuoglu K, Silver D, et al., 2015. Human-level control through deep reinforcement learning. Nature, 518(7540):529-533.
[22]Nie JH, Haykin S, 1999. A dynamic channel assignment policy through Q-learning. IEEE Trans Neur Netw, 10(6):linebreak 1443-1455.
[23]Ouyous M, Zytoune O, Aboutajdine D, 2017. Multi-channel coordination based MAC protocols in vehicular ad hoc networks (VANETs): a survey. In: El-Azouzi R, Menasche D, Sabir E, et al. (Eds.), Advances in Ubiquitous Networking 2. Springer, Singapore.
[24]Qiu CR, Hu Y, Chen Y, et al., 2019. Deep deterministic policy gradient (DDPG)-based energy harvesting wireless communications. IEEE Int Things J, 6(5):8577-8588.
[25]Seah MWM, Tham CK, Srinivasan V, et al., 2007. Achieving coverage through distributed reinforcement learning in wireless sensor networks. 3rd Int Conf on Intelligent Sensors, Sensor Networks and Information, p.425-430.
[26]Silver D, Schrittwieser J, Simonyan K, et al., 2017. Mastering the game of go without human knowledge. Nature, 550(7676):354-350.
[27]Wang Q, Leng S, Fu HR, et al., 2012. An IEEE 802.11p-based multichannel MAC scheme with channel coordination for vehicular ad hoc networks. IEEE Trans Intell Trans Syst, 13(2):449-458.
[28]Wang W, Kwasinski A, Niyato D, et al., 2017. A survey on applications of model-free strategy learning in cognitive wireless networks. IEEE Commun Surv Tutor, 18(3):1717-1757.
[29]Xu ZY, Wang YZ, Tang J, et al., 2017. A deep reinforcement learning based framework for power-efficient resource allocation in cloud RANs. IEEE Int Conf on Communications, p.1-6.
[30]Yau KLA, Komisarczuk P, Paul DT, 2010. Enhancing network performance in distributed cognitive radio networks using single-agent and multi-agent reinforcement learning. IEEE Local Computer Network Conf, p.152-159.
[31]Ye H, Li GY, and Juang BHF, 2018. Deep reinforcement learning based resource allocation for V2V communications. IEEE Int Conf on Communications, p.1-6.
Open peer comments: Debate/Discuss/Question/Opinion
<1>