CLC number:
On-line Access: 2022-01-24
Received: 2021-07-05
Revision Accepted: 2022-04-22
Crosschecked: 2021-11-15
Cited: 0
Clicked: 4595
Citations: Bibtex RefMan EndNote GB/T7714
https://orcid.org/0000-0003-3293-0803
Xiaoyu LIU, Chi XU, Haibin YU, Peng ZENG. Multi-agent deep reinforcement learning for end–edge orchestrated resource allocation in industrial wireless networks[J]. Frontiers of Information Technology & Electronic Engineering, 2022, 23(1): 47-60.
@article{title="Multi-agent deep reinforcement learning for end–edge orchestrated resource allocation in industrial wireless networks",
author="Xiaoyu LIU, Chi XU, Haibin YU, Peng ZENG",
journal="Frontiers of Information Technology & Electronic Engineering",
volume="23",
number="1",
pages="47-60",
year="2022",
publisher="Zhejiang University Press & Springer",
doi="10.1631/FITEE.2100331"
}
%0 Journal Article
%T Multi-agent deep reinforcement learning for end–edge orchestrated resource allocation in industrial wireless networks
%A Xiaoyu LIU
%A Chi XU
%A Haibin YU
%A Peng ZENG
%J Frontiers of Information Technology & Electronic Engineering
%V 23
%N 1
%P 47-60
%@ 2095-9184
%D 2022
%I Zhejiang University Press & Springer
%DOI 10.1631/FITEE.2100331
TY - JOUR
T1 - Multi-agent deep reinforcement learning for end–edge orchestrated resource allocation in industrial wireless networks
A1 - Xiaoyu LIU
A1 - Chi XU
A1 - Haibin YU
A1 - Peng ZENG
J0 - Frontiers of Information Technology & Electronic Engineering
VL - 23
IS - 1
SP - 47
EP - 60
%@ 2095-9184
Y1 - 2022
PB - Zhejiang University Press & Springer
ER -
DOI - 10.1631/FITEE.2100331
Abstract: Edge artificial intelligence will empower the ever simple industrial wireless networks (IWNs) supporting complex and dynamic tasks by collaboratively exploiting the computation and communication resources of both machine-type devices (MTDs) and edge servers. In this paper, we propose a multi-agent deep reinforcement learning based resource allocation (MADRL-RA) algorithm for end–edge orchestrated IWNs to support computation-intensive and delay-sensitive applications. First, we present the system model of IWNs, wherein each MTD is regarded as a self-learning agent. Then, we apply the Markov decision process to formulate a minimum system overhead problem with joint optimization of delay and energy consumption. Next, we employ MADRL to defeat the explosive state space and learn an effective resource allocation policy with respect to computing decision, computation capacity, and transmission power. To break the time correlation of training data while accelerating the learning process of MADRL-RA, we design a weighted experience replay to store and sample experiences categorically. Furthermore, we propose a step-by-step
[1]Alfakih T, Hassan MM, Gumaei A, et al., 2020. Task offloading and resource allocation for mobile edge computing by deep reinforcement learning based on SARSA. IEEE Access, 8:54074-54084. doi: 10.1109/access.2020.2981434
[2]Cao ZL, Zhou P, Li RX, et al., 2020. Multiagent deep reinforcement learning for joint multichannel access and task offloading of mobile-edge computing in Industry 4.0. IEEE Int Things J, 7(7):6201-6213. doi: 10.1109/jiot.2020.2968951
[3]Chen Y, Liu ZY, Zhang YC, et al., 2021. Deep reinforcement learning-based dynamic resource management for mobile edge computing in industrial Internet of Things. IEEE Trans Ind Inform, 17(7):4925-4934. doi: 10.1109/tii.2020.3028963
[4]Chu TS, Wang J, Codecà L, et al., 2020. Multi-agent deep reinforcement learning for large-scale traffic signal control. IEEE Trans Intell Transp Syst, 21(3):1086-1095. doi: 10.1109/tits.2019.2901791
[5]Dai YY, Zhang K, Maharjan S, et al., 2020. Edge intelligence for energy-efficient computation offloading and resource allocation in 5G beyond. IEEE Trans Veh Technol, 69(10):12175-12186. doi: 10.1109/tvt.2020.3013990
[6]Feng J, Pei QQ, Yu FR, et al., 2019. Computation offloading and resource allocation for wireless powered mobile edge computing with latency constraint. IEEE Wirel Commun Lett, 8(5):1320-1323. doi: 10.1109/lwc.2019.2915618
[7]Foerster JN, Assael YM, de Freitas N, et al., 2016. Learning to communicate with deep multi-agent reinforcement learning. Proc Advances in Neural Information Processing Systems 29, p.2137-2145.
[8]Guo JF, Song ZZ, Cui Y, et al., 2017. Energy-efficient resource allocation for multi-user mobile edge computing. Proc IEEE Global Communications Conf, p.1-7.
[9]He XM, Lu HD, Du M, et al., 2021. QoE-based task offloading with deep reinforcement learning in edge-enabled Internet of Vehicles. IEEE Trans Intell Transp Syst, 22(4):2252-2261. doi: 10.1109/tits.2020.3016002
[10]Kumar M, Sharma SC, Goel A, et al., 2019. A comprehensive survey for scheduling techniques in cloud computing. J Netw Comput Appl, 143:1-33.
[11]Li HL, Xu HT, Zhou CC, et al., 2020. Joint optimization strategy of computation offloading and resource allocation in multi-access edge computing environment. IEEE Trans Veh Technol, 69(9):10214-10226. doi: 10.1109/tvt.2020.3003898
[12]Lin CC, Deng DJ, Chih YL, et al., 2019. Smart manufacturing scheduling with edge computing using multiclass deep Q network. IEEE Trans Ind Inform, 15(7):4276-4284. doi: 10.1109/tii.2019.2908210
[13]Liu KH, Liao WJ, 2020. Intelligent offloading for multi-access edge computing: a new actor-critic approach. Proc IEEE Int Conf on Communications, p.1-6.
[14]Liu XY, Xu C, Yu HB, et al., 2021. Deep reinforcement learning-based multi-channel access for industrial wireless networks with dynamic multi-user priority. IEEE Trans Ind Inform, early access. doi: 10.1109/TII.2021.3139349
[15]Lowe R, Wu Y, Tamar A, et al., 2017. Multi-agent actor-critic for mixed cooperative-competitive environments. Proc Advances in Neural Information Processing Systems 30, p.6379-6390.
[16]Lu HD, He XM, Du M, et al., 2020. Edge QoE: computation offloading with deep reinforcement learning for Internet of Things. IEEE Int Things J, 7(10):9255-9265. doi: 10.1109/jiot.2020.2981557
[17]Naparstek O, Cohen K, 2019. Deep multi-user reinforcement learning for distributed dynamic spectrum access. IEEE Trans Wirel Commun, 18(1):310-323. doi: 10.1109/TWC.2018.2879433
[18]Porambage P, Okwuibe J, Liyanage M, et al., 2018. Survey on multi-access edge computing for Internet of Things realization. IEEE Commun Surv Tut, 20(4):2961-2991. doi: 10.1109/comst.2018.2849509
[19]Rashid T, Samvelyan M, de Witt CS, et al., 2018. QMIX: monotonic value function factorisation for deep multi-agent reinforcement learning. Proc 35th Int Conf on Machine Learning, p.4292-4301.
[20]Ren YJ, Sun YH, Peng MG, 2021. Deep reinforcement learning based computation offloading in fog enabled industrial Internet of Things. IEEE Trans Ind Inform, 17:4978-4987. doi: 10.1109/tii.2020.3021024
[21]Schaul T, Quan J, Antonoglou I, et al., 2016. Prioritized experience replay. Proc 4th Int Conf on Learning Representations.
[22]Shakarami A, Ghobaei-Arani M, Shahidinejad A, 2020. A survey on the computation offloading approaches in mobile edge computing: a machine learning-based perspective. Comput Netw, 182:107496. doi: 10.1016/j.comnet.2020.107496
[23]Tang L, He SB, 2018. Multi-user computation offloading in mobile edge computing: a behavioral perspective. IEEE Netw, 32(1):48-53. doi: 10.1109/mnet.2018.1700119
[24]Wang HN, Liu N, Zhang YY, et al., 2020. Deep reinforcement learning: a survey. Front Inform Technol Electron Eng, 21:1726-1744. doi: 10.1631/FITEE.1900533
[25]Wei YF, Yu FR, Song M, et al., 2019. Joint optimization of caching, computing, and radio resources for fog-enabled IoT using natural actor–critic deep reinforcement learning. IEEE Int Things J, 6(2):2061-2073. doi: 10.1109/jiot.2018.2878435
[26]Xiong X, Zheng K, Lei L, et al., 2020. Resource allocation based on deep reinforcement learning in IoT edge computing. IEEE J Sel Area Commun, 38(6):1133-1146. doi: 10.1109/jsac.2020.2986615
[27]Xu C, Zeng P, Yu HB, et al., 2021. WIA-NR: ultra-reliable low-latency communication for industrial wireless control networks over unlicensed bands. IEEE Netw, 35(1):258-265. doi: 10.1109/mnet.011.2000308
[28]Yao XF, Zhou JJ, Lin YZ, et al., 2019. Smart manufacturing based on cyber-physical systems and beyond. J Intell Manuf, 30(8):2805-2817. doi: 10.1007/s10845-017-1384-5
[29]Yu HB, Zeng P, Xu C, 2021. Industrial wireless control networks: from WIA to the future. Engineering, early access. doi: 10.1016/j.eng.2021.06.024
[30]Zhang GL, Zhang WQ, Cao Y, et al., 2018. Energy-delay tradeoff for dynamic offloading in mobile-edge computing system with energy harvesting devices. IEEE Trans Ind Inform, 14(10):4642-4655. doi: 10.1109/tii.2018.2843365
[31]Zhang KQ, Yang ZR, Basar T, 2021. Decentralized multi-agent reinforcement learning with networked agents: recent advances. Front Inform Technol Electron Eng, 22:802-814. doi: 10.1631/FITEE.1900661
[32]Zhang P, Peng MG, Cui SG, et al., 2022. Theory and techniques for "intellicise" wireless networks. Front Inform Technol Electron Eng, 23(1):1-4. doi: 10.1631/FITEE.2210000
[33]Zhang YM, Lan XL, Ren J, et al., 2020. Efficient computing resource sharing for mobile edge-cloud computing networks. IEEE/ACM Trans Netw, 28(3):1227-1240. doi: 10.1109/tnet.2020.2979807
[34]Zhu XY, Luo YY, Liu AF, et al., 2021. Multiagent deep reinforcement learning for vehicular computation offloading in IoT. IEEE Int Things J, 8(12):9763-9773. doi: 10.1109/jiot.2020.3040768
Open peer comments: Debate/Discuss/Question/Opinion
<1>