Journal of Zhejiang University

Frontiers of Information Technology & Electronic Engineering 2024 Vol.25 No.12 P.1651-1663

Deep reinforcement learning for near-field wideband beamforming in STAR-RIS networks

Author(s): Ji WANG, Jiayi SUN, Wei FANG, Zhao CHEN, Yue LIU, Yuanwei LIU
Affiliation(s): 1. Department of Electronics and Information Engineering, College of Physical Science and Technology, Central China Normal University, Wuhan 430079, China more
Corresponding email(s): jiwang@ccnu.edu.cn, zhao_chen@tsinghua.edu.cn
Key Words: Deep reinforcement learning, Near-field beamforming, Simultaneously transmitting and reflecting reconfigurable intelligent surface (STAR-RIS), Wideband beam split

Share this article to： More <<< Previous Article \|Next Article >>>

Ji WANG, Jiayi SUN, Wei FANG, Zhao CHEN, Yue LIU, Yuanwei LIU. Deep reinforcement learning for near-field wideband beamforming in STAR-RIS networks[J]. Frontiers of Information Technology & Electronic Engineering, 2024, 25(12): 1651-1663.

@article{title="Deep reinforcement learning for near-field wideband beamforming in STAR-RIS networks",
author="Ji WANG, Jiayi SUN, Wei FANG, Zhao CHEN, Yue LIU, Yuanwei LIU",
journal="Frontiers of Information Technology & Electronic Engineering",
volume="25",
number="12",
pages="1651-1663",
year="2024",
publisher="Zhejiang University Press & Springer",
doi="10.1631/FITEE.2400364"
}

%0 Journal Article
%T Deep reinforcement learning for near-field wideband beamforming in STAR-RIS networks
%A Ji WANG
%A Jiayi SUN
%A Wei FANG
%A Zhao CHEN
%A Yue LIU
%A Yuanwei LIU
%J Frontiers of Information Technology & Electronic Engineering
%V 25
%N 12
%P 1651-1663
%@ 2095-9184
%D 2024
%I Zhejiang University Press & Springer
%DOI 10.1631/FITEE.2400364

TY - JOUR
T1 - Deep reinforcement learning for near-field wideband beamforming in STAR-RIS networks
A1 - Ji WANG
A1 - Jiayi SUN
A1 - Wei FANG
A1 - Zhao CHEN
A1 - Yue LIU
A1 - Yuanwei LIU
J0 - Frontiers of Information Technology & Electronic Engineering
VL - 25
IS - 12
SP - 1651
EP - 1663
%@ 2095-9184
Y1 - 2024
PB - Zhejiang University Press & Springer
ER -
DOI - 10.1631/FITEE.2400364

Abstract
Chinese Summary
Academic Network
Reviewer Comment

Abstract: A simultaneously transmitting and reflecting reconfigurable intelligent surface (STAR-RIS) assisted multi-user near-field wideband communication system is investigated, in which a robust deep reinforcement learning (DRL) based algorithm is proposed to enhance the users’ achievable rate by jointly optimizing the active beamforming at the base station (BS) and passive beamforming at the STAR-RIS. To mitigate the beam split issue, the delay-phase hybrid precoding structure is introduced to facilitate wideband beamforming. Considering the coupled nature of the STAR-RIS phase-shift model, the passive beamforming design is formulated as a problem of hybrid continuous and discrete phase-shift control, and the proposed algorithm controls the high-dimensional continuous action through hybrid action mapping. Additionally, to address the issue of biased estimation encountered by existing DRL algorithms, a softmax operator is introduced into the algorithm to mitigate this bias. Simulation results illustrate that the proposed algorithm outperforms existing algorithms and overcomes the issues of overestimation and underestimation.

基于深度强化学习的智能全向超表面

辅助近场宽带通信系统波束赋形研究
王骥¹，孙嘉毅¹，方炜¹，陈钊²，刘玥³，刘元玮^4,5
¹华中师范大学物理科学与技术学院电子信息工程系，中国武汉市，430079
²清华大学北京信息科学与技术国家研究中心，中国北京市，100084
³澳门理工大学应用科学学院，中国澳门特别行政区
⁴伦敦玛丽女王大学电子工程与计算机科学学院，英国伦敦，E1 4NS
⁵香港大学电机及电子工程系，中国香港特别行政区，999077
摘要：本文研究了一种智能全向超表面辅助的多用户近场宽带通信系统，提出了一种基于深度强化学习的鲁棒算法。通过联合优化基站的主动波束成形和智能全向超表面的被动波束成形，提升用户的可达速率。为缓解宽带通信中的波束分裂问题，引入了时相联合的混合预编码结构，以实现高效的宽带波束成形。考虑到智能全向超表面相移模型的耦合性，将无源波束成形设计转化为连续与离散相移的混合控制问题，并通过混合动作映射解决高维连续动作的控制难题。此外，针对现有深度强化学习算法中的估计偏差问题，引入softmax算子有效减轻了该偏差。仿真结果表明，所提算法在克服估计过高和估计过低问题方面优于现有算法。

关键词：深度强化学习；近场波束成形；智能全向超表面；宽带波束分裂

Darkslateblue:Affiliate; Royal Blue:Author; Turquoise:Article

Reference

[1]Abeywickrama S, Zhang R, Wu QQ, et al., 2020. Intelligent reflecting surface: practical phase shift model and beamforming optimization. IEEE Trans Commun, 68(9):5849-5863.

[2]Dai LL, Tan JB, Chen Z, et al., 2022. Delay-phase precoding for wideband THz massive MIMO. IEEE Trans Wirel Commun, 21(9):7271-7286.

[3]ElMossallamy MA, Zhang HL, Song LY, et al., 2020. Reconfigurable intelligent surfaces for wireless communications: principles, challenges, and opportunities. IEEE Trans Cogn Commun Netw, 6(3):990-1002.

[4]Fujimoto S, van Hoof H, Meger D, 2018. Addressing function approximation error in actor-critic methods. https://arxiv.org/abs/1802.09477

[5]Gao XY, Dai LL, Zhou SD, et al., 2019. Wideband beamspace channel estimation for millimeter-wave MIMO systems relying on lens antenna arrays. IEEE Trans Signal Process, 67(18):4809-4824.

[6]Guo KF, Liu R, Alazab M, et al., 2023. STAR-RIS-empowered cognitive non-terrestrial vehicle network with NOMA. IEEE Trans Intell Veh, 8(6):3735-3749.

[7]Han C, Akyildiz IF, 2016. Distance-aware bandwidth-adaptive resource allocation for wireless systems in the terahertz band. IEEE Trans Terahertz Sci Technol, 6(4):541-553.

[8]Han C, Yan LF, Yuan JH, 2021. Hybrid beamforming for terahertz wireless communications: challenges, architectures, and open problems. IEEE Wirel Commun, 28(4):198-204.

[9]He XL, Xu HB, Wang J, et al., 2024. Joint active and passive beamforming in RIS-assisted covert symbiotic radio based on deep unfolding. IEEE Trans Veh Technol, 73(9):14021-14026.

[10]Headland D, Monnai Y, Abbott D, et al., 2018. Tutorial: terahertz beamforming, from concepts to realizations. APL Photon, 3(5):051101.

[11]Hua M, Wu QQ, Chen W, et al., 2024a. Intelligent reflecting surface assisted localization: performance analysis and algorithm design. IEEE Wirel Commun Lett, 13(1):84-88.

[12]Hua M, Wu QQ, Chen W, et al., 2024b. Secure intelligent reflecting surface-aided integrated sensing and communication. IEEE Trans Wirel Commun, 23(1):575-591.

[13]Huang CW, Alexandropoulos GC, Zappone A, et al., 2019. Deep learning for UL/DL channel calibration in generic massive MIMO systems. Proc IEEE Int Conf on Communications, p.1-6.

[14]Huang CW, Mo RH, Yuen C, 2020. Reconfigurable intelligent surface assisted multiuser MISO systems exploiting deep reinforcement learning. IEEE J Select Areas Commun, 38(8):1839-1850.

[15]Jiang CX, Zhang HJ, Ren Y, et al., 2017. Machine learning paradigms for next-generation wireless networks. IEEE Wirel Commun, 24(2):98-105.

[16]Kraus JD, Marhefka RJ, 2002. Antennas for All Applications (3^rd Ed.). McGraw-Hill Science/Engineering/Math, New York, USA.

[17]Li HC, Liu YW, Mu XD, et al., 2023. Near-field beamforming for STAR-RIS networks. https://arxiv.org/abs/2306.14587

[18]Li HY, Li M, Liu Q, et al., 2020. Dynamic hybrid beamforming with low-resolution PSs for wideband mmWave MIMO-OFDM systems. IEEE J Sel Areas Commun, 38(9):2168-2181.

[19]Li XW, Xie Z, Chu Z, et al., 2022. Exploiting benefits of IRS in wireless powered NOMA networks. IEEE Trans Green Commun Netw, 6(1):175-186.

[20]Li XW, Zhang JY, Han CZ, et al., 2024. Reliability and security of CR-STAR-RIS-NOMA-assisted IoT networks. IEEE Int Things J, 11(17):27969-27980.

[21]Liu R, Guo KF, Li XW, et al., 2024. RIS-empowered satellite-aerial-terrestrial networks with PD-NOMA. IEEE Commun Surv Tutor, 26(4):2258-2289.

[22]Mismar FB, Evans BL, Alkhateeb A, 2020. Deep reinforcement learning for 5G networks: joint beamforming, power control, and interference coordination. IEEE Trans Commun, 68(3):1581-1592.

[23]Mnih V, Kavukcuoglu K, Silver D, et al., 2015. Human-level control through deep reinforcement learning. Nature, 518(7540):529-533.

[24]Mu XD, Liu YW, Guo L, et al., 2020. Exploiting intelligent reflecting surfaces in NOMA networks: joint beamforming optimization. IEEE Trans Wirel Commun, 19(10):6884-6898.

[25]Mu XD, Liu YW, Guo L, et al., 2022. Simultaneously transmitting and reflecting (STAR) RIS aided wireless communications. IEEE Trans Wirel Commun, 21(5):3083-3098.

[26]Ni WL, Liu YW, Eldar YC, et al., 2021. STAR-RIS enabled heterogeneous networks: ubiquitous NOMA communication and pervasive federated learning. https://arxiv.org/abs/2106.08592v1

[27]Pan L, Cai Q, Huang L, 2020. Softmax deep double deterministic policy gradients. Proc 34^th Int Conf on Neural Information Processing Systems, p.11767-11777.

[28]Samir M, Elhattab M, Assi C, et al., 2021. Optimizing age of information through aerial reconfigurable intelligent surfaces: a deep reinforcement learning approach. IEEE Trans Veh Technol, 70(4):3978-3983.

[29]Shafin R, Chen H, Nam YH, et al., 2020. Self-tuning sectorization: deep reinforcement learning meets broadcast beam optimization. IEEE Trans Wirel Commun, 19(6):4038-4053.

[30]Silver D, Lever G, Heess N, et al., 2014. Deterministic policy gradient algorithms. Proc 31^st Int Conf on Machine Learning, p.I-387-I-395.

[31]Wang J, Xiao J, Zou YX, et al., 2024. Wideband beamforming for RIS assisted near-field communications. IEEE Trans Wirel Commun, 23(11):16836-16851.

[32]Wang ZL, Mu XD, Xu JQ, et al., 2023. Simultaneously transmitting and reflecting surface (STARS) for terahertz communications. IEEE J Sel Top Signal Process, 17(4):861-877.

[33]Wu CY, Liu YW, Mu XD, et al., 2021. Coverage characterization of STAR-RIS networks: NOMA and OMA. IEEE Commun Lett, 25(9):3036-3040.

[34]Xiao J, Wang J, Wang ZL, et al., 2024a. Multi-scale attention based channel estimation for RIS-aided massive MIMO systems. IEEE Trans Wirel Commun, 23(6):5969-5984.

[35]Xiao J, Wang J, Wang ZL, et al., 2024b. Multi-task learning for near/far field channel estimation in STAR-RIS networks. IEEE Trans Commun, 72(10):6344-6359.

[36]Xu C, Ishikawa N, Rajashekar R, et al., 2019. Sixty years of coherent versus non-coherent tradeoffs and the road from 5G to wireless futures. IEEE Access, 7:178246-178299.

[37]Yu XH, Shen JC, Zhang J, et al., 2016. Alternating minimization algorithms for hybrid precoding in millimeter wave MIMO systems. IEEE J Sel Top Signal Process, 10(3):485-500.

[38]Zhang E, Huang C, 2014. On achieving optimal rate of digital precoder by RF-baseband codesign for MIMO systems. Proc IEEE 80^th Vehicular Technology Conf, p.1-5.

[39]Zhou Y, Zhou FH, Wu YP, et al., 2020. Subcarrier assignment schemes based on Q-learning in wideband cognitive radio networks. IEEE Trans Veh Technol, 69(1):1168-1172.

[40]Zhu BO, Chen K, Jia N, et al., 2014. Dynamic control of electromagnetic wave propagation with the equivalent principle inspired tunable metasurface. Sci Rep, 4(1):4971.

[41]Zhu FH, Wang BH, Yang ZH, et al., 2023. Robust millimeter beamforming via self-supervised hybrid deep learning. Proc 31^st European Signal Processing Conf, p.915-919.

[42]Zhu FH, Wang XQ, Huang CW, et al., 2024. Beamforming inferring by conditional WGAN-GP for holographic antenna arrays. IEEE Wirel Commun Lett, 13(7):2023-2027.

Open peer comments: Debate/Discuss/Question/Opinion

<1>