
CLC number: TN92
On-line Access: 2026-01-09
Received: 2025-01-03
Revision Accepted: 2025-05-28
Crosschecked: 2026-01-11
Cited: 0
Clicked: 1005
Citations: Bibtex RefMan EndNote GB/T7714
Zhongyang MAO, Zhilin ZHANG, Faping LU, Xiguo LIU, Zhichao XU, Yaozong PAN, Jiafang KANG, Yang YOU. Dynamic joint resource allocation in maritime wireless communication networks: a meta-reinforcement learning approach based on knowledge embedding[J]. Frontiers of Information Technology & Electronic Engineering, 2025, 26(12): 2672-2687.
@article{title="Dynamic joint resource allocation in maritime wireless communication networks: a meta-reinforcement learning approach based on knowledge embedding",
author="Zhongyang MAO, Zhilin ZHANG, Faping LU, Xiguo LIU, Zhichao XU, Yaozong PAN, Jiafang KANG, Yang YOU",
journal="Frontiers of Information Technology & Electronic Engineering",
volume="26",
number="12",
pages="2672-2687",
year="2025",
publisher="Zhejiang University Press & Springer",
doi="10.1631/FITEE.2500007"
}
%0 Journal Article
%T Dynamic joint resource allocation in maritime wireless communication networks: a meta-reinforcement learning approach based on knowledge embedding
%A Zhongyang MAO
%A Zhilin ZHANG
%A Faping LU
%A Xiguo LIU
%A Zhichao XU
%A Yaozong PAN
%A Jiafang KANG
%A Yang YOU
%J Frontiers of Information Technology & Electronic Engineering
%V 26
%N 12
%P 2672-2687
%@ 2095-9184
%D 2025
%I Zhejiang University Press & Springer
%DOI 10.1631/FITEE.2500007
TY - JOUR
T1 - Dynamic joint resource allocation in maritime wireless communication networks: a meta-reinforcement learning approach based on knowledge embedding
A1 - Zhongyang MAO
A1 - Zhilin ZHANG
A1 - Faping LU
A1 - Xiguo LIU
A1 - Zhichao XU
A1 - Yaozong PAN
A1 - Jiafang KANG
A1 - Yang YOU
J0 - Frontiers of Information Technology & Electronic Engineering
VL - 26
IS - 12
SP - 2672
EP - 2687
%@ 2095-9184
Y1 - 2025
PB - Zhejiang University Press & Springer
ER -
DOI - 10.1631/FITEE.2500007
Abstract: As human exploration of the ocean expands, the demand for continuous, high-quality, and ubiquitous maritime communication is steadily increasing. However, the dynamic nature of the marine environment and resource constraints present significant challenges for traditional heuristic resource allocation methods, complicating the balance between high-quality communication and limited network resources. This results in suboptimal system throughput and an over-reliance on specific problem structures. To address these issues, in this paper, we introduce a joint resource allocation method based on knowledge embedding. The proposed approach includes an action distribution alignment module designed to improve resource utilization by preventing unreasonable action-output combinations. Furthermore, by integrating knowledge embedding with meta-reinforcement learning techniques, a physical guidance loss function is formulated, which effectively reduces the sample size required for model training, thereby enhancing the algorithm’s generalization capabilities. Simulation results show that the proposed method achieves an increase in average system throughput of 31.19% compared to the model-agnostic meta-learning proximal policy optimization (MAML-PPO) algorithm and 80.91% compared to the RL2 algorithm, across various channel environments.
[1]Bekkadal F, 2010. Innovative maritime communications technologies. Proc 18th Int Conf on Microwaves, Radar and Wireless Communications, p.1-6.
[2]Bossy B, Kryszkiewicz P, Bogucka H, 2022. Energy-efficient OFDM radio resource allocation optimization with computational awareness: a survey. IEEE Access, 10:94100-94132.
[3]Chen SY, Rui LL, Gao ZP, et al., 2022. Cache-assisted collaborative task offloading and resource allocation strategy: a metareinforcement learning approach. IEEE Internet Things J, 9(20):19823-19842.
[4]Dhuheir M, Erbad A, Al-Fuqaha A, et al., 2024. Meta reinforcement learning for UAV-assisted energy harvesting IoT devices in disaster-affected areas. IEEE Open J Commun Soc, 5:2145-2163.
[5]Duan Y, Schulman J, Chen X, et al., 2016. RL2: fast reinforcement learning via slow reinforcement learning.
[6]Fallah A, Mokhtari A, Ozdaglar A, 2020. On the convergence theory of gradient-based model-agnostic meta-learning algorithms. Proc 23rd Int Conf on Artificial Intelligence and Statistics, p.1082-1092.
[7]Ferreira GO, Zanella AF, Bakirtzis S, et al., 2024. A joint optimization approach for power-efficient heterogeneous OFDMA radio access networks. IEEE J Select Areas Commun, 42(11):3232-3245.
[8]Finn C, Abbeel P, Levine S, 2017. Model-agnostic meta-learning for fast adaptation of deep networks. Proc 34th Int Conf on Machine Learning, p.1126-1135.
[9]Gautam S, Lagunas E, Chatzinotas S, et al., 2019. Relay selection and resource allocation for SWIPT in multi-user OFDMA systems. IEEE Trans Wirel Commun, 18(5):2493-2508.
[10]Han J, Lee GH, Park S, et al., 2022. Joint subcarrier and transmission power allocation in OFDMA-based WPT system for mobile-edge computing in IoT environment. IEEE Internet Things J, 9(16):15039-15052.
[11]Hou QS, Lee M, Yu GD, et al., 2023. Meta-gating framework for fast and continuous resource optimization in dynamic wireless environments. IEEE Trans Commun, 71(9):5259-5273.
[12]Hu SY, Yuan X, Ni W, et al., 2024. OFDMA-F2L: federated learning with flexible aggregation over an OFDMA air interface. IEEE Trans Wirel Commun, 23(7):6793-6807.
[13]ITU, 2016. Recommendation ITU-R P.372-13. https://www.itu.int/rec/R-REC-P.372-13-201609-S
[14]Jang D, Spangher L, Khattar M, et al., 2021. Using meta reinforcement learning to bridge the gap between simulation and experiment in energy demand response. Proc 12th ACM Int Conf on Future Energy Systems, p.483-487.
[15]Jha S, Ahmad S, Abdeljaber HAM, et al., 2024. Enabling resilient wireless networks: OFDMA-based algorithm for enhanced survivability and privacy in 6G IoT environments. IEEE Trans Consum Electr, 70(1):3810-3819.
[16]Jin ZW, Ma ML, Wang Z, et al., 2025a. Optimal transmission schedule with privacy preservation for cyber-physical system against eavesdropping attack. IEEE Signal Process Lett, 32:436-440.
[17]Jin ZW, Xu CH, Wang Z, et al., 2025b. Towards robust differential privacy in adaptive federated learning architectures. IEEE Trans Consum Electr, 71(2):4087-4099.
[18]Kim Y, Choi Y, Yang HJ, 2023. Spectrum sensing for underwater cognitive radio with limited sensing time. IEEE Commun Lett, 27(8):2014-2018.
[19]Le NT, Tran LN, Vu QD, et al., 2019. Energy-efficient resource allocation for OFDMA heterogeneous networks. IEEE Trans Commun, 67(10):7043-7057.
[20]Letchford AN, Ni Q, Zhong ZY, 2020. A heuristic for fair dynamic resource allocation in overloaded OFDMA systems. J Heuristics, 26(1):21-32.
[21]Li SC, Zhang N, Chen HB, et al., 2022. Joint subcarrier allocation, modulation mode selection, and trajectory design in a UAV-based OFDMA network. IEEE Commun Lett, 26(9):2111-2115.
[22]Liu L, Cai L, Ma L, et al., 2021. Channel state information prediction for adaptive underwater acoustic downlink OFDMA system: deep neural networks based approach. IEEE Trans Veh Technol, 70(9):9063-9076.
[23]Mao ZY, Zhang ZL, Lu FP, et al., 2024. Sea-based UAV network resource allocation method based on an attention mechanism. Electronics, 13(18):3686.
[24]Meister G, Knuble JJ, Gliese U, et al., 2024. The ocean color instrument (OCI) on the plankton, aerosol, cloud, ocean ecosystem (PACE) mission: system design and prelaunch radiometric performance. IEEE Trans Geosci Remote Sensing, 62:5517418.
[25]Ning JH, Wang JL, Feng P, et al., 2023. A distributed framework for the ocean IoT network. Proc 34th Annual Int Symp on Personal, Indoor and Mobile Radio Communications, p.1-6.
[26]Schulman J, Wolski F, Dhariwal P, et al., 2017. Proximal policy optimization algorithms.
[27]Shi XH, Zhang S, Liu MZ, et al., 2025. Mystique: user-level adaptation for real-time video analytics in edge networks via meta-RL. IEEE Trans Mob Comput, 24(5):3615-3632.
[28]Su YS, Liu X, Han GY, et al., 2021. A traffic load-aware OFDMA-based MAC protocol for distributed underwater acoustic sensor networks. IEEE Trans Veh Technol, 70(10):10501-10513.
[29]Sun GX, Wang XM, Jiang R, et al., 2022. Beamforming and resource allocation in multi-cell OFDMA systems based on deep transfer reinforcement learning. Proc 95th Vehicular Technology Conf, p.1-6.
[30]Švedek V, Kurdija AS, Ilic Ž, 2023. Static and mobile relay selection with chunk-based subcarrier allocation in uplink OFDMA networks. Proc Int Symp on ELMAR, p.137-140.
[31]Tan QY, He JJ, Gao YY, 2024. Deep reinforcement learning based OFDMA scheduling for WiFi networks with coexist-ing latency-sensitive and high-throughput services. Proc 5th Information Communication Technologies Conf, p.146-150.
[32]Tefera MK, Zhang SB, Jin ZW, 2023. Deep reinforcement learning-assisted optimization for resource allocation in downlink OFDMA cooperative systems. Entropy, 25(3):413.
[33]Tseng SM, Wang PH, Hsu YT, 2023. Modified loss function considering outage capacity for deep learning-based OFDMA NOMA video transmission resource management. Proc 8th Int Conf on Multimedia Communication Technologies, p.7-11.
[34]Wang J, Zhou HF, Li Y, et al., 2018. Wireless channel models for maritime communications. IEEE Access, 6:68070-68088.
[35]Wang LY, Guo J, Zhu JQ, et al., 2024. Cross-layer wireless re-source allocation method based on environment-awareness in high-speed mobile networks. Electronics, 13(3):499.
[36]Wang T, You CC, 2024. Adaptive uplink scheduling and UAV association in UAV-assisted OFDMA cellular networks: a game-theoretical approach. IEEE Access, 12:63504-63514.
[37]Wang T, You CC, He Z, et al., 2023. Distributed subcarrier assignment and discrete power allocation for multi-UAV millimeter-wave cooperative OFDMA networks with heter-ogeneous QoS consideration. IEEE Access, 11:123132-123148.
[38]Wang XH, Su YS, Yang SD, et al., 2024. An OFDMA downlink acoustic communication scheme for AUV-based mobile underwater sensor network. IEEE Sens J, 24(7):11527-11536.
[39]Wang XM, Sun GX, Xin YX, et al., 2022. Deep transfer reinforcement learning for beamforming and resource allocation in multi-cell MISO-OFDMA systems. IEEE Trans Signal Inform Process Netw, 8:815-829.
[40]Xia TT, Wang MM, Zhang JJ, et al., 2020. Maritime Internet of Things: challenges and solutions. IEEE Wirel Commun, 27(2):188-196.
[41]Yan RW, Li Q, Xiong HG, 2024. Adaptive channel division and subchannel allocation for orthogonal frequency division multiple access-based airborne power line communication networks. Sensors, 24(23):7644.
[42]Yang LW, Jia BY, Wang F, et al., 2022. Energy efficiency optimization of heterogeneous network resources based on OFDMA. Proc 20th Int Conf on Optical Communications and Networks, p.1-3.
[43]Yang SD, Su YS, Wang XH, et al., 2024. Resource allocation for cognitive underwater acoustic downlink OFDMA system with a practical spectrum sensing scheme. IEEE Internet Things J, 11(5):8731-8745.
[44]Yin H, Huang YH, Han LC, et al., 2023. Thoughts on 6G integrated communication, sensing and computing networks. Sci Sin Inform, 53(9):1838-1842 (in Chinese).
[45]Yuan X, Hu SY, Ni W, et al., 2023. Joint user, channel, modulation-coding selection, and RIS configuration for jamming resistance in multiuser OFDMA systems. IEEE Trans Commun, 71(3):1631-1645.
[46]Zhang L, Han SQ, Yang CY, 2023. Joint scheduling and power allocation with per-user rate constraints for uplink MU-MIMO OFDMA systems. Proc 97th Vehicular Technology Conf, p.1-5.
Open peer comments: Debate/Discuss/Question/Opinion
<1>