Full Text:   <822>

Summary:  <307>

CLC number: TN92

On-line Access: 2026-01-09

Received: 2025-01-03

Revision Accepted: 2025-05-28

Crosschecked: 2026-01-11

Cited: 0

Clicked: 1005

Citations:  Bibtex RefMan EndNote GB/T7714

 ORCID:

Zhongyang MAO

https://orcid.org/0000-0001-6279-1627

Zhilin ZHANG

https://orcid.org/0009-0006-1442-3735

-   Go to

Article info.
Open peer comments

Frontiers of Information Technology & Electronic Engineering  2025 Vol.26 No.12 P.2672-2687

http://doi.org/10.1631/FITEE.2500007


Dynamic joint resource allocation in maritime wireless communication networks: a meta-reinforcement learning approach based on knowledge embedding


Author(s):  Zhongyang MAO, Zhilin ZHANG, Faping LU, Xiguo LIU, Zhichao XU, Yaozong PAN, Jiafang KANG, Yang YOU

Affiliation(s):  Naval Aviation University, Yantai 264001, China; more

Corresponding email(s):   freedom_mzy@163.com, zzl19970811@163.com

Key Words:  Marine wireless communication, Resource allocation, Knowledge embedding, Meta-reinforcement learning


Zhongyang MAO, Zhilin ZHANG, Faping LU, Xiguo LIU, Zhichao XU, Yaozong PAN, Jiafang KANG, Yang YOU. Dynamic joint resource allocation in maritime wireless communication networks: a meta-reinforcement learning approach based on knowledge embedding[J]. Frontiers of Information Technology & Electronic Engineering, 2025, 26(12): 2672-2687.

@article{title="Dynamic joint resource allocation in maritime wireless communication networks: a meta-reinforcement learning approach based on knowledge embedding",
author="Zhongyang MAO, Zhilin ZHANG, Faping LU, Xiguo LIU, Zhichao XU, Yaozong PAN, Jiafang KANG, Yang YOU",
journal="Frontiers of Information Technology & Electronic Engineering",
volume="26",
number="12",
pages="2672-2687",
year="2025",
publisher="Zhejiang University Press & Springer",
doi="10.1631/FITEE.2500007"
}

%0 Journal Article
%T Dynamic joint resource allocation in maritime wireless communication networks: a meta-reinforcement learning approach based on knowledge embedding
%A Zhongyang MAO
%A Zhilin ZHANG
%A Faping LU
%A Xiguo LIU
%A Zhichao XU
%A Yaozong PAN
%A Jiafang KANG
%A Yang YOU
%J Frontiers of Information Technology & Electronic Engineering
%V 26
%N 12
%P 2672-2687
%@ 2095-9184
%D 2025
%I Zhejiang University Press & Springer
%DOI 10.1631/FITEE.2500007

TY - JOUR
T1 - Dynamic joint resource allocation in maritime wireless communication networks: a meta-reinforcement learning approach based on knowledge embedding
A1 - Zhongyang MAO
A1 - Zhilin ZHANG
A1 - Faping LU
A1 - Xiguo LIU
A1 - Zhichao XU
A1 - Yaozong PAN
A1 - Jiafang KANG
A1 - Yang YOU
J0 - Frontiers of Information Technology & Electronic Engineering
VL - 26
IS - 12
SP - 2672
EP - 2687
%@ 2095-9184
Y1 - 2025
PB - Zhejiang University Press & Springer
ER -
DOI - 10.1631/FITEE.2500007


Abstract: 
As human exploration of the ocean expands, the demand for continuous, high-quality, and ubiquitous maritime communication is steadily increasing. However, the dynamic nature of the marine environment and resource constraints present significant challenges for traditional heuristic resource allocation methods, complicating the balance between high-quality communication and limited network resources. This results in suboptimal system throughput and an over-reliance on specific problem structures. To address these issues, in this paper, we introduce a joint resource allocation method based on knowledge embedding. The proposed approach includes an action distribution alignment module designed to improve resource utilization by preventing unreasonable action-output combinations. Furthermore, by integrating knowledge embedding with meta-reinforcement learning techniques, a physical guidance loss function is formulated, which effectively reduces the sample size required for model training, thereby enhancing the algorithm’s generalization capabilities. Simulation results show that the proposed method achieves an increase in average system throughput of 31.19% compared to the model-agnostic meta-learning proximal policy optimization (MAML-PPO) algorithm and 80.91% compared to the RL2 algorithm, across various channel environments.

海上无线通信网络的动态联合资源分配:一种基于知识嵌入的元强化学习方法

毛忠阳1,2,张治霖1,2,陆发平1,2,刘锡国1,2,许志超1,2,潘耀宗1,2,康家方1,2,攸阳3
1海军航空大学,中国烟台市,264001
2山东省海空信息感知与处理技术重点实验室,中国烟台市,264001
3中国人民解放军91001部队,中国北京市,100000
摘要:随着人类对海洋探索的不断拓展,对海上全时全域高质量通信的需求在逐渐提高。然而,海上环境呈现强动态和资源受限的特性,使得传统启发式资源分配方法难以平衡高质量通信和有限网络资源之间的关系,存在系统吞吐量低、问题结构依赖性高的问题。为此,本文提出一种基于知识嵌入的联合资源分配方法,设计了动作分布对齐模块,通过规避不合理动作输出组合方式提高资源利用率。此外,引入知识嵌入和元强化学习方法,构建基于知识嵌入的物理引导损失函数,有效降低模型训练样本量,提高算法泛化性。仿真结果表明,所提方法在多种信道环境平均系统总吞吐量上相较于MAML-PPO和RL2算法分别提升31.19%和80.91%。

关键词:海上无线通信;资源分配;知识嵌入;元强化学习

Darkslateblue:Affiliate; Royal Blue:Author; Turquoise:Article

Reference

[1]Bekkadal F, 2010. Innovative maritime communications technologies. Proc 18th Int Conf on Microwaves, Radar and Wireless Communications, p.1-6.

[2]Bossy B, Kryszkiewicz P, Bogucka H, 2022. Energy-efficient OFDM radio resource allocation optimization with computational awareness: a survey. IEEE Access, 10:94100-94132.

[3]Chen SY, Rui LL, Gao ZP, et al., 2022. Cache-assisted collaborative task offloading and resource allocation strategy: a metareinforcement learning approach. IEEE Internet Things J, 9(20):19823-19842.

[4]Dhuheir M, Erbad A, Al-Fuqaha A, et al., 2024. Meta reinforcement learning for UAV-assisted energy harvesting IoT devices in disaster-affected areas. IEEE Open J Commun Soc, 5:2145-2163.

[5]Duan Y, Schulman J, Chen X, et al., 2016. RL2: fast reinforcement learning via slow reinforcement learning.

[6]Fallah A, Mokhtari A, Ozdaglar A, 2020. On the convergence theory of gradient-based model-agnostic meta-learning algorithms. Proc 23rd Int Conf on Artificial Intelligence and Statistics, p.1082-1092.

[7]Ferreira GO, Zanella AF, Bakirtzis S, et al., 2024. A joint optimization approach for power-efficient heterogeneous OFDMA radio access networks. IEEE J Select Areas Commun, 42(11):3232-3245.

[8]Finn C, Abbeel P, Levine S, 2017. Model-agnostic meta-learning for fast adaptation of deep networks. Proc 34th Int Conf on Machine Learning, p.1126-1135.

[9]Gautam S, Lagunas E, Chatzinotas S, et al., 2019. Relay selection and resource allocation for SWIPT in multi-user OFDMA systems. IEEE Trans Wirel Commun, 18(5):2493-2508.

[10]Han J, Lee GH, Park S, et al., 2022. Joint subcarrier and transmission power allocation in OFDMA-based WPT system for mobile-edge computing in IoT environment. IEEE Internet Things J, 9(16):15039-15052.

[11]Hou QS, Lee M, Yu GD, et al., 2023. Meta-gating framework for fast and continuous resource optimization in dynamic wireless environments. IEEE Trans Commun, 71(9):5259-5273.

[12]Hu SY, Yuan X, Ni W, et al., 2024. OFDMA-F2L: federated learning with flexible aggregation over an OFDMA air interface. IEEE Trans Wirel Commun, 23(7):6793-6807.

[13]ITU, 2016. Recommendation ITU-R P.372-13. https://www.itu.int/rec/R-REC-P.372-13-201609-S

[14]Jang D, Spangher L, Khattar M, et al., 2021. Using meta reinforcement learning to bridge the gap between simulation and experiment in energy demand response. Proc 12th ACM Int Conf on Future Energy Systems, p.483-487.

[15]Jha S, Ahmad S, Abdeljaber HAM, et al., 2024. Enabling resilient wireless networks: OFDMA-based algorithm for enhanced survivability and privacy in 6G IoT environments. IEEE Trans Consum Electr, 70(1):3810-3819.

[16]Jin ZW, Ma ML, Wang Z, et al., 2025a. Optimal transmission schedule with privacy preservation for cyber-physical system against eavesdropping attack. IEEE Signal Process Lett, 32:436-440.

[17]Jin ZW, Xu CH, Wang Z, et al., 2025b. Towards robust differential privacy in adaptive federated learning architectures. IEEE Trans Consum Electr, 71(2):4087-4099.

[18]Kim Y, Choi Y, Yang HJ, 2023. Spectrum sensing for underwater cognitive radio with limited sensing time. IEEE Commun Lett, 27(8):2014-2018.

[19]Le NT, Tran LN, Vu QD, et al., 2019. Energy-efficient resource allocation for OFDMA heterogeneous networks. IEEE Trans Commun, 67(10):7043-7057.

[20]Letchford AN, Ni Q, Zhong ZY, 2020. A heuristic for fair dynamic resource allocation in overloaded OFDMA systems. J Heuristics, 26(1):21-32.

[21]Li SC, Zhang N, Chen HB, et al., 2022. Joint subcarrier allocation, modulation mode selection, and trajectory design in a UAV-based OFDMA network. IEEE Commun Lett, 26(9):2111-2115.

[22]Liu L, Cai L, Ma L, et al., 2021. Channel state information prediction for adaptive underwater acoustic downlink OFDMA system: deep neural networks based approach. IEEE Trans Veh Technol, 70(9):9063-9076.

[23]Mao ZY, Zhang ZL, Lu FP, et al., 2024. Sea-based UAV network resource allocation method based on an attention mechanism. Electronics, 13(18):3686.

[24]Meister G, Knuble JJ, Gliese U, et al., 2024. The ocean color instrument (OCI) on the plankton, aerosol, cloud, ocean ecosystem (PACE) mission: system design and prelaunch radiometric performance. IEEE Trans Geosci Remote Sensing, 62:5517418.

[25]Ning JH, Wang JL, Feng P, et al., 2023. A distributed framework for the ocean IoT network. Proc 34th Annual Int Symp on Personal, Indoor and Mobile Radio Communications, p.1-6.

[26]Schulman J, Wolski F, Dhariwal P, et al., 2017. Proximal policy optimization algorithms.

[27]Shi XH, Zhang S, Liu MZ, et al., 2025. Mystique: user-level adaptation for real-time video analytics in edge networks via meta-RL. IEEE Trans Mob Comput, 24(5):3615-3632.

[28]Su YS, Liu X, Han GY, et al., 2021. A traffic load-aware OFDMA-based MAC protocol for distributed underwater acoustic sensor networks. IEEE Trans Veh Technol, 70(10):10501-10513.

[29]Sun GX, Wang XM, Jiang R, et al., 2022. Beamforming and resource allocation in multi-cell OFDMA systems based on deep transfer reinforcement learning. Proc 95th Vehicular Technology Conf, p.1-6.

[30]Švedek V, Kurdija AS, Ilic Ž, 2023. Static and mobile relay selection with chunk-based subcarrier allocation in uplink OFDMA networks. Proc Int Symp on ELMAR, p.137-140.

[31]Tan QY, He JJ, Gao YY, 2024. Deep reinforcement learning based OFDMA scheduling for WiFi networks with coexist-ing latency-sensitive and high-throughput services. Proc 5th Information Communication Technologies Conf, p.146-150.

[32]Tefera MK, Zhang SB, Jin ZW, 2023. Deep reinforcement learning-assisted optimization for resource allocation in downlink OFDMA cooperative systems. Entropy, 25(3):413.

[33]Tseng SM, Wang PH, Hsu YT, 2023. Modified loss function considering outage capacity for deep learning-based OFDMA NOMA video transmission resource management. Proc 8th Int Conf on Multimedia Communication Technologies, p.7-11.

[34]Wang J, Zhou HF, Li Y, et al., 2018. Wireless channel models for maritime communications. IEEE Access, 6:68070-68088.

[35]Wang LY, Guo J, Zhu JQ, et al., 2024. Cross-layer wireless re-source allocation method based on environment-awareness in high-speed mobile networks. Electronics, 13(3):499.

[36]Wang T, You CC, 2024. Adaptive uplink scheduling and UAV association in UAV-assisted OFDMA cellular networks: a game-theoretical approach. IEEE Access, 12:63504-63514.

[37]Wang T, You CC, He Z, et al., 2023. Distributed subcarrier assignment and discrete power allocation for multi-UAV millimeter-wave cooperative OFDMA networks with heter-ogeneous QoS consideration. IEEE Access, 11:123132-123148.

[38]Wang XH, Su YS, Yang SD, et al., 2024. An OFDMA downlink acoustic communication scheme for AUV-based mobile underwater sensor network. IEEE Sens J, 24(7):11527-11536.

[39]Wang XM, Sun GX, Xin YX, et al., 2022. Deep transfer reinforcement learning for beamforming and resource allocation in multi-cell MISO-OFDMA systems. IEEE Trans Signal Inform Process Netw, 8:815-829.

[40]Xia TT, Wang MM, Zhang JJ, et al., 2020. Maritime Internet of Things: challenges and solutions. IEEE Wirel Commun, 27(2):188-196.

[41]Yan RW, Li Q, Xiong HG, 2024. Adaptive channel division and subchannel allocation for orthogonal frequency division multiple access-based airborne power line communication networks. Sensors, 24(23):7644.

[42]Yang LW, Jia BY, Wang F, et al., 2022. Energy efficiency optimization of heterogeneous network resources based on OFDMA. Proc 20th Int Conf on Optical Communications and Networks, p.1-3.

[43]Yang SD, Su YS, Wang XH, et al., 2024. Resource allocation for cognitive underwater acoustic downlink OFDMA system with a practical spectrum sensing scheme. IEEE Internet Things J, 11(5):8731-8745.

[44]Yin H, Huang YH, Han LC, et al., 2023. Thoughts on 6G integrated communication, sensing and computing networks. Sci Sin Inform, 53(9):1838-1842 (in Chinese).

[45]Yuan X, Hu SY, Ni W, et al., 2023. Joint user, channel, modulation-coding selection, and RIS configuration for jamming resistance in multiuser OFDMA systems. IEEE Trans Commun, 71(3):1631-1645.

[46]Zhang L, Han SQ, Yang CY, 2023. Joint scheduling and power allocation with per-user rate constraints for uplink MU-MIMO OFDMA systems. Proc 97th Vehicular Technology Conf, p.1-5.

Open peer comments: Debate/Discuss/Question/Opinion

<1>

Please provide your name, email address and a comment





Journal of Zhejiang University-SCIENCE, 38 Zheda Road, Hangzhou 310027, China
Tel: +86-571-87952783; E-mail: cjzhang@zju.edu.cn
Copyright © 2000 - 2026 Journal of Zhejiang University-SCIENCE