JZUS - Journal of Zhejiang University SCIENCE

Frontiers of Information Technology & Electronic Engineering

Accepted manuscript available online (unedited version)

Dynamic joint resource allocation in maritime wireless communication networks: a meta-reinforcement learning approach based on knowledge embedding

Author(s): Zhongyang MAO^1, ², Zhilin ZHANG^1, ², Faping LU^1, ², Xiguo LIU^1, ², Zhichao XU^1, ², Yaozong PAN^1, ², Jiafang KANG^1, ², Yang YOU³
Affiliation(s): ¹Naval Aeronautic University, Yantai 264001, China ²Shandong Key Laboratory of Sea and Air Information Perception and Processing Technology, Naval Aeronautic University, Yantai 264001, China ³PLA 91001 Unit, Beijing 100000, China
Corresponding email(s): freedom_mzy@163.com, zzl19970811@163.com
Key Words: Marine wireless communication; Resource allocation; Knowledge embedding; Meta-reinforcement learning

Share this article to： More <<< Previous Paper \|Next Paper >>>

Zhongyang MAO^1,², Zhilin ZHANG^1,², Faping LU^1,², Xiguo LIU^1,², Zhichao XU^1,²,Yaozong PAN^1,², Jiafang KANG^1,², Yang YOU³. Dynamic joint resource allocation in maritime wireless communication networks: a meta-reinforcement learning approach based on knowledge embedding[J]. Frontiers of Information Technology & Electronic Engineering,in press.https://doi.org/10.1631/FITEE.2500007

@article{title="Dynamic joint resource allocation in maritime wireless communication networks: a meta-reinforcement learning approach based on knowledge embedding",
author="Zhongyang MAO^1,², Zhilin ZHANG^1,², Faping LU^1,², Xiguo LIU^1,², Zhichao XU^1,²,Yaozong PAN^1,², Jiafang KANG^1,², Yang YOU³",
journal="Frontiers of Information Technology & Electronic Engineering",
year="in press",
publisher="Zhejiang University Press & Springer",
doi="https://doi.org/10.1631/FITEE.2500007"
}

%0 Journal Article
%T Dynamic joint resource allocation in maritime wireless communication networks: a meta-reinforcement learning approach based on knowledge embedding
%A Zhongyang MAO^1
%A²
%A Zhilin ZHANG^1
%A²
%A Faping LU^1
%A²
%A Xiguo LIU^1
%A²
%A Zhichao XU^1
%A²
%A Yaozong PAN^1
%A²
%A Jiafang KANG^1
%A²
%A Yang YOU³
%J Frontiers of Information Technology & Electronic Engineering
%P
%@ 2095-9184
%D in press
%I Zhejiang University Press & Springer
doi="https://doi.org/10.1631/FITEE.2500007"

TY - JOUR
T1 - Dynamic joint resource allocation in maritime wireless communication networks: a meta-reinforcement learning approach based on knowledge embedding
A1 - Zhongyang MAO^{1
A1 -}²
A1 - Zhilin ZHANG^{1
A1 -}²
A1 - Faping LU^{1
A1 -}²
A1 - Xiguo LIU^{1
A1 -}²
A1 - Zhichao XU^{1
A1 -}²
A1 - Yaozong PAN^{1
A1 -}²
A1 - Jiafang KANG^{1
A1 -}²
A1 - Yang YOU³
J0 - Frontiers of Information Technology & Electronic Engineering
SP -
EP -
%@ 2095-9184
Y1 - in press
PB - Zhejiang University Press & Springer
ER -
doi="https://doi.org/10.1631/FITEE.2500007"

Abstract
Chinese Summary
Academic Network
Reviewer Comment

Abstract: As human exploration of the ocean expands, the demand for continuous, high-quality, and ubiquitous maritime communication is steadily increasing. However, the dynamic nature of the marine environment and resource constraints present significant challenges for traditional heuristic resource allocation methods, complicating the balance between high-quality communication and limited network resources. This results in suboptimal system throughput and an over-reliance on specific problem structures. To address these issues, in this paper we introduce a joint resource allocation method based on knowledge embedding. The proposed approach includes an action distribution alignment module designed to improve resource utilization by preventing unreasonable action-output combinations. Furthermore, by integrating knowledge embedding with meta-reinforcement learning techniques, a physical guidance loss function is formulated, which effectively reduces the sample size required for model training, thereby enhancing the algorithm's generalization capabilities. Simulation results show that the proposed method achieves an increase in average system throughput of 31.19% compared to the MAML-PPO algorithm and 80.91% compared to the RL2 algorithm, across various channel environments.

Darkslateblue:Affiliate; Royal Blue:Author; Turquoise:Article

Reference

Open peer comments: Debate/Discuss/Question/Opinion

<1>

- Go to

Darkslateblue:Affiliate; Royal Blue:Author; Turquoise:Article

Reference