Publishing Service

Polishing & Checking

Frontiers of Information Technology & Electronic Engineering

ISSN 2095-9184 (print), ISSN 2095-9230 (online)

Dynamic joint resource allocation in maritime wireless communication networks: a meta-reinforcement learning approach based on knowledge embedding

Abstract: As human exploration of the ocean expands, the demand for continuous, high-quality, and ubiquitous maritime communication is steadily increasing. However, the dynamic nature of the marine environment and resource constraints present significant challenges for traditional heuristic resource allocation methods, complicating the balance between high-quality communication and limited network resources. This results in suboptimal system throughput and an over-reliance on specific problem structures. To address these issues, in this paper, we introduce a joint resource allocation method based on knowledge embedding. The proposed approach includes an action distribution alignment module designed to improve resource utilization by preventing unreasonable action-output combinations. Furthermore, by integrating knowledge embedding with meta-reinforcement learning techniques, a physical guidance loss function is formulated, which effectively reduces the sample size required for model training, thereby enhancing the algorithm’s generalization capabilities. Simulation results show that the proposed method achieves an increase in average system throughput of 31.19% compared to the model-agnostic meta-learning proximal policy optimization (MAML-PPO) algorithm and 80.91% compared to the RL2 algorithm, across various channel environments.

Key words: Marine wireless communication; Resource allocation; Knowledge embedding; Meta-reinforcement learning

Chinese Summary  <1> 海上无线通信网络的动态联合资源分配:一种基于知识嵌入的元强化学习方法

毛忠阳1,2,张治霖1,2,陆发平1,2,刘锡国1,2,许志超1,2,潘耀宗1,2,康家方1,2,攸阳3
1海军航空大学,中国烟台市,264001
2山东省海空信息感知与处理技术重点实验室,中国烟台市,264001
3中国人民解放军91001部队,中国北京市,100000
摘要:随着人类对海洋探索的不断拓展,对海上全时全域高质量通信的需求在逐渐提高。然而,海上环境呈现强动态和资源受限的特性,使得传统启发式资源分配方法难以平衡高质量通信和有限网络资源之间的关系,存在系统吞吐量低、问题结构依赖性高的问题。为此,本文提出一种基于知识嵌入的联合资源分配方法,设计了动作分布对齐模块,通过规避不合理动作输出组合方式提高资源利用率。此外,引入知识嵌入和元强化学习方法,构建基于知识嵌入的物理引导损失函数,有效降低模型训练样本量,提高算法泛化性。仿真结果表明,所提方法在多种信道环境平均系统总吞吐量上相较于MAML-PPO和RL2算法分别提升31.19%和80.91%。

关键词组:海上无线通信;资源分配;知识嵌入;元强化学习


Share this article to: More

Go to Contents

References:

<Show All>

Open peer comments: Debate/Discuss/Question/Opinion

<1>

Please provide your name, email address and a comment





DOI:

10.1631/FITEE.2500007

CLC number:

TN92

Download Full Text:

Click Here

Downloaded:

714

Download summary:

<Click Here> 

Downloaded:

251

Clicked:

932

Cited:

0

On-line Access:

2026-01-09

Received:

2025-01-03

Revision Accepted:

2025-05-28

Crosschecked:

2026-01-11

Journal of Zhejiang University-SCIENCE, 38 Zheda Road, Hangzhou 310027, China
Tel: +86-571-87952276; Fax: +86-571-87952331; E-mail: jzus@zju.edu.cn
Copyright © 2000~ Journal of Zhejiang University-SCIENCE