JZUS - Journal of Zhejiang University SCIENCE

Frontiers of Information Technology & Electronic Engineering

ISSN 2095-9184 (print), ISSN 2095-9230 (online)

2025 Vol.26 No.2 P.278-292

Adaptive layer splitting for wireless large language model inference in edge computing: a model-based reinforcement learning approach

Yuxuan CHEN, Rongpeng LI, Xiaoxue YU, Zhifeng ZHAO, Honggang ZHANG

College of Information Science & Electronic Engineering, Zhejiang University, Hangzhou 310027, China; Zhejiang Lab, Hangzhou 310012, China

cyx00@zju.edu.cn, lirongpeng@zju.edu.cn, sdwhyxx@zju.edu.cn, zhaozf@zhejianglab.com, honggangzhang@zju.edu.cn

Abstract: Optimizing the deployment of large language models (LLMs) in edge computing environments is critical for enhancing privacy and computational efficiency. In the path toward efficient wireless LLM inference in edge computing, this study comprehensively analyzes the impact of different splitting points in mainstream open-source LLMs. Accordingly, this study introduces a framework taking inspiration from model-based reinforcement learning to determine the optimal splitting point across the edge and user equipment. By incorporating a reward surrogate model, our approach significantly reduces the computational cost of frequent performance evaluations. Extensive simulations demonstrate that this method effectively balances inference performance and computational load under varying network conditions, providing a robust solution for LLM deployment in decentralized settings.

Key words: Large language models (LLMs); Edge computing; Model-based reinforcement learning (MBRL); Split inference; Transformer

Chinese Summary <15> 基于模型强化学习的边缘计算无线大语言模型推理自适应层切分方法

陈宇轩¹，李荣鹏¹，于小雪¹，赵志峰²，张宏纲¹
¹浙江大学信息与电子工程学院，中国杭州市，310027
²之江实验室，中国杭州市，310012
摘要：在边缘计算环境中优化大型语言模型（LLMs）的部署对提升隐私保护和计算效率至关重要。为实现高效的无线LLM推理，本文全面分析了主流开源LLMs中不同分割点的影响。本文引入一个基于模型的强化学习（MBRL）框架，以确定边缘和用户设备（UE）之间的最佳分割点。通过引入奖励替代模型，该方法显著减少了频繁的性能评估的计算成本。广泛的仿真结果表明，该方法在不同网络条件下有效地平衡了推理性能和计算负载，为去中心化环境中LLM的部署提供稳健的解决方案。

关键词组：大型语言模型；边缘计算；基于模型的强化学习；分裂推理；Transformer模型

Share this article to： More

Go to Contents

References:

Open peer comments: Debate/Discuss/Question/Opinion

<1>

DOI:

10.1631/FITEE.2400468

CLC number:

TP391

Download Full Text:

Click Here

Downloaded:

1254

Download summary:

Downloaded:

338

Clicked:

1569

Cited:

On-line Access:

2025-03-07

Received:

2024-06-01

Revision Accepted:

2024-09-13

Crosschecked:

2025-03-07

Journal of Zhejiang University-SCIENCE, 38 Zheda Road, Hangzhou 310027, China
Tel: +86-571-87952276; Fax: +86-571-87952331; E-mail: jzus@zju.edu.cn
Copyright © 2000~ Journal of Zhejiang University-SCIENCE

CONTENTS

INSTR. FOR AUTHOR

FOR REVIEWER

ABOUT JZUS

Publishing Service