Journal of Zhejiang University

Journal of Zhejiang University SCIENCE A 2026 Vol.27 No.4 P.365-383

Time control entry guidance method for hypersonic glide vehicles based on deep reinforcement learning

Author(s): Zhenyu LIU, Gang LEI, Yong XIAN, Leliang REN, Shaopeng LI, Daqiao ZHANG
Affiliation(s): 1. Department of Missile Engineering, Rocket Force University of Engineering, Xi’an 710025, China
Corresponding email(s): lishaopeng.2021@tsinghua.org.cn
Key Words: Hypersonic glide vehicles (HGVs), Entry guidance, Reinforcement learning, Time coordination, Deep neural network (DNN)

Share this article to： More <<< Previous Article \|Next Article >>>

Zhenyu LIU, Gang LEI, Yong XIAN, Leliang REN, Shaopeng LI, Daqiao ZHANG. Time control entry guidance method for hypersonic glide vehicles based on deep reinforcement learning[J]. Journal of Zhejiang University Science A, 2026, 27(4): 365-383.

@article{title="Time control entry guidance method for hypersonic glide vehicles based on deep reinforcement learning",
author="Zhenyu LIU, Gang LEI, Yong XIAN, Leliang REN, Shaopeng LI, Daqiao ZHANG",
journal="Journal of Zhejiang University Science A",
volume="27",
number="4",
pages="365-383",
year="2026",
publisher="Zhejiang University Press & Springer",
doi="10.1631/jzus.A2500144"
}

%0 Journal Article
%T Time control entry guidance method for hypersonic glide vehicles based on deep reinforcement learning
%A Zhenyu LIU
%A Gang LEI
%A Yong XIAN
%A Leliang REN
%A Shaopeng LI
%A Daqiao ZHANG
%J Journal of Zhejiang University SCIENCE A
%V 27
%N 4
%P 365-383
%@ 1673-565X
%D 2026
%I Zhejiang University Press & Springer
%DOI 10.1631/jzus.A2500144

TY - JOUR
T1 - Time control entry guidance method for hypersonic glide vehicles based on deep reinforcement learning
A1 - Zhenyu LIU
A1 - Gang LEI
A1 - Yong XIAN
A1 - Leliang REN
A1 - Shaopeng LI
A1 - Daqiao ZHANG
J0 - Journal of Zhejiang University Science A
VL - 27
IS - 4
SP - 365
EP - 383
%@ 1673-565X
Y1 - 2026
PB - Zhejiang University Press & Springer
ER -
DOI - 10.1631/jzus.A2500144

Abstract
Chinese Summary
Academic Network
Reviewer Comment

Abstract: To meet the requirement of simultaneous arrival for multiple hypersonic glide vehicles (HGVs), we propose a time control entry guidance (TCEG) method leveraging deep reinforcement learning. First, the entry guidance problem is solved with a reinforcement learning framework based on a designed reference flight profile. By appropriately designing the observation space and training environment, the well-trained agent demonstrates robust guidance performance under varying widths of the heading error corridor. Then, a novel method for predicting the remaining flight time is established, which consists of two main components. The first component estimates the remaining flight time using an analytical formula, while the second component employs a deep neural network (DNN) to predict the residual error between the estimated and the true value. Subsequently, based on the predicted terminal time error, the threshold of the heading error and the observation vector are corrected in real time, thereby guiding the agent to dynamically adjust its output actions. This enables precise control of the terminal time. Since the generation of guidance commands only requires forward computations by the neural network, the proposed method exhibits excellent real-time performance. Finally, the effectiveness and robustness of the method are demonstrated through numerical simulations in various scenarios.

基于深度强化学习的高超声速滑翔飞行器时间控制再入制导方法

作者：刘振宇，雷刚，鲜勇，任乐亮，李少朋，张大巧
机构：火箭军工程大学，导弹工程系，中国西安，710025
目的：解决多高超声速滑翔飞行器（HGV）在协同打击任务中对同时到达（时间协同）的需求。
创新点：1.提出了一种基于深度强化学习（DRL）的时间控制再入制导（TCEG）框架；2.设计了一种解析公式与深度神经网络（DNN）相结合的混合式剩余飞行时间预测方法；3.提出了一种自适应调节航向误差走廊的TCEG方法，且有效地将基于学习的控制和任务级目标模块化集成在一起。
方法：1.基于参考飞行剖面设计强化学习环境与观测空间，并训练智能体在变宽度航向误差走廊下实现鲁棒制导；2.利用解析公式估算剩余时间，用DNN预测估算值与真实值的残差，并将两者结合提升预测精度；3.根据预测的到达时间误差，实时修正航向误差阈值和观测向量，并引导智能体动态调整输出动作以控制到时。
结论：1.提出了一种基于DRL的时间控制再入制导方法，通过实时预测剩余时间并自适应调节航向误差，无需复杂参考轨迹设计即可实现高精度、强鲁棒性的再入制导；2.通过神经网络的前向计算生成指令，相比传统的再入制导方法大幅降低了计算需求，具备优异的实时性。

关键词：高超声速滑翔飞行器；再入制导；强化学习；时间协同；深度神经网络

Darkslateblue:Affiliate; Royal Blue:Author; Turquoise:Article

Reference

[1]BaoCY, LiXC, XuWL, et al., 2025. Coordinated reentry guidance with A^* and deep reinforcement learning for hypersonic morphing vehicles under multiple no-fly zones. Aerospace, 12(7):591.

[2]BrunnerCW, LuP, 2012. Comparison of fully numerical predictor-corrector and Apollo skip entry guidance algorithms. The Journal of the Astronautical Sciences, 59(3):517-540.

[3]ChaiRQ, TsourdosA, SavvarisA, et al., 2021. Review of advanced guidance and control algorithms for space/aerospace vehicles. Progress in Aerospace Sciences, 122:100696.

[4]ChengL, JiangFH, WangZB, et al., 2021. Multiconstrained real-time entry guidance using deep neural networks. IEEE Transactions on Aerospace and Electronic Systems, 57(1):325-340.

[5]ChungJ, GulcehreC, ChoK, et al., 2015. Gated feedback recurrent neural networks. Proceedings of the 32nd International Conference on International Conference on Machine Learning, p.2067-2075. https://proceedings.mlr.press/v37/chung15.html

[6]GaoY, ZhouR, ChenJY, 2024. Integrated entry guidance with no-fly zone constraint using reinforcement learning and predictor-corrector technique. Proceedings of the Institution of Mechanical Engineers, Part G: Journal of Aerospace Engineering, 238(7):728-741.

[7]GaudetB, DrozdK, FurfaroR, 2022. Adaptive approach phase guidance for a hypersonic glider via reinforcement meta learning. AIAA SCITECH 2022 Forum, p.1-19.

[8]GuoYH, LiX, ZhangHJ, et al., 2020. Entry guidance with terminal time control based on quasi-equilibrium glide condition. IEEE Transactions on Aerospace and Electronic Systems, 56(2):887-896.

[9]HarpoldJC, GavertDE, 1983. Space shuttle entry guidance performance results. Journal of Guidance, Control, and Dynamics, 6(6):442-447.

[10]HuQL, CaoRH, HanT, et al., 2021. Field-of-view limited guidance with impact angle constraint and feasibility analysis. Aerospace Science and Technology, 114:106753.

[11]HuYD, GaoCS, LiJL, et al., 2022. A novel adaptive lateral reentry guidance algorithm with complex distributed no-fly zones constraints. Chinese Journal of Aeronautics, 35(7):128-143.

[12]KimHG, LeeJY, KimHJ, et al., 2020. Look-angle-shaping guidance law for impact angle and time control with field-of-view constraint. IEEE Transactions on Aerospace and Electronic Systems, 56(2):1602-1612.

[13]LeeS, LeeY, KimY, et al., 2023. Impact angle control guidance considering seeker’s field-of-view limit based on reinforcement learning. Journal of Guidance, Control, and Dynamics, 46(11):2168-2182.

[14]LiJQ, ZhangGQ, ShanQH, et al., 2023. A novel cooperative design for USV–UAV systems: 3-D mapping guidance and adaptive fuzzy control. IEEE Transactions on Control of Network Systems, 10(2):564-574.

[15]LiZH, HuC, DingCB, et al., 2018. Stochastic gradient particle swarm optimization based entry trajectory rapid planning for hypersonic glide vehicles. Aerospace Science and Technology, 76:176-186.

[16]LiZH, HeB, WangMH, et al., 2019. Time-coordination entry guidance for multi-hypersonic vehicles. Aerospace Science and Technology, 89:123-135.

[17]LiangZX, LiQD, RenZ, 2017. Virtual terminal-based adaptive predictor–corrector entry guidance. Journal of Aerospace Engineering, 30(4):04017013.

[18]LiangZX, LvC, ZhuSY, 2023. Lateral entry guidance with terminal time constraint. IEEE Transactions on Aerospace and Electronic Systems, 59(3):2544-2553.

[19]LiuX, LiX, ZhangHJ, et al., 2025. Entry guidance with terminal time constraint based on reduced-order dynamics. IEEE Transactions on Aerospace and Electronic Systems, 61(3):5949-5961.

[20]LuP, 1997. Entry guidance and trajectory control for reusable launch vehicle. Journal of Guidance, Control, and Dynamics, 20(1):143-149.

[21]LuP, 2014. Entry guidance: a unified method. Journal of Guidance, Control, and Dynamics, 37(3):713-728.

[22]PhillipsTH, 2003. A common aero vehicle (CAV) model, description, and employment guide. Schafer Corporation for AFRL and AFSPC, 27:1-9.

[23]QiuXQ, LaiP, GaoCS, et al., 2024. Recorded recurrent deep reinforcement learning guidance laws for intercepting endoatmospheric maneuvering missiles. Defence Technology, 31:457-470.

[24]RenLL, XianY, LiSP, et al., 2023. Robust depletion shutdown guidance algorithm for long-range vehicles with a solid divert control system in large deviation conditions. Advances in Space Research, 72(9):3818-3841.

[25]RenLL, GuoWL, XianY, et al., 2025. Deep reinforcement learning based integrated evasion and impact hierarchical intelligent policy of exo-atmospheric vehicles. Chinese Journal of Aeronautics, 38(1):103193.

[26]SchulmanJ, WolskiF, DhariwalP, et al., 2017. Proximal policy optimization algorithms. arXiv:1707.06347.

[27]ShenZJ, LuP, 2003. Onboard generation of three-dimensional constrained entry trajectories. Journal of Guidance, Control, and Dynamics, 26(1):111-121.

[28]SureshM, SwarSC, ShyamS, 2023. Autonomous cooperative guidance strategies for unmanned aerial vehicles during on-board emergency. Journal of Aerospace Information Systems, 20(2):102-113.

[29]WangCY, WangWL, DongW, et al., 2024. Multiple-stage spatial–temporal cooperative guidance without time-to-go estimation. Chinese Journal of Aeronautics, 37(9):399-416.

[30]WangHN, GuoJ, WangX, et al., 2022. Time-coordination entry guidance using a range-determined strategy. Aerospace Science and Technology, 129:107842.

[31]WangNY, WangXG, CuiNG, et al., 2022. Deep reinforcement learning-based impact time control guidance law with constraints on the field-of-view. Aerospace Science and Technology, 128:107765.

[32]XueSB, LuP, 2010. Constrained predictor–corrector entry guidance. Journal of Guidance, Control, and Dynamics, 33(4):1273-1281.

[33]YangHD, LiangHZ, LiuJQ, et al., 2024. Analytical time-coordinated entry guidance for multi-hypersonic vehicles within three-dimensional corridor. Aerospace Science and Technology, 155:109639.

[34]YangHW, HuJC, LiS, et al., 2024. Reinforcement-learning-based robust guidance for asteroid approaching. Journal of Guidance, Control, and Dynamics, 47(10):2058-2072.

[35]YuWB, ChenWC, JiangZG, et al., 2019. Analytical entry guidance for coordinated flight with multiple no-fly-zone constraints. Aerospace Science and Technology, 84:273-290.

[36]ZengL, ZhangHB, ZhengW, 2018. A three-dimensional predictor–corrector entry guidance based on reduced-order motion equations. Aerospace Science and Technology, 73:223-231.

Open peer comments: Debate/Discuss/Question/Opinion

<1>