
Zhenyu LIU, Gang LEI, Yong XIAN, Leliang REN, Shaopeng LI, Daqiao ZHANG. Time control entry guidance method for hypersonic glide vehicles based on deep reinforcement learning[J]. Journal of Zhejiang University Science A, 2026, 27(4): 365-383.
@article{title="Time control entry guidance method for hypersonic glide vehicles based on deep reinforcement learning",
author="Zhenyu LIU, Gang LEI, Yong XIAN, Leliang REN, Shaopeng LI, Daqiao ZHANG",
journal="Journal of Zhejiang University Science A",
volume="27",
number="4",
pages="365-383",
year="2026",
publisher="Zhejiang University Press & Springer",
doi="10.1631/jzus.A2500144"
}
%0 Journal Article
%T Time control entry guidance method for hypersonic glide vehicles based on deep reinforcement learning
%A Zhenyu LIU
%A Gang LEI
%A Yong XIAN
%A Leliang REN
%A Shaopeng LI
%A Daqiao ZHANG
%J Journal of Zhejiang University SCIENCE A
%V 27
%N 4
%P 365-383
%@ 1673-565X
%D 2026
%I Zhejiang University Press & Springer
%DOI 10.1631/jzus.A2500144
TY - JOUR
T1 - Time control entry guidance method for hypersonic glide vehicles based on deep reinforcement learning
A1 - Zhenyu LIU
A1 - Gang LEI
A1 - Yong XIAN
A1 - Leliang REN
A1 - Shaopeng LI
A1 - Daqiao ZHANG
J0 - Journal of Zhejiang University Science A
VL - 27
IS - 4
SP - 365
EP - 383
%@ 1673-565X
Y1 - 2026
PB - Zhejiang University Press & Springer
ER -
DOI - 10.1631/jzus.A2500144
Abstract: To meet the requirement of simultaneous arrival for multiple hypersonic glide vehicles (HGVs), we propose a time control entry guidance (TCEG) method leveraging deep reinforcement learning. First, the entry guidance problem is solved with a reinforcement learning framework based on a designed reference flight profile. By appropriately designing the observation space and training environment, the well-trained agent demonstrates robust guidance performance under varying widths of the heading error corridor. Then, a novel method for predicting the remaining flight time is established, which consists of two main components. The first component estimates the remaining flight time using an analytical formula, while the second component employs a deep neural network (DNN) to predict the residual error between the estimated and the true value. Subsequently, based on the predicted terminal time error, the threshold of the heading error and the observation vector are corrected in real time, thereby guiding the agent to dynamically adjust its output actions. This enables precise control of the terminal time. Since the generation of guidance commands only requires forward computations by the neural network, the proposed method exhibits excellent real-time performance. Finally, the effectiveness and robustness of the method are demonstrated through numerical simulations in various scenarios.
[1]BaoCY, LiXC, XuWL, et al., 2025. Coordinated reentry guidance with A* and deep reinforcement learning for hypersonic morphing vehicles under multiple no-fly zones. Aerospace, 12(7):591.
[2]BrunnerCW, LuP, 2012. Comparison of fully numerical predictor-corrector and Apollo skip entry guidance algorithms. The Journal of the Astronautical Sciences, 59(3):517-540.
[3]ChaiRQ, TsourdosA, SavvarisA, et al., 2021. Review of advanced guidance and control algorithms for space/aerospace vehicles. Progress in Aerospace Sciences, 122:100696.
[4]ChengL, JiangFH, WangZB, et al., 2021. Multiconstrained real-time entry guidance using deep neural networks. IEEE Transactions on Aerospace and Electronic Systems, 57(1):325-340.
[5]ChungJ, GulcehreC, ChoK, et al., 2015. Gated feedback recurrent neural networks. Proceedings of the 32nd International Conference on International Conference on Machine Learning, p.2067-2075. https://proceedings.mlr.press/v37/chung15.html
[6]GaoY, ZhouR, ChenJY, 2024. Integrated entry guidance with no-fly zone constraint using reinforcement learning and predictor-corrector technique. Proceedings of the Institution of Mechanical Engineers, Part G: Journal of Aerospace Engineering, 238(7):728-741.
[7]GaudetB, DrozdK, FurfaroR, 2022. Adaptive approach phase guidance for a hypersonic glider via reinforcement meta learning. AIAA SCITECH 2022 Forum, p.1-19.
[8]GuoYH, LiX, ZhangHJ, et al., 2020. Entry guidance with terminal time control based on quasi-equilibrium glide condition. IEEE Transactions on Aerospace and Electronic Systems, 56(2):887-896.
[9]HarpoldJC, GavertDE, 1983. Space shuttle entry guidance performance results. Journal of Guidance, Control, and Dynamics, 6(6):442-447.
[10]HuQL, CaoRH, HanT, et al., 2021. Field-of-view limited guidance with impact angle constraint and feasibility analysis. Aerospace Science and Technology, 114:106753.
[11]HuYD, GaoCS, LiJL, et al., 2022. A novel adaptive lateral reentry guidance algorithm with complex distributed no-fly zones constraints. Chinese Journal of Aeronautics, 35(7):128-143.
[12]KimHG, LeeJY, KimHJ, et al., 2020. Look-angle-shaping guidance law for impact angle and time control with field-of-view constraint. IEEE Transactions on Aerospace and Electronic Systems, 56(2):1602-1612.
[13]LeeS, LeeY, KimY, et al., 2023. Impact angle control guidance considering seeker’s field-of-view limit based on reinforcement learning. Journal of Guidance, Control, and Dynamics, 46(11):2168-2182.
[14]LiJQ, ZhangGQ, ShanQH, et al., 2023. A novel cooperative design for USV–UAV systems: 3-D mapping guidance and adaptive fuzzy control. IEEE Transactions on Control of Network Systems, 10(2):564-574.
[15]LiZH, HuC, DingCB, et al., 2018. Stochastic gradient particle swarm optimization based entry trajectory rapid planning for hypersonic glide vehicles. Aerospace Science and Technology, 76:176-186.
[16]LiZH, HeB, WangMH, et al., 2019. Time-coordination entry guidance for multi-hypersonic vehicles. Aerospace Science and Technology, 89:123-135.
[17]LiangZX, LiQD, RenZ, 2017. Virtual terminal-based adaptive predictor–corrector entry guidance. Journal of Aerospace Engineering, 30(4):04017013.
[18]LiangZX, LvC, ZhuSY, 2023. Lateral entry guidance with terminal time constraint. IEEE Transactions on Aerospace and Electronic Systems, 59(3):2544-2553.
[19]LiuX, LiX, ZhangHJ, et al., 2025. Entry guidance with terminal time constraint based on reduced-order dynamics. IEEE Transactions on Aerospace and Electronic Systems, 61(3):5949-5961.
[20]LuP, 1997. Entry guidance and trajectory control for reusable launch vehicle. Journal of Guidance, Control, and Dynamics, 20(1):143-149.
[21]LuP, 2014. Entry guidance: a unified method. Journal of Guidance, Control, and Dynamics, 37(3):713-728.
[22]PhillipsTH, 2003. A common aero vehicle (CAV) model, description, and employment guide. Schafer Corporation for AFRL and AFSPC, 27:1-9.
[23]QiuXQ, LaiP, GaoCS, et al., 2024. Recorded recurrent deep reinforcement learning guidance laws for intercepting endoatmospheric maneuvering missiles. Defence Technology, 31:457-470.
[24]RenLL, XianY, LiSP, et al., 2023. Robust depletion shutdown guidance algorithm for long-range vehicles with a solid divert control system in large deviation conditions. Advances in Space Research, 72(9):3818-3841.
[25]RenLL, GuoWL, XianY, et al., 2025. Deep reinforcement learning based integrated evasion and impact hierarchical intelligent policy of exo-atmospheric vehicles. Chinese Journal of Aeronautics, 38(1):103193.
[26]SchulmanJ, WolskiF, DhariwalP, et al., 2017. Proximal policy optimization algorithms. arXiv:1707.06347.
[27]ShenZJ, LuP, 2003. Onboard generation of three-dimensional constrained entry trajectories. Journal of Guidance, Control, and Dynamics, 26(1):111-121.
[28]SureshM, SwarSC, ShyamS, 2023. Autonomous cooperative guidance strategies for unmanned aerial vehicles during on-board emergency. Journal of Aerospace Information Systems, 20(2):102-113.
[29]WangCY, WangWL, DongW, et al., 2024. Multiple-stage spatial–temporal cooperative guidance without time-to-go estimation. Chinese Journal of Aeronautics, 37(9):399-416.
[30]WangHN, GuoJ, WangX, et al., 2022. Time-coordination entry guidance using a range-determined strategy. Aerospace Science and Technology, 129:107842.
[31]WangNY, WangXG, CuiNG, et al., 2022. Deep reinforcement learning-based impact time control guidance law with constraints on the field-of-view. Aerospace Science and Technology, 128:107765.
[32]XueSB, LuP, 2010. Constrained predictor–corrector entry guidance. Journal of Guidance, Control, and Dynamics, 33(4):1273-1281.
[33]YangHD, LiangHZ, LiuJQ, et al., 2024. Analytical time-coordinated entry guidance for multi-hypersonic vehicles within three-dimensional corridor. Aerospace Science and Technology, 155:109639.
[34]YangHW, HuJC, LiS, et al., 2024. Reinforcement-learning-based robust guidance for asteroid approaching. Journal of Guidance, Control, and Dynamics, 47(10):2058-2072.
[35]YuWB, ChenWC, JiangZG, et al., 2019. Analytical entry guidance for coordinated flight with multiple no-fly-zone constraints. Aerospace Science and Technology, 84:273-290.
[36]ZengL, ZhangHB, ZhengW, 2018. A three-dimensional predictor–corrector entry guidance based on reduced-order motion equations. Aerospace Science and Technology, 73:223-231.
CLC number:
On-line Access: 2026-04-18
Received: 2025-04-27
Revision Accepted: 2025-11-07
Crosschecked: 2026-04-20
Cited: 0
Clicked: 974
Open peer comments: Debate/Discuss/Question/Opinion
<1>