
CLC number: TP242.6;TP18
On-line Access: 2025-10-13
Received: 2024-12-16
Revision Accepted: 2025-05-12
Crosschecked: 2025-10-13
Cited: 0
Clicked: 543
Zhicheng WANG, Xin ZHAO, Meng Yee (Michael) CHUAH, Zhibin LI, Jun WU, Qiuguo ZHU. Efficient learning of robust multigait quadruped locomotion for minimizing the cost of transport[J]. Frontiers of Information Technology & Electronic Engineering, 2025, 26(9): 1679-1691.
@article{title="Efficient learning of robust multigait quadruped locomotion for minimizing the cost of transport",
author="Zhicheng WANG, Xin ZHAO, Meng Yee (Michael) CHUAH, Zhibin LI, Jun WU, Qiuguo ZHU",
journal="Frontiers of Information Technology & Electronic Engineering",
volume="26",
number="9",
pages="1679-1691",
year="2025",
publisher="Zhejiang University Press & Springer",
doi="10.1631/FITEE.2401070"
}
%0 Journal Article
%T Efficient learning of robust multigait quadruped locomotion for minimizing the cost of transport
%A Zhicheng WANG
%A Xin ZHAO
%A Meng Yee (Michael) CHUAH
%A Zhibin LI
%A Jun WU
%A Qiuguo ZHU
%J Frontiers of Information Technology & Electronic Engineering
%V 26
%N 9
%P 1679-1691
%@ 2095-9184
%D 2025
%I Zhejiang University Press & Springer
%DOI 10.1631/FITEE.2401070
TY - JOUR
T1 - Efficient learning of robust multigait quadruped locomotion for minimizing the cost of transport
A1 - Zhicheng WANG
A1 - Xin ZHAO
A1 - Meng Yee (Michael) CHUAH
A1 - Zhibin LI
A1 - Jun WU
A1 - Qiuguo ZHU
J0 - Frontiers of Information Technology & Electronic Engineering
VL - 26
IS - 9
SP - 1679
EP - 1691
%@ 2095-9184
Y1 - 2025
PB - Zhejiang University Press & Springer
ER -
DOI - 10.1631/FITEE.2401070
Abstract: Quadruped robots are able to exhibit a range of gaits, each with its own traversability and energy efficiency characteristics. By actively coordinating between gaits in different scenarios, energy-efficient and adaptive locomotion can be achieved. This study investigates the performances of learned energy-efficient policies for quadrupedal gaits under different commands. We propose a training–synthesizing framework that integrates learned gait-conditioned locomotion policies into an efficient multiskill locomotion policy. The resulting control policy achieves low-cost smooth switching and controllable gaits. Our results of the learned multiskill policy demonstrate seamless gait transitions while maintaining energy optimality across all commands.
[1]Acero F, Yuan K, Li ZB, 2022. Learning perceptual locomotion on uneven terrains using sparse visual observations. IEEE Robot Autom Lett, 7(4):8611-8618.
[2]Agarwal A, Kumar A, Malik J, et al., 2022. Legged locomotion in challenging terrains using egocentric vision. Proc 6th Annual Conf on Robot Learning, p.403-415.
[3]Chen D, Zhou B, Koltun V, et al., 2019. Learning by cheating. Proc 3rd Annual Conf on Robot Learning, p.66-75.
[4]di Carlo J, Wensing PM, Katz B, et al., 2018. Dynamic locomotion in the MIT Cheetah 3 through convex model-predictive control. Proc IEEE/RSJ Int Conf on Intelligent Robots and Systems, p.1-9.
[5]Fu ZP, Kumar A, Malik J, et al., 2021. Minimizing energy consumption leads to the emergence of gaits in legged robots. Proc 5th Annual Conf on Robot Learning, p.928-937.
[6]Haynes GC, Rizzi AA, 2006. Gaits and gait transitions for legged robots. Proc IEEE Int Conf on Robotics and Automation, p.1117-1122.
[7]Hildebrand M, 1965. Symmetrical gaits of horses. Science, 150(3697):701-708.
[8]Hinton G, Vinyals O, Dean J, 2015. Distilling the knowledge in a neural network. https://arxiv.org/abs/1503.02531
[9]Hoyt DF, Taylor CR, 1981. Gait and the energetics of locomotion in horses. Nature, 292(5820):239-240.
[10]Hsiao-Wecksler ET, Polk JD, Rosengren KS, et al., 2010. A review of new analytic techniques for quantifying symmetry in locomotion. Symmetry, 2(2):1135-1155.
[11]Hwangbo J, Lee J, Hutter M, 2018. Per-contact iteration method for solving contact dynamics. IEEE Robot Autom Lett, 3(2):895-902.
[12]Ijspeert AJ, 2008. Central pattern generators for locomotion control in animals and robots: a review. Neur Netw, 21(4):642-653.
[13]Iscen A, Caluwaerts K, Tan J, et al., 2018. Policies modulating trajectory generators. Proc 2nd Annual Conf on Robot Learning, p.916-926.
[14]Jacobs RA, Jordan MI, Nowlan SJ, et al., 1991. Adaptive mixtures of local experts. Neur Comput, 3(1):79-87.
[15]Jin YB, Liu XW, Shao YC, et al., 2022. High-speed quadrupedal locomotion by imitation-relaxation reinforcement learning. Nat Mach Intell, 4(12):1198-1208.
[16]Kumar A, Fu ZP, Pathak D, et al., 2021. RMA: rapid motor adaptation for legged robots. Proc 17th Robotics: Science and Systems, p.1-9.
[17]Kumar A, Li Z, Zeng J, et al., 2022. Adapting rapid motor adaptation for bipedal robots. Proc IEEE/RSJ Int Conf on Intelligent Robots and Systems, p.1161-1168.
[18]Lee J, Hwangbo J, Hutter M, 2019. Robust recovery controller for a quadrupedal robot using deep reinforcement learning. https://arxiv.org/abs/1901.07517
[19]Lee J, Hwangbo J, Wellhausen L, et al., 2020. Learning quadrupedal locomotion over challenging terrain. Sci Robot, 5(47):eabc5986.
[20]Loquercio A, Kumar A, Malik J, 2023. Learning visual locomotion with cross-modal supervision. Proc Int Conf on Robotics and Automation, p.7295-7302.
[21]Luo YS, Soeseno JH, Chen TPC, et al., 2020. CARL: controllable agent with reinforcement learning for quadruped locomotion. ACM Trans Graph, 39(4):38.
[22]Makoviychuk V, Wawrzyniak L, Guo YR, et al., 2021. Isaac Gym: high performance GPU based physics simulation for robot learning. Proc 35th Conf on Neural Information Processing Systems, p.1-12.
[23]Margolis GB, Agrawal P, 2022. Walk these ways: tuning robot control for generalization with multiplicity of behavior. Proc 6th Annual Conf on Robot Learning, p.1-9.
[24]Margolis GB, Yang G, Paigwar K, et al., 2022. Rapid locomotion via reinforcement learning. Proc 18th Robotics: Science and Systems, p.1-9.
[25]Miki T, Lee J, Hwangbo J, et al., 2022. Learning robust perceptive locomotion for quadrupedal robots in the wild. Sci Robot, 7(62):eabk2822.
[26]Nahrendra IMA, Yu B, Myung H, 2023. DreamWaq: learning robust quadrupedal locomotion with implicit terrain imagination via deep reinforcement learning. Proc IEEE Int Conf on Robotics and Automation, p.5078-5084.
[27]Paszke A, Gross S, Massa F, et al., 2019. PyTorch: an imperative style, high-performance deep learning library. Proc 33rd Int Conf on Neural Information Processing Systems, p.8024-8035.
[28]Peng XB, Chang M, Zhang G, et al., 2019. MCP: learning composable hierarchical control with multiplicative compositional policies. Proc 33rd Int Conf on Neural Information Processing Systems, Article 331.
[29]Peng XB, Coumans E, Zhang TN, et al., 2020. Learning agile robotic locomotion skills by imitating animals. Proc 16th Robotics: Science and Systems, p.1-9.
[30]Peng XB, Ma Z, Abbeel P, et al., 2021. AMP: adversarial motion priors for stylized physics-based character control. ACM Trans Graph, 40(4):144.
[31]Peng XB, Guo YR, Halper L, et al., 2022. ASE: large-scale reusable adversarial skill embeddings for physically simulated characters. ACM Trans Graph, 41(4):94.
[32]Schulman J, Wolski F, Dhariwal P, et al., 2017. Proximal policy optimization algorithms. https://arxiv.org/abs/1707.06347
[33]Seok S, Wang A, Chuah MY, et al., 2013. Design principles for highly efficient quadrupeds and implementation on the MIT Cheetah robot. IEEE Int Conf on Robotics and Automation, p.3307-3312.
[34]Shao YC, Jin YB, Liu XW, et al., 2022. Learning free gait transition for quadruped robots via phase-guided controller. IEEE Robot Autom Lett, 7(2):1230-1237.
[35]Siekmann J, Valluri S, Dao J, et al., 2020. Learning memory-based control for human-scale bipedal locomotion. Proc 16th Robotics: Science and Systems, p.1-8.
[36]Siekmann J, Green K, Warila J, et al., 2021a. Blind bipedal stair traversal via sim-to-real reinforcement learning. Proc 17th Robotics: Science and Systems, p.1-9.
[37]Siekmann J, Godse Y, Fern A, et al., 2021b. Sim-to-real learning of all common bipedal gaits via periodic reward composition. Proc IEEE Int Conf on Robotics and Automation, p.7309-7315.
[38]Tan DCH, Zhang J, Chuah M, et al., 2023. Perceptive locomotion with controllable pace and natural gait transitions over uneven terrains. https://arxiv.org/abs/2301.10894
[39]Tan J, Zhang TN, Coumans E, et al., 2018. Sim-to-real: learning agile locomotion for quadruped robots. Proc 14th Robotics: Science and Systems, p.1-9.
[40]Tsounis V, Alge M, Lee J, et al., 2020. DeepGait: planning and control of quadrupedal gaits using deep reinforcement learning. IEEE Robot Autom Lett, 5(2):3699-3706.
[41]Xi WT, Yesilevskiy Y, Remy CD, 2016. Selecting gaits for economical locomotion of legged robots. Int J Robot Res, 35(9):1140-1154.
[42]Xie ZM, Da XY, van de Panne M, et al., 2021. Dynamics randomization revisited: a case study for quadrupedal locomotion. Proc IEEE Int Conf on Robotics and Automation, p.4955-4961.
[43]Yang CY, Yuan K, Zhu QG, et al., 2020. Multi-expert learning of adaptive legged locomotion. Sci Robot, 5(49):eabb2174.
[44]Zhang H, Starke S, Komura T, et al., 2018. Mode-adaptive neural networks for quadruped motion control. ACM Trans Graph, 37(4):145.
Open peer comments: Debate/Discuss/Question/Opinion
<1>