CLC number: TP13
On-line Access: 2024-08-27
Received: 2023-10-17
Revision Accepted: 2024-05-08
Crosschecked: 2022-03-07
Cited: 0
Clicked: 2816
Citations: Bibtex RefMan EndNote GB/T7714
Hongyang LI, Qinglai WEI. Optimal synchronization control for multi-agent systems with input saturation: a nonzero-sum game[J]. Frontiers of Information Technology & Electronic Engineering, 2022, 23(7): 1010-1019.
@article{title="Optimal synchronization control for multi-agent systems with input saturation: a nonzero-sum game",
author="Hongyang LI, Qinglai WEI",
journal="Frontiers of Information Technology & Electronic Engineering",
volume="23",
number="7",
pages="1010-1019",
year="2022",
publisher="Zhejiang University Press & Springer",
doi="10.1631/FITEE.2200010"
}
%0 Journal Article
%T Optimal synchronization control for multi-agent systems with input saturation: a nonzero-sum game
%A Hongyang LI
%A Qinglai WEI
%J Frontiers of Information Technology & Electronic Engineering
%V 23
%N 7
%P 1010-1019
%@ 2095-9184
%D 2022
%I Zhejiang University Press & Springer
%DOI 10.1631/FITEE.2200010
TY - JOUR
T1 - Optimal synchronization control for multi-agent systems with input saturation: a nonzero-sum game
A1 - Hongyang LI
A1 - Qinglai WEI
J0 - Frontiers of Information Technology & Electronic Engineering
VL - 23
IS - 7
SP - 1010
EP - 1019
%@ 2095-9184
Y1 - 2022
PB - Zhejiang University Press & Springer
ER -
DOI - 10.1631/FITEE.2200010
Abstract: This paper presents a novel optimal synchronization control method for multi-agent systems with input saturation. The multi-agent game theory is introduced to transform the optimal synchronization control problem into a multi-agent nonzero-sum game. Then, the Nash equilibrium can be achieved by solving the coupled Hamilton–Jacobi–Bellman (HJB) equations with nonquadratic input energy terms. A novel off-policy reinforcement learning method is presented to obtain the Nash equilibrium solution without the system models, and the critic neural networks (NNs) and actor NNs are introduced to implement the presented method. Theoretical analysis is provided, which shows that the iterative control laws converge to the Nash equilibrium. Simulation results show the good performance of the presented method.
[1]Abu-Khalaf M, Lewis FL, 2005. Nearly optimal control laws for nonlinear systems with saturating actuators using a neural network HJB approach. Automatica, 41(5):779-791.
[2]Bertsekas DP, 2007. Dynamic Programming and Optimal Control. Athena Scientific, Belmont, USA.
[3]Cao MT, Xiao F, Wang L, 2015. Event-based second-order consensus control for multi-agent systems via synchronous periodic event detection. IEEE Trans Autom Contr, 60(9):2452-2457.
[4]Du HB, He YG, Cheng YY, 2014. Finite-time synchronization of a class of second-order nonlinear multi-agent systems using output feedback control. IEEE Trans Circ Syst I, 61(6):1778-1788.
[5]Garcia E, Cao YC, Casbeer D, 2017. Periodic event-triggered synchronization of linear multi-agent systems with communication delays. IEEE Trans Autom Contr, 62(1):366-371.
[6]Han YJ, Lu WL, Chen TP, 2013. Cluster consensus in discrete-time networks of multi-agents with inter-cluster nonidentical inputs. IEEE Trans Neur Netw Learn Syst, 24(4):566-578.
[7]He WL, Gao XY, Zhong WM, et al., 2018. Secure impulsive synchronization control of multi-agent systems under deception attacks. Inform Sci, 459:354-368.
[8]Jiao Q, Modares H, Xu SY, et al., 2016. Multi-agent zero-sum differential graphical games for disturbance rejection in distributed control. Automatica, 69:24-34.
[9]Li JN, Modares H, Chai TY, et al., 2017. Off-policy reinforcement learning for synchronization in multiagent graphical games. IEEE Trans Neur Netw Learn Syst, 28(10):2434-2445.
[10]Li JQ, Wang QL, Su YX, et al., 2021. Robust distributed model predictive consensus of discrete-time multi-agent systems: a self-triggered approach. Front Inform Technol Electron Eng, 22(8):1068-1079.
[11]Liu DR, Xue S, Zhao B, et al., 2021. Adaptive dynamic programming for control: a survey and recent advances. IEEE Trans Syst Man Cybern Syst, 51(1):142-160.
[12]Ma HJ, Yang GH, 2016. Adaptive fault tolerant control of cooperative heterogeneous systems with actuator faults and unreliable interconnections. IEEE Trans Autom Contr, 61(11):3240-3255.
[13]Qin JH, Li M, Shi Y, et al., 2019. Optimal synchronization control of multiagent systems with input saturation via off-policy reinforcement learning. IEEE Trans Neur Netw Learn Syst, 30(1):85-96.
[14]Rehák B, Lynnyk V, 2021. Leader-following synchronization of a multi-agent system with heterogeneous delays. Front Inform Technol Electron Eng, 22(1):97-106.
[15]Thunberg J, Song W, Monitijano E, et al., 2014. Distributed attitude synchronization control of multi-agent systems with switching topologies. Automatica, 50(3):832-840.
[16]Vamvoudakis KG, Lewis FL, Hudas GR, 2012. Multi-agent differential graphical games: online adaptive learning solution for synchronization with optimality. Automatica, 48(8):1598-1611.
[17]Vrabie D, Lewis F, 2011. Adaptive dynamic programming for online solution of a zero-sum differential game. J Contr Theory Appl, 9(3):353-360.
[18]Wang FY, Zhang HG, Liu DR, 2009. Adaptive dynamic programming: an introduction. IEEE Comput Intell Mag, 4(2):39-47.
[19]Wei QL, Liu DR, 2014. Adaptive dynamic programming for optimal tracking control of unknown nonlinear systems with application to coal gasification. IEEE Trans Autom Sci Eng, 11(4):1020-1036.
[20]Wei QL, Wang FY, Liu DR, et al., 2014. Finite-approximation-error-based discrete-time iterative adaptive dynamic programming. IEEE Trans Cybern, 44(12):2820-2833.
[21]Wei QL, Liu DR, Lewis FL, 2015. Optimal distributed synchronization control for continuous-time heterogeneous multi-agent differential graphical games. Inform Sci, 317:96-113.
[22]Wei QL, Liu DR, Lin HQ, 2016. Value iteration adaptive dynamic programming for optimal control of discrete-time nonlinear systems. IEEE Trans Cybern, 46(3):840-853.
[23]Wei QL, Lewis FL, Sun QY, et al., 2017. Discrete-time deterministic Q-learning: a novel convergence analysis. IEEE Trans Cybern, 47(5):1224-1237.
[24]Wei QL, Lewis FL, Liu DR, et al., 2018. Discrete-time local value iteration adaptive dynamic programming: convergence analysis. IEEE Trans Syst Man Cybern Syst, 48(6):875-891.
[25]Wei QL, Li HY, Wang FY, 2020. Parallel control for continuous-time linear systems: a case study. IEEE/ CAA J Autom Sin, 7(4):919-928.
[26]Wei QL, Wang X, Zhong XN, et al., 2021. Consensus control of leader-following multi-agent systems in directed topology with heterogeneous disturbances. IEEE/CAA J Autom Sin, 8(2):423-431.
[27]Wieland P, Sepulchre R, Allgöwer F, 2011. An internal model principle is necessary and sufficient for linear output synchronization. Automatica, 47(5):1068-1074.
[28]Yang JY, Xi F, Ma J, 2019. Model-based edge-event-triggered containment control under directed topologies. IEEE Trans Cybern, 49(7):2556-2567.
[29]Yang N, Xiao JW, Xiao L, et al., 2019. Non-zero sum differential graphical game: cluster synchronisation for multi-agents with partially unknown dynamics. Int J Contr, 92(10):2408-2419.
[30]Zhang HG, Zhang JL, Yang GH, et al., 2015. Leader-based optimal coordination control for the consensus problem of multiagent differential games via fuzzy adaptive dynamic programming. IEEE Trans Fuzzy Syst, 23(1):152-163.
[31]Zhang KQ, Yang ZR, Başar T, 2021. Decentralized multi-agent reinforcement learning with networked agents: recent advances. Front Inform Technol Electron Eng, 22(6):802-814.
[32]Zhang LD, Wang B, Liu ZX, et al., 2019. Motion planning of a quadrotor robot game using a simulation-based projected policy iteration method. Front Inform Technol Electron Eng, 20(4):525-537.
[33]Zhao DY, Zhu QM, Li N, et al., 2014. Synchronized control with neuro-agents for leader-follower based multiple robotic manipulators. Neurocomputing, 124:149-161.
Open peer comments: Debate/Discuss/Question/Opinion
<1>