Full Text:   <10291>

Summary:  <446>

CLC number: TP13

On-line Access: 2024-08-27

Received: 2023-10-17

Revision Accepted: 2024-05-08

Crosschecked: 2022-03-07

Cited: 0

Clicked: 2815

Citations:  Bibtex RefMan EndNote GB/T7714

 ORCID:

Hongyang LI

https://orcid.org/0000-0001-5891-134X

Qinglai WEI

https://orcid.org/0000-0001-7002-9800

-   Go to

Article info.
Open peer comments

Frontiers of Information Technology & Electronic Engineering  2022 Vol.23 No.7 P.1010-1019

http://doi.org/10.1631/FITEE.2200010


Optimal synchronization control for multi-agent systems with input saturation: a nonzero-sum game


Author(s):  Hongyang LI, Qinglai WEI

Affiliation(s):  School of Artificial Intelligence, University of Chinese Academy of Sciences, Beijing 100049, China; more

Corresponding email(s):   lihongyang2019@ia.ac.cn, qinglai.wei@ia.ac.cn

Key Words:  Optimal synchronization control, Multi-agent systems, Nonzero-sum game, Adaptive dynamic programming, Input saturation, Off-policy reinforcement learning, Policy iteration


Hongyang LI, Qinglai WEI. Optimal synchronization control for multi-agent systems with input saturation: a nonzero-sum game[J]. Frontiers of Information Technology & Electronic Engineering, 2022, 23(7): 1010-1019.

@article{title="Optimal synchronization control for multi-agent systems with input saturation: a nonzero-sum game",
author="Hongyang LI, Qinglai WEI",
journal="Frontiers of Information Technology & Electronic Engineering",
volume="23",
number="7",
pages="1010-1019",
year="2022",
publisher="Zhejiang University Press & Springer",
doi="10.1631/FITEE.2200010"
}

%0 Journal Article
%T Optimal synchronization control for multi-agent systems with input saturation: a nonzero-sum game
%A Hongyang LI
%A Qinglai WEI
%J Frontiers of Information Technology & Electronic Engineering
%V 23
%N 7
%P 1010-1019
%@ 2095-9184
%D 2022
%I Zhejiang University Press & Springer
%DOI 10.1631/FITEE.2200010

TY - JOUR
T1 - Optimal synchronization control for multi-agent systems with input saturation: a nonzero-sum game
A1 - Hongyang LI
A1 - Qinglai WEI
J0 - Frontiers of Information Technology & Electronic Engineering
VL - 23
IS - 7
SP - 1010
EP - 1019
%@ 2095-9184
Y1 - 2022
PB - Zhejiang University Press & Springer
ER -
DOI - 10.1631/FITEE.2200010


Abstract: 
This paper presents a novel optimal synchronization control method for multi-agent systems with input saturation. The multi-agent game theory is introduced to transform the optimal synchronization control problem into a multi-agent nonzero-sum game. Then, the Nash equilibrium can be achieved by solving the coupled Hamilton–Jacobi–Bellman (HJB) equations with nonquadratic input energy terms. A novel off-policy reinforcement learning method is presented to obtain the Nash equilibrium solution without the system models, and the critic neural networks (NNs) and actor NNs are introduced to implement the presented method. Theoretical analysis is provided, which shows that the iterative control laws converge to the Nash equilibrium. Simulation results show the good performance of the presented method.

输入饱和下多智能体系统最优一致性控制:一类非零和博弈方法

李洪阳1,2,魏庆来1,2,3
1中国科学院大学人工智能学院,中国北京市,100049
2中国科学院自动化研究所复杂系统管理与控制国家重点实验室,中国北京市,100190
3澳门科技大学系统工程研究所,中国澳门特别行政区,999078
摘要:本文针对输入饱和下的多智能体系统,提出一种最优一致性控制方法。引入多智能体博弈理论,将最优一致性控制问题转化为多智能体非零和博弈。之后,通过求解具有非二次输入能量项的耦合Hamilton–Jacobi–Bellman(HJB)方程,实现Nash平衡。提出脱策强化学习方法,在系统模型未知情况下获得Nash平衡解;引入评判神经网络和执行神经网络实现所提方法。理论分析显示迭代控制律收敛到Nash平衡。仿真实验验证了所提方法的有效性。

关键词:最优一致性控制;多智能体系统;非零和博弈;自适应动态规划;输入饱和;脱策强化学习;策略迭代

Darkslateblue:Affiliate; Royal Blue:Author; Turquoise:Article

Reference

[1]Abu-Khalaf M, Lewis FL, 2005. Nearly optimal control laws for nonlinear systems with saturating actuators using a neural network HJB approach. Automatica, 41(5):779-791.

[2]Bertsekas DP, 2007. Dynamic Programming and Optimal Control. Athena Scientific, Belmont, USA.

[3]Cao MT, Xiao F, Wang L, 2015. Event-based second-order consensus control for multi-agent systems via synchronous periodic event detection. IEEE Trans Autom Contr, 60(9):2452-2457.

[4]Du HB, He YG, Cheng YY, 2014. Finite-time synchronization of a class of second-order nonlinear multi-agent systems using output feedback control. IEEE Trans Circ Syst I, 61(6):1778-1788.

[5]Garcia E, Cao YC, Casbeer D, 2017. Periodic event-triggered synchronization of linear multi-agent systems with communication delays. IEEE Trans Autom Contr, 62(1):366-371.

[6]Han YJ, Lu WL, Chen TP, 2013. Cluster consensus in discrete-time networks of multi-agents with inter-cluster nonidentical inputs. IEEE Trans Neur Netw Learn Syst, 24(4):566-578.

[7]He WL, Gao XY, Zhong WM, et al., 2018. Secure impulsive synchronization control of multi-agent systems under deception attacks. Inform Sci, 459:354-368.

[8]Jiao Q, Modares H, Xu SY, et al., 2016. Multi-agent zero-sum differential graphical games for disturbance rejection in distributed control. Automatica, 69:24-34.

[9]Li JN, Modares H, Chai TY, et al., 2017. Off-policy reinforcement learning for synchronization in multiagent graphical games. IEEE Trans Neur Netw Learn Syst, 28(10):2434-2445.

[10]Li JQ, Wang QL, Su YX, et al., 2021. Robust distributed model predictive consensus of discrete-time multi-agent systems: a self-triggered approach. Front Inform Technol Electron Eng, 22(8):1068-1079.

[11]Liu DR, Xue S, Zhao B, et al., 2021. Adaptive dynamic programming for control: a survey and recent advances. IEEE Trans Syst Man Cybern Syst, 51(1):142-160.

[12]Ma HJ, Yang GH, 2016. Adaptive fault tolerant control of cooperative heterogeneous systems with actuator faults and unreliable interconnections. IEEE Trans Autom Contr, 61(11):3240-3255.

[13]Qin JH, Li M, Shi Y, et al., 2019. Optimal synchronization control of multiagent systems with input saturation via off-policy reinforcement learning. IEEE Trans Neur Netw Learn Syst, 30(1):85-96.

[14]Rehák B, Lynnyk V, 2021. Leader-following synchronization of a multi-agent system with heterogeneous delays. Front Inform Technol Electron Eng, 22(1):97-106.

[15]Thunberg J, Song W, Monitijano E, et al., 2014. Distributed attitude synchronization control of multi-agent systems with switching topologies. Automatica, 50(3):832-840.

[16]Vamvoudakis KG, Lewis FL, Hudas GR, 2012. Multi-agent differential graphical games: online adaptive learning solution for synchronization with optimality. Automatica, 48(8):1598-1611.

[17]Vrabie D, Lewis F, 2011. Adaptive dynamic programming for online solution of a zero-sum differential game. J Contr Theory Appl, 9(3):353-360.

[18]Wang FY, Zhang HG, Liu DR, 2009. Adaptive dynamic programming: an introduction. IEEE Comput Intell Mag, 4(2):39-47.

[19]Wei QL, Liu DR, 2014. Adaptive dynamic programming for optimal tracking control of unknown nonlinear systems with application to coal gasification. IEEE Trans Autom Sci Eng, 11(4):1020-1036.

[20]Wei QL, Wang FY, Liu DR, et al., 2014. Finite-approximation-error-based discrete-time iterative adaptive dynamic programming. IEEE Trans Cybern, 44(12):2820-2833.

[21]Wei QL, Liu DR, Lewis FL, 2015. Optimal distributed synchronization control for continuous-time heterogeneous multi-agent differential graphical games. Inform Sci, 317:96-113.

[22]Wei QL, Liu DR, Lin HQ, 2016. Value iteration adaptive dynamic programming for optimal control of discrete-time nonlinear systems. IEEE Trans Cybern, 46(3):840-853.

[23]Wei QL, Lewis FL, Sun QY, et al., 2017. Discrete-time deterministic Q-learning: a novel convergence analysis. IEEE Trans Cybern, 47(5):1224-1237.

[24]Wei QL, Lewis FL, Liu DR, et al., 2018. Discrete-time local value iteration adaptive dynamic programming: convergence analysis. IEEE Trans Syst Man Cybern Syst, 48(6):875-891.

[25]Wei QL, Li HY, Wang FY, 2020. Parallel control for continuous-time linear systems: a case study. IEEE/ CAA J Autom Sin, 7(4):919-928.

[26]Wei QL, Wang X, Zhong XN, et al., 2021. Consensus control of leader-following multi-agent systems in directed topology with heterogeneous disturbances. IEEE/CAA J Autom Sin, 8(2):423-431.

[27]Wieland P, Sepulchre R, Allgöwer F, 2011. An internal model principle is necessary and sufficient for linear output synchronization. Automatica, 47(5):1068-1074.

[28]Yang JY, Xi F, Ma J, 2019. Model-based edge-event-triggered containment control under directed topologies. IEEE Trans Cybern, 49(7):2556-2567.

[29]Yang N, Xiao JW, Xiao L, et al., 2019. Non-zero sum differential graphical game: cluster synchronisation for multi-agents with partially unknown dynamics. Int J Contr, 92(10):2408-2419.

[30]Zhang HG, Zhang JL, Yang GH, et al., 2015. Leader-based optimal coordination control for the consensus problem of multiagent differential games via fuzzy adaptive dynamic programming. IEEE Trans Fuzzy Syst, 23(1):152-163.

[31]Zhang KQ, Yang ZR, Başar T, 2021. Decentralized multi-agent reinforcement learning with networked agents: recent advances. Front Inform Technol Electron Eng, 22(6):802-814.

[32]Zhang LD, Wang B, Liu ZX, et al., 2019. Motion planning of a quadrotor robot game using a simulation-based projected policy iteration method. Front Inform Technol Electron Eng, 20(4):525-537.

[33]Zhao DY, Zhu QM, Li N, et al., 2014. Synchronized control with neuro-agents for leader-follower based multiple robotic manipulators. Neurocomputing, 124:149-161.

Open peer comments: Debate/Discuss/Question/Opinion

<1>

Please provide your name, email address and a comment





Journal of Zhejiang University-SCIENCE, 38 Zheda Road, Hangzhou 310027, China
Tel: +86-571-87952783; E-mail: cjzhang@zju.edu.cn
Copyright © 2000 - 2024 Journal of Zhejiang University-SCIENCE