Full Text:   <7058>

Summary:  <1602>

CLC number: TP13

On-line Access: 2021-02-01

Received: 2019-11-11

Revision Accepted: 2020-03-27

Crosschecked: 2020-09-28

Cited: 0

Clicked: 6706

Citations:  Bibtex RefMan EndNote GB/T7714


Jin Wang


Haiyun Zhang


-   Go to

Article info.
Open peer comments

Frontiers of Information Technology & Electronic Engineering  2021 Vol.22 No.2 P.155-169


Indirect adaptive fuzzy-regulated optimal control for unknown continuous-time nonlinear systems

Author(s):  Haiyun Zhang, Deyuan Meng, Jin Wang, Guodong Lu

Affiliation(s):  State Key Laboratory of Fluid Power and Mechatronic Systems, Zhejiang University, Hangzhou 310027, China; more

Corresponding email(s):   gray_sun@zju.edu.cn, tinydreams@126.com, dwjcom@zju.edu.cn, lugd@zju.edu.cn

Key Words:  Indirect adaptive optimal control, Hamilton-Jacobi-Bellman equation, Fuzzy-regulated critic, Adaptive optimal control actor, Actor-critic structure, Unknown nonlinear systems

Haiyun Zhang, Deyuan Meng, Jin Wang, Guodong Lu. Indirect adaptive fuzzy-regulated optimal control for unknown continuous-time nonlinear systems[J]. Frontiers of Information Technology & Electronic Engineering, 2021, 22(2): 155-169.

@article{title="Indirect adaptive fuzzy-regulated optimal control for unknown continuous-time nonlinear systems",
author="Haiyun Zhang, Deyuan Meng, Jin Wang, Guodong Lu",
journal="Frontiers of Information Technology & Electronic Engineering",
publisher="Zhejiang University Press & Springer",

%0 Journal Article
%T Indirect adaptive fuzzy-regulated optimal control for unknown continuous-time nonlinear systems
%A Haiyun Zhang
%A Deyuan Meng
%A Jin Wang
%A Guodong Lu
%J Frontiers of Information Technology & Electronic Engineering
%V 22
%N 2
%P 155-169
%@ 2095-9184
%D 2021
%I Zhejiang University Press & Springer
%DOI 10.1631/FITEE.1900610

T1 - Indirect adaptive fuzzy-regulated optimal control for unknown continuous-time nonlinear systems
A1 - Haiyun Zhang
A1 - Deyuan Meng
A1 - Jin Wang
A1 - Guodong Lu
J0 - Frontiers of Information Technology & Electronic Engineering
VL - 22
IS - 2
SP - 155
EP - 169
%@ 2095-9184
Y1 - 2021
PB - Zhejiang University Press & Springer
ER -
DOI - 10.1631/FITEE.1900610

We present a novel indirect adaptive fuzzy-regulated optimal control scheme for continuous-time nonlinear systems with unknown dynamics, mismatches, and disturbances. Initially, the Hamilton-Jacobi-Bellman (HJB) equation associated with its performance function is derived for the original nonlinear systems. Unlike existing adaptive dynamic programming (ADP) approaches, this scheme uses a special non-quadratic variable performance function as the reinforcement medium in the actor-critic architecture. An adaptive fuzzy-regulated critic structure is correspondingly constructed to configure the weighting matrix of the performance function for the purpose of approximating and balancing the HJB equation. A concurrent self-organizing learning technique is designed to adaptively update the critic weights. Based on this particular critic, an adaptive optimal feedback controller is developed as the actor with a new form of augmented Riccati equation to optimize the fuzzy-regulated variable performance function in real time. The result is an online indirect adaptive optimal control mechanism implemented as an actor-critic structure, which involves continuous-time adaptation of both the optimal cost and the optimal control policy. The convergence and closed-loop stability of the proposed system are proved and guaranteed. Simulation examples and comparisons show the effectiveness and advantages of the proposed method.





Darkslateblue:Affiliate; Royal Blue:Author; Turquoise:Article


[1]Abu-Khalaf M, Lewis FL, 2005. Nearly optimal control laws for nonlinear systems with saturating actuators using a neural network HJB approach. Automatica, 41(5):779-791.

[2]Bhasin S, Kamalapurkar R, Johnson M, et al., 2013. A novel actor-critic-identifier architecture for approximate optimal control of uncertain nonlinear systems. Automatica, 49(1):82-92.

[3]Bian T, Jiang ZP, 2016. Value iteration and adaptive dynamic programming for data-driven adaptive optimal control design. Automatica, 71:348-360.

[4]Chang XH, Yang C, Xiong J, 2019. Quantized fuzzy output feedback H control for nonlinear systems with adjustment of dynamic parameters. IEEE Trans Syst Man Cybern Syst, 49(10):2005-2015.

[5]Chang Y, Wang YQ, Alsaadi FE, et al., 2019. Adaptive fuzzy output-feedback tracking control for switched stochastic pure-feedback nonlinear systems. Int J Adapt Contr Signal Process, 33(10):1567-1582.

[6]Finlayson BA, 1990. The Method of Weighted Residuals and Variational Principles. Academic Press, New York, USA.

[7]Huo X, Ma L, Zhao XD, et al., 2020. Event-triggered adaptive fuzzy output feedback control of MIMO switched nonlinear systems with average dwell time. Appl Math Comput, 365:124665.

[8]Ioannou PA, Fidan B, 2006. Advances in Design and Control. Adaptive Control Tutorial. SIAM, Philadelphia, USA.

[9]Jiang Y, Jiang ZP, 2012. Computational adaptive optimal control for continuous-time linear systems with completely unknown dynamics. Automatica, 48(10):2699-2704.

[10]Jiang Y, Jiang ZP, 2014. Robust adaptive dynamic programming and feedback stabilization of nonlinear systems. IEEE Trans Neur Netw Learn Syst, 25(5):882-893.

[11]Kiumarsi B, Lewis FL, Modares H, et al., 2014. Reinforcement Q-learning for optimal tracking control of linear discrete-time systems with unknown dynamics. Automatica, 50(4):1167-1175.

[12]Lee JM, Lee JH, 2004. Approximate dynamic programming strategies and their applicability for process control: a review and future directions. Int J Contr Autom Syst, 2(3):263-278.

[13]Lee JY, Park JB, Choi YH, 2012. Integral Q-learning and explorized policy iteration for adaptive optimal control of continuous-time linear systems. Automatica, 48(11):2850-2859.

[14]Lee JY, Park JB, Choi YH, 2015. Integral reinforcement learning for continuous-time input-affine nonlinear systems with simultaneous invariant explorations. IEEE Trans Neur Netw Learn Syst, 26(5):916-932.

[15]Lewis FL, Vrabie DL, Syrmos VL, 2012a. Optimal Control (3rd Ed.). Wiley, Hoboken, USA.

[16]Lewis FL, Vrabie D, Vamvoudakis KG, 2012b. Reinforcement learning and feedback control: using natural decision methods to design optimal adaptive controllers. IEEE Contr Syst Mag, 32(6):76-105.

[17]Li YM, Tong SC, Li TS, 2016. Hybrid fuzzy adaptive output feedback control design for uncertain MIMO nonlinear systems with time-varying delays and input saturation. IEEE Trans Fuzzy Syst, 24(4):841-853.

[18]Lin WS, 2011. Optimality and convergence of adaptive optimal control by reinforcement synthesis. Automatica, 47(5):1047-1052.

[19]Liu DR, Wei QL, 2013. Finite-approximation-error-based optimal control approach for discrete-time nonlinear systems. IEEE Trans Cybern, 43(2):779-789.

[20]Liu DR, Yang X, Li HL, 2013. Adaptive optimal control for a class of continuous-time affine nonlinear systems with unknown internal dynamics. Neur Comput Appl, 23(7):1843-1850.

[21]Liu DR, Wang D, Wang FY, et al., 2014. Neural-network-based online HJB solution for optimal robust guaranteed cost control of continuous-time uncertain nonlinear systems. IEEE Trans Cybern, 44(12):2834-2847.

[22]Ma L, Huo X, Zhao XD, et al., 2019. Adaptive fuzzy tracking control for a class of uncertain switched nonlinear systems with multiple constraints: a small-gain approach. Int J Fuzzy Syst, 21(8):2609-2624.

[23]Modares H, Lewis FL, 2014. Optimal tracking control of nonlinear partially-unknown constrained-input systems using integral reinforcement learning. Automatica, 50(7):1780-1792.

[24]Modares H, Naghibi Sistani MB, Lewis FL, 2013. A policy iteration approach to online optimal control of continuous-time constrained-input systems. ISA Trans, 52(5):611-621.

[25]Murray JJ, Cox CJ, Lendaris GG, et al., 2002. Adaptive dynamic programming. IEEE Trans Syst Man Cybern Part C, 32(2):140-153.

[26]Padhi R, Unnikrishnan N, Wang XH, et al., 2006. A Single Network Adaptive Critic (SNAC) architecture for optimal control synthesis for a class of nonlinear systems. Neur Netw, 19(10):1648-1660.

[27]Powell WB, 2007. Approximate Dynamic Programming: Solving the Curses of Dimensionality. Wiley, New York, USA.

[28]Sastry SS, 1999. Nonlinear Systems: Analysis, Stability, and Control. Springer-Verlag, New York, USA.

[29]Slotine JE, Li W, 1991. Applied Nonlinear Control. Prentice Hall, Englewood Cliffs, NJ, USA.

[30]Song RZ, Xiao WD, Zhang HG, et al., 2014. Adaptive dynamic programming for a class of complex-valued nonlinear systems. IEEE Trans Neur Netw Learn Syst, 25(9):1733-1739.

[31]Tao G, 2003. Adaptive Control Design and Analysis. In: Adaptive and Learning Systems for Signal Processing, Communications and Control Series. Wiley-Interscience, Hoboken, NJ, USA.

[32]Vamvoudakis KG, 2017. Q-learning for continuous-time linear systems: a model-free infinite horizon optimal control approach. Syst Contr Lett, 100:14-20.

[33]Vamvoudakis KG, Lewis FL, 2010. Online actor-critic algorithm to solve the continuous-time infinite horizon optimal control problem. Automatica, 46(5):878-888.

[34]van der Schaft AJ, 1992. L2-gain analysis of nonlinear systems and nonlinear state-feedback H1 control. IEEE Trans Autom Contr, 37(6):770-784.

[35]Vrabie D, Pastravanu O, Abu-Khalaf M, et al., 2009. Adaptive optimal control for continuous-time linear systems based on policy iteration. Automatica, 45(2):477-484.

[36]Wang FY, Zhang HG, Liu DR, 2009. Adaptive dynamic programming: an introduction. IEEE Comput Intell Mag, 4(2):39-47.

[37]Wei QL, Zhang HG, Dai J, 2009. Model-free multiobjective approximate dynamic programming for discrete-time nonlinear systems with general performance index functions. Neurocomputing, 72(8-9):1839-1848.

[38]Werbos P, 2004. ADP: goals, opportunities and principles. In: Si J, Barto A, Powell W, et al. (Eds.), Handbook of Learning and Approximate Dynamic Programming. Institute of Electrical and Electronics Engineers, New York, USA, p.3-44.

[39]Yang X, He HB, 2018. Self-learning robust optimal control for continuous-time nonlinear systems with mismatched disturbances. Neur Netw, 99:19-30.

[40]Yang X, Liu DR, Luo B, et al., 2016. Data-based robust adaptive control for a class of unknown nonlinear constrained-input systems via integral reinforcement learning. Inform Sci, 369:731-747.

[41]Yang XY, Liu DR, Huang YZ, 2013. Neural-network-based online optimal control for uncertain non-linear continuous-time systems with control constraints. IET Contr Theory Appl, 7(17):2037-2047.

[42]Yin YF, Zhao XD, Zheng XL, 2017. New stability and stabilization conditions of switched systems with mode-dependent average dwell time. Circ Syst Signal Process, 36(1):82-98.

[43]Yu ZX, Yang YK, Li SG, et al., 2018. Observer-based adaptive finite-time quantized tracking control of nonstrict-feedback nonlinear systems with asymmetric actuator saturation. IEEE Trans Syst Man Cyber Syst, 50(11):545-4556.

[44]Zak SH, 2003. Systems and Control. Oxford University Press, New York, USA.

Open peer comments: Debate/Discuss/Question/Opinion


Please provide your name, email address and a comment

Journal of Zhejiang University-SCIENCE, 38 Zheda Road, Hangzhou 310027, China
Tel: +86-571-87952783; E-mail: cjzhang@zju.edu.cn
Copyright © 2000 - 2024 Journal of Zhejiang University-SCIENCE