|
Frontiers of Information Technology & Electronic Engineering
ISSN 2095-9184 (print), ISSN 2095-9230 (online)
2021 Vol.22 No.2 P.155-169
Indirect adaptive fuzzy-regulated optimal control for unknown continuous-time nonlinear systems
Abstract: We present a novel indirect adaptive fuzzy-regulated optimal control scheme for continuous-time nonlinear systems with unknown dynamics, mismatches, and disturbances. Initially, the Hamilton-Jacobi-Bellman (HJB) equation associated with its performance function is derived for the original nonlinear systems. Unlike existing adaptive dynamic programming (ADP) approaches, this scheme uses a special non-quadratic variable performance function as the reinforcement medium in the actor-critic architecture. An adaptive fuzzy-regulated critic structure is correspondingly constructed to configure the weighting matrix of the performance function for the purpose of approximating and balancing the HJB equation. A concurrent self-organizing learning technique is designed to adaptively update the critic weights. Based on this particular critic, an adaptive optimal feedback controller is developed as the actor with a new form of augmented Riccati equation to optimize the fuzzy-regulated variable performance function in real time. The result is an online indirect adaptive optimal control mechanism implemented as an actor-critic structure, which involves continuous-time adaptation of both the optimal cost and the optimal control policy. The convergence and closed-loop stability of the proposed system are proved and guaranteed. Simulation examples and comparisons show the effectiveness and advantages of the proposed method.
Key words: Indirect adaptive optimal control, Hamilton-Jacobi-Bellman equation, Fuzzy-regulated critic, Adaptive optimal control actor, Actor-critic structure, Unknown nonlinear systems
张海运1,2,孟德远2,王进1,陆国栋1
1浙江大学流体动力与机电系统国家重点实验室,中国杭州市,310027
2中国矿业大学机械电子工程系,中国徐州市,221116
摘要:针对动力学未知、不匹配和扰动条件下的连续非线性系统,提出一种新的间接自适应模糊规划最优控制方案。首先,建立非线性系统汉密尔顿-雅各比-贝尔曼(HJB)方程及其匹配的性能函数。与现有自适应动态规划(ADP)方法不同,在执行器-评判器架构下,所提方案采用特殊的非二次变量性能函数作为强化媒介。构造一个自适应模糊规划的评判器结构来配置性能函数的权重矩阵,以逼近和平衡非线性HJB方程。同时,设计一种并行的自组织学习技术用于自适应更新该评判器的权重。在此基础上,提出一种自适应最优反馈控制器与一个新形式的增广黎卡提方程作为执行器,实时优化模糊规划后的性能函数。通过设计上述执行器-评判器架构获得一种在线间接自适应最优控制机制,可同时实现最优成本函数和最优控制策略的连续实时自适应调整。该方法的控制收敛性和闭环稳定性得到证明和保证。最后,仿真和比较表明所提方案的有效性和可靠性。
关键词组:
References:
Open peer comments: Debate/Discuss/Question/Opinion
<1>
DOI:
10.1631/FITEE.1900610
CLC number:
TP13
Download Full Text:
Downloaded:
11186
Download summary:
<Click Here>Downloaded:
1833Clicked:
8621
Cited:
0
On-line Access:
2024-08-27
Received:
2023-10-17
Revision Accepted:
2024-05-08
Crosschecked:
2020-09-28