JZUS - Journal of Zhejiang University SCIENCE

Journal of Zhejiang University SCIENCE C

ISSN 1869-1951(Print), 1869-196x(Online), Monthly

2014 Vol.15 No.1 P.43-50

Adaptive dynamic programming for linear impulse systems

Xiao-hua Wang, Juan-juan Yu, Yao Huang, Hua Wang, Zhong-hua Miao

School of Mechatronics Engineering and Automation, Shanghai University, Shanghai 200072, China; Shanghai Key Laboratory of Power Station Automation Technology, Shanghai University, Shanghai 200072, China

x.wang@shu.edu.cn, zhhmiao@shu.edu.cn

Abstract: We investigate the optimization of linear impulse systems with the reinforcement learning based adaptive dynamic programming (ADP) method. For linear impulse systems, the optimal objective function is shown to be a quadric form of the pre-impulse states. The ADP method provides solutions that iteratively converge to the optimal objective function. If an initial guess of the pre-impulse objective function is selected as a quadratic form of the pre-impulse states, the objective function iteratively converges to the optimal one through ADP. Though direct use of the quadratic objective function of the states within the ADP method is theoretically possible, the numerical singularity problem may occur due to the matrix inversion therein when the system dimensionality increases. A neural network based ADP method can circumvent this problem. A neural network with polynomial activation functions is selected to approximate the pre-impulse objective function and trained iteratively using the ADP method to achieve optimal control. After a successful training, optimal impulse control can be derived. Simulations are presented for illustrative purposes.

Key words: Adaptive dynamic programming (ADP), Impulse system, Optimal control, Neural network

Chinese Summary <69> 线性脉冲系统的自适应动态规划方法

研究目的：针对线性脉冲系统最优控制，研究了基于自适应动态规划的递归方法。通过神经网络逼近最优目标函数，得出最优控制率。求解过程适用于一般脉冲系统，无需初始稳定控制器，为此类系统的最优控制提供理论依据。
创新要点：目前自适应动态规划方法研究局限于连续和离散系统，对脉冲系统研究较少。本文研究了线性脉冲系统的最优控制问题，运用自适应动态规划思路，完成了脉冲系统相关理论证明，证实了方法的收敛性。通过神经网络逼近最优目标函数，当迭代稳定后，神经网络获得稳定参数，同时获得最优脉冲控制率。
方法提亮：线性脉冲系统的最优目标函数是一个状态二次型，但其中的P矩阵表现为跳跃的脉冲形式。基于此发现，以迭代学习为基础的自适应动态规划方法适用于最优脉冲求解。本文提出的方法避免了直接迭代的矩阵求逆，大大降低了运算量。
重要结论：线性脉冲系统的最优目标函数表现为状态二次型，可通过自适应动态规划方法迭代求解，求解过程稳定。通过神经网络逼近最优目标函数，可避免矩阵求逆，大大降低计算量。

关键词组：脉冲系统；自适应动态规划；最优控制；神经网络

Share this article to： More

Go to Contents

References:

Open peer comments: Debate/Discuss/Question/Opinion

<1>

DOI:

10.1631/jzus.C1300145

CLC number:

TP273.1

Download Full Text:

Click Here

Downloaded:

5408

Download summary:

Downloaded:

3034

Clicked:

10225

Cited:

On-line Access:

2024-08-27

Received:

2023-10-17

Revision Accepted:

2024-05-08

Crosschecked:

2013-12-19

Journal of Zhejiang University-SCIENCE, 38 Zheda Road, Hangzhou 310027, China
Tel: +86-571-87952276; Fax: +86-571-87952331; E-mail: jzus@zju.edu.cn
Copyright © 2000~ Journal of Zhejiang University-SCIENCE

CONTENTS

INSTR. FOR AUTHOR

FOR REVIEWER

ABOUT JZUS

Publishing Service