|
Journal of Zhejiang University SCIENCE C
ISSN 1869-1951(Print), 1869-196x(Online), Monthly
2011 Vol.12 No.1 P.17-24
Convergence analysis of an incremental approach to online inverse reinforcement learning
Abstract: Interest in inverse reinforcement learning (IRL) has recently increased, that is, interest in the problem of recovering the reward function underlying a Markov decision process (MDP) given the dynamics of the system and the behavior of an expert. This paper deals with an incremental approach to online IRL. First, the convergence property of the incremental method for the IRL problem was investigated, and the bounds of both the mistake number during the learning process and regret were provided by using a detailed proof. Then an online algorithm based on incremental error correcting was derived to deal with the IRL problem. The key idea is to add an increment to the current reward estimate each time an action mismatch occurs. This leads to an estimate that approaches a target optimal value. The proposed method was tested in a driving simulation experiment and found to be able to efficiently recover an adequate reward function.
Key words: Incremental approach, Reward recovering, Online learning, Inverse reinforcement learning, Markov decision process
References:
Open peer comments: Debate/Discuss/Question/Opinion
<1>
DOI:
10.1631/jzus.C1010010
CLC number:
TP181
Download Full Text:
Downloaded:
3162
Clicked:
8186
Cited:
0
On-line Access:
2024-08-27
Received:
2023-10-17
Revision Accepted:
2024-05-08
Crosschecked:
2010-12-06