Discrete-Time Nonlinear HJB Solution Using Approximate Dynamic Programming: Convergence Proof

Al-Tamimi, A.; Lewis, Frank L.; Abu-Khalaf, Murad

doi:10.1109/tsmcb.2008.926614

articleIEEE Transactions on Systems Man and Cybernetics Part B (Cybernetics)Jul 24, 2008Closed access

Discrete-Time Nonlinear HJB Solution Using Approximate Dynamic Programming: Convergence Proof

AAA. Al-Tamimi FLFrank L. Lewis MAMurad Abu-Khalaf

Hashemite University · The University of Texas at Arlington · +1 more institution

PubMed

Indexed incrossrefpubmed

Abstract

Convergence of the value-iteration-based heuristic dynamic programming (HDP) algorithm is proven in the case of general nonlinear systems. That is, it is shown that HDP converges to the optimal control and the optimal value function that solves the Hamilton-Jacobi-Bellman equation appearing in infinite-horizon discrete-time (DT) nonlinear optimal control. It is assumed that, at each iteration, the value and action update equations can be exactly solved. The following two standard neural networks (NN) are used: a critic NN is used to approximate the value function, whereas an action network is used to approximate the optimal control policy. It is stressed that this approach allows the implementation of HDP…

Citation impact

1,064

total citations

FWCI: 37.66
Percentile: 100%
References: 42

Citations per year

Authors

3

Topics & keywords

Topics

Keywords

Hamilton–Jacobi–Bellman equation
Convergence (economics)
Nonlinear system
Dynamic programming
Computer science
Mathematical optimization
Discrete time and continuous time
Mathematics

No related works found for this paper.