Discrete-Time Nonlinear HJB Solution Using Approximate Dynamic Programming: Convergence Proof

Hashemite University · The University of Texas at Arlington · +1 more institution

PubMed
Indexed incrossrefpubmed

Abstract

Convergence of the value-iteration-based heuristic dynamic programming (HDP) algorithm is proven in the case of general nonlinear systems. That is, it is shown that HDP converges to the optimal control and the optimal value function that solves the Hamilton-Jacobi-Bellman equation appearing in infinite-horizon discrete-time (DT) nonlinear optimal control. It is assumed that, at each iteration, the value and action update equations can be exactly solved. The following two standard neural networks (NN) are used: a critic NN is used to approximate the value function, whereas an action network is used to approximate the optimal control policy. It is stressed that this approach allows the implementation of HDP…

Citation impact

1,064
total citations
FWCI
37.66
Percentile
100%
References
42
Citations per year

Authors

3

Topics & keywords

Keywords
  • Hamilton–Jacobi–Bellman equation
  • Convergence (economics)
  • Nonlinear system
  • Dynamic programming
  • Computer science
  • Mathematical optimization
  • Discrete time and continuous time
  • Mathematics
No related works found for this paper.