Actor-critic algorithms

Konda, Vijay R.; Tsitsiklis, John N.

bookDSpace@MIT (Massachusetts Institute of Technology)Jan 1, 2002GREEN OA

Actor-critic algorithms

Abstract

Abstract. In this article, we propose and analyze a class of actor-critic algorithms. These are two-time-scale algorithms in which the critic uses temporal difference learning with a linearly parameterized approximation architecture, and the actor is updated in an approximate gradient direction, based on information provided by the critic. We show that the features for the critic should ideally span a subspace prescribed by the choice of parameterization of the actor. We study actor-critic algorithms for Markov decision processes with Polish state and action spaces. We state and prove two results regarding their convergence.

Citation impact

1,818

total citations

FWCI: 8.23
Percentile: 100%
References: 17

Citations per year

Authors

2

Topics & keywords

Topics

Keywords

Markov decision process
Bellman equation
Dynamic programming
Reinforcement learning
Mathematical optimization
Computer science
Stochastic control
Optimal control

UN Sustainable Development Goals

Peace, Justice and strong institutions

No related works found for this paper.