OnActor-Critic Algorithms

Konda, Vijay R.; Tsitsiklis, John N.

doi:10.1137/s0363012901385691

articleSIAM Journal on Control and OptimizationJan 1, 2003Closed access

OnActor-Critic Algorithms

VRVijay R. Konda JNJohn N. Tsitsiklis

Indexed incrossref

Abstract

In this article, we propose and analyze a class of actor-critic algorithms. These are two-time-scale algorithms in which the critic uses temporal difference learning with a linearly parameterized approximation architecture, and the actor is updated in an approximate gradient direction, based on information provided by the critic. We show that the features for the critic should ideally span a subspace prescribed by the choice of parameterization of the actor. We study actor-critic algorithms for Markov decision processes with Polish state and action spaces. We state and prove two results regarding their convergence.

Citation impact

708

total citations

FWCI: 25.30
Percentile: 100%
References: 20

Citations per year

Authors

2

Topics & keywords

Topics

Keywords

Parameterized complexity
Mathematics
Convergence (economics)
Markov decision process
Subspace topology
Class (philosophy)
Algorithm
State (computer science)

UN Sustainable Development Goals

Peace, Justice and strong institutions

No related works found for this paper.