A Survey of Actor-Critic Reinforcement Learning: Standard and Natural Policy Gradients

Grondman, I.; Buşoniu, Lucian; Lopes, Gabriel A. D.; Babuška, Robert

doi:10.1109/tsmcc.2012.2218595

articleIEEE Transactions on Systems Man and Cybernetics Part C (Applications and Reviews)Nov 1, 2012GREEN OA

A Survey of Actor-Critic Reinforcement Learning: Standard and Natural Policy Gradients

IGI. Grondman LBLucian Buşoniu GAGabriel A. D. Lopes RBRobert Babuška

Delft University of Technology · Technical University of Cluj-Napoca · +2 more institutions

Indexed incrossref

Abstract

Policy-gradient-based actor-critic algorithms are amongst the most popular algorithms in the reinforcement learning framework. Their advantage of being able to search for optimal policies using low-variance gradient estimates has made them useful in several real-life applications, such as robotics, power control, and finance. Although general surveys on reinforcement learning techniques already exist, no survey is specifically dedicated to actor-critic algorithms in particular. This paper, therefore, describes the state of the art of actor-critic algorithms, with a focus on methods that can work in an online setting and use function approximation in order to deal with continuous state and action spaces. After…

Citation impact

1,036

total citations

FWCI: 21.66
Percentile: 100%
References: 107

Citations per year

Authors

4

Topics & keywords

Topics

Keywords

Reinforcement learning
Computer science
Artificial intelligence
Natural (archaeology)
Robotics
Function (biology)
Variance (accounting)
Machine learning

No related works found for this paper.