A Survey of Actor-Critic Reinforcement Learning: Standard and Natural Policy Gradients
Delft University of Technology · Technical University of Cluj-Napoca · +2 more institutions
Abstract
Policy-gradient-based actor-critic algorithms are amongst the most popular algorithms in the reinforcement learning framework. Their advantage of being able to search for optimal policies using low-variance gradient estimates has made them useful in several real-life applications, such as robotics, power control, and finance. Although general surveys on reinforcement learning techniques already exist, no survey is specifically dedicated to actor-critic algorithms in particular. This paper, therefore, describes the state of the art of actor-critic algorithms, with a focus on methods that can work in an online setting and use function approximation in order to deal with continuous state and action spaces. After…
Citation impact
- FWCI
- 21.66
- Percentile
- 100%
- References
- 107
Authors
4Topics & keywords
- Reinforcement learning
- Computer science
- Artificial intelligence
- Natural (archaeology)
- Robotics
- Function (biology)
- Variance (accounting)
- Machine learning