Continuous control with deep reinforcement learning
Google (United States) · Google DeepMind (United Kingdom)
Abstract
Abstract: We adapt the ideas underlying the success of Deep Q-Learning to the continuous action domain. We present an actor-critic, model-free algorithm based on the deterministic policy gradient that can operate over continuous action spaces. Using the same learning algorithm, network architecture and hyper-parameters, our algorithm robustly solves more than 20 simulated physics tasks, including classic problems such as cartpole swing-up, dexterous manipulation, legged locomotion and car driving. Our algorithm is able to find policies whose performance is competitive with those found by a planning algorithm with full access to the dynamics of the domain and its derivatives. We further demonstrate that for…
Citation impact
- FWCI
- 503.28
- Percentile
- 100%
- References
- 31
Authors
8- TLTimothy LillicrapCorresponding
Google (United States), Google DeepMind (United Kingdom)
- JJJonathan J. Hunt
Google (United States), Google DeepMind (United Kingdom)
- APAlexander Pritzel
Google (United States), Google DeepMind (United Kingdom)
- NHNicolas Heess
Google (United States), Google DeepMind (United Kingdom)
- TETom Erez
Google (United States), Google DeepMind (United Kingdom)
Topics & keywords
- Reinforcement learning
- Computer science
- Domain (mathematical analysis)
- Artificial intelligence
- Action (physics)
- Control (management)
- Swing
- Architecture