Leveraging Demonstrations for Deep Reinforcement Learning on Robotics Problems with Sparse Rewards
Indexed inarxivdatacite
Abstract
We propose a general and model-free approach for Reinforcement Learning (RL) on real robotics with sparse rewards. We build upon the Deep Deterministic Policy Gradient (DDPG) algorithm to use demonstrations. Both demonstrations and actual interactions are used to fill a replay buffer and the sampling ratio between demonstrations and transitions is automatically tuned via a prioritized replay mechanism. Typically, carefully engineered shaping rewards are required to enable the agents to efficiently explore on high dimensional control problems such as robotics. They are also required for model-based acceleration methods relying on local solvers such as iLQG (e.g. Guided Policy Search and Normalized Advantage…
Citation impact
510
total citations
- FWCI
- —
- Percentile
- —
- References
- 20
Citations per year
Authors
10- VMVecerik, MelCorresponding
- THTodd Hester
- JSJonathan Scholz
- FWFumin Wang
- OPOlivier Pietquin
Topics & keywords
Topics
Keywords
- Robotics
- Reinforcement learning
- Artificial intelligence
- Computer science
- Robot
- Task (project management)
- Object (grammar)
- Function (biology)
No related works found for this paper.