preprintarXiv (Cornell University)Jul 27, 2017GREEN OA

Leveraging Demonstrations for Deep Reinforcement Learning on Robotics Problems with Sparse Rewards

VMVecerik, MelTHTodd HesterJSJonathan ScholzFWFumin WangOPOlivier Pietquin
Indexed inarxivdatacite

Abstract

We propose a general and model-free approach for Reinforcement Learning (RL) on real robotics with sparse rewards. We build upon the Deep Deterministic Policy Gradient (DDPG) algorithm to use demonstrations. Both demonstrations and actual interactions are used to fill a replay buffer and the sampling ratio between demonstrations and transitions is automatically tuned via a prioritized replay mechanism. Typically, carefully engineered shaping rewards are required to enable the agents to efficiently explore on high dimensional control problems such as robotics. They are also required for model-based acceleration methods relying on local solvers such as iLQG (e.g. Guided Policy Search and Normalized Advantage…

Citation impact

510
total citations
FWCI
Percentile
References
20
Citations per year

Authors

10

Topics & keywords

Keywords
  • Robotics
  • Reinforcement learning
  • Artificial intelligence
  • Computer science
  • Robot
  • Task (project management)
  • Object (grammar)
  • Function (biology)
No related works found for this paper.