preprintarXiv (Cornell University)Jun 8, 2020GREEN OA

Conservative Q-Learning for Offline Reinforcement Learning

University of California, Berkeley

Indexed inarxivdatacite

Abstract

Effectively leveraging large, previously collected datasets in reinforcement learning (RL) is a key challenge for large-scale real-world applications. Offline RL algorithms promise to learn effective policies from previously-collected, static datasets without further interaction. However, in practice, offline RL presents a major challenge, and standard off-policy RL methods can fail due to overestimation of values induced by the distributional shift between the dataset and the learned policy, especially when training on complex and multi-modal data distributions. In this paper, we propose conservative Q-learning (CQL), which aims to address these limitations by learning a conservative Q-function such that the…

Citation impact

537
total citations
FWCI
Percentile
References
60
Citations per year

Authors

4

Topics & keywords

Keywords
  • Reinforcement learning
  • Computer science
  • Function (biology)
  • Modal
  • Artificial intelligence
  • Machine learning
  • Key (lock)
  • Value (mathematics)
No related works found for this paper.