PILCO: A Model-Based and Data-Efficient Approach to Policy Search

Deisenroth, Marc Peter; Rasmussen, Carl Edward

articleCambridge University Engineering Department Publications DatabaseJun 28, 2011GREEN OA

PILCO: A Model-Based and Data-Efficient Approach to Policy Search

MPMarc Peter Deisenroth CECarl Edward Rasmussen

University of Washington · University of Cambridge

Abstract

In this paper, we introduce pilco, a practical, data-efficient model-based policy search method. Pilco reduces model bias, one of the key problems of model-based reinforcement learning, in a principled way. By learning a probabilistic dynamics model and explicitly incorporating model uncertainty into long-term planning, pilco can cope with very little data and facilitates learning from scratch in only a few trials. Policy evaluation is performed in closed form using state-ofthe-art approximate inference. Furthermore, policy gradients are computed analytically for policy improvement. We report unprecedented learning efficiency on challenging and high-dimensional control tasks. 1. Introduction and Related

Citation impact

1,078

total citations

FWCI: 37.86
Percentile: 100%
References: 25

Citations per year

Authors

2

Topics & keywords

Topics

Keywords

Computer science
Reinforcement learning
Inference
Probabilistic logic
Key (lock)
Artificial intelligence
Machine learning
Policy learning

No related works found for this paper.