Blazing the trails before beating the path: Sample-efficient Monte-Carlo planning
Indexed inarxivdatacite
Abstract
You are a robot and you live in a Markov decision process (MDP) with a finite or an infinite number of transitions from state-action to next states. You got brains and so you plan before you act. Luckily, your roboparents equipped you with a generative model to do some Monte-Carlo planning. The world is waiting for you and you have no time to waste. You want your planning to be efficient. Sample-efficient. Indeed, you want to exploit the possible structure of the MDP by exploring only a subset of states reachable by following near-optimal policies. You want guarantees on sample complexity that depend on a measure of the quantity of near-optimal states. You want something, that is an extension of Monte-Carlo…
Citation impact
8
total citations
- FWCI
- —
- Percentile
- —
- References
- 17
Citations per year
Authors
3Topics & keywords
Topics
Keywords
- Monte Carlo method
- Computer science
- Markov chain Monte Carlo
- Markov decision process
- Exploit
- Sample (material)
- Mathematical optimization
- Path (computing)
UN Sustainable Development Goals
- Peace, Justice and strong institutions
No related works found for this paper.