preprintArXiv.orgApr 16, 2026GREEN OA

Blazing the trails before beating the path: Sample-efficient Monte-Carlo planning

Indexed inarxivdatacite

Abstract

You are a robot and you live in a Markov decision process (MDP) with a finite or an infinite number of transitions from state-action to next states. You got brains and so you plan before you act. Luckily, your roboparents equipped you with a generative model to do some Monte-Carlo planning. The world is waiting for you and you have no time to waste. You want your planning to be efficient. Sample-efficient. Indeed, you want to exploit the possible structure of the MDP by exploring only a subset of states reachable by following near-optimal policies. You want guarantees on sample complexity that depend on a measure of the quantity of near-optimal states. You want something, that is an extension of Monte-Carlo…

Citation impact

8
total citations
FWCI
Percentile
References
17
Citations per year

Authors

3

Topics & keywords

Keywords
  • Monte Carlo method
  • Computer science
  • Markov chain Monte Carlo
  • Markov decision process
  • Exploit
  • Sample (material)
  • Mathematical optimization
  • Path (computing)
UN Sustainable Development Goals
  • Peace, Justice and strong institutions
No related works found for this paper.

Funding