Monte-Carlo Planning in Large POMDPs

IIT@MIT · UNSW Sydney

Abstract

This paper introduces a Monte-Carlo algorithm for online planning in large POMDPs. The algorithm combines a Monte-Carlo update of the agent’s belief state with a Monte-Carlo tree search from the current belief state. The new algorithm, POMCP, has two important properties. First, Monte-Carlo sampling is used to break the curse of dimensionality both during belief state updates and during planning. Second, only a black box simulator of the POMDP is required, rather than explicit probability distributions. These properties enable POMCP to plan effectively in significantly larger POMDPs than has previously been possible. We demonstrate its effectiveness in three large POMDPs. We scale up a well-known benchmark…

Citation impact

852
total citations
FWCI
41.98
Percentile
100%
References
16
Citations per year

Authors

2

Topics & keywords

Keywords
  • Monte Carlo method
  • Computer science
  • Monte Carlo tree search
  • Benchmark (surveying)
  • Curse of dimensionality
  • Partially observable Markov decision process
  • Mathematical optimization
  • Observable
UN Sustainable Development Goals
  • Sustainable cities and communities
No related works found for this paper.