preprintarXiv (Cornell University)Apr 17, 2026GREEN OA

Sample Complexity Bounds for Stochastic Shortest Path with a Generative Model

Meta (Israel) · Afterschool Alliance · +1 more institution

Indexed inarxivdatacite

Abstract

We study the sample complexity of learning an $ε$-optimal policy in the Stochastic Shortest Path (SSP) problem. We first derive sample complexity bounds when the learner has access to a generative model. We show that there exists a worst-case SSP instance with $S$ states, $A$ actions, minimum cost $c_{\min}$, and maximum expected cost of the optimal policy over all states $B_{\star}$, where any algorithm requires at least $Ω(SAB_{\star}^3/(c_{\min}ε^2))$ samples to return an $ε$-optimal policy with high probability. Surprisingly, this implies that whenever $c_{\min} = 0$ an SSP problem may not be learnable, thus revealing that learning in SSPs is strictly harder than in the finite-horizon and discounted…

Citation impact

8
total citations
FWCI
Percentile
References
13
Citations per year

Authors

4

Topics & keywords

Keywords
  • Regret
  • Shortest path problem
  • Mathematical optimization
  • Generalization
  • Computer science
  • Sample (material)
  • Oracle
  • Variance reduction
No related works found for this paper.