Sample Complexity Bounds for Stochastic Shortest Path with a Generative Model

Tarbouriech, Jean; Pirotta, Matteo; Vaľko, Michal; Lazaric, Alessandro

doi:10.48550/arxiv.2604.16111

preprintarXiv (Cornell University)Apr 17, 2026GREEN OA

Sample Complexity Bounds for Stochastic Shortest Path with a Generative Model

JTJean Tarbouriech MPMatteo Pirotta MVMichal Vaľko ALAlessandro Lazaric

Meta (Israel) · Afterschool Alliance · +1 more institution

Indexed inarxivdatacite

Abstract

We study the sample complexity of learning an $ε$-optimal policy in the Stochastic Shortest Path (SSP) problem. We first derive sample complexity bounds when the learner has access to a generative model. We show that there exists a worst-case SSP instance with $S$ states, $A$ actions, minimum cost $c_{\min}$, and maximum expected cost of the optimal policy over all states $B_{\star}$, where any algorithm requires at least $Ω(SAB_{\star}^3/(c_{\min}ε^2))$ samples to return an $ε$-optimal policy with high probability. Surprisingly, this implies that whenever $c_{\min} = 0$ an SSP problem may not be learnable, thus revealing that learning in SSPs is strictly harder than in the finite-horizon and discounted…

Citation impact

8

total citations

FWCI: —
Percentile: —
References: 13

Citations per year

Authors

4

Topics & keywords

Topics

Keywords

Regret
Shortest path problem
Mathematical optimization
Generalization
Computer science
Sample (material)
Oracle
Variance reduction

No related works found for this paper.