preprintarXiv (Cornell University)Apr 23, 2026GREEN OA

A single algorithm for both restless and rested rotting bandits

Afterschool Alliance · Meta (Israel) · +1 more institution

Indexed inarxivdatacite

Abstract

In many application domains (e.g., recommender systems, intelligent tutoring systems), the rewards associated to the actions tend to decrease over time. This decay is either caused by the actions executed in the past (e.g., a user may get bored when songs of the same genre are recommended over and over) or by an external factor (e.g., content becomes outdated). These two situations can be modeled as specific instances of the rested and restless bandit settings, where arms are rotting (i.e., their value decrease over time). These problems were thought to be significantly different, since Levine et al. (2017) showed that state-of-the-art algorithms for restless bandit perform poorly in the rested rotting…

Citation impact

6
total citations
FWCI
Percentile
References
0
Citations per year

Authors

4

Topics & keywords

Keywords
  • Regret
  • Computer science
  • Recommender system
  • Constant (computer programming)
  • Contrast (vision)
  • Value (mathematics)
  • Bounded function
  • Artificial intelligence
No related works found for this paper.