booknow publishers, Inc. eBooksJan 1, 2012Closed access

Regret Analysis of Stochastic and Nonstochastic Multi-armed Bandit Problems

Princeton University

Indexed incrossref

Abstract

A multi-armed bandit problem - or, simply, a bandit problem - is a sequential allocation problem defined by a set of actions. At each time step, a unit resource is allocated to an action and some observable payoff is obtained. The goal is to maximize the total payoff obtained in a sequence of allocations. The name bandit refers to the colloquial term for a slot machine (a "one-armed bandit" in American slang). In a casino, a sequential allocation problem is obtained when the player is facing many slot machines at once (a "multi-armed bandit"), and must repeatedly choose where to insert the next coin. Multi-armed bandit problems are the most basic examples of sequential decision problems with an…

Citation impact

1,541
total citations
FWCI
27.04
Percentile
100%
References
192
Citations per year

Authors

1

Topics & keywords

Keywords
  • Multi-armed bandit
  • Stochastic game
  • Regret
  • Computer science
  • Mathematical optimization
  • Simple (philosophy)
  • Mathematical economics
  • Operations research
No related works found for this paper.