Tornar a Working Papers

Paper #206

Títol:
Minimax lower bounds for the two-armed bandit problem
Autors:
Sanjeev R. Kulkarni i Gábor Lugosi
Data:
Febrer 1997
Resum:
We obtain minimax lower bounds on the regret for the classical two--armed bandit problem. We provide a finite--sample minimax version of the well--known log $n$ asymptotic lower bound of Lai and Robbins. Also, in contrast to the log $n$ asymptotic results on the regret, we show that the minimax regret is achieved by mere random guessing under fairly mild conditions on the set of allowable configurations of the two arms. That is, we show that for {\sl every} allocation rule and for {\sl every} $n$, there is a configuration such that the regret at time $n$ is at least 1 -- $\epsilon$ times the regret of random guessing, where $\epsilon$ is any small positive constant.
Paraules clau:
Bandit problem, minimax lower bounds
Codis JEL:
C12, C73
Àrea de Recerca:
Microeconomia
Publicat a:
IEEE Transactions on Automatic Control

Descarregar el paper en format PDF