Article contents
ASYMPTOTIC BAYES ANALYSIS FOR THE FINITE-HORIZON ONE-ARMED-BANDIT PROBLEM
Published online by Cambridge University Press: 07 January 2003
Abstract
The multiarmed-bandit problem is often taken as a basic model for the trade-off between the exploration and utilization required for efficient optimization under uncertainty. In this article, we study the situation in which the unknown performance of a new bandit is to be evaluated and compared with that of a known one over a finite horizon. We assume that the bandits represent random variables with distributions from the one-parameter exponential family. When the objective is to maximize the Bayes expected sum of outcomes over a finite horizon, it is shown that optimal policies tend to simple limits when the length of the horizon is large.
- Type
- Research Article
- Information
- Probability in the Engineering and Informational Sciences , Volume 17 , Issue 1 , January 2003 , pp. 53 - 82
- Copyright
- © 2003 Cambridge University Press
- 8
- Cited by