Hostname: page-component-586b7cd67f-t8hqh Total loading time: 0 Render date: 2024-11-28T06:20:30.120Z Has data issue: false hasContentIssue false

On the evaluation of suboptimal strategies for families of alternative bandit processes

Published online by Cambridge University Press:  14 July 2016

K. D. Glazebrook*
Affiliation:
University of Newcastle upon Tyne
*
Postal address: Department of Statistics, The University, Newcastle upon Tyne, NE1 7RU, U.K.

Abstract

Families of alternative bandit processes have been used as models for problems in a variety of areas. Optimal strategies for these decision processes are determined by dynamic allocation indices. These indices are here shown to play an important role in the evaluation of suboptimal strategies.

Type
Short Communications
Copyright
Copyright © Applied Probability Trust 1982 

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

Bather, J. A. (1981) Randomized allocation of treatments in sequential experiments (with discussion). J. R. Statist. Soc. B 43, 265292.Google Scholar
Beckmann, M. J. (1973) Der diskontierte Bandit. OR - Verfahren XVIII, 918.Google Scholar
Fischer, J. (1979) Der diskontierte einarmige Bandit. Metrika 26, 195204.Google Scholar
Gittins, J. C. (1979) Bandit processes and dynamic allocation indices. J. R. Statist. Soc. B 41, 148177.Google Scholar
Gittins, J. C. and Glazebrook, K. D. (1977) On Bayesian models in stochastic scheduling. J. Appl. Prob. 14, 556565.Google Scholar
Gittins, J. C. and Jones, D. M. (1974) A dynamic allocation index for the sequential design of experiments. In Progress in Statistics, ed. Gani, J. North-Holland, Amsterdam.Google Scholar
Glazebrook, K. D. (1976) Stochastic scheduling with order constraints. Internat. J. Systems Sci. 7, 657666.Google Scholar
Glazebrook, K. D. (1978) On the optimal allocation of two or more treatments in a controlled clinical trial. Biometrika 65, 335340.Google Scholar
Glazebrook, K. D. (1980) On randomised dynamic allocation indices for the sequential design of experiments. J. R. Statist. Soc. B 42, 342346.Google Scholar
Glazebrook, K. D. and Jones, D. M. (1983) Some best possible results for a discounted one armed bandit. Metrika. To appear.Google Scholar
Nash, P. (1973) Optimal Allocation of Resources Between Research Projects. Ph. D. Thesis, Cambridge University.Google Scholar
Whittle, P. (1980) Multi-armed bandits and the Gittins index. J. R. Statist. Soc. B 42, 143149.Google Scholar