Index policies for a class of discounted restless bandits

K. D. Glazebrook; J. Niño-Mora; P. S. Ansell

doi:10.1239/aap/1037990952

Index policies for a class of discounted restless bandits

Part of: Stochastic systems and control Operations research and management science

Published online by Cambridge University Press: 01 July 2016

K. D. Glazebrook ,

J. Niño-Mora and

P. S. Ansell

Show author details

K. D. Glazebrook: Affiliation:
University of Newcastle upon Tyne
J. Niño-Mora*: Affiliation:
Universitat Pompeu Fabra
P. S. Ansell*: Affiliation:
University of Newcastle upon Tyne
*: ∗∗ Postal address: Department of Economics and Business, Universitat Pompeu Fabra, E-08005, Barcelona, Spain.
∗∗∗ Postal address: School of Mathematics and Statistics, University of Newcastle upon Tyne, Newcastle upon Tyne NE1 7RU, UK.

Article contents

Abstract
Footnotes
References

Get access

Rights & Permissions

Abstract

The paper concerns a class of discounted restless bandit problems which possess an indexability property. Conservation laws yield an expression for the reward suboptimality of a general policy. These results are utilised to study the closeness to optimality of an index policy for a special class of simple and natural dual speed restless bandits for which indexability is guaranteed. The strong performance of the index policy is confirmed by a computational study.

Keywords

Conservation laws indexability index policy restless bandit suboptimality bound

MSC classification

Primary: 90B36: Scheduling theory, stochastic

Secondary: 93E20: Optimal stochastic control

Type: General Applied Probability
Information: Advances in Applied Probability , Volume 34 , Issue 4 , December 2002 , pp. 754 - 774

DOI: https://doi.org/10.1239/aap/1037990952 [Opens in a new window]
Copyright: Copyright © Applied Probability Trust 2002

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Footnotes

∗

Current address: School of Management, University of Edinburgh, William Robertson Building, 50 George Square, Edinburgh EH8 9JY, UK. Email address: [email protected]

References

Bertsimas, D. and Niño-Mora, J. (1996). Conservation laws, extended polymatroids and multi-armed bandit problems: a polyhedral approach to indexable systems. Math. Operat. Res. 21, 257–306.CrossRef Google Scholar

Faihe, Y. and Müller, J.-P. (1998). Behaviors coordination using restless bandit allocation indices. In From Animals to Animats 5 (Proc. 5th Internat. Conf. Simulation of Adaptive Behavior, Zürich), eds Pfeifer, R. et al., MIT Press, Cambridge, MA.Google Scholar

Gittins, J. C. (1979). Bandit processes and dynamic allocation indices (with discussion). J. R. Statist. Soc. B 41, 148–177.Google Scholar

Gittins, J. C. (1989). Multi-Armed Bandit Allocation Indices. John Wiley, New York.Google Scholar

Glazebrook, K. D. and Garbe, R. (1999). Almost optimal policies for stochastic systems which almost satisfy conservation laws. Ann. Operat. Res. 92, 19–43.CrossRef Google Scholar

Glazebrook, K. D. and Niño-Mora, J. (2001). Parallel scheduling of multiclass M/M/m queues: approximate and heavy-traffic optimization of achievable performance. To appear in Operat. Res. 49, 609–623.Google Scholar

Glazebrook, K. D. and Wilkinson, D. J. (2000). Index-based policies for discounted multi-armed bandits on parallel machines. Ann. Appl. Prob. 10, 877–896.CrossRef Google Scholar

Glazebrook, K. D., Niño-Mora, J. and Ansell, P. S. (2000). Index policies for a class of discounted restless bandits. Tech. Rep., University of Newcastle upon Tyne.Google Scholar

Niño-Mora, J. (1999). Restless bandits, partial conservation laws and indexability. Working paper 435, Department of Economics and Business, Universitat Pompeu Fabra, Barcelona.Google Scholar

Papadimitriou, C. H. and Tsitsiklis, J. N. (1999). The complexity of optimal queueing network control. Math. Operat. Res. 24, 293–305.Google Scholar

Varaiya, P. P., Walrand, J. C. and Buyukkoc, C. (1985). Extensions of the multi-armed bandit problem: the discounted case. IEEE Trans. Automatic Control 30, 426–439.Google Scholar

Veatch, M. and Wein, L. M. (1996). Scheduling a make-to-stock queue: index policies and hedging points. Operat. Res. 44, 634–647.CrossRef Google Scholar

Weber, R. R. and Weiss, G. (1990). On an index policy for restless bandits. J. Appl. Prob. 27, 637–648.Google Scholar

Weber, R. R. and Weiss, G. (1991). Addendum to ‘On an index policy for restless bandits’. Adv. Appl. Prob. 23, 429–430.Google Scholar

Whittle, P. (1988). Restless bandits: activity allocation in a changing world. In A Celebration of Applied Probability (J. Appl. Prob. Spec. Vol. 25A), ed. Gani, J., Applied Probability Trust, Sheffield, pp. 287–298.Google Scholar

Article contents

Index policies for a class of discounted restless bandits

Abstract

Keywords

MSC classification

Access options

Footnotes

References

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests