Spinning plates and squad systems: policies for bi-directional restless bandits

K. D. Glazebrook; C. Kirkbride; D. Ruiz-Hernandez

doi:10.1239/aap/1143936142

Spinning plates and squad systems: policies for bi-directional restless bandits

Part of: Numerical methods in calculus of variations and optimal control Hamilton-Jacobi theories, including dynamic programming

Published online by Cambridge University Press: 01 July 2016

K. D. Glazebrook ,

C. Kirkbride and

D. Ruiz-Hernandez

Show author details

K. D. Glazebrook*: Affiliation:
Lancaster University
C. Kirkbride*: Affiliation:
Lancaster University
D. Ruiz-Hernandez*: Affiliation:
Universitat Pompeu Fabra
*: ∗ Postal address: Department of Mathematics and Statistics, Lancaster University, Lancaster LA1 4YF, UK. Email address: [email protected]
∗∗ Postal address: Department of Management Science, Lancaster University, Lancaster LA1 4YX, UK.
∗∗∗ Department of Economics and Business, Universitat Pompeu Fabra, Barcelona, E-08005, Spain.

Article contents

Abstract
References

Rights & Permissions

Abstract

Core share and HTML view are not available for this content. However, as you have access to this content, a full PDF is available via the ‘Save PDF’ action button.

This paper concerns two families of Markov decision problem that fall within the family of (bi-directional) restless bandits, an intractable class of decision processes introduced by Whittle. The spinning plates problem concerns the optimal management of a portfolio of reward-generating assets whose yields grow with investment but otherwise tend to decline. In the model of asset exploitation called the squad system, the yield from an asset tends to decline when it is used but will recover when the asset is at rest. In all cases, simply stated conditions are given that guarantee indexability of the problem, together with conditions necessary and sufficient for its strict indexability. The index heuristics for asset activation that emerge from the analysis are assessed numerically and found to perform very strongly.

Keywords

Index policy Lagrangian method Markov decision problem restless bandit stochastic dynamic programming

MSC classification

Primary: 90C40: Markov and semi-Markov decision processes

Secondary: 49L20: Dynamic programming method 90C39: Dynamic programming 49M20: Methods of relaxation type

Type: General Applied Probability
Information: Advances in Applied Probability , Volume 38 , Issue 1 , March 2006 , pp. 95 - 115

DOI: https://doi.org/10.1239/aap/1143936142 [Opens in a new window]
Copyright: Copyright © Applied Probability Trust 2006

References

Ansell, P. S., Glazebrook, K. D., Niño-Mora, J. and O'Keeffe, M. (2003). Whittle's index policy for a multi-class queueing system with convex holding costs. Math. Meth. Operat. Res. 57, 21–39.Google Scholar

Gittins, J. C. (1979). Bandit processes and dynamic allocation indices. With discussion. J. R. Statist. Soc. Ser. B 41, 148–177.Google Scholar

Gittins, J. C. (1989). Multi-Armed Bandit Allocation Indices. John Wiley, Chichester.Google Scholar

Glazebrook, K. D., Lumley, R. R. and Ansell, P. S. (2003). Index heuristics for multi-class M/G/1 systems with non-preemptive service and convex holding costs. Queueing Systems 45, 81–111.CrossRef Google Scholar

Glazebrook, K. D., Niño-Mora, J. and Ansell, P. S. (2002). Index policies for a class of discounted restless bandits. Adv. Appl. Prob. 34, 754–774.CrossRef Google Scholar

Niño-Mora, J. (2001a). PCL-indexable restless bandits: diminishing marginal returns, optimal marginal reward rate index characterization, and a tiring–recovery model. Unpublished manuscript.Google Scholar

Niño-Mora, J. (2001b). Restless bandits, partial conservation laws and indexability. Adv. Appl. Prob. 33, 76–98.Google Scholar

Niño-Mora, J. (2002). Dynamic allocation indices for restless projects and queueing admission control: a polyhedral approach. Math. Program. 93, 361–413.CrossRef Google Scholar

Papadimitriou, C. H. and Tsitsiklis, J. N. (1999). The complexity of optimal queueing network control. Math. Operat. Res. 24, 293–305.Google Scholar

Puterman, M. L. (1994). Markov Decision Processes: Discrete Stochastic Dynamic Programming. John Wiley, New York.CrossRef Google Scholar

Tijms, H. C. (1994). Stochastic Models: An Algorithmic Approach. John Wiley, New York.Google Scholar

Weber, R. R. and Weiss, G. (1990). On an index policy for restless bandits. J. Appl. Prob. 27, 637–648.CrossRef Google Scholar

Weber, R. R. and Weiss, G. (1991). Addendum to ‘On an index policy for restless bandits’. Adv. Appl. Prob. 23, 429–430.Google Scholar

Whittle, P. (1988). Restless bandits: activity allocation in a changing world. In A Celebration of Applied Probability (J. Appl. Prob. Spec. Vol. 25A), pplied Probability Trust, Sheffield, pp. 287–298.Google Scholar

Article contents

Spinning plates and squad systems: policies for bi-directional restless bandits

Abstract

Keywords

MSC classification

References

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests