The performance of index-based policies for bandit problems with stochastic machine availability

R. T. Dunn; K. D. Glazebrook

doi:10.1017/S0001867800010843

The performance of index-based policies for bandit problems with stochastic machine availability

Part of: Computer system organization Operations research and management science

Published online by Cambridge University Press: 01 July 2016

R. T. Dunn and

K. D. Glazebrook

Show author details

R. T. Dunn*: Affiliation:
University of Newcastle upon Tyne
K. D. Glazebrook*: Affiliation:
University of Newcastle upon Tyne
*: ∗ Postal address: Department of Statistics, University of Newcastle upon Tyne, Newcastle upon Tyne, NE1 7RU, UK.
∗∗ Email address: [email protected]

Article contents

Abstract
References

Get access

Rights & Permissions

Abstract

We consider generalisations of two classical stochastic scheduling models, namely the discounted branching bandit and the discounted multi-armed bandit, to the case where the collection of machines available for processing is itself a stochastic process. Under rather mild conditions on the machine availability process we obtain performance guarantees for a range of controls based on Gittins indices. Various forms of asymptotic optimality are established for index-based limit policies as the discount rate approaches 0.

Keywords

average-overtaking optimal average-reward optimal branching bandit discounted rewards Gittins index machine breakdowns multi-armed bandit problem parallel machines suboptimality bound

MSC classification

Primary: 90B36: Scheduling theory, stochastic

Secondary: 68M20: Performance evaluation; queueing; scheduling 90B22: Queues and service

Type: General Applied Probability
Information: Advances in Applied Probability , Volume 33 , Issue 2 , June 2001 , pp. 365 - 390

DOI: https://doi.org/10.1017/S0001867800010843 [Opens in a new window]
Copyright: Copyright © Applied Probability Trust 2001

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Article purchase

Temporarily unavailable

References

Bertsimas, D. and Niño-Mora, J. (1996). Conservation laws, extended polymatroids and multi-armed bandit problems: a polyhedral approach to indexable systems. Math. Operat. Res. 21, 257–306.CrossRef Google Scholar

Birge, J., Frenk, J. B. G., Mittenthal, J. and Rinnooy Kan, A. H. G. (1990). Single machine scheduling subject to stochastic breakdowns. Naval Res. Logist. 37, 660–677.3.0.CO;2-3>CrossRef Google Scholar

Blackwell, D. (1962). Discrete dynamic programming. Ann. Math. Statist. 33, 719–726.CrossRef Google Scholar

Dacre, M., Glazebrook, K. D. and Niño-Mora, J. (1999). The achievable region approach to the optimal control of stochastic systems (with discussion). J. R. Statist. Soc. B 61, 747–791.CrossRef Google Scholar

Denardo, E. V. and Miller, B. L. (1968). An optimality criterion for discrete dynamic programming with no discounting. Ann. Math. Statist. 39, 1220–1227.CrossRef Google Scholar

Gittins, J. C. and Jones, D. M. (1974). A dynamic allocation index for the sequential design of experiments. In Progress in Statistics (European Meeting of Statisticians, Budapest, 1972), eds Gani, J., Sarkadi, K. and Vince, I. North-Holland, Amsterdam, pp. 241–266.Google Scholar

Glazebrook, K. D. (1976). Stochastic scheduling with order constraints. Int. J. Systems Sci. 7, 657–666.CrossRef Google Scholar

Glazebrook, K. D. (1984). Scheduling stochastic jobs on a single machine subject to breakdowns. Naval Res. Logist. Quart. 31, 251–264.CrossRef Google Scholar

Glazebrook, K. D. (1987). Evaluating the effects of machine breakdowns in stochastic scheduling problems. Naval Res. Logist. 34, 319–335.3.0.CO;2-5>CrossRef Google Scholar

Glazebrook, K. D. and Garbe, R. (1999). Almost optimal policies for stochastic systems which almost satisfy conservation laws. Ann. Operat. Res. 92, 19–43.CrossRef Google Scholar

Glazebrook, K. D. and Niño-Mora, J. (2001). Scheduling multiclass queueing networks on parallel servers: approximate and heavy-traffic optimality of Klimov's rule. To appear in Operat. Res. Google Scholar

Glazebrook, K. D. and Wilkinson, D. J. (2000). Index-based policies for discounted multi-armed bandits on parallel machines. Ann. Appl. Prob. 10, 877–896.CrossRef Google Scholar

Katehakis, M. N. and Veinott, A. F. (1987). The multi-armed bandit problem: decomposition and computation. Math. Operat. Res. 12, 262–268.CrossRef Google Scholar

Pinedo, M. and Rammouz, E. (1988). A note on stochastic scheduling on a single machine subject to breakdown and repair. Prob. Eng. Inf. Sci. 2, 41–49.CrossRef Google Scholar

Puterman, M. (1994). Markov Decision Processes: Discrete Stochastic Dynamic Programming. John Wiley, New York.CrossRef Google Scholar

Veinott, A. F. Jr. (1966). On finding optimal policies in discrete dynamic programming with no discounting. Ann. Math. Statist. 37, 1284–1294.CrossRef Google Scholar

Weber, R. R. (1982). Scheduling jobs with stochastic processing requirements on parallel machines to minimize makespan or flowtime. J. Appl. Prob. 19, 167–182.CrossRef Google Scholar

Weber, R. R., Varaiya, P. and Walrand, J. (1986). Scheduling jobs with stochastically ordered processing times on parallel machines to minimize expected flowtime. J. Appl. Prob. 23, 841–847.CrossRef Google Scholar

Weiss, G. (1988). Branching bandit processes. Prob. Eng. Inf. Sci. 2, 269–278.CrossRef Google Scholar

Weiss, G. (1990). Approximation results in parallel machines stochastic scheduling. Ann. Operat. Res. 26, 195–242.CrossRef Google Scholar

Weiss, G. (1992). Turnpike optimality of Smith's rule in parallel machines stochastic scheduling. Math. Operat. Res. 17, 255–270.CrossRef Google Scholar

Weiss, G. (1995). On almost optimal priority rules for preemptive scheduling of stochastic jobs on parallel machines. Adv. Appl. Prob. 27, 821–839.CrossRef Google Scholar

Whittle, P. (1981). Arm acquiring bandits. Ann. Prob. 9, 284–292.CrossRef Google Scholar

Article contents

The performance of index-based policies for bandit problems with stochastic machine availability

Abstract

Keywords

MSC classification

Access options

Article purchase

Temporarily unavailable

References

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests