Hostname: page-component-cd9895bd7-p9bg8 Total loading time: 0 Render date: 2024-12-26T19:37:10.999Z Has data issue: false hasContentIssue false

APPROXIMATE DYNAMIC PROGRAMMING TECHNIQUES FOR THE CONTROL OF TIME-VARYING QUEUING SYSTEMS APPLIED TO CALL CENTERS WITH ABANDONMENTS AND RETRIALS

Published online by Cambridge University Press:  21 December 2009

Dennis Roubos
Affiliation:
VU University Amsterdam, Faculty of Sciences, 1081 HV Amsterdam, The Netherlands E-mail: [email protected]; [email protected]
Sandjai Bhulai
Affiliation:
VU University Amsterdam, Faculty of Sciences, 1081 HV Amsterdam, The Netherlands E-mail: [email protected]; [email protected]

Abstract

In this article we develop techniques for applying Approximate Dynamic Programming (ADP) to the control of time-varying queuing systems. First, we show that the classical state space representation in queuing systems leads to approximations that can be significantly improved by increasing the dimensionality of the state space by state disaggregation. Second, we deal with time-varying parameters by adding them to the state space with an ADP parameterization. We demonstrate these techniques for the optimal admission control in a retrial queue with abandonments and time-varying parameters. The numerical experiments show that our techniques have near to optimal performance.

Type
Research Article
Copyright
Copyright © Cambridge University Press 2009

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

1.Bertsekas, D.P. & Tsitsiklis, J.N. (1996). Neuro-dynamic programming. Belmont, MA: Athena Scientific.Google Scholar
2.Bhulai, S. & Koole, G.M. (2003). On the structure of value functions for threshold policies in queueing models. Journal of Applied Probability 40: 613622.CrossRefGoogle Scholar
3.de Farias, D.P. & Van Roy, B. (2003). Approximate linear programming for average-cost dynamic programming. In Thrun, S., Becker, S. & Obermayer, K. (eds.), Advances in neural information processing systems 15, Cambridge, MA: MIT Press, pp. 15871594.Google Scholar
4.de Farias, D.P. & Van Roy, B. (2003). The linear programming approach to approximate dynamic programming. Operations Research 51 (6): 850865.CrossRefGoogle Scholar
5.de Farias, D.P. & Van Roy, B. (2004). On constraint sampling in the linear programming approach to approximate dynamic programming. Mathematics of Operations Research 29 (3): 462478.CrossRefGoogle Scholar
6.Grassmann, W.K. (1977). Transient solutions in Markovian queueing systems. Computers and Operations Research 4: 4753.CrossRefGoogle Scholar
7.Green, L. & Kolesar, P. (1991). The pointwise stationary approximation for queues with nonstationary arrivals. Management Science 37: 8497.CrossRefGoogle Scholar
8.Hernández-Lerma, O. & Lasserre, J.B. (1996). Discrete-time markov control processes: Basic optimality criteria. New York: Springer-Verlag.CrossRefGoogle Scholar
9.Hernández-Lerma, O. & Lasserre, J.B. (1999). Further topics on discrete-time markov control processes. New York: Springer-Verlag.CrossRefGoogle Scholar
10.Ingolfsson, A., Akhmetshina, E., Budge, S., Li, Y. & Wu, X. (2007). A survey and experimental comparison of service level approximation methods for non-stationary M/M/s queueing systems. INFORMS Journal of Computing 19: 201214.CrossRefGoogle Scholar
11.Ingolfsson, A., Haque, M.A. & Umnikov, A. (2002). Accounting for time-varying queueing effects in workforce scheduling. European Journal of Operational Research 139: 585597.CrossRefGoogle Scholar
12.Parr, R. (1990). Hierarchical control and learning for markov decision processes. Ph.D. dessertation, Berkeley, CA: University of California.Google Scholar
13.Powell, W.B. (2007). Approximate dynamic programming: Solving the curses of dimensionality. New York: Wiley.CrossRefGoogle Scholar
14.Puterman, M.L. (1994). Markov decision processes: Discrete stochastic dynamic programming. New York: Wiley.CrossRefGoogle Scholar
15.Ren, Z. & Krogh, B.H. (2002). State aggregation in Markov decision processes. In Proceedings of the 41st IEEE Conference on Decision and Control, vol 4, pp. 38193824.Google Scholar
16.Roubos, D. & Bhulai, S. (2007). Average-cost approximate dynamic programming for the control of birth-death processes. Technical report, VU University Amsterdam.Google Scholar
17.Singh, S., Jaakkola, T. & Jordan, M. (1995). Reinforcement learning with soft state aggregation. Advances in Neural Information Processing Systems 7: 361368.Google Scholar
18.Singh, S.P. & Bertsekas, D.P. (1997). Reinforcement learning for dynamic channel allocation in cellular telephone systems. Advances in Neural Information Processing Systems 9: 974980.Google Scholar
19.Sutton, R.S. & Barto, A.G. (2000). Reinforcement Learning: An introduction. Cambridge, MA: MIT Press.Google Scholar
20.Tsitsiklis, J.N. & Van Roy, B. (1996). Feature-based methods for large scale dynamic programming. Machine Learning 22: 5994.CrossRefGoogle Scholar
21.Yoo, J. (1996). Queueing models for staffing service operations. Ph.D. dessertation, College Park: University of Maryland.Google Scholar