Constrained Discounted Markov Decision Chains

Linn I. Sennott

doi:10.1017/S0269964800002230

Abstract

A Markov decision chain with countable state space incurs two types of costs: an operating cost and a holding cost. The objective is to minimize the expected discounted operating cost, subject to a constraint on the expected discounted holding cost. The existence of an optimal randomized simple policy is proved. This is a policy that randomizes between two stationary policies, that differ in at most one state. Several examples from the control of discrete time queueing systems are discussed.

References

Altman, E. (1990). Denumerable constrained Markov decision problems and finite approximations. Preprint.Google Scholar

Bertsekas, D. (1987). Dynamic programming: Deterministic and stochastic models. Englewood Cliffs, NJ: Prentice-Hall.Google Scholar

Beutler, F.J. & Ross, K.W. (1985). Optimal policies for controlled Markov chains with a constraint. Journal of Mathematical Analysis and Applications 112: 236–252.CrossRef Google Scholar

Beutler, F.J. & Ross, K.W. (1986). Time-average optimal constrained semi-Markov decision processes. Advances in Applied Probability 18: 341–359.CrossRef Google Scholar

Borkar, V. (1990). Controlled Markov chains with constraints. Preprint.CrossRef Google Scholar

Chitgopekar, S.S. (1975). Denumerable state Markovian sequential control processes: On randomizations of optimal policies. Naval Research Logistics Quarterly 22: 567–573.CrossRef Google Scholar

Frid, E.B. (1972). On optimal strategies in control problems with constraints. Theory of Probability and Its Applications 17: 188–192.CrossRef Google Scholar

Hewitt, & Stromberg, (1965). Real and abstract analysis. New York: Springer-Verlag.Google Scholar

Hordijk, A. (1974). Dynamic programming and Markov potential theory. Mathematical Centre Tracts 51. Amsterdam: CWI.Google Scholar

Hordijk, A. & Kallenberg, L.C.M. (1984). Constrained undiscounted stochastic dynamic programming. Mathematics of Operations Research 9: 276–289.CrossRef Google Scholar

Kallenberg, L.C.M. (1980). Linear programming and finite Markovian control problems. Mathematical Centre Tracts 148. Amsterdam: CWI.Google Scholar

Ma, D.-J., Makowski, A.M., & Shwartz, A. (1986). Estimation and optimal control for constrained Markov chains. Proceedings of the 25th Conference on Decision and Control, Athens, Greece, pp. 994–999.Google Scholar

Sennott, L.I. (1989). Average cost optimal stationary policies in infinite state Markov decision processes with unbounded costs. Operations Research 37: 626–633.CrossRef Google Scholar

Sennott, L.I. (1989). Average cost semi-Markov decision processes and the control of queueng systems. Probability in the Engineering and Informational Sciences 3: 247–272.CrossRef Google Scholar

Sennott, L.I. (1991). Value iteration in countable state average cost Markov decision processes with unbounded costs. Annals of Operations Research 28: 261–272.CrossRef Google Scholar

Crossref Citations

This article has been cited by the following publications. This list is generated based on data provided by Crossref.

Altman, Eitan 1993. Asymptotic properties of constrained Markov Decision Processes. ZOR - Methods and Models of Operations Research, Vol. 37, Issue. 2, p. 151.

Sennott, Linn I. 1993. Constrained Average Cost Markov Decision Chains. Probability in the Engineering and Informational Sciences, Vol. 7, Issue. 1, p. 69.

Altman, Eitan and Spieksma, Flos 1995. The Linear Program approach in multi-chain Markov Decision Processes revisited. ZOR Zeitschrift f�r Operations Research Methods and Models of Operations Research, Vol. 42, Issue. 2, p. 169.

Altman, Eitan 1996. Constrained Markov decision processes with total cost criteria: Occupation measures and primal LP. Mathematical Methods of Operations Research, Vol. 43, Issue. 1, p. 45.

Altman, Eitan Hordijk, Arie and Kallenberg, Lodewijk C. M. 1996. On the value function in constrained control of Markov chains. Mathematical Methods of Operations Research, Vol. 44, Issue. 3, p. 387.

Пиуновский, Алексей Борисович and Piunovskiy, Alexei Borisovich 1998. Управляемые случайные последовательности: методы выпуклого анализа и задачи с функциональными ограничениями. Успехи математических наук, Vol. 53, Issue. 6, p. 129.

Xianping, Guo 2000. Constrained denumerable state non-stationary MDPs with expected total reward criterion. Acta Mathematicae Applicatae Sinica, Vol. 16, Issue. 2, p. 205.

Tidball, Mabel M. Lombardi, Ariel Pourtallier, Odile and Altman, Eitan 2000. Continuity of Optimal Values and Solutions for Control of Markov Chains with Constraints. SIAM Journal on Control and Optimization, Vol. 38, Issue. 4, p. 1204.

Feinberg, Eugene A. and Piunovskiy, Aleksey B. 2000. Multiple Objective Nonatomic Markov Decision Processes with Total Reward Criteria. Journal of Mathematical Analysis and Applications, Vol. 247, Issue. 1, p. 45.

Altman, Eitan and Shwartz, Adam 2000. Advances in Dynamic Games and Applications. p. 213.

Woriguchi, M. Kurano, M. and Yasuda, M. 2000. Markov decision processes with constrained stopping times. Vol. 1, Issue. , p. 706.

Guo, Xianping and Hernández-Lerma, Onésimo 2003. Constrained Continuous-Time Markov Control Processes with Discounted Criteria. Stochastic Analysis and Applications, Vol. 21, Issue. 2, p. 379.

Kadota, Yoshinobu Kurano, Masami and Yasuda, Masami 2006. Discounted Markov decision processes with utility constraints. Computers & Mathematics with Applications, Vol. 51, Issue. 2, p. 279.

Alvarez-Mena, Jorge and Hernández-Lerma, Onésimo 2006. Existence of nash equilibria for constrained stochastic games. Mathematical Methods of Operations Research, Vol. 63, Issue. 2, p. 261.

Khemiri, Sondes Boussetta, Khaled Achir, Nadjib and Pujolle, Guy 2007. Network Control and Optimization. Vol. 4465, Issue. , p. 105.

González-Hernández, Juan and Villarreal, César E. 2011. Optimal policies for constrained average-cost Markov decision processes. TOP, Vol. 19, Issue. 1, p. 107.

Guo, Xianping and Song, Xinyuan 2011. Discounted continuous-time constrained Markov decision processes in Polish spaces. The Annals of Applied Probability, Vol. 21, Issue. 5,

Guo, Xianping Wei, Qingda and Zhang, Junyu 2012. Optimization, Control, and Applications of Stochastic Systems. p. 125.

Chong, Edwin K.P. Miller, Scott A. and Adaska, Jason 2012. On Bellman’s principle with inequality constraints. Operations Research Letters, Vol. 40, Issue. 2, p. 108.

Huang, Yonghui Wei, Qingda and Guo, Xianping 2013. Constrained Markov decision processes with first passage criteria. Annals of Operations Research, Vol. 206, Issue. 1, p. 197.

Download full list

Article contents

Constrained Discounted Markov Decision Chains

Abstract

Access options

References

This article has been cited by the following publications. This list is generated based on data provided by Crossref.

Article contents

Constrained Discounted Markov Decision Chains

Abstract

Access options

References

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests