Absorbing Continuous-Time Markov Decision Processes with Total Cost Criteria

Xianping Guo; Mantas Vykertas; Yi Zhang

doi:10.1239/aap/1370870127

Absorbing Continuous-Time Markov Decision Processes with Total Cost Criteria

Part of: Markov processes

Published online by Cambridge University Press: 22 February 2016

Xianping Guo ,

Mantas Vykertas and

Yi Zhang

Show author details

Xianping Guo*: Affiliation:
Sun Yat-Sen University
Mantas Vykertas*: Affiliation:
Open University
Yi Zhang*: Affiliation:
University of Liverpool
*: ∗ Postal address: School of Mathematics and Computational Science, Sun Yat-Sen University, Guangzhou, P. R. China. Email address: [email protected]
∗∗ Postal address: Department of Mathematics and Statistics, Open University, Milton Keynes, MK7 6AA, UK. Email address: [email protected]
∗∗∗ Postal address: Department of Mathematical Sciences, University of Liverpool, Liverpool, L69 7ZL, UK. Email address: [email protected]

Article contents

Abstract
References

Rights & Permissions

Abstract

Core share and HTML view are not available for this content. However, as you have access to this content, a full PDF is available via the ‘Save PDF’ action button.

In this paper we study absorbing continuous-time Markov decision processes in Polish state spaces with unbounded transition and cost rates, and history-dependent policies. The performance measure is the expected total undiscounted costs. For the unconstrained problem, we show the existence of a deterministic stationary optimal policy, whereas, for the constrained problems with N constraints, we show the existence of a mixed stationary optimal policy, where the mixture is over no more than N+1 deterministic stationary policies. Furthermore, the strong duality result is obtained for the associated linear programs.

Keywords

CTMDP total cost constrained optimality linear program

MSC classification

Primary: 90C40: Markov and semi-Markov decision processes

Secondary: 60J25: Continuous-time Markov processes on general state spaces 60J75: Jump processes

Type: General Applied Probability
Information: Advances in Applied Probability , Volume 45 , Issue 2 , June 2013 , pp. 490 - 519

DOI: https://doi.org/10.1239/aap/1370870127 [Opens in a new window]
Copyright: © Applied Probability Trust

References

Aliprantis, C. and Border, K. (2007). Infinite Dimensional Analysis. Springer, New York.Google Scholar

Altman, E. (1999). Constrained Markov Decision Processes. Chapman and Hall/CRC, Boca Raton.Google Scholar

Bertsekas, D. P. and Shreve, S. E. (1978). Stochastic Optimal Control. Academic Press, New York.Google Scholar

Bertsekas, D., Nedı´c, A. and Ozdaglar, A. (2003). Convex Analysis and Optimization. Athena Scientific, Belmont, MA.Google Scholar

Bogachev, V. I. (2007). Measure Theory, Vol. I. Springer, Berlin.CrossRef Google Scholar

Bogachev, V. I. (2007). Measure Theory, Vol. II. Springer, Berlin.CrossRef Google Scholar

Clancy, D. and Piunovskiy, A. B. (2005). An explicit optimal isolation policy for a determinisitc epidemic model. Appl. Math. Comput. 163, 1109–1121.Google Scholar

Dubins, L. E. (1962). On extreme points of convex sets. J. Math. Anal. Appl. 5, 237–244.CrossRef Google Scholar

Feinberg, E. A. and Fei, J. (2009). An inequality for variances of the discounted rewards. J. Appl. Prob. 46, 1209–1212.CrossRef Google Scholar

Feinberg, E. A. and Rothblum, U. G. (2012). Splitting randomized stationary policies in total-reward Markov decision processes. Math. Operat. Res. 37, 129–153.CrossRef Google Scholar

Gleissner, W. (1988). The spread of epidemics. Appl. Math. Comput. 27, 167–171.Google Scholar

Guo, X. (2007). Constrained optimization for average cost continuous-time Markov decision processes. IEEE Trans. Automatic Control 52, 1139–1143.CrossRef Google Scholar

Guo, X. and Hernández-Lerma, O. (2009). Continuous-time Markov Decision Processes. Springer, Berlin.CrossRef Google Scholar

Guo, X. and Rieder, U. (2006). Average optimality for continuous-time Markov decision processes in Polish spaces. Ann. Appl. Prob. 16, 730–756.CrossRef Google Scholar

Guo, X. and Song, X. (2011). Discounted continuous-time constrained Markov decision processes in Polish spaces. Ann. Appl. Prob. 21, 2016–2049.CrossRef Google Scholar

Guo, X. and Zhang, L. (2011). Total reward criteria for unconstrained/constrained continuous-time Markov decision processes. J. Systems Sci. Complex. 24, 491–505.CrossRef Google Scholar

Guo, X., Huang, Y. and Song, X. (2012). Linear programming and constrained average optimality for general continuous-time Markov decision processes in history-dependent policies. SIAM J. Control Optimization 50, 23–47.CrossRef Google Scholar

Hernández-Lerma, O. and Lasserre, J. B. (1996). Discrete-time Markov Control Processes. Springer, New York.CrossRef Google Scholar

Hernández-Lerma, O. and Lasserre, J. B. (1999). Further Topics on Discrete-Time Markov Control Processes. Springer, New York.CrossRef Google Scholar

Hernández-Lerma, O. and Lasserre, J. B. (2000). Fatou's lemma and Lebesgue's convergence theorem for measures. J. Appl. Math. Stoch. Anal. 13, 137–146.CrossRef Google Scholar

Himmelberg, C. J. (1975). Measurable relations. Fund. Math. 87, 53–72.CrossRef Google Scholar

Himmelberg, C. J., Parthasarathy, T. and Van Vleck, F. S. (1976). Optimal plans for dynamic programming problems. Math. Operat. Res. 1, 390–394.CrossRef Google Scholar

Jacod, J. (1975). Multivariate point processes: predictable projection, Radon-Nykodym derivatives, representation of martingales. Z. Wahrscheinlichkeitsth. 31, 235–253.CrossRef Google Scholar

Kitaev, M. (1986). Semi-Markov and Jump Markov controlled models: average cost criterion. Theory. Prob. Appl. 30, 272–288.CrossRef Google Scholar

Kitaev, M. and Rykov, V. V. (1995). Controlled Queueing Systems. CRC Press, Boca Raton, FL.Google Scholar

Piunovskiy, A. B. (1997). Optimal Control of Random Sequences in Problems with Constraints. Kluwer, Dordrecht.CrossRef Google Scholar

Piunovskiy, A. B. (1998). A controlled Jump discounted model with constraints. Theory Prob. Appl. 42, 51–71.CrossRef Google Scholar

Piunovskiy, A. B. (2004). Optimal interventions in countable Jump Markov processes. Math. Operat. Res. 29, 289–308.CrossRef Google Scholar

Piunovskiy, A. and Zhang, Y. (2011). Accuracy of fluid approximation to controlled birth-and-death processes: absorbing case. Math. Meth. Operat. Res. 73, 159–187.CrossRef Google Scholar

Piunovskiy, A. and Zhang, Y. (2011). Discounted continuous-time Markov decision processes with unbounded rates: the dynamic programming approach. Preprint. Available at http://arxiv.org/abs/1103.0134v1.Google Scholar

Piunovskiy, A. and Zhang, Y. (2011). Discounted continuous-time Markov decision processes with unbounded rates: the convex analytic approach. SIAM J. Control Optimization 49, 2032–2061.CrossRef Google Scholar

Piunovskiy, A. and Zhang, Y. (2012). The transformation method for continuous-time Markov decision processes. J. Optimization Theory Appl. 154, 691–712.CrossRef Google Scholar

Pliska, S. R. (1975). Controlled Jump processes. Stoch. Process Appl. 3, 259–282.CrossRef Google Scholar

Prieto-Rumeau, T. and Hernández-Lerma, O. (2008). Ergodic control of continuous-time Markov chains with pathwise constraints. SIAM J. Control Optimization 47, 1888–1908.CrossRef Google Scholar

Rockafellar, R. T. (1974). Conjugate Duality and Optimization. SIAM, Philadelphia, PA.CrossRef Google Scholar

Varadarajan, V. S. (1958). Weak convergence of measures on separable metric spaces. Sankhyā 19, 15–22.Google Scholar

Yeh, J. (2006). Real analysis: Theory of Measure and Integration, 2nd edn. World Scientific, Hackensack, NJ.CrossRef Google Scholar

Zhang, Y. (2011). Convex analytic approach to constrained discounted Markov decision processes with non-constant discount factor. TOP, 31 pp.Google Scholar

Zhu, Q. (2008). Average optimality for continuous-time Jump Markov decision processes with a policy iteration approach. J. Math. Anal. Appl. 339, 691–704.CrossRef Google Scholar

Zhu, Q. and Prieto-Rumeau, T. (2008). Bias and overtaking optimality for continuous-time Jump Markov decision processes in Polish spaces. J. Appl. Prob. 45, 417–429.CrossRef Google Scholar

Article contents

Absorbing Continuous-Time Markov Decision Processes with Total Cost Criteria

Abstract

Keywords

MSC classification

References

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests