Discounted Continuous-Time Controlled Markov Chains: Convergence of Control Models

Tomás Prieto-Rumeau; Onésimo Hernández-Lerma

doi:10.1239/jap/1354716658

Discounted Continuous-Time Controlled Markov Chains: Convergence of Control Models

Part of: Markov processes

Published online by Cambridge University Press: 30 January 2018

Tomás Prieto-Rumeau and

Onésimo Hernández-Lerma

Show author details

Tomás Prieto-Rumeau*: Affiliation:
Universidad Nacional de Educación a Distancia
Onésimo Hernández-Lerma*: Affiliation:
CINVESTAV-IPN
*: ∗ Postal address: Departamento de Estadística, Facultad de Ciencias, Universidad Nacional de Educación a Distancia, Calle Senda del Rey 9, 28040, Madrid, Spain. Email address: [email protected]
∗∗ Postal address: Departamento de Matemáticas, CINVESTAV-IPN, México D.F. 07000, México.

Article contents

Abstract
References

Rights & Permissions

Abstract

Core share and HTML view are not available for this content. However, as you have access to this content, a full PDF is available via the ‘Save PDF’ action button.

We are interested in continuous-time, denumerable state controlled Markov chains (CMCs), with compact Borel action sets, and possibly unbounded transition and reward rates, under the discounted reward optimality criterion. For such CMCs, we propose a definition of a sequence of control models {ℳn} converging to a given control model ℳ, which ensures that the discount optimal reward and policies of ℳn converge to those of ℳ. As an application, we propose a finite-state and finite-action truncation technique of the original control model ℳ, which is illustrated by approximating numerically the optimal reward and policy of a controlled population system with catastrophes. We study the corresponding convergence rates.

Keywords

Continuous-time controlled Markov chain approximation of control models discount optimality

MSC classification

Primary: 90C40: Markov and semi-Markov decision processes 60J27: Continuous-time Markov processes on discrete state spaces

Type: Research Article
Information: Journal of Applied Probability , Volume 49 , Issue 4 , December 2012 , pp. 1072 - 1090

DOI: https://doi.org/10.1239/jap/1354716658 [Opens in a new window]
Copyright: © Applied Probability Trust

References

Altman, E. (1994). Denumerable constrained Markov decision processes and finite approximations. Math. Operat. Res. 19, 169–191.Google Scholar

Álvarez-Mena, J. and Hernández-Lerma, O. (2002). Convergence of the optimal values of constrained Markov control processes. Math. Meth. Operat. Res. 55, 461–484.Google Scholar

Guo, X. and Hernández-Lerma, O. (2003). Continuous-time controlled Markov chains with discounted rewards. Acta Appl. Math. 79, 195–216.CrossRef Google Scholar

Guo, X. and Hernández-Lerma, O. (2003). Drift and monotonicity conditions for continuous-time controlled Markov chains with an average criterion. IEEE Trans. Automatic Control 48, 236–245.Google Scholar

Guo, X. and Hernández-Lerma, O. (2009). Continuous-Time Markov Decision Processes. Springer, Berlin.CrossRef Google Scholar

Hernández-Lerma, O. (1989). Adaptive Markov Control Processes. Springer, New York.Google Scholar

Kushner, H. J. and Dupuis, P. (2001). Numerical Methods for Stochastic Control Problems in Continuous Time, 2nd edn. Springer, New York.CrossRef Google Scholar

Langen, H.-J. (1981). Convergence of dynamic programming models. Math. Operat. Res. 6, 493–512.Google Scholar

Leizarowitz, A. and Shwartz, A. (2008). Exact finite approximations of average-cost countable Markov decision processes. Automatica J. IFAC 44, 1480–1487.Google Scholar

Prieto-Rumeau, T. and Hernández-Lerma, O. (2010). Policy iteration and finite approximations to discounted continuous-time controlled Markov chains. In Modern Trends in Controlled Stochastic Processes, ed. Piunovskiy, A. B., Luniver Press, pp. 84–101.Google Scholar

Prieto-Rumeau, T. and Lorenzo, J. M. (2010). Approximating ergodic average reward continuous-time controlled Markov chains. IEEE Trans. Automatic Control 55, 201–207.Google Scholar

Rudin, W. (1976). Principles of Mathematical Analysis, 3rd edn. McGraw-Hill, New York.Google Scholar

Song, Q. S. (2008). Convergence of Markov chain approximation on generalized HJB equation and its applications. Automatica J. IFAC 44, 761–766.Google Scholar

Tidball, M. M., Lombardi, A., Pourtallier, O. and Altman, E. (2000). Continuity of optimal values and solutions for control of Markov chains with constraints. SIAM J. Control Optimization 38, 1204–1222.CrossRef Google Scholar

Whitt, W. (1978). Approximation of dynamic programs. I. Math. Operat. Res. 3, 231–243.Google Scholar

Article contents

Discounted Continuous-Time Controlled Markov Chains: Convergence of Control Models

Abstract

Keywords

MSC classification

References

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests