On the total reward variance for continuous-time Markov reward chains

Nico M. Van Dijk; Karel Sladký

doi:10.1239/jap/1165505206

On the total reward variance for continuous-time Markov reward chains

Part of: Markov processes

Published online by Cambridge University Press: 14 July 2016

Nico M. Van Dijk and

Karel Sladký

Show author details

Nico M. Van Dijk*: Affiliation:
University of Amsterdam
Karel Sladký*: Affiliation:
Institute of Information Theory and Automation, Prague
*: ∗Postal address: Department of Economic Sciences and Econometrics, University of Amsterdam, Roetersstraat 11, 1018 WB Amsterdam, The Netherlands. Email address: [email protected]
∗∗Postal address: Institute of Information Theory and Automation, Academy of Sciences of the Czech Republic, PO Box 18, Pod Vodárenskou věží 4, 182 08 Prague 8, Czech Republic. Email address: [email protected]

Article contents

Abstract
References

Rights & Permissions

Abstract

Core share and HTML view are not available for this content. However, as you have access to this content, a full PDF is available via the ‘Save PDF’ action button.

As an extension of the discrete-time case, this note investigates the variance of the total cumulative reward for continuous-time Markov reward chains with finite state spaces. The results correspond to discrete-time results. In particular, the variance growth rate is shown to be asymptotically linear in time. Expressions are provided to compute this growth rate.

Keywords

Continuous-time Markov reward chain variance of cumulative reward asymptotic behaviour uniformization

MSC classification

Primary: 90C47: Minimax problems

Secondary: 60J27: Continuous-time Markov processes on discrete state spaces

Type: Research Papers
Information: Journal of Applied Probability , Volume 43 , Issue 4 , December 2006 , pp. 1044 - 1052

DOI: https://doi.org/10.1239/jap/1165505206 [Opens in a new window]
Copyright: © Applied Probability Trust 2006

References

Benito, F. (1982). Calculating the variance in Markov-processes with random reward. Trabajos Estadı´st. Investigación Operat. 33, 73–85.CrossRef Google Scholar

Dynkin, E. B. (1965). Markov Processes, Vol. I. Springer, Berlin.Google Scholar

Filar, J., Kallenberg, L. C. M. and Lee, H.-M. (1989). Variance penalized Markov decision processes. Math. Operat. Res. 14, 147–161.CrossRef Google Scholar

Huang, Y. and Kallenberg, L. C. M. (1994). On finding optimal policies for Markov decision chains: a unifying framework for mean-variance-tradeoffs. Math. Operat. Res. 19, 434–448.Google Scholar

Jaquette, S. C. (1972). Markov decision processes with a new optimality criterion: small interest rates. Ann. Math. Statist. 43, 1894–1901.CrossRef Google Scholar

Jaquette, S. C. (1973). Markov decision processes with a new optimality criterion: discrete time. Ann. Statist. 1, 496–505.CrossRef Google Scholar

Jaquette, S. C. (1975). Markov decision processes with a new optimality criterion: continuous time. Ann. Statist. 3, 547–553.Google Scholar

Jaquette, S. C. (1976). A utility criterion for Markov decision processes. Manag. Sci. 23, 43–49.CrossRef Google Scholar

Kadota, Y. (1997). A minimum average-variance in Markov decision processes. Bull. Inf. Cybernet. 29, 83–89.CrossRef Google Scholar

Kawai, H. (1987). A variance minimization problem for a Markov decision process. Europ. J. Operat. Res. 31, 140–145.Google Scholar

Kurano, M. (1987). Markov decision processes with a minimum-variance criterion. J. Math. Anal. Appl. 123, 572–583.CrossRef Google Scholar

Mandl, P. (1971). On the variance in controlled Markov chains. Kybernetika 7, 1–12.Google Scholar

Odoni, R. A. (1969). On finding the maximal gain for Markov decision processes. Operat. Res. 17, 857–860.Google Scholar

Puterman, M. L. (1994). Markov Decision Processes: Discrete Stochastic Dynamic Programming. John Wiley, New York.CrossRef Google Scholar

Ross, S. M. (1970). Applied Probability Models with Optimization Applications. Holden-Day, San Francisco, CA.Google Scholar

Sladký, K. and Sitař, M. (2004). Optimal solutions for undiscounted variance penalized Markov decision chains. In Dynamic Stochastic Optimization (Lecture Notes Econom. Math. Systems 532), eds Marti, K., Ermoliev, Y. and Pflug, G., Springer, Berlin, pp. 43–66.Google Scholar

Sobel, M. J. (1982). The variance of discounted Markov decision processes. J. Appl. Prob. 19, 794–802.CrossRef Google Scholar

Sobel, M. J. (1985). Maximal mean/standard deviation ratio in an undiscounted MDP. Operat. Res. Lett. 4, 157–159.Google Scholar

Tijms, H. C. (1994). Stochastic Models. An Algebraic Approach. John Wiley, Chichester.Google Scholar

White, D. J. (1988). Mean, variance and probability criteria in finite Markov decision processes: A review. J. Optimization Theory Appl. 56, 1–29.CrossRef Google Scholar

Article contents

On the total reward variance for continuous-time Markov reward chains

Abstract

Keywords

MSC classification

References

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests