Hostname: page-component-586b7cd67f-2plfb Total loading time: 0 Render date: 2024-11-30T16:51:04.591Z Has data issue: false hasContentIssue false

Optimal stationary strategies in leavable Markov decision processes

Published online by Cambridge University Press:  14 July 2016

Abstract

This paper establishes the existence of an optimal stationary strategy in a leavable Markov decision process with countable state space and undiscounted total reward criterion.

Besides assumptions of boundedness and continuity, an assumption is imposed on the model which demands the continuity of the mean recurrence times on a subset of the stationary strategies, the so-called ‘good strategies'. For practical applications it is important that this assumption is implied by an assumption about the cost structure and the transition probabilities. In the last part we point out that our results in general cannot be deduced from related works on bias-optimality by Dekker and Hordijk, Wijngaard or Mann.

Type
Research Papers
Copyright
Copyright © Applied Probability Trust 1990 

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

[1] Chung, K. (1967) Markov Chains with Stationary Transition Probabilities. Springer-Verlag, Berlin.Google Scholar
[2] Dekker, R. and Hordijk, A. (1989) Average, sensitive and Blackwell optimal policies in denumerable Markov decision chains with unbounded rewards. Math. Operat. Res. Google Scholar
[3] Deppe, H. (1985) Continuity of mean recurrence times in denumerable semi-Markov processes. Z. Wahrscheinlichkeitsth. 69, 581592.Google Scholar
[4] Dubins, L. E. and Savage, L. J. (1965) How to Gamble if You Must. Mc Graw-Hill, New York.Google Scholar
[5] Federgruen, A., Hordijk, A. and Tijms, H. C. (1979) Denumerable state semi-Markov processes with unbounded cost, average cost criterion. Stoch. Proc. Appl. 9, 223235.Google Scholar
[6] Hordijk, A. (1974) Dynamic Programming and Markov Potential Theory. Mathematic Centre Tracts 51, Amsterdam.Google Scholar
[7] Mann, E. (1985) Optimality equations and sensitive optimality in bounded Markov decision processes. Optimization 16, 767781.Google Scholar
[8] Schäl, M. (1989) On stochastic dynamic programming: A bridge between Markov decision processes and gambling.Google Scholar
[9] Schweitzer, P. J. (1968) Perturbation theory and finite Markov chains. J. Appl. Prob. 5, 401413.Google Scholar
[10] Sudderth, W. D. (1969) On the existence of good stationary strategies. Trans. Amer. Math. Soc. 135, 399414.Google Scholar
[11] Wijngaard, J. (1976) Sensitive optimality in stationary Markovian decision problems on a general state space. Proc. Advanced Seminar on Markov Decision Theory, Amsterdam 1976, 5893.Google Scholar