Optimal stationary strategies in leavable Markov decision processes

Matthias Fassbender

doi:10.2307/3214601

Optimal stationary strategies in leavable Markov decision processes

Published online by Cambridge University Press: 14 July 2016

Matthias Fassbender

Article contents

Abstract
References

Get access

Rights & Permissions

Abstract

This paper establishes the existence of an optimal stationary strategy in a leavable Markov decision process with countable state space and undiscounted total reward criterion.

Besides assumptions of boundedness and continuity, an assumption is imposed on the model which demands the continuity of the mean recurrence times on a subset of the stationary strategies, the so-called ‘good strategies'. For practical applications it is important that this assumption is implied by an assumption about the cost structure and the transition probabilities. In the last part we point out that our results in general cannot be deduced from related works on bias-optimality by Dekker and Hordijk, Wijngaard or Mann.

Keywords

DYNAMIC PROGRAMMING OPTIMAL STOPPING AND CONTROL COUNTABLE STATE SPACE STATIONARY POLICY UNDISCOUNTED TOTAL REWARD CRITERION

Type: Research Papers
Information: Journal of Applied Probability , Volume 27 , Issue 1 , March 1990 , pp. 134 - 145

DOI: https://doi.org/10.2307/3214601 [Opens in a new window]
Copyright: Copyright © Applied Probability Trust 1990

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Article purchase

Temporarily unavailable

References

[1] Chung, K. (1967) Markov Chains with Stationary Transition Probabilities. Springer-Verlag, Berlin.Google Scholar

[2] Dekker, R. and Hordijk, A. (1989) Average, sensitive and Blackwell optimal policies in denumerable Markov decision chains with unbounded rewards. Math. Operat. Res. Google Scholar

[3] Deppe, H. (1985) Continuity of mean recurrence times in denumerable semi-Markov processes. Z. Wahrscheinlichkeitsth. 69, 581–592.Google Scholar

[4] Dubins, L. E. and Savage, L. J. (1965) How to Gamble if You Must. Mc Graw-Hill, New York.Google Scholar

[5] Federgruen, A., Hordijk, A. and Tijms, H. C. (1979) Denumerable state semi-Markov processes with unbounded cost, average cost criterion. Stoch. Proc. Appl. 9, 223–235.Google Scholar

[6] Hordijk, A. (1974) Dynamic Programming and Markov Potential Theory. Mathematic Centre Tracts 51, Amsterdam.Google Scholar

[7] Mann, E. (1985) Optimality equations and sensitive optimality in bounded Markov decision processes. Optimization 16, 767–781.Google Scholar

[8] Schäl, M. (1989) On stochastic dynamic programming: A bridge between Markov decision processes and gambling.Google Scholar

[9] Schweitzer, P. J. (1968) Perturbation theory and finite Markov chains. J. Appl. Prob. 5, 401–413.Google Scholar

[10] Sudderth, W. D. (1969) On the existence of good stationary strategies. Trans. Amer. Math. Soc. 135, 399–414.Google Scholar

[11] Wijngaard, J. (1976) Sensitive optimality in stationary Markovian decision problems on a general state space. Proc. Advanced Seminar on Markov Decision Theory, Amsterdam 1976, 58–93.Google Scholar

Article contents

Optimal stationary strategies in leavable Markov decision processes

Abstract

Keywords

Access options

Article purchase

Temporarily unavailable

References

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests