Hostname: page-component-78c5997874-lj6df Total loading time: 0 Render date: 2024-11-16T07:29:32.561Z Has data issue: false hasContentIssue false

Optimal decision procedures for finite markov chains. Part I: Examples

Published online by Cambridge University Press:  01 July 2016

John Bather*
Affiliation:
University of Sussex

Abstract

A Markov process in discrete time with a finite state space is controlled by choosing the transition probabilities from a prescribed set depending on the state occupied at any time. Given the immediate cost for each choice, it is required to minimise the expected cost over an infinite future, without discounting. Various techniques are reviewed for the case when there is a finite set of possible transition matrices and an example is given to illustrate the unpredictable behaviour of policy sequences derived by backward induction. Further examples show that the existing methods may break down when there is an infinite family of transition matrices. A new approach is suggested, based on the idea of classifying the states according to their accessibility from one another.

Type
Research Article
Copyright
Copyright © Applied Probability Trust 1973 

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

[1] Blackwell, D. (1962) Discrete dynamic programming. Ann. Math. Statist. 33, 719726.CrossRefGoogle Scholar
[2] Brown, B. W. (1965) On the iterative method of dynamic programming on a finite space discrete time Markov process. Ann. Math. Statist. 36, 12791286.CrossRefGoogle Scholar
[3] Derman, C. (1970) Finite State Markovian Decision Processes. Academic Press, New York.Google Scholar
[4] Howard, R. A. (1960) Dynamic Programming and Markov Processes. Wiley, New York.Google Scholar
[5] Kemeny, J. G. and Snell, J. L. (1960) Finite Markov Chains. Van Nostrand, New York.Google Scholar
[6] Lanery, E. (1967) Étude asymptotique des systèmes Markoviens à commande. Revue d'Informatique et Recherche Opérationnelle. 1 no. 6, 356.Google Scholar
[7] Miller, B. L. and Veinott, A. F. Jr. (1969) Discrete dynamic programming with a small interest rate. Ann. Math. Statist. 40, 366370.CrossRefGoogle Scholar
[8] Veinott, A. F. Jr. (1966) On finding optimal policies in discrete dynamic programming with no discounting. Ann. Math. Statist. 37, 12841294.CrossRefGoogle Scholar
[9] Veinott, A. F. Jr. (1969) Discrete dynamic programming with sensitive discount optimality criteria. Ann. Math. Statist. 40, 16351660.CrossRefGoogle Scholar