Hostname: page-component-586b7cd67f-t7czq Total loading time: 0 Render date: 2024-11-30T17:08:56.777Z Has data issue: false hasContentIssue false

The Average Cost Optimality Equation and Critical Number Policies

Published online by Cambridge University Press:  27 July 2009

Linn I. Sennott
Affiliation:
Department of Mathematics, Illinois State University, Normal, Illinois 61761

Abstract

We consider a Markov decision chain with countable state space, finite action sets, and nonnegative costs. Conditions for the average cost optimality inequality to be an equality are derived. This extends work of Cavazos-Cadena [8]. It is shown that an optimal stationary policy must satisfy the optimality equation at all positive recurrent states. Structural results on the chain induced by an optimal stationary policy are derived. The results are employed in two examples to prove that any optimal stationary policy must be of critical number form.

Type
Research Article
Copyright
Copyright © Cambridge University Press 1993

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

1.Apostol, T. (1974). Mathematical analysis, 2nd ed.Reading, MA: Addison-Wesley.Google Scholar
2.Bertsekas, D. (1987). Dynamic programming: Deterministic and stochastic models. Englewood Cliffs, NJ: Prentice-Hall.Google Scholar
3.Borkar, V. (1984). On minimum cost per unit time control of Markov chains. SIAM Journal on Control and Optimization 22: 965978.CrossRefGoogle Scholar
4.Borkar, V. (1989). Control of Markov chains with long-run average cost criterion: The dynamic programming equations. SIAM Journal on Control and Optimization 27: 642657.CrossRefGoogle Scholar
5.Cavazos-Cadena, R. (1989). Weak conditions for the existence of optimal stationary policies in average Markov decision chains with unbounded cost. Kybernetika 25: 145156.Google Scholar
6.Cavazos-Cadena, R. (1991). A counterexample on the optimality equation in Markov decision chains with the average cost criterion. Systems and Control Letters 16: 387392.CrossRefGoogle Scholar
7.Cavazos-Cadena, R. (1991). Recent results on conditions for the existence of average optimal stationary policies. Annals of Operations Research 28: 328.CrossRefGoogle Scholar
8.Cavazos-Cadena, R. (1991). Solution to the optimality equation in a class of Markov decision chains with the average cost criterion. Kybernetika 27: 2337.Google Scholar
9.Cavazos-Cadena, R. & Sennott, L. (1992). Comparing recent assumptions for the existence of average optimal stationary policies. Operations Research Letters 11: 3337.CrossRefGoogle Scholar
10.Chung, K.L. (1967). Markov chains with stationary transition probabilities, 2nd ed.New York: Springer-Verlag.Google Scholar
11.Derman, C. & Veinott, A. Jr. (1967). A solution to a countable system of equations arising in Markovian decision processes. Annals of Mathematical Statistics 38: 582584.CrossRefGoogle Scholar
12.Pakes, A. (1969). Some conditions for ergodicity and recurrence of Markov chains. Operations Research 17: 10581061.CrossRefGoogle Scholar
13.Ritt, R. & Sennott, L. (to appear). Optimal stationary policies in general state space Markov decision chains with finite action sets. Mathematics of Operations Research.Google Scholar
14.Ross, S. (1983). Introduction to stochastic dynamic programming. New York: Academic Press.Google Scholar
15.Schal, M. (to appear). Average optimality in dynamic programming with general state space. Mathematics of Operations Research.Google Scholar
16.Sennott, L. (1989). Average cost optimal stationary policies in infinite state Markov decision processes with unbounded costs. Operations Research 37: 626633.CrossRefGoogle Scholar
17.Sennott, L. (1989). Average cost semi-Markov decision processes and the control of queueing systems. Probability in the Engineering and Informational Sciences 3: 247272.CrossRefGoogle Scholar
18.Sennott, L., Humblet, P., & Tweedie, R. (1983). Mean drifts and the non-ergodicity of Markov chains. Operations Research 31: 783789.CrossRefGoogle Scholar
19.Shwartz, A. & Makowski, A. (submitted). On the Poisson equation for Markov chains: Existence of solutions and parameter dependence by probabilistic methods.Google Scholar
20.Stidham, S. Jr. & Weber, R. (1989). Monotonic and insensitive optimal policies for control of queues with undiscounted costs. Operations Research 37: 611625.CrossRefGoogle Scholar