Hostname: page-component-586b7cd67f-tf8b9 Total loading time: 0 Render date: 2024-12-01T01:36:11.187Z Has data issue: false hasContentIssue false

On N-person stochastic games by denumerable state space

Published online by Cambridge University Press:  01 July 2016

A. Federgruen*
Affiliation:
Mathematisch Centrum, Amsterdam

Abstract

This paper considers non-cooperative N-person stochastic games with a countable state space and compact metric action spaces. We concentrate upon the average return per unit time criterion for which the existence of an equilibrium policy is established under a number of recurrency conditions with respect to the transition probability matrices associated with the stationary policies. These results are obtained by establishing the existence of total discounted return equilibrium policies, for each discount factor α ∈ [0, 1) and by showing that under each one of the aforementioned recurrency conditions, average return equilibrium policies appear as limit policies of sequences of discounted return equilibrium policies, with discount factor tending to one.

Finally, we review and extend the results that are known for the case where both the state space and the action spaces are finite.

Type
Research Article
Copyright
Copyright © Applied Probability Trust 1978 

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

[1] Anthonisse, J. and Tijms, H. (1977) Exponential convergence of products of stochastic matrices. J. Math. Anal. Appl. 59, 360364.CrossRefGoogle Scholar
[2] Bather, J. (1973) Optimal decision procedures for finite Markov chains I, II. Adv. Appl. Prob. 5, 521553.CrossRefGoogle Scholar
[3] Billingsley, P. (1968) Convergence of Probability Measures. Wiley, New York.Google Scholar
[4] Blackwell, D. (1965) Discrete dynamic programming. Ann. Math. Statist. 36, 719726.CrossRefGoogle Scholar
[5] Blackwell, D. (1968) Discounted dynamic programming. Ann. Math. Statist. 39, 226235.Google Scholar
[6] Derman, C. and Strauch, R. (1966) A note on memoryless rules for controlling sequential control processes. Ann. Math. Statist. 37, 276278.CrossRefGoogle Scholar
[7] Doob, J. (1953) Stochastic Processes. Wiley, New York.Google Scholar
[8] Fan, K. (1952) Fixed-point and minimax theorems in locally convex topological linear spaces. Proc. Natn. Acad. Sci. USA 38, 121126.CrossRefGoogle ScholarPubMed
[9] Federgruen, A. and Tijms, H. C. (1978) The optimality equation in average cost denumerable state semi-Markov decision problems, recurrency conditions and algorithms. J. Appl. Prob. 15, (2).CrossRefGoogle Scholar
[10] Gillette, D. (1957) Stochastic games with zero stop probabilities. In Contributions to the Theory of Games, ed. Dresher, M. et al., Princeton University Press, Princeton, N.J., 179188.Google Scholar
[11] Glicksberg, I. (1952) A further generalization of the Kakutani fixed point theorem with application to Nash equilibrium points. Proc. Amer. Math. Soc. 3, 170174.Google Scholar
[12] Himmelberg, C., Parthasarathy, T., Raghavan, T. and van Vleck, F. (1976) Existence of p-equilibrium and optimal stationary strategies in stochastic games. Proc. Amer. Math. Soc. 27, 245251.Google Scholar
[13] Hordijk, A. (1974) Dynamic Programming and Markov Potential Theory. Mathematical Centre Tract 51, Amsterdam.Google Scholar
[14] Hordijk, A., Vrieze, O. and Wanrooij, G. (1976) Semi-Markov strategies in stochastic games. Mathematical Centre Report BW 68/76.Google Scholar
[15] Idzik, A. (1976) Two-stage noncooperative discounted stochastic games. Computation Centre of the Polish Academy of Sciences, Warsaw.Google Scholar
[16] Liggett, T. and Lippman, S. (1969) Stochastic games with perfect information and time average payoff. SIAM Rev. 11, 604607.CrossRefGoogle Scholar
[17] Miller, B. and Veinott, A. Jr. (1969) Discrete dynamic programming with a small interest rate. Ann. Math. Statist. 40, 366370.CrossRefGoogle Scholar
[18] Rogers, P. (1969) Nonzero-sum stochastic games. Report ORC 69–8, Operations Research Center, University of California, Berkeley.Google Scholar
[19] Ross, S. (1970) Applied Probability Models with Optimization Applications. Holden-Day, San Francisco.Google Scholar
[20] Royden, H. (1968) Real Analysis, 2nd edn. Macmillan, New York.Google Scholar
[21] Schweitzer, P. J. (1968) Perturbation theory and finite Markov chains. J. Appl. Prob. 5, 401413.CrossRefGoogle Scholar
[22] Schweitzer, P. J. (1971) Iterative solution of the functional equations of undiscounted Markov renewal programming. J. Math. Anal. Appl. 34, 495501.CrossRefGoogle Scholar
[23] Schweitzer, P. J. and Federgruen, A. (1978) The functional equations of undiscounted Markov renewal programming. Maths Opns Res. To appear.CrossRefGoogle Scholar
[24] Seneta, E. (1974) Non-negative Matrices. Allen and Unwin, London.Google Scholar
[25] Shapley, L. (1953) Stochastic games. Proc. Natn. Acad. Sci. USA 39, 10951100.CrossRefGoogle ScholarPubMed
[26] Sobel, M. (1971) Noncooperative stochastic games. Ann. Math. Statist. 42, 19301935.CrossRefGoogle Scholar
[27] Sobel, M. (1973) Continuous stochastic games. J. Appl. Prob. 10, 597604.CrossRefGoogle Scholar
[28] Stern, M. (1975) On Stochastic Games with Limiting Average Payoff. Ph.D. Dissertation, University of Illinois at Chicago Circle.Google Scholar
[29] Taylor, H. (1965) Markovian sequential replacement processes. Ann. Math. Statist. 36, 16771694.CrossRefGoogle Scholar
[30] Varadarajan, V. (1958) Weak convergence of measures on separable metric spaces. Sankhyā 20, 1522.Google Scholar
[31] Vrieze, O. and Wanrooij, G. (1975) Private communication.Google Scholar