On zero-sum two-person undiscounted semi-Markov games with a multichain structure

Prasenjit Mondal

doi:10.1017/apr.2017.23

On zero-sum two-person undiscounted semi-Markov games with a multichain structure

Part of: Stochastic processes Game theory

Published online by Cambridge University Press: 08 September 2017

Prasenjit Mondal

Show author details

Prasenjit Mondal*: Affiliation:
Government General Degree College, Ranibandh
*: * Postal address: Mathematics Department, Government General Degree College, Ranibandh, Bankura, 722135, India. Email address: [email protected]

Article contents

Abstract
References

Get access

Rights & Permissions

Abstract

Zero-sum two-person finite undiscounted (limiting ratio average) semi-Markov games (SMGs) are considered with a general multichain structure. We derive the strategy evaluation equations for stationary strategies of the players. A relation between the payoff in the multichain SMG and that in the associated stochastic game (SG) obtained by a data-transformation is established. We prove that the multichain optimality equations (OEs) for an SMG have a solution if and only if the associated SG has optimal stationary strategies. Though the solution of the OEs may not be optimal for an SMG, we establish the significance of studying the OEs for a multichain SMG. We provide a nice example of SMGs in which one player has no optimal strategy in the stationary class but has an optimal semistationary strategy (that depends only on the initial and current state of the game). For an SMG with absorbing states, we prove that solutions in the game where all players are restricted to semistationary strategies are solutions for the unrestricted game. Finally, we prove the existence of stationary optimal strategies for unichain SMGs and conclude that the unichain condition is equivalent to require that the game satisfies some recurrence/ergodicity/weakly communicating conditions.

Keywords

Semi-Markov game undiscounted payoff multichain structure optimality equation data transformation stationary and semistationary strategies

MSC classification

Primary: 91A15: Stochastic games

Secondary: 60G99: None of the above, but in this section

Type: Research Article
Information: Advances in Applied Probability , Volume 49 , Issue 3 , September 2017 , pp. 826 - 849

DOI: https://doi.org/10.1017/apr.2017.23 [Opens in a new window]
Copyright: Copyright © Applied Probability Trust 2017

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Article purchase

Temporarily unavailable

References

[1] Bertsekas, D. P. and Shreve, S. E. (1978). Stochastic Optimal Control: The Discrete Time Case. Academic Press, New York. Google Scholar

[2] Beutler, F. J. and Ross, K. W. (1987). Uniformization for semi-Markov decision processes under stationary policies. J. Appl. Prob. 24, 644–656. Google Scholar

[3] Bewley, T. and Kohlberg, E. (1976). The asymptotic theory of stochastic games. Math. Operat. Res. 1, 197–208. CrossRef Google Scholar

[4] Blackwell, D. (1962). Discrete dynamic programming. Ann. Math. Statist. 33, 719–726. Google Scholar

[5] Blackwell, D. and Ferguson, T. S. (1968). The big match. Ann. Math. Statist. 39, 159–163. CrossRef Google Scholar

[6] Denardo, E. V. and Fox, B. L. (1968). Multichain Markov renewal programs. SIAM J. Appl. Math. 16, 468–487. Google Scholar

[7] Federgruen, A. and Tijms, H. C. (1978). The optimality equation in average cost denumerable state semi-Markov decision problems: recurrency conditions and algorithms. J. Appl. Prob. 15, 356–373. Google Scholar

[8] Federgruen, A., Hordijk, A. and Tijms, H. C. (1978). A note on simultaneous recurrence conditions on a set of denumerable stochastic matrices. J. Appl. Prob. 15, 842–847. Google Scholar

[9] Filar, J. and Vrieze, K. (1997). Competitive Markov Decision Processes. Springer, New York. Google Scholar

[10] Fox, B. (1966). Markov renewal programming by linear fractional programming. SIAM J. Appl. Math. 14, 1418–1432. Google Scholar

[11] Jaśkiewicz, A. (2002). Zero-sum semi-Markov games. SIAM J. Control Optimization 41, 723–739. Google Scholar

[12] Jaśkiewicz, A. (2009). Zero-sum ergodic semi-Markov games with weakly continuous transition probabilities. J. Optimization Theory Appl. 141, 321–347. Google Scholar

[13] Jianyong, L. and Xiaobo, Z. (2004). On average reward semi-Markov decision processes with a general multichain structure. Math. Operat. Res. 29, 339–352. Google Scholar

[14] Kemeny, J. G. and Snell, J. L. (1976). Finite Markov Chains. Springer, New York. Google Scholar

[15] Kirman, A. P. and Sobel, M. J. (1974). Dynamic oligopoly with inventories. Econometrica 42, 279–287. CrossRef Google Scholar

[16] Kuhn, H. W. (1953). Extensive games and the problem of information. In Contributions to the Theory of Games, Vol. II. Princeton University Press, pp. 193–216. Google Scholar

[17] Lal, A. K. and Sinha, S. (1992). Zero-sum two-person semi-Markov games. J. Appl. Prob. 29, 56–72. Google Scholar

[18] Loomis, L. H. (1946). On a theorem of von Neumann. Proc. Nat. Acad. Sci. USA 32, 213–215. CrossRef Google Scholar PubMed

[19] Luque-Vásquez, F. (2002). Zero-sum semi-Markov games in Borel spaces: discounted and average payoff. Bol. Soc. Mat. Mexicana (3) 8, 227–241. Google Scholar

[20] Mertens, J.-F. and Neyman, A. (1981). Stochastic games. Internat. J. Game Theory 10, 53–66. Google Scholar

[21] Mondal, P. (2015). Linear programming and zero-sum two-person undiscounted semi-Markov games. Asia-Pacific J. Operat. Res. 32, 1550043. Google Scholar

[22] Mondal, P. (2016). On undiscounted semi-Markov decision processes with absorbing states. Math. Meth. Operat. Res. 83, 161–177. Google Scholar

[23] Mondal, P. and Sinha, S. (2013). Ordered field property in a subclass of finite SeR-SIT semi-Markov games. Internat. Game Theory Rev. 15, 1340026. Google Scholar

[24] Mondal, P. and Sinha, S. (2015). Ordered field property for semi-Markov games when one player controls transition probabilities and transition times. Internat. Game Theory Rev. 17, 1540022. Google Scholar

[25] Parthasarathy, T. and Raghavan, T. E. S. (1981). An orderfield property for stochastic games when one player controls transition probabilities. J. Optimization Theory Appl. 33, 375–392. Google Scholar

[26] Puterman, M. L. (1994). Markov Decision Processes: Discrete Stochastic Dynamic Programming. John Wiley, New York. Google Scholar

[27] Raut, L. K. (2006). Two-sided altruism, Lindahl equilibrium, and Pareto optimality in overlapping generations models. Econom. Theory 27, 729–736. CrossRef Google Scholar

[28] Ross, S. M. (1970). Applied Probability Models with Optimization Applications. Holden-Day, San Francisco, CA. Google Scholar

[29] Schäl, M. (1992). On the second optimality equation for semi-Markov decision models. Math. Operat. Res. 17, 470–486. CrossRef Google Scholar

[30] Schweitzer, P. J. (1971). Iterative solution of the functional equations of undiscounted Markov renewal programming. J. Math. Anal. Appl. 34, 495–501. Google Scholar

[31] Shapley, L. S. (1953). Stochastic games. Proc. Nat. Acad. Sci. USA 39, 1095–1100. CrossRef Google Scholar PubMed

[32] Vega-Amaya, O. (2003). Zero-sum average semi-Markov games: fixed-point solutions of the Shapley equation. SIAM J. Control Optimization 42, 1876–1894. Google Scholar

[33] Vrieze, O. J. (2003). Stochastic games and stationary strategies. In Stochastic Games and Applications (NATO Sci. Ser. C Math. Phys. Sci. 570), Kluwer, Dordrecht, pp. 37–50. CrossRef Google Scholar

Article contents

On zero-sum two-person undiscounted semi-Markov games with a multichain structure

Abstract

Keywords

MSC classification

Access options

Article purchase

Temporarily unavailable

References

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests