A utility-based analysis of equilibria in multi-objective normal-form games

Roxana Rădulescu; Patrick Mannion; Yijie Zhang; Diederik M. Roijers; Ann Nowé

doi:10.1017/S0269888920000351

A utility-based analysis of equilibria in multi-objective normal-form games

Part of: Adaptive Learning Agents 2019

Published online by Cambridge University Press: 30 June 2020

Roxana Rădulescu ,

Patrick Mannion

Yijie Zhang ,

Diederik M. Roijers and

Ann Nowé

Show author details

Roxana Rădulescu: Affiliation:
Artificial Intelligence Lab, Vrije Universiteit Brussel, Pleinlaan 2, Brussels1050, Belgium, e-mails: [email protected], [email protected]
Patrick Mannion: Affiliation:
School of Computer Science, National University of Ireland Galway, GalwayH91 TK33, Ireland, e-mail: [email protected]
Yijie Zhang: Affiliation:
Universiteit van Amsterdam, Amsterdam, The Netherlands, e-mail: [email protected]
Diederik M. Roijers: Affiliation:
Artificial Intelligence Lab, Vrije Universiteit Brussel, Pleinlaan 2, Brussels1050, Belgium, e-mails: [email protected], [email protected] Microsystems Technology, HU University of Applied Sciences Utrecht, Heidelberglaan 15, 3584CSUtrecht, The Netherlands, e-mail: [email protected]
Ann Nowé: Affiliation:
Artificial Intelligence Lab, Vrije Universiteit Brussel, Pleinlaan 2, Brussels1050, Belgium, e-mails: [email protected], [email protected]

Article contents

Abstract
Footnotes
References

Get access

Rights & Permissions

Abstract

In multi-objective multi-agent systems (MOMASs), agents explicitly consider the possible trade-offs between conflicting objective functions. We argue that compromises between competing objectives in MOMAS should be analyzed on the basis of the utility that these compromises have for the users of a system, where an agent’s utility function maps their payoff vectors to scalar utility values. This utility-based approach naturally leads to two different optimization criteria for agents in a MOMAS: expected scalarized returns (ESRs) and scalarized expected returns (SERs). In this article, we explore the differences between these two criteria using the framework of multi-objective normal-form games (MONFGs). We demonstrate that the choice of optimization criterion (ESR or SER) can radically alter the set of equilibria in a MONFG when nonlinear utility functions are used.

Type: Adaptive and Learning Agents
Information: The Knowledge Engineering Review , Volume 35 , 2020 , e32

DOI: https://doi.org/10.1017/S0269888920000351 [Opens in a new window]
Copyright: © The Author(s), 2020. Published by Cambridge University Press

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Article purchase

Temporarily unavailable

Footnotes

This article extends an earlier unpublished paper (Rădulescu et al., 2019) that was originally presented at the Adaptive and Learning Agents Workshop 2019.

References

Arifovic, J., Boitnott, J. F. & Duffy, J. 2016. Learning correlated equilibria: an evolutionary approach. Journal of Economic Behavior & Organization 157, 171–190.Google Scholar

Aumann, R. J. 1974. Subjectivity and correlation in randomized strategies. Journal of Mathematical Economics 1(1), 67–96.CrossRef Google Scholar

Aumann, R. J. 1987. Correlated equilibrium as an expression of bayesian rationality. Econometrica: Journal of the Econometric Society 1, 1–18.CrossRef Google Scholar

Bergstresser, K. and Yu, P. 1977. Domination structures and multicriteria problems in n-person games. Theory and Decision 8(1), 5–48.CrossRef Google Scholar

Blackwell, D.et al. 1956. An analog of the minimax theorem for vector payoffs. Pacific Journal of Mathematics 6(1), 1–8.CrossRef Google Scholar

Borm, P., Tijs, S. & van den Aarssen, J. 1990. Pareto equilibria in multi-objective games. Methods of Operations Research 60, 303–312.Google Scholar

Colby, M. & Tumer, K. 2015. An evolutionary game theoretic analysis of difference evaluation functions. In Proceedings of the 2015 Annual Conference on Genetic and Evolutionary Computation, 1391–1398. ACM.CrossRef Google Scholar

Devlin, S. & Kudenko, D. 2011. Theoretical considerations of potential-based reward shaping for multi-agent systems. In Proceedings of the 10th International Conference on Autonomous Agents and Multiagent Systems (AAMAS), 225–232.Google Scholar

Foster, D. P. & Vohra, R. 1999. Regret in the on-line decision problem. Games and Economic Behavior 29 (1–2), 7–35.CrossRef Google Scholar

Fudenberg, D. & Kreps, D. M. 1993. Learning mixed equilibria. Games and Economic Behavior 5 (3), 320–367. ISSN 0899-8256.CrossRef Google Scholar

Hart, S. & Schmeidler, D. 1989. Existence of correlated equilibria. Mathematics of Operations Research 14(1), 18–25.CrossRef Google Scholar

Igarashi, A. & Roijers, D. M. 2017. Multi-criteria coalition formation games. In International Conference on Algorithmic DecisionTheory, 197–213. Springer.CrossRef Google Scholar

Jensen, J. L. W. V.et al. 1906. Sur les fonctions convexes et les inégalités entre les valeurs moyennes. Actamathematica 30, 175–193.Google Scholar

Lozan, V. & Ungureanu, V. 2013. Computing the pareto-nash equilibrium set in finite multi-objective mixed-strategy games. Computer Science Journal of Moldova, 21 (2).Google Scholar

Lozovanu, D., Solomon, D. & Zelikovsky, A. 2005. Multiobjective games and determining pareto-nashequilibria. Buletinul Academiei de Ştiinţe a Republicii Moldova. Matematica, (3), 115–122.Google Scholar

Mannion, P., Devlin, S., Duggan, J. & Howley, E. 2018. Reward shaping for knowledge-based multi-objective multi-agent reinforcement learning. The Knowledge Engineering Review 33, e23.CrossRef Google Scholar

Mannion, P., Devlin, S., Mason, K., Duggan, J. & Howley, E. 2017a. Policy invariance under reward transformations for multi-objective reinforcement learning. Neurocomputing 263, 60–73.CrossRef Google Scholar

Mannion, P., Duggan, J. & Howley, E. 2016a. An experimental review of reinforcement learning algorithms for adaptive traffic signal control. In Autonomic Road Transport Support Systems, McCluskey, L. T., Kotsialos, A., Müller, P. J., Klügl, F., Rana, O. & Schumann, R. (eds), 47–66. Springer International Publishing.Google Scholar

Mannion, P., Duggan, J. & Howley, E. 2017b. A theoretical and empirical analysis of reward transformations in multi-objective stochastic games. In Proceedings of the 16th International Conference on Autonomous Agents and Multiagent Systems (AAMAS), May 2017b.Google Scholar

Mannion, P., Mason, K., Devlin, S., Duggan, J. & Howley, E. 2016b. Multi-objective dynamic dispatch optimisation using multi-agent reinforcement learning. In Proceedings of the 15th International Conference on Autonomous Agents and Multiagent Systems (AAMAS), May 2016b.Google Scholar

Mossalam, H., Assael, Y. M., Roijers, D. M. & Whiteson, S. 2016. Multi-objective deep reinforcement learning. In NIPS Workshop on Deep Reinforcement Learning.Google Scholar

Nash, J. 1950. Equilibrium points in n-person games. Proceedings of the National Academy of Sciences 36(1), 48–49. ISSN 0027-8424.Google Scholar

Nash, J. 1951. Non-cooperative games. Annals of Mathematics 54(2), 286–295.CrossRef Google Scholar

Papadimitriou, C. H. & Roughgarden, T. 2008. Computing correlated equilibria in multi-player games. Journal of the ACM (JACM) 55(3), 14.CrossRef Google Scholar

Rădulescu, R., Legrand, M., Efthymiadis, K., Roijers, D. M. & Nowé, A. 2018. Deep multi-agent reinforcement learning in a homogeneous open population. In Proceedings of the 30th Benelux Conference on Artificial Intelligence (BNAIC 2018), 177–191.Google Scholar

Rădulescu, R., Mannion, P., Roijers, D. & Nowé, A. 2019. Equilibria in multi-objective games: a utility-based perspective. In Adaptive and Learning Agents Workshop (at AAMAS 2019), May 2019.Google Scholar

Rădulescu, R., Mannion, P., Roijers, D. M. and Nowé, A. 2020. Multi-objective multi-agent decision making: a utility-based analysis and survey. Autonomous Agents and Multi-Agent Systems 34 (10).CrossRef Google Scholar

Reymond, M., Patyn, C., Rădulescu, R., Deconinck, G. & Nowé, A. 2018. Reinforcement learning for demand response of domestic household appliances. In Proceedings of the Adaptive and Learning Agents Workshop at FAIM 2018.Google Scholar

Roijers, D. M. 2016. Multi-Objective Decision-Theoretic Planning. PhD thesis, University of Amsterdam.CrossRef Google Scholar

Roijers, D. M., Steckelmacher, D. & Nowé, A. 2018. Multi-objective reinforcement learning for the expected utility of the return. In Proceedings of the Adaptive and Learning Agents Workshop at FAIM 2018.Google Scholar

Roijers, D. M., Vamplew, P., Whiteson, S. & Dazeley, R. 2013. A survey of multi-objective sequential decision-making. Journal of Artificial Intelligence Research 48, 67–113.CrossRef Google Scholar

Roijers, D. M. & Whiteson, S. 2017. Multi-objective decision making. Synthesis Lectures on Artificial Intelligence and Machine Learning 11(1), 1–129.CrossRef Google Scholar

Shapley, L. S. & Rigby, F. D. 1959. Equilibrium points in games with vector payoffs. Naval Research Logistics Quarterly 6 (1), 57–61.CrossRef Google Scholar

Talpert, V., Sobh, I., Kiran, B. R., Mannion, P., Yogamani, S., El-Sallab, A. & Perez, P. 2019. Exploring applications of deep reinforcement learning for real-world autonomous driving systems. In International Conference on Computer Vision Theory and Applications (VISAPP), February 2019.Google Scholar

Vamplew, P., Dazeley, R., Berry, A., Issabekov, R. & Dekker, E. 2011. Empirical evaluation methods for multiobjective reinforcement learning algorithms. Machine Learning 84 (1–2), 51–80.CrossRef Google Scholar

Van Moffaert, K. & Nowé, A. 2014. Multi-objective reinforcement learning using sets of pareto dominating policies. The Journal of Machine Learning Research 15(1), 3483–3512.Google Scholar

Virtanen, P., Gommers, R., Oliphant, T. E., Haberland, M., Reddy, T., Cournapeau, D., Burovski, E., Peterson, P., Weckesser, W., Bright, J., van der Walt, S. J., Brett, M., Wilson, J., Jarrod Millman, K., Mayorov, N., Nelson, A. R. J., Jones, E., Kern, R., Larson, E., Carey, C., Polat, İ., Feng, Y., Moore, E. W., Vand erPlas, J., Laxalde, D., Perktold, J., Cimrman, R., Henriksen, I., Quintero, E. A., Harris, C. R., Archibald, A. M., Ribeiro, A. H., Pedregosa, F., van Mulbregt, P. & Contributors, S. 2019. SciPy 1.0–Fundamental Algorithms for Scientific Computing in Python. arXiv e-prints, art. arXiv:1907.10121, July 2019.Google Scholar

Voorneveld, M., Vermeulen, D. & Borm, P. 1999. Axiomatizations of paretoequilibria in multicriteria games. Games and Economic Behavior 280 (1), 146–154.Google Scholar

Walraven, E. & Spaan, M. T. J. 2016. Planning under uncertainty for aggregated electric vehicle charging with renewable energy supply. In Proceedings of the European Conference on Artificial Intelligence, 904–912.Google Scholar

Watkins, C. J. C. H. Learning from Delayed Rewards. PhD thesis, King’s College, Cambridge, UK, 1989.Google Scholar

Wierzbicki, A. P. 1995. Multiple criteria games – theory and applications. Journal of Systems Engineering and Electronics 60 (2), 65–81.Google Scholar

Wiggers, A. J., Oliehoek, F. A. & Roijers, D. M. 2016. Structure in the value function of two-player zero-sum games of incomplete information. In Proceedings of the Twenty-second European Conference on Artificial Intelligence, 1628–1629. IOS Press.Google Scholar

Yliniemi, L., Agogino, A. K. & Tumer, K. 2015. Simulation of the introduction of new technologies in air traffic management. Connection Science 270 (3), 269–287.Google Scholar

Yliniemi, L. & Tumer, K. 2016. Multi-objective multiagent credit assignment in reinforcement learning and nsga-ii. Soft Computing 200 (10), 3869–3887.Google Scholar

Zhang, Y., Rădulescu, R., Mannion, P., Roijers, D. M. & Nowé, A. 2020. Opponent modelling for reinforcement learning in multi-objective normal form games. In Proceedings of the 19th International Conference on Autonomous Agents and Multiagent Systems (AAMAS 2020), May 2020.Google Scholar

Zinkevich, M., Greenwald, A. & Littman, M. L. 2006. Cyclic equilibria in markov games. In Advances in Neural Information Processing Systems, 1641–1648.Google Scholar

Zintgraf, L. M., Kanters, T. V., Roijers, D. M., Oliehoek, F. A. & Beau, P. 2015. Quality assessment of MORL algorithms: a utility-based approach. In Benelearn 2015: Proceedings of the Twenty-Fourth Belgian-Dutch Conference on Machine Learning.Google Scholar

Zintgraf, L. M., Roijers, D. M., Linders, S., Jonker, C. M. & Nowé, A. 2018. Ordered preference elicitation strategies for supporting multi-objective decision making. In Proceedings of the 17th International Conference on Autonomous Agents and Multi-Agent Systems, 1477–1485. International Foundation for Autonomous Agents and Multiagent Systems.Google Scholar

Article contents

A utility-based analysis of equilibria in multi-objective normal-form games

Abstract

Access options

Article purchase

Temporarily unavailable

Footnotes

References

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests