What is the simplest model that can account for high-fidelity imitation?

Joel Z. Leibo; Raphael Köster; Alexander Sasha Vezhnevets; Edgar A. Duénez-Guzmán; John P. Agapiou; Peter Sunehag

doi:10.1017/S0140525X22001364

What is the simplest model that can account for high-fidelity imitation?

Published online by Cambridge University Press: 10 November 2022

Joel Z. Leibo

Raphael Köster ,

Alexander Sasha Vezhnevets ,

Edgar A. Duénez-Guzmán ,

John P. Agapiou and

Peter Sunehag

Show author details

Joel Z. Leibo: Affiliation:
DeepMind, London EC4A 3TW, UK [email protected] [email protected] [email protected] [email protected] [email protected] [email protected] www.jzleibo.com
Raphael Köster: Affiliation:
DeepMind, London EC4A 3TW, UK [email protected] [email protected] [email protected] [email protected] [email protected] [email protected] www.jzleibo.com
Alexander Sasha Vezhnevets: Affiliation:
DeepMind, London EC4A 3TW, UK [email protected] [email protected] [email protected] [email protected] [email protected] [email protected] www.jzleibo.com
Edgar A. Duénez-Guzmán: Affiliation:
DeepMind, London EC4A 3TW, UK [email protected] [email protected] [email protected] [email protected] [email protected] [email protected] www.jzleibo.com
John P. Agapiou: Affiliation:
DeepMind, London EC4A 3TW, UK [email protected] [email protected] [email protected] [email protected] [email protected] [email protected] www.jzleibo.com
Peter Sunehag: Affiliation:
DeepMind, London EC4A 3TW, UK [email protected] [email protected] [email protected] [email protected] [email protected] [email protected] www.jzleibo.com

Article contents

Abstract
References

Get access

Rights & Permissions

Abstract

What inductive biases must be incorporated into multi-agent artificial intelligence models to get them to capture high-fidelity imitation? We think very little is needed. In the right environments, both instrumental- and ritual-stance imitation can emerge from generic learning mechanisms operating on non-deliberative decision architectures. In this view, imitation emerges from trial-and-error learning and does not require explicit deliberation.

Type: Open Peer Commentary
Information: Behavioral and Brain Sciences , Volume 45 , 2022 , e261

DOI: https://doi.org/10.1017/S0140525X22001364 [Opens in a new window]
Copyright: Copyright © The Author(s), 2022. Published by Cambridge University Press

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Article purchase

Temporarily unavailable

References

Bhoopchand, A., Brownfield, B., Collister, A., Lago, A. D., Edwards, A., Everett, R., … Zhang, L. M. (2022). Learning robust real-time cultural transmission without human data. arXiv preprint arXiv:2203.00715.Google Scholar

Borsa, D., Piot, B., Munos, R., & Pietquin, O. (2019). Observational learning by reinforcement learning. Proceedings of the 18th International Conference on Autonomous Agents and Multi-Agent Systems (pp. 1117–1124).Google Scholar

Catmur, C., Walsh, V., & Heyes, C. (2009). Associative sequence learning: The role of experience in the development of imitation and the mirror system. Philosophical Transactions of the Royal Society B: Biological Sciences, 364(1528), 2369–2380.CrossRef Google Scholar PubMed

Dolan, R. J., & Dayan, P. (2013). Goals and habits in the brain. Neuron, 80(2), 312–325.CrossRef Google Scholar

Ha, S., & Jeong, H. (2022). Social learning spontaneously emerges by searching optimal heuristics with deep reinforcement learning. arXiv preprint arXiv:2204.12371.Google Scholar

Haidt, J. (2001). The emotional dog and its rational tail: A social intuitionist approach to moral judgment. Psychological Review, 108(4), 814.CrossRef Google Scholar PubMed

Heyes, C. (2016). Homo imitans? Seven reasons why imitation couldn't possibly be associative. Philosophical Transactions of the Royal Society B: Biological Sciences, 371(1686), 20150069.CrossRef Google Scholar PubMed

Heyes, C. (2022). Rethinking norm psychology. https://users.ox.ac.uk/~ascch/Celia's%20pdfs/Heyes,%20Rethinking%20Norm%20Psychology%20preprint%20pdf.pdf CrossRef Google Scholar

Jaderberg, M., Mnih, V., Czarnecki, W. M., Schaul, T., Leibo, J. Z., Silver, D., & Kavukcuoglu, K. (2016). Reinforcement learning with unsupervised auxiliary tasks. International Conference on Learning Representations (ICLR).Google Scholar

Köster, R., Hadfield-Menell, D., Everett, R., Weidinger, L., Hadfield, G. K., & Leibo, J. Z. (2022). Spurious normativity enhances learning of compliance and enforcement behavior in artificial agents. Proceedings of the National Academy of Sciences, 119(3).CrossRef Google Scholar PubMed

Leibo, J. Z., Zambaldi, V., Lanctot, M., Marecki, J., & Graepel, T. (2017). Multi-agent reinforcement learning in sequential social dilemmas. Proceedings of the 16th Conference on Autonomous Agents and Multi-Agent Systems (pp. 464–473).Google Scholar

Mercier, H., & Sperber, D. (2017). The enigma of reason. Harvard University Press.Google Scholar

Ndousse, K. K., Eck, D., Levine, S., & Jaques, N. (2021). Emergent social learning via multi-agent reinforcement learning. International Conference on Machine Learning (pp. 7991–8004). PMLR.Google Scholar

Perolat, J., Leibo, J. Z., Zambaldi, V., Beattie, C., Tuyls, K., & Graepel, T. (2017). A multi-agent reinforcement learning model of common-pool resource appropriation. Advances in Neural Information Processing Systems, 30.Google Scholar

Schrittwieser, J., Antonoglou, I., Hubert, T., Simonyan, K., Sifre, L., Schmitt, S., … Silver, D. (2020). Mastering Atari, Go, chess and shogi by planning with a learned model. Nature, 588(7839), 604–609.CrossRef Google Scholar PubMed

Vinitsky, E., Köster, R., Agapiou, J. P., Duéñez-Guzmán, E., Vezhnevets, A. S., & Leibo, J. Z. (2021). A learning agent that acquires social norms from public sanctions in decentralized multi-agent settings. arXiv preprint arXiv:2106.09012.Google Scholar

Woodward, M., Finn, C., & Hausman, K. (2020). Learning to interactively learn and assist. Proceedings of the AAAI conference on artificial intelligence (Vol. 34, No. 03, pp. 2535–2543).Google Scholar