Meta-learned models of cognition

Marcel Binz; Ishita Dasgupta; Akshay K. Jagadish; Matthew Botvinick; Jane X. Wang; Eric Schulz

doi:10.1017/S0140525X23003266

Meta-learned models of cognition

Published online by Cambridge University Press: 23 November 2023

Jane X. Wang and

Marcel Binz*: Affiliation:
Max Planck Institute for Biological Cybernetics, Tübingen, Germany [email protected] [email protected] Helmholtz Institute for Human-Centered AI, Munich, Germany [email protected]
Ishita Dasgupta: Affiliation:
Google DeepMind, London, UK [email protected] [email protected] [email protected]
Akshay K. Jagadish: Affiliation:
Max Planck Institute for Biological Cybernetics, Tübingen, Germany [email protected] [email protected] Helmholtz Institute for Human-Centered AI, Munich, Germany [email protected]
Matthew Botvinick: Affiliation:
Google DeepMind, London, UK [email protected] [email protected] [email protected]
Jane X. Wang: Affiliation:
Google DeepMind, London, UK [email protected] [email protected] [email protected]
Eric Schulz: Affiliation:
Max Planck Institute for Biological Cybernetics, Tübingen, Germany [email protected] [email protected] Helmholtz Institute for Human-Centered AI, Munich, Germany [email protected]
*: Corresponding author: Marcel Binz; Email: [email protected]

Article contents

Abstract
References

Get access

Rights & Permissions

Abstract

Psychologists and neuroscientists extensively rely on computational models for studying and analyzing the human mind. Traditionally, such computational models have been hand-designed by expert researchers. Two prominent examples are cognitive architectures and Bayesian models of cognition. Although the former requires the specification of a fixed set of computational structures and a definition of how these structures interact with each other, the latter necessitates the commitment to a particular prior and a likelihood function that – in combination with Bayes' rule – determine the model's behavior. In recent years, a new framework has established itself as a promising tool for building models of human cognition: the framework of meta-learning. In contrast to the previously mentioned model classes, meta-learned models acquire their inductive biases from experience, that is, by repeatedly interacting with an environment. However, a coherent research program around meta-learned models of cognition is still missing to date. The purpose of this article is to synthesize previous work in this field and establish such a research program. We accomplish this by pointing out that meta-learning can be used to construct Bayes-optimal learning algorithms, allowing us to draw strong connections to the rational analysis of cognition. We then discuss several advantages of the meta-learning framework over traditional methods and reexamine prior work in the context of these new insights.

Keywords

Bayesian inference cognitive modeling meta-learning neural networks rational analysis

Type: Target Article
Information: Behavioral and Brain Sciences , Volume 47 , 2024 , e147

DOI: https://doi.org/10.1017/S0140525X23003266 [Opens in a new window]
Copyright: Copyright © The Author(s), 2023. Published by Cambridge University Press

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Article purchase

Temporarily unavailable

References

Aitchison, J. (1975). Goodness of prediction fit. Biometrika, 62(3), 547–554.Google Scholar

Anderson, J. R. (2013a). The adaptive character of thought. Psychology Press.Google Scholar

Anderson, J. R. (2013b). The architecture of cognition. Psychology Press.Google Scholar

Ashby, F. G., & Maddox, W. T. (2005). Human category learning. Annual Review of Psychology, 56(1), 149–178.Google Scholar

Bates, C. J., & Jacobs, R. A. (2020). Efficient data compression in perception and perceptual memory. Psychological Review, 127(5), 891.Google Scholar

Baxter, J. (1998). Theoretical models of learning to learn. In Thrun, S. & Pratt, L. (Eds.), Learning to learn (pp. 71–94). Springer.Google Scholar

Bellec, G., Salaj, D., Subramoney, A., Legenstein, R., & Maass, W. (2018). Long short-term memory and learning-to-learn in networks of spiking neurons. Advances in Neural Information Processing Systems, 31, 795–805.Google Scholar

Bengio, Y., Bengio, S., & Cloutier, J. (1991). Learning a synaptic learning rule. In IJCNN-91-Seattle International Joint Conference on Neural Networks, Seattle, WA, USA (Vol. 2, p. 969).Google Scholar

Benjamin, D. J. (2019). Errors in probabilistic reasoning and judgment biases. In Bernheim, B., DellaVigna, S., & Laibson, D. (Eds.), Handbook of behavioral economics: Applications and foundations (Vol. 2, pp. 69–186). North-Holland.Google Scholar

Binmore, K. (2007). Rational decisions in large worlds. Annales d'Economie et de Statistique, No. 86, 25–41.Google Scholar

Binz, M., Gershman, S. J., Schulz, E., & Endres, D. (2022). Heuristics from bounded meta-learned inference. Psychological Review, 129(5), 1042–1077.Google Scholar

Binz, M., & Schulz, E. (2022a). Modeling human exploration through resource-rational reinforcement learning. In Oh, A. H., Agarwal, A., Belgrave, D., & Cho, K. (Eds.), Advances in neural information processing systems (pp. 31755–31768). Curran Associates, Inc. https://openreview.net/forum?id=W1MUJv5zaXP.Google Scholar

Binz, M., & Schulz, E. (2022b). Reconstructing the Einstellung effect. Computational Brain & Behavior, 6, 1–17.Google Scholar

Binz, M., & Schulz, E. (2023). Using cognitive psychology to understand GPT-3. Proceedings of the National Academy of Sciences of the United States of America, 120(6), e2218523120.Google Scholar

Bishop, C. M. (2006). Pattern recognition and machine learning. Springer.Google Scholar

Blundell, C., Uria, B., Pritzel, A., Li, Y., Ruderman, A., Leibo, J. Z., … Hassabis, D. (2016). Model-free episodic control. arXiv preprint arXiv:1606.04460.Google Scholar

Botvinick, M. M., Braver, T. S., Barch, D. M., Carter, C. S., & Cohen, J. D. (2001). Conflict monitoring and cognitive control. Psychological Review, 108(3), 624.Google Scholar

Botvinick, M. M., & Cohen, J. D. (2014). The computational and neural basis of cognitive control: Charted territory and new frontiers. Cognitive Science, 38(6), 1249–1285.Google Scholar

Bowers, J. S., & Davis, C. J. (2012). Bayesian just-so stories in psychology and neuroscience. Psychological Bulletin, 138(3), 389.Google Scholar

Bowers, J. S., Malhotra, G., Dujmović, M., Montero, M. L., Tsvetkov, C., Biscione, V., … Blything, R. (2022). Deep problems with neural network models of human vision. Behavioral and Brain Sciences, 46, 1–74.Google Scholar

Bramley, N. R., Dayan, P., Griffiths, T. L., & Lagnado, D. A. (2017). Formalizing Neurath's ship: Approximate algorithms for online causal learning. Psychological Review, 124(3), 301.Google Scholar

Brändle, F., Binz, M., & Schulz, E. (2022a). Exploration beyond bandits. In Cogliati Dezza, I., Schulz, E., & Wu, C. M. (Eds.), The drive for knowledge: The science of human information seeking (pp. 147–168). Cambridge University Press. doi:10.1017/9781009026949.008Google Scholar

Brändle, F., Stocks, L. J., Tenenbaum, J. B., Gershman, S. J., & Schulz, E. (2022b). Intrinsically motivated exploration as empowerment. PsyArXiv. January 14.Google Scholar

Brighton, H., & Gigerenzer, G. (2012). Are rational actor models “rational” outside small worlds. In Okasha, S. & Binmore, K. (Eds.), Evolution and rationality: Decisions, co-operation, and strategic behavior (pp. 84–109). Cambridge University Press.Google Scholar

Bromberg-Martin, E. S., Matsumoto, M., Hong, S., & Hikosaka, O. (2010). A pallidus–habenula–dopamine pathway signals inferred stimulus values. Journal of Neurophysiology, 104(2), 1068–1076.Google Scholar

Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J. D., Dhariwal, P., … Amodei, D. (2020). Language models are few-shot learners. Advances in Neural Information Processing Systems, 33, 1877–1901.Google Scholar

Chaitin, G. J. (1969). On the simplicity and speed of programs for computing infinite sets of natural numbers. Journal of the ACM (JACM), 16(3), 407–422.Google Scholar

Chater, N., & Oaksford, M. (1999). Ten years of the rational analysis of cognition. Trends in Cognitive Sciences, 3(2), 57–65.Google Scholar

Chater, N., & Vitányi, P. (2003). Simplicity: A unifying principle in cognitive science? Trends in Cognitive Sciences, 7(1), 19–22.Google Scholar

Chomsky, N. (2014). Aspects of the theory of syntax (Vol. 11). MIT Press.Google Scholar

Chowdhery, A., Narang, S., Devlin, J., Bosma, M., Mishra, G., Roberts, A., … Fiedel, N. (2022). Palm: Scaling language modeling with pathways. arXiv preprint arXiv:2204.02311.Google Scholar

Cohen, J. D. (2017). Cognitive control: Core constructs and current considerations. In Egner, T. (Ed.), The Wiley handbook of cognitive control (pp. 1–28). Wiley Blackwell.Google Scholar

Colas, C., Karch, T., Moulin-Frier, C., & Oudeyer, P.-Y. (2022). Language and culture internalization for human-like autotelic AI. Nature Machine Intelligence, 4(12), 1068–1076.Google Scholar

Collins, A. G., & Frank, M. J. (2013). Cognitive control over learning: Creating, clustering, and generalizing task-set structure. Psychological Review, 120(1), 190.Google Scholar

Corner, A., & Hahn, U. (2013). Normative theories of argumentation: Are some norms better than others? Synthese, 190(16), 3579–3610.Google Scholar

Courville, A. C., & Daw, N. D. (2008). The rat as particle filter. In Platt, J., Koller, D., Singer, Y., & Roweis, S. (Eds.), Advances in neural information processing systems (pp. 369–376). Curran Associates, Inc.Google Scholar

Cover, T. M. (1999). Elements of information theory. John Wiley.Google Scholar

Cranmer, K., Brehmer, J., & Louppe, G. (2020). The frontier of simulation-based inference. Proceedings of the National Academy of Sciences of the United States of America, 117(48), 30055–30062.Google Scholar

Czerlinski, J., Gigerenzer, G., & Goldstein, D. G. (1999). How good are simple heuristics. In Gigerenzer, G. & Todd, P. M. (Eds.), Simple heuristics that make us smart (pp. 97–118). Oxford University Press.Google Scholar

Dasgupta, I., & Gershman, S. J. (2021). Memory as a computational resource. Trends in Cognitive Sciences, 25(3), 240–251.Google Scholar

Dasgupta, I., Lampinen, A. K., Chan, S. C., Creswell, A., Kumaran, D., McClelland, J. L., & Hill, F. (2022). Language models show human-like content effects on reasoning. arXiv preprint arXiv:2207.07051.Google Scholar

Dasgupta, I., Schulz, E., & Gershman, S. J. (2017). Where do hypotheses come from? Cognitive Psychology, 96, 1–25.Google Scholar

Dasgupta, I., Schulz, E., Tenenbaum, J. B., & Gershman, S. J. (2020). A theory of learning to infer. Psychological Review, 127(3), 412.Google Scholar

Dasgupta, I., Wang, J., Chiappa, S., Mitrovic, J., Ortega, P., Raposo, D., … Kurth-Nelson, Z. (2019). Causal reasoning from meta-reinforcement learning. arXiv preprint arXiv:1901.08162.Google Scholar

Daw, N. D., Courville, A. C., & Dayan, P. (2008). Semi-rational models of conditioning: The case of trial order. In Chater, N. & Oaksford, M. (Eds.), The probabilistic mind (pp. 431–452). Oxford Academic.Google Scholar

Daw, N. D., Gershman, S. J., Seymour, B., Dayan, P., & Dolan, R. J. (2011). Model-based influences on humans’ choices and striatal prediction errors. Neuron, 69(6), 1204–1215.Google Scholar

Dayan, P., & Kakade, S. (2000). Explaining away in weight space. Advances in Neural Information Processing Systems, 13, 430–436.Google Scholar

Dobs, K., Martinez, J., Kell, A. J., & Kanwisher, N. (2022). Brain-like functional specialization emerges spontaneously in deep neural networks. Science Advances, 8(11), eabl8913.Google Scholar

Doya, K. (2002). Metalearning and neuromodulation. Neural Networks, 15(4–6), 495–506.Google Scholar

Duan, Y., Schulman, J., Chen, X., Bartlett, P. L., Sutskever, I., & Abbeel, P. (2016). RL2: Fast reinforcement learning via slow reinforcement learning. arXiv preprint arXiv:1611.02779.Google Scholar

Dubey, R., Grant, E., Luo, M., Narasimhan, K., & Griffiths, T. (2020). Connecting context-specific adaptation in humans to meta-learning. arXiv preprint arXiv:2011.13782.Google Scholar

Duff, M. O. (2003). Optimal learning: Computational procedures for Bayes-adaptive Markov decision processes [Unpublished PhD thesis]. University of Massachusetts Amherst.Google Scholar

Farquhar, G., Rocktäschel, T., Igl, M., & Whiteson, S. (2017). TreeQN and ATreeC: Differentiable tree-structured models for deep reinforcement learning. arXiv preprint arXiv:1710.11417.Google Scholar

Feldman, J. (2016). The simplicity principle in perception and cognition. Wiley Interdisciplinary Reviews: Cognitive Science, 7(5), 330–340.Google Scholar

Feurer, M., & Hutter, F. (2019). Hyperparameter optimization. In Hutter, F., Kotthoff, L., & Vanschoren, J. (Eds.), Automated machine learning (pp. 3–33). Springer.Google Scholar

Finn, C., Abbeel, P., & Levine, S. (2017). Model-agnostic meta-learning for fast adaptation of deep networks. In International conference on machine learning (pp. 1126–1135).Google Scholar

Friston, K. (2010). The free-energy principle: A unified brain theory? Nature Reviews Neuroscience, 11(2), 127–138.Google Scholar

Gauvrit, N., Zenil, H., Delahaye, J.-P., & Soler-Toscano, F. (2014). Algorithmic complexity for short binary strings applied to psychology: A primer. Behavior Research Methods, 46(3), 732–744.Google Scholar

Gauvrit, N., Zenil, H., & Tegnér, J. (2017). The information-theoretic and algorithmic approach to human, animal, and artificial cognition. In Dodig-Crnkovic, G. & Giovagnoli, R. (Eds.), Representation and reality in humans, other living organisms and intelligent machines (pp. 117–139). Springer.Google Scholar

Geman, S., & Geman, D. (1984). Stochastic relaxation, Gibbs distributions, and the Bayesian restoration of images. IEEE Transactions on Pattern Analysis and Machine Intelligence, (6), 721–741.Google Scholar

Gershman, S. J. (2015). A unifying probabilistic view of associative learning. PLoS Computational Biology, 11(11), e1004567.Google Scholar

Gershman, S. J. (2018). Deconstructing the human algorithms for exploration. Cognition, 173, 34–42.Google Scholar

Gershman, S. J., & Daw, N. D. (2017). Reinforcement learning and episodic memory in humans and animals: An integrative framework. Annual Review of Psychology, 68, 101.Google Scholar

Gerstenberg, T., Goodman, N. D., Lagnado, D. A., & Tenenbaum, J. B. (2021). A counterfactual simulation model of causal judgments for physical events. Psychological Review, 128(5), 936.Google Scholar

Gigerenzer, G., & Gaissmaier, W. (2011). Heuristic decision making. Annual Review of Psychology, 62, 451–482.Google Scholar

Gittins, J. C. (1979). Bandit processes and dynamic allocation indices. Journal of the Royal Statistical Society. Series B: Methodological, 41(2), 148–177.Google Scholar

Goyal, A., & Bengio, Y. (2022). Inductive biases for deep learning of higher-level cognition. Proceedings of the Royal Society A, 478(2266), 20210068.Google Scholar

Grant, E., Finn, C., Levine, S., Darrell, T., & Griffiths, T. (2018). Recasting gradient-based meta-learning as hierarchical Bayes. In 6th international conference on learning representations, ICLR 2018.Google Scholar

Griffiths, T. L., Callaway, F., Chang, M. B., Grant, E., Krueger, P. M., & Lieder, F. (2019). Doing more with less: Meta-reasoning and meta-learning in humans and machines. Current Opinion in Behavioral Sciences, 29, 24–30.Google Scholar

Griffiths, T. L., Chater, N., Kemp, C., Perfors, A., & Tenenbaum, J. B. (2010). Probabilistic models of cognition: Exploring representations and inductive biases. Trends in Cognitive Sciences, 14(8), 357–364.Google Scholar

Griffiths, T. L., Chater, N., Norris, D., & Pouget, A. (2012). How the Bayesians got their beliefs (and what those beliefs actually are): Comment on Bowers and Davis (2012). Psychological Bulletin, 138(3), 415–422.Google Scholar

Griffiths, T. L., Daniels, D., Austerweil, J. L., & Tenenbaum, J. B. (2018). Subjective randomness as statistical inference. Cognitive Psychology, 103, 85–109.Google Scholar

Griffiths, T. L., Kemp, C., & Tenenbaum, J. B. (2008). Bayesian models of cognition. In Sun, R. (Ed.), The Cambridge handbook of computational psychology (pp. 59–100). Cambridge University Press.Google Scholar

Griffiths, T. L., & Tenenbaum, J. B. (2006). Optimal predictions in everyday cognition. Psychological Science, 17(9), 767–773.Google Scholar

Harlow, H. F. (1949). The formation of learning sets. Psychological Review, 56(1), 51.Google Scholar

Harrison, P., Marjieh, R., Adolfi, F., van Rijn, P., Anglada-Tort, M., Tchernichovski, O., … Jacoby, N. (2020). Gibbs sampling with people. Advances in Neural Information Processing Systems, 33, 10659–10671.Google Scholar

Hill, F., Lampinen, A., Schneider, R., Clark, S., Botvinick, M., McClelland, J. L., & Santoro, A. (2020). Environmental drivers of systematicity and generalization in a situated agent. In International conference on learning representations. Retrieved from https://openreview.net/forum?id=SklGryBtwr Google Scholar

Hinton, G. E., & Van Camp, D. (1993). Keeping the neural networks simple by minimizing the description length of the weights. In Proceedings of the 6th annual conference on computational learning theory (pp. 5–13).Google Scholar

Hinton, G. E., & Zemel, R. (1993). Autoencoders, minimum description length and Helmholtz free energy. Advances in Neural Information Processing Systems, 6, 3–10.Google Scholar

Hochreiter, S., Younger, A. S., & Conwell, P. R. (2001). Learning to learn using gradient descent. In International conference on artificial neural networks (pp. 87–94).Google Scholar

Hoppe, D., & Rothkopf, C. A. (2016). Learning rational temporal eye movement strategies. Proceedings of the National Academy of Sciences of the United States of America, 113(29), 8332–8337.Google Scholar

Hornik, K., Stinchcombe, M., & White, H. (1989). Multilayer feedforward networks are universal approximators. Neural Networks, 2(5), 359–366.Google Scholar

Jagadish, A. K., Binz, M., Saanum, T., Wang, J. X., & Schulz, E. (2023). Zero-shot compositional reinforcement learning in humans. https://doi.org/10.31234/osf.io/ymve5.Google Scholar

Jensen, K. T., Hennequin, G., & Mattar, M. G. (2023). A recurrent network model of planning explains hippocampal replay and human behavior. bioRxiv, 2023-01.Google Scholar

Jones, M., & Love, B. C. (2011). Bayesian fundamentalism or enlightenment? On the explanatory status and theoretical contributions of Bayesian models of cognition. Behavioral and Brain Sciences, 34(4), 169.Google Scholar

Jordan, M. I., Ghahramani, Z., Jaakkola, T. S., & Saul, L. K. (1999). An introduction to variational methods for graphical models. Machine Learning, 37(2), 183–233.Google Scholar

Kahneman, D., & Tversky, A. (1973). On the psychology of prediction. Psychological Review, 80(4), 237.Google Scholar

Kanwisher, N., Khosla, M., & Dobs, K. (2023). Using artificial neural networks to ask “why” questions of minds and brains. Trends in Neurosciences, 46(3), 240–254.Google Scholar

Kingma, D. P., & Welling, M. (2013). Auto-encoding variational Bayes. arXiv preprint arXiv:1312.6114.Google Scholar

Kirsch, L., & Schmidhuber, J. (2021). Meta learning backpropagation and improving it. Advances in Neural Information Processing Systems, 34, 14122–14134.Google Scholar

Knill, D. C., & Richards, W. (1996). Perception as Bayesian inference. Cambridge University Press.Google Scholar

Kolmogorov, A. N. (1965). Three approaches to the quantitative definition of information. Problems of Information Transmission, 1(1), 1–7.Google Scholar

Kool, W., Cushman, F. A., & Gershman, S. J. (2016). When does model-based control payoff? PLoS Computational Biology, 12(8), e1005090.Google Scholar

Körding, K. P., & Wolpert, D. M. (2004). Bayesian integration in sensorimotor learning. Nature, 427(6971), 244–247.Google Scholar

Kruschke, J. (1990). Alcove: A connectionist model of human category learning. Advances in Neural Information Processing Systems, 3, 649–655.Google Scholar

Kumar, S., Correa, C. G., Dasgupta, I., Marjieh, R., Hu, M., Hawkins, R. D., … Griffiths, T. L. (2022a). Using natural language and program abstractions to instill human inductive biases in machines. In Oh, A. H., Agarwal, A., Belgrave, D., & Cho, K. (Eds.), Advances in neural information processing systems (pp. 167–180). Curran Associates, Inc. https://openreview.net/forum?id=buXZ7nIqiwE.Google Scholar

Kumar, S., Dasgupta, I., Cohen, J., Daw, N., & Griffiths, T. (2020b). Meta-learning of structured task distributions in humans and machines. In International conference on learning representations.Google Scholar

Kumar, S., Dasgupta, I., Cohen, J. D., Daw, N. D., & Griffiths, T. L. (2020a). Meta-learning of structured task distributions in humans and machines. arXiv preprint arXiv:2010.02317.Google Scholar

Kumar, S., Dasgupta, I., Marjieh, R., Daw, N. D., Cohen, J. D., & Griffiths, T. L. (2022b). Disentangling abstraction from statistical pattern matching in human and machine learning. arXiv preprint arXiv:2204.01437.Google Scholar

Lake, B. M. (2019). Compositional generalization through meta sequence-to-sequence learning. Advances in Neural Information Processing Systems, 32, 9791–9801.Google Scholar

Lake, B. M., & Baroni, M. (2023). Human-like systematic generalization through a meta-learning neural network. Nature, 623, 1–7.Google Scholar

Lange, R. T., & Sprekeler, H. (2020). Learning not to learn: Nature versus nurture in silico. arXiv preprint arXiv:2010.04466.Google Scholar

Lengyel, M., & Dayan, P. (2007). Hippocampal contributions to control: The third way. Advances in Neural Information Processing Systems, 20, 889–896.Google Scholar

Lewis, D. (1999). Why conditionalize? In Lewis, D. (Ed.) , Papers in metaphysics and epistemology (Vol. 2, pp. 403–407). Cambridge University Press. doi:10.1017/CBO9780511625343.024Google Scholar

Li, Z., Zhou, F., Chen, F., & Li, H. (2017). Meta-SGD: Learning to learn quickly for few-shot learning. arXiv preprint arXiv:1707.09835.Google Scholar

Lieder, F., & Griffiths, T. L. (2017). Strategy selection as rational metareasoning. Psychological Review, 124(6), 762.Google Scholar

Liu, Z., Luo, P., Wang, X., & Tang, X. (2015, December). Deep learning face attributes in the wild. In Proceedings of international conference on computer vision (ICCV).Google Scholar

Lucas, C. G., Griffiths, T. L., Williams, J. J., & Kalish, M. L. (2015). A rational model of function learning. Psychonomic Bulletin & Review, 22(5), 1193–1215.Google Scholar

Lueckmann, J.-M., Boelts, J., Greenberg, D., Goncalves, P., & Macke, J. (2021). Benchmarking simulation-based inference. In International conference on artificial intelligence and statistics (pp. 343–351).Google Scholar

Marr, D. (2010). Vision: A computational investigation into the human representation and processing of visual information. MIT Press.Google Scholar

McClelland, J. L., Botvinick, M. M., Noelle, D. C., Plaut, D. C., Rogers, T. T., Seidenberg, M. S., & Smith, L. B. (2010). Letting structure emerge: Connectionist and dynamical systems approaches to cognition. Trends in Cognitive Sciences, 14(8), 348–356.Google Scholar

McClelland, J. L., Hill, F., Rudolph, M., Baldridge, J., & Schütze, H. (2020). Placing language in an integrated understanding system: Next steps toward human-level performance in neural language models. Proceedings of the National Academy of Sciences of the United States of America, 117(42), 25966–25974.Google Scholar

McClelland, J. L., McNaughton, B. L., & O'Reilly, R. C. (1995). Why there are complementary learning systems in the hippocampus and neocortex: Insights from the successes and failures of connectionist models of learning and memory. Psychological Review, 102(3), 419.Google Scholar

McCoy, R. T., Grant, E., Smolensky, P., Griffiths, T. L., & Linzen, T. (2020). Universal linguistic inductive biases via meta-learning. arXiv preprint arXiv:2006.16324.Google Scholar

Mercier, H., & Sperber, D. (2017). The enigma of reason. Harvard University Press.Google Scholar

Miconi, T. (2023). A large parametrized space of meta-reinforcement learning tasks. arXiv preprint arXiv:2302.05583.Google Scholar

Miconi, T., Rawal, A., Clune, J., & Stanley, K. O. (2020). Backpropamine: Training self-modifying neural networks with differentiable neuromodulated plasticity. arXiv preprint arXiv:2002.10585.Google Scholar

Mikulik, V., Delétang, G., McGrath, T., Genewein, T., Martic, M., Legg, S., & Ortega, P. (2020). Meta-trained agents implement Bayes-optimal agents. In Larochelle, H., Ranzato, M., Hadsell, R., Balcan, M. F., & Lin, H. (Eds.), Advances in neural information processing systems (Vol. 33, pp. 18691–18703). Curran. Retrieved from https://proceedings.neurips.cc/paper/2020/file/d902c3ce47124c66ce615d5ad9ba304f-Paper.pdf Google Scholar

Mitchell, T. M. (1997). Machine learning (Vol. 1). McGraw Hill.Google Scholar

Molano-Mazon, M., Barbosa, J., Pastor-Ciurana, J., Fradera, M., Zhang, R.-Y., Forest, J., … Yang, G. R. (2022). Neurogym: An open resource for developing and sharing neuroscience tasks. https://doi.org/10.31234/osf.io/aqc9n.Google Scholar

Montello, D. R. (2005). Navigation. Cambridge University Press.Google Scholar

Moskovitz, T., Miller, K., Sahani, M., & Botvinick, M. M. (2022). A unified theory of dual-process control. arXiv preprint arXiv:2211.07036.Google Scholar

Müller, S., Hollmann, N., Arango, S. P., Grabocka, J., & Hutter, F. (2021). Transformers can do Bayesian inference. arXiv preprint arXiv:2112.10510.Google Scholar

Murphy, K. P. (2012). Machine learning: A probabilistic perspective. MIT Press.Google Scholar

Newell, A. (1992). Unified theories of cognition and the role of soar. In Soar: A cognitive architecture in perspective (pp. 25–79). Springer.Google Scholar

Newell, A., & Simon, H. A. (1972). Human problem solving (Vol. 104, No. 9). Prentice Hall.Google Scholar

Nichol, A., Achiam, J., & Schulman, J. (2018). On first-order meta-learning algorithms. arXiv preprint arXiv:1803.02999.Google Scholar

Nosofsky, R. M. (2011). The generalized context model: An exemplar model of classification. In Pothos, E. M. & Wills, A. J. (Eds.), Formal approaches in categorization (pp. 18–39). Cambridge University Press.Google Scholar

Oaksford, M., & Chater, N. (2007). Bayesian rationality: The probabilistic approach to human reasoning. Oxford University Press.Google Scholar

Ortega, P. A., Braun, D. A., Dyer, J., Kim, K.-E., & Tishby, N. (2015). Information-theoretic bounded rationality. arXiv preprint arXiv:1512.06789.Google Scholar

Ortega, P. A., Wang, J. X., Rowland, M., Genewein, T., Kurth-Nelson, Z., Pascanu, R., … Legg, S. (2019). Meta-learning of sequential strategies. arXiv preprint arXiv:1905.03030.Google Scholar

Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., … Chintala, S. (2019). Pytorch: An imperative style, high-performance deep learning library. In Wallach, H., Larochelle, H., Beygelzimer, A., d'Alché-Buc, F., Fox, E., & Garnett, R. (Eds.), Advances in neural information processing systems 32 (pp. 8024–8035). Curran. Retrieved from http://papers.neurips.cc/paper/9015-pytorch-an-imperative-style-high-performance-deep-learning-library.pdf Google Scholar

Pearl, J. (2009). Causality. Cambridge University Press.Google Scholar

Piloto, L. S., Weinstein, A., Battaglia, P., & Botvinick, M. (2022). Intuitive physics learning in a deep-learning model inspired by developmental psychology. Nature Human Behaviour, 6(9), 1257–1267.Google Scholar

Pritzel, A., Uria, B., Srinivasan, S., Badia, A. P., Vinyals, O., Hassabis, D., … Blundell, C. (2017). Neural episodic control. In International conference on machine learning (pp. 2827–2836).Google Scholar

Rabinowitz, N. C. (2019). Meta-learners’ learning dynamics are unlike learners’. arXiv preprint arXiv:1905.01320.Google Scholar

Rahwan, I., Cebrian, M., Obradovich, N., Bongard, J., Bonnefon, J.-F., Breazeal, C., … Wellman, M. (2019). Machine behaviour. Nature, 568(7753), 477–486.Google Scholar

Ratcliff, R., & McKoon, G. (2008). The diffusion decision model: Theory and data for two-choice decision tasks. Neural Computation, 20(4), 873–922.Google Scholar

Reed, S., Zolna, K., Parisotto, E., Colmenarejo, S. G., Novikov, A., Barth-Maron, G., … de Freitas, N. (2022). A generalist agent. arXiv preprint arXiv:2205.06175.Google Scholar

Rescorla, M. (2020). An improved Dutch book theorem for conditionalization. Erkenntnis, 87, 1–29.Google Scholar

Rescorla, R. A. (1972). A theory of Pavlovian conditioning: Variations in the effectiveness of reinforcement and nonreinforcement. In Black, A. H. & Prokasy, W. F. (Eds.), Current research and theory (pp. 64–99). Appleton-Century-Crofts.Google Scholar

Rich, A. S., & Gureckis, T. M. (2019). Lessons for artificial intelligence from the study of natural stupidity. Nature Machine Intelligence, 1(4), 174–180.Google Scholar

Ritter, S., Barrett, D. G., Santoro, A., & Botvinick, M. M. (2017). Cognitive psychology for deep neural networks: A shape bias case study. In International conference on machine learning (pp. 2940–2949).Google Scholar

Ritter, S., Wang, J., Kurth-Nelson, Z., Jayakumar, S., Blundell, C., Pascanu, R., & Botvinick, M. (2018). Been there, done that: Meta-learning with episodic recall. In International conference on machine learning (pp. 4354–4363).Google Scholar

Riveland, R., & Pouget, A. (2022). Generalization in sensorimotor networks configured with natural language instructions. bioRxiv, 2022-02.Google Scholar

Rosenkrantz, R. D. (1992). The justification of induction. Philosophy of Science, 59(4), 527–539.Google Scholar

Rumelhart, D. E., McClelland, J. L., & PDP Research Group, . (1988). Parallel distributed processing (Vol. 1). IEEE.Google Scholar

Sanborn, A., & Griffiths, T. (2007). Markov chain Monte Carlo with people. Advances in Neural Information Processing Systems, 20, 1265–1272.Google Scholar

Sanborn, A. N. (2017). Types of approximation for probabilistic cognition: Sampling and variational. Brain and Cognition, 112, 98–101.Google Scholar

Sanborn, A. N., Griffiths, T. L., & Navarro, D. J. (2010). Rational approximations to rational models: Alternative algorithms for category learning. Psychological Review, 117(4), 1144.Google Scholar

Sanborn, A. N., & Silva, R. (2013). Constraining bridges between levels of analysis: A computational justification for locally Bayesian learning. Journal of Mathematical Psychology, 57(3–4), 94–106.Google Scholar

Sancaktar, C., Blaes, S., & Martius, G. (2022). Curious exploration via structured world models yields zero-shot object manipulation. In Oh, A. H., Agarwal, A., Belgrave, D., & Cho, K. (Eds.) , Advances in neural information processing systems (pp. 24170–24183). Curran Associates, Inc. https://openreview.net/forum?id=NnuYZ1el24C.Google Scholar

Santoro, A., Bartunov, S., Botvinick, M., Wierstra, D., & Lillicrap, T. (2016). Meta-learning with memory-augmented neural networks. In International conference on machine learning (pp. 1842–1850).Google Scholar

Savage, L. J. (1972). The foundations of statistics. Courier.Google Scholar

Schaul, T., & Schmidhuber, J. (2010). Metalearning. Scholarpedia, 5(6), 4650 (revision #91489). doi:10.4249/scholarpedia.4650Google Scholar

Schlag, I., Irie, K., & Schmidhuber, J. (2021). Linear transformers are secretly fast weight programmers. In International conference on machine learning (pp. 9355–9366).Google Scholar

Schmidhuber, J. (1987). Evolutionary principles in self-referential learning, or on learning how to learn: The meta-meta-… hook. Unpublished doctoral dissertation, Technische Universität München.Google Scholar

Schramowski, P., Turan, C., Andersen, N., Rothkopf, C. A., & Kersting, K. (2022). Large pre-trained language models contain human-like biases of what is right and wrong to do. Nature Machine Intelligence, 4(3), 258–268.Google Scholar

Schulz, E., Bhui, R., Love, B. C., Brier, B., Todd, M. T., & Gershman, S. J. (2019). Structured, uncertainty-driven exploration in real-world consumer choice. Proceedings of the National Academy of Sciences of the United States of America, 116(28), 13903–13908.Google Scholar

Schulz, E., & Dayan, P. (2020). Computational psychiatry for computers. iScience, 23(12), 101772.Google Scholar

Schulz, E., & Gershman, S. J. (2019). The algorithmic architecture of exploration in the human brain. Current Opinion in Neurobiology, 55, 7–14.Google Scholar

Schulz, E., Tenenbaum, J. B., Duvenaud, D., Speekenbrink, M., & Gershman, S. J. (2017). Compositional inductive biases in function learning. Cognitive Psychology, 99, 44–79. doi:10.1016/j.cogpsych.2017.11.002Google Scholar

Schulze Buschoff, L. M., Schulz, E., & Binz, M. (2023). The acquisition of physical knowledge in generative neural networks. In Proceedings of the 40th international conference on machine learning (pp. 30321–30341).Google Scholar

Shepard, R. N. (1987). Toward a universal law of generalization for psychological science. Science, 237(4820), 1317–1323.Google Scholar

Snell, J., Swersky, K., & Zemel, R. (2017). Prototypical networks for few-shot learning. Advances in Neural Information Processing Systems, 30, 4080–4090.Google Scholar

Solomonoff, R. J. (1964). A formal theory of inductive inference. Part I. Information and Control, 7(1), 1–22.Google Scholar

Stopper, C. M., Maric, T., Montes, D. R., Wiedman, C. R., & Floresco, S. B. (2014). Overriding phasic dopamine signals redirects action selection during risk/reward decision making. Neuron, 84(1), 177–189.Google Scholar

Strouse, D., McKee, K., Botvinick, M., Hughes, E., & Everett, R. (2021). Collaborating with humans without human data. Advances in Neural Information Processing Systems, 34, 14502–14515.Google Scholar

Tamar, A., Wu, Y., Thomas, G., Levine, S., & Abbeel, P. (2016). Value iteration networks. Advances in Neural Information Processing Systems, 29, 2154–2162.Google Scholar

Tauber, S., Navarro, D. J., Perfors, A., & Steyvers, M. (2017). Bayesian models of cognition revisited: Setting optimality aside and letting data drive psychological theory. Psychological Review, 124(4), 410.Google Scholar

Team, A. A., Bauer, J., Baumli, K., Baveja, S., Behbahani, F., Bhoopchand, A., … Zhang, L. (2023). Human-timescale adaptation in an open-ended task space. arXiv preprint arXiv:2301.07608.Google Scholar

Team, O. E. L., Stooke, A., Mahajan, A., Barros, C., Deck, C., Bauer, J., … Czarnecki, W. M. (2021). Open-ended learning leads to generally capable agents. arXiv preprint arXiv:2107.12808.Google Scholar

Tenenbaum, J. (2021). Joshua Tenenbaum's homepage. Retrieved from http://web.mit.edu/cocosci/josh.html Google Scholar

Thrun, S., & Pratt, L. (1998). Learning to learn: Introduction and overview. In Thrun, S. & Pratt, L. (Eds.), Learning to learn (pp. 3–17). Springer.Google Scholar

Todorov, E., Erez, T., & Tassa, Y. (2012). Mujoco: A physics engine for model-based control. In 2012 IEEE/RSJ international conference on intelligent robots and systems (pp. 5026–5033).Google Scholar

Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., … Polosukhin, I. (2017). Attention is all you need. Advances in Neural Information Processing Systems, 30, 6000–6010.Google Scholar

Vinyals, O., Blundell, C., Lillicrap, T., & Wierstra, D. (2016). Matching networks for one shot learning. Advances in Neural Information Processing Systems, 29, 3637–3645.Google Scholar

Wang, J. X. (2021). Meta-learning in natural and artificial intelligence. Current Opinion in Behavioral Sciences, 38, 90–95.Google Scholar

Wang, J. X., King, M., Porcel, N., Kurth-Nelson, Z., Zhu, T., Deck, C., … Botvinick, M. (2021). Alchemy: A structured task distribution for meta-reinforcement learning. CoRR, abs/2102.02926. Retrieved from https://arxiv.org/abs/2102.02926 Google Scholar

Wang, J. X., Kurth-Nelson, Z., Kumaran, D., Tirumala, D., Soyer, H., Leibo, J. Z., … Botvinick, M. (2018). Prefrontal cortex as a meta-reinforcement learning system. Nature Neuroscience, 21(6), 860–868.Google Scholar

Wang, J. X., Kurth-Nelson, Z., Tirumala, D., Soyer, H., Leibo, J. Z., Munos, R., … Botvinick, M. (2016). Learning to reinforcement learn. arXiv preprint arXiv:1611.05763.Google Scholar

Wilson, R. C., Geana, A., White, J. M., Ludvig, E. A., & Cohen, J. D. (2014). Humans use directed and random exploration to solve the explore–exploit dilemma. Journal of Experimental Psychology: General, 143(6), 2074.Google Scholar

Wolpert, D. H. (1996). The lack of a priori distinctions between learning algorithms. Neural Computation, 8(7), 1341–1390.Google Scholar

Wolpert, D. H., & Macready, W. G. (1997). No free lunch theorems for optimization. IEEE Transactions on Evolutionary Computation, 1(1), 67–82.Google Scholar

Wu, C. M., Schulz, E., Speekenbrink, M., Nelson, J. D., & Meder, B. (2018). Generalization guides human exploration in vast decision spaces. Nature Human Behaviour, 2(12), 915–924.Google Scholar

Yang, G. R., Joglekar, M. R., Song, H. F., Newsome, W. T., & Wang, X.-J. (2019). Task representations in neural networks trained to perform many cognitive tasks. Nature Neuroscience, 22(2), 297–306.Google Scholar

Yang, Y., & Piantadosi, S. T. (2022). One model for the learning of language. Proceedings of the National Academy of Sciences of the United States of America, 119(5), e2021865119.Google Scholar

Yu, T., Quillen, D., He, Z., Julian, R., Hausman, K., Finn, C., & Levine, S. (2020). Meta-world: A benchmark and evaluation for multi-task and meta reinforcement learning. In Conference on robot learning (pp. 1094–1100).Google Scholar

Zednik, C., & Jäkel, F. (2016). Bayesian reverse-engineering considered as a research strategy for cognitive science. Synthese, 193(12), 3951–3985.Google Scholar

Zenil, H., Marshall, J. A., & Tegnér, J. (2015). Approximations of algorithmic and structural complexity validate cognitive-behavioural experimental results. arXiv preprint arXiv:1509.06338.Google Scholar