No CrossRef data available.
Article contents
Is human compositionality meta-learned?
Published online by Cambridge University Press: 23 September 2024
Abstract
Recent studies suggest that meta-learning may provide an original solution to an enduring puzzle about whether neural networks can explain compositionality – in particular, by raising the prospect that compositionality can be understood as an emergent property of an inner-loop learning algorithm. We elaborate on this hypothesis and consider its empirical predictions regarding the neural mechanisms and development of human compositionality.
- Type
- Open Peer Commentary
- Information
- Copyright
- Copyright © The Author(s), 2024. Published by Cambridge University Press
References
Bergelson, E. (2020). The comprehension boost in early word learning: Older infants are better learners. Child Development Perspectives, 14(3), 142–149. https://doi.org/10.1111/cdep.12373CrossRefGoogle ScholarPubMed
Brown, T. B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P.. (2020). Language models are few-shot learners. Advances in Neural Information Processing Systems, 33, 1877–1901. https://papers.nips.cc/paper/2020/hash/1457c0d6bfcb4967418bfb8ac142f64a-Abstract.htmlGoogle Scholar
Calderon, C. B., Verguts, T., & Frank, M. J. (2022). Thunderstruck: The ACDC model of flexible sequences and rhythms in recurrent neural circuits. PLoS Computational Biology, 18(2), e1009854. https://doi.org/10.1371/journal.pcbi.1009854CrossRefGoogle ScholarPubMed
Chan, S. C. Y., Santoro, A., Lampinen, A. K., Wang, J. X., Singh, A., Richemond, P. H., … Hill, F. (2022). Data distributional properties drive emergent in-context learning in transformers. Advances in Neural Information Processing Systems, 35, 18878–18891. https://papers.nips.cc/paper_files/paper/2022/hash/77c6ccacfd9962e2307fc64680fc5ace-Abstract-Conference.htmlGoogle Scholar
Collins, A. G. E., & Frank, M. J. (2013). Cognitive control over learning: Creating, clustering and generalizing task-set structure. Psychological Review, 120(1), 190–229. https://doi.org/10.1037/a0030852CrossRefGoogle ScholarPubMed
Crescentini, C., Seyed-Allaei, S., De Pisapia, N., Jovicich, J., Amati, D., & Shallice, T. (2011). Mechanisms of rule acquisition and rule following in inductive reasoning. Journal of Neuroscience, 31(21), 7763–7774. https://doi.org/10.1523/JNEUROSCI.4579-10.2011CrossRefGoogle ScholarPubMed
Fodor, J. A., & Pylyshyn, Z. W. (1988). Connectionism and cognitive architecture: A critical analysis. Cognition, 28(1–2), 3–71. https://doi.org/10.1016/0010-0277(88)90031-5CrossRefGoogle ScholarPubMed
Frank, M. J., & Badre, D. (2012). Mechanisms of hierarchical reinforcement learning in corticostriatal circuits 1: Computational analysis. Cerebral Cortex, 22(3), 509–526. https://doi.org/10.1093/cercor/bhr114CrossRefGoogle ScholarPubMed
Goel, V. (2007). Anatomy of deductive reasoning. Trends in Cognitive Sciences, 11(10), 435–441. https://doi.org/10.1016/j.tics.2007.09.003CrossRefGoogle ScholarPubMed
Kriete, T., Noelle, D. C., Cohen, J. D., & O'Reilly, R. C. (2013). Indirection and symbol-like processing in the prefrontal cortex and basal ganglia. Proceedings of the National Academy of Sciences of the United States of America, 110(41), 16390–16395. https://doi.org/10.1073/pnas.1303547110CrossRefGoogle ScholarPubMed
Lake, B. M., & Baroni, M. (2018). Generalization without systematicity: On the compositional skills of sequence-to-sequence recurrent networks. In Dy, J. G. & Krause, A. (Eds.), Proceedings of the 35th International Conference on Machine Learning (Vol. 80, pp. 2879–2888). PMLR. http://proceedings.mlr.press/v80/lake18a.htmlGoogle Scholar
Lake, B. M., & Baroni, M. (2023). Human-like systematic generalization through a meta-learning neural network. Nature, 623, 1–7. https://doi.org/10.1038/s41586-023-06668-3CrossRefGoogle ScholarPubMed
Linzen, T., & Baroni, M. (2021). Syntactic structure from deep learning. Annual Review of Linguistics, 7(1), 195–212. https://doi.org/10.1146/annurev-linguistics-032020-051035CrossRefGoogle Scholar
Marcus, G. F. (1998). Rethinking eliminative connectionism. Cognitive Psychology, 37(3), 243–282. https://doi.org/10.1006/cogp.1998.0694CrossRefGoogle ScholarPubMed
Miller, E. K., & Cohen, J. D. (2001). An integrative theory of prefrontal cortex function. Annual Review of Neuroscience, 24, 167–202.CrossRefGoogle ScholarPubMed
Munakata, Y., Snyder, H. R., & Chatham, C. H. (2012). Developing cognitive control: Three key transitions. Current Directions in Psychological Science, 21(2), 71–77. https://doi.org/10.1177/0963721412436807CrossRefGoogle ScholarPubMed
O'Reilly, R. C., & Frank, M. J. (2006). Making working memory work: A computational model of learning in the prefrontal cortex and basal ganglia. Neural Computation, 18(2), 283–328. https://doi.org/10.1162/089976606775093909CrossRefGoogle Scholar
Piantadosi, S., & Aslin, R. (2016). Compositional reasoning in early childhood. PLoS ONE, 11(9), e0147734. https://doi.org/10.1371/journal.pone.0147734CrossRefGoogle ScholarPubMed
Piantadosi, S. T., Palmeri, H., & Aslin, R. (2018). Limits on composition of conceptual operations in 9-month-olds. Infancy, 23(3), 310–324. https://doi.org/10.1111/infa.12225CrossRefGoogle ScholarPubMed
Rougier, N. P., Noelle, D., Braver, T. S., Cohen, J. D., & O'Reilly, R. C. (2005). Prefrontal cortex and the flexibility of cognitive control: Rules without symbols. Proceedings of the National Academy of Sciences of the United States of America, 102(20), 7338–7343.CrossRefGoogle ScholarPubMed
Russin, J., Jo, J., O'Reilly, R. C., & Bengio, Y. (2020a). Systematicity in a recurrent neural network by factorizing syntax and semantics. Proceedings for the 42nd Annual Meeting of the Cognitive Science Society, 7. https://cognitivesciencesociety.org/cogsci20/papers/0027/0027.pdfGoogle Scholar
Russin, J., O'Reilly, R. C., & Bengio, Y. (2020b). Deep learning needs a prefrontal cortex. In Bridging AI and Cognitive Science (BAICS) Workshop, ICLR, 2020, 11.Google Scholar
Smolensky, P. (1990). Tensor product variable binding and the representation of symbolic structures in connectionist systems. Artificial Intelligence, 46(1–2), 159–216. https://doi.org/10.1016/0004-3702(90)90007-MCrossRefGoogle Scholar
Thompson-Schill, S. L. (2005). Dissecting the language organ: A new look at the role of Broca's area in language processing. In Cutler, Anne (Ed.), Twenty-first century psycholinguistics (1st ed., Vol. 1, pp. 1–18). Routledge.Google Scholar
von Oswald, J., Niklasson, E., Schlegel, M., Kobayashi, S., Zucchet, N., Scherrer, N., … Sacramento, J. (2023). Uncovering mesa-optimization algorithms in transformers (arXiv:2309.05858). arXiv. https://doi.org/10.48550/arXiv.2309.05858CrossRefGoogle Scholar
Webb, T., Frankland, S. M., Altabaa, A., Krishnamurthy, K., Campbell, D., Russin, J., … Cohen, J. D. (2024). The relational bottleneck as an inductive bias for efficient abstraction (arXiv:2309.06629). arXiv. http://arxiv.org/abs/2309.06629Google Scholar
Webb, T., Holyoak, K. J., & Lu, H. (2022). Emergent analogical reasoning in large language models. Nature Human Behaviour, 7(9). https://doi.org/10.1038/s41562-023-01659-wGoogle Scholar
Wei, J., Wang, X., Schuurmans, D., Bosma, M., Ichter, B., Xia, F., … Zhou, D. (2023). Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Information Processing Systems, 35, 24824–24837. https://papers.nips.cc/paper_files/paper/2022/file/9d5609613524ecf4f15af0f7b31abca4-Paper-Conference.pdfGoogle Scholar
Werchan, D. M., Collins, A. G. E., Frank, M. J., & Amso, D. (2015). 8-Month-old infants spontaneously learn and generalize hierarchical rules. Psychological Science, 26(6), 805–815. https://doi.org/10.1177/0956797615571442CrossRefGoogle ScholarPubMed
Werchan, D. M., Collins, A. G. E., Frank, M. J., & Amso, D. (2016). Role of prefrontal cortex in learning and generalizing hierarchical rules in 8-month-old infants. The Journal of Neuroscience, 36(40), 10314–10322. https://doi.org/10.1523/JNEUROSCI.1351-16.2016CrossRefGoogle ScholarPubMed
Xie, S. M., Raghunathan, A., Liang, P., & Ma, T. (2022). An explanation of in-context learning as implicit Bayesian inference. International Conference on Learning Representations. https://openreview.net/pdf?id=RdJVFCHjUMIGoogle Scholar
Zhou, D., Schärli, N., Hou, L., Wei, J., Scales, N., Wang, X., … Chi, E. (2022). Least-to-most prompting enables complex reasoning in large language models. The Eleventh International Conference on Learning Representations. https://openreview.net/forum?id=WZH7099tgfMGoogle Scholar
Target article
Meta-learned models of cognition
Related commentaries (22)
Bayes beyond the predictive distribution
Challenges of meta-learning and rational analysis in large worlds
Combining meta-learned models with process models of cognition
Integrative learning in the lens of meta-learned models of cognition: Impacts on animal and human learning outcomes
Is human compositionality meta-learned?
Learning and memory are inextricable
Linking meta-learning to meta-structure
Meta-learned models as tools to test theories of cognitive development
Meta-learned models beyond and beneath the cognitive
Meta-learning and the evolution of cognition
Meta-learning as a bridge between neural networks and symbolic Bayesian models
Meta-learning goes hand-in-hand with metacognition
Meta-learning in active inference
Meta-learning modeling and the role of affective-homeostatic states in human cognition
Meta-learning: Bayesian or quantum?
Probabilistic programming versus meta-learning as models of cognition
Quantum Markov blankets for meta-learned classical inferential paradoxes with suboptimal free energy
Quo vadis, planning?
The added value of affective processes for models of human cognition and learning
The hard problem of meta-learning is what-to-learn
The meta-learning toolkit needs stronger constraints
The reinforcement metalearner as a biologically plausible meta-learning framework
Author response
Meta-learning: Data, architecture, and both