Hostname: page-component-cd9895bd7-jkksz Total loading time: 0 Render date: 2024-12-24T18:33:55.598Z Has data issue: false hasContentIssue false

Meta-learning in active inference

Published online by Cambridge University Press:  23 September 2024

O. Penacchio*
Affiliation:
Computer Science Department, Autonomous University of Barcelona, and School of Psychology and Neuroscience, University of St Andrews, Barcelona, Spain [email protected] https://openacchio.github.io/
A. Clemente
Affiliation:
Department of Cognitive Neuropsychology, Max Planck Institute for Empirical Aesthetics, Frankfurt am Main, Germany [email protected] https://www.aesthetics.mpg.de/institut/mitarbeiterinnen/ana-clemente.html
*
*Corresponding author.

Abstract

Binz et al. propose meta-learning as a promising avenue for modelling human cognition. They provide an in-depth reflection on the advantages of meta-learning over other computational models of cognition, including a sound discussion on how their proposal can accommodate neuroscientific insights. We argue that active inference presents similar computational advantages while offering greater mechanistic explanatory power and biological plausibility.

Type
Open Peer Commentary
Copyright
Copyright © The Author(s), 2024. Published by Cambridge University Press

Binz et al. provide a meritorious survey of the prospects offered by meta-learning for building models of human cognition. Among the main assets of meta-learning discussed by Binz et al. is the capacity to learn inductive bias from experience independently of the constraints enforced by the modeller. Further, Binz et al. showcase the capacity of meta-learning algorithms to approximate Bayesian inference, the gold standard for modelling rational analysis. Finally, they claim that meta-learning offers an unequalled framework for constructing rational models of human cognition that incorporate insights from neuroscience. We propose that an alternative theory of cognition, active inference, shares the same strengths as Binz et al.'s proposal while establishing precise and empirically validated connections to neurobiological mechanisms underlying cognition.

Learning from experience has become a benchmark in all fields aiming to understand and emulate natural intelligence and might be the next driver of developments in artificial intelligence (Zador & Tsao, Reference Zador, Escola, Richards, Ölveczky, Bengio, Boahen and Tsao2023). In this regard, meta-learning joins other frameworks with the potential to advance the understanding of human cognition as it allows learning algorithms to adapt to experience beyond the modeller's intervention. However, active inference provides a distinct advantage as it has the purpose of modelling and understanding how agents engage with their environment. In active inference, cognitive agents – or algorithms – learn through experience by continually refining their internal model of the environment or the task at hand (Friston, FitzGerald, Rigoli, Schwartenbeck, & Pezzulo, Reference Friston, FitzGerald, Rigoli, Schwartenbeck and Pezzulo2016).

Another critical aspect of Binz et al.'s proposal is their insistent reference to Bayes' optimality. The concept has well-grounded theoretical and empirical foundations (Clark, Reference Clark2013) that make it a good standard, justifying Binz et al.'s eagerness to probe their approach against Bayes' optimality. Their algorithm approximates Bayes’ optimality with the mathematical consequence that any cognitive phenomenon accounted for by Bayesian inference can, in theory, be accounted for by meta-learning. However, the flexibility of Binz et al.'s approach to meta-learning entails a reduced interpretability of the resulting models. In active inference, posterior distributions are inferred using the free-energy principle, a variational approach to Bayesian inference that also approximates intractable computations but within a fully interpretable architecture (see below).

We also commend Binz et al. for describing meta-learning's capacity to incorporate insights from neuroscience, a requisite for a computational understanding of cognition (Kriegeskorte & Douglas, Reference Kriegeskorte and Douglas2018). Yet, the biologically inspired elements introduced are ad hoc and case-dependent, as explicitly stated in their conclusion. Their meta-learning models are not motivated by a fundamental biological principle but are conceived as a powerful tool to enhance learning. By contrast, active inference directly translates into neural mechanisms (Friston, FitzGerald, Rigoli, Schwartenbeck, & Pezzulo, Reference Friston, FitzGerald, Rigoli, Schwartenbeck and Pezzulo2017) and originates from a single unifying principle: the imperative for organisms to avoid surprising states, implemented by a continuous loop between drawing hypotheses on hidden states (e.g., mean length of an insect species) and observations (e.g., length of a particular specimen) (Friston, Reference Friston2010). This principle aligns with the Helmoltzian perspective of perception as inference and subsequent Bayesian brain theories. The variational inferential dynamic when receiving new observations can be naturally cast into a constant bidirectional message passing with direct neural implementation as ascending prediction errors and descending predictions (Pezzulo, Parr, & Friston, Reference Pezzulo, Parr and Friston2024). Importantly, these mechanisms are common to all active inference models and enjoy empirical support (e.g., Bastos et al., Reference Bastos, Usrey, Adams, Mangun, Fries and Friston2012; Schwartenbeck, FitzGerald, Mathys, Dolan, & Friston, Reference Schwartenbeck, FitzGerald, Mathys, Dolan and Friston2015).

Learning is a central construct in active inference. Agents constantly update their generative models based on observations and prediction errors, with the imperative of reducing prediction errors. Generative models represent alternative hypotheses about task execution and associated outcomes (e.g., estimating the average length of a species). These hypotheses, and possibly all components of the generative models, are tested and refined through the agent's experience (Friston et al., Reference Friston, FitzGerald, Rigoli, Schwartenbeck and Pezzulo2016). This experience-dependent plasticity has two fundamental assets.

First, learning in active inference is directly and naturally interpreted in terms of biologically plausible neuronal mechanisms. The updates of all components of the generative models are driven by co-occurrences between predicted outcomes (in postsynaptic units in the neuronal interpretation sketched above) and (presynaptic) observational inputs in a process reminiscent of Hebbian learning (Friston et al., Reference Friston, FitzGerald, Rigoli, Schwartenbeck and Pezzulo2016). Consequently, active inference is cast as a process theory that can draw specific empirical predictions on neuronal dynamics (Whyte & Smith, Reference Whyte and Smith2021).

Second, and crucially, agents in active inference learn the reliability of inputs and prediction errors. This precision estimation is akin to learning meta-parameters (as per Binz et al.) as it entails a weighting process that prioritises reliable sources over uninformative inputs. The balance between exploration and exploitation (central constructs in cognition reflecting epistemic affordances and pragmatic value, respectively) rests upon mechanisms with direct neurobiological substrate in terms of dopamine release, with important implications for rational decision-making – for example, in two-armed bandit tasks (Schwartenbeck et al., Reference Schwartenbeck, Passecker, Hauser, FitzGerald, Kronbichler and Friston2019), maze navigation (Kaplan & Friston, Reference Kaplan and Friston2018) and computational psychiatry (Smith, Badcock, & Friston, Reference Smith, Badcock and Friston2021). Another essential strength of active inference for implementing meta-learning is its natural hierarchical extension. Upper levels can control parameters of lower levels, enabling inference at different timescales whereby learning at lower levels is optimised over time by top-down adjustments from upper levels, which has direct neuronal interpretation in multi-scale hierarchical brain organisation (Pezzulo, Rigoli, & Friston, Reference Pezzulo, Rigoli and Friston2018).

Model preference depends on performance and, primarily, on the scientific question at hand. To understand cognition and its mechanistic underpinnings, models whose components and articulations can be directly interpreted in terms of neural mechanisms are essential. Active inference is a principled, biologically plausible and fully interpretable model of cognition with promising applications to artificial intelligence that accounts for neurobiological and psychological phenomena. We contend that it provides a comprehensive model for understanding biological systems and improving artificial cognition.

Financial support

O. P. was funded by a Maria Zambrano Fellowship for the attraction of international talent for the requalification of the Spanish university system—Next Generation EU.

Competing interests

None.

References

Bastos, A. M., Usrey, W. M., Adams, R. A., Mangun, G. R., Fries, P., & Friston, K. J. (2012). Canonical microcircuits for predictive coding. Neuron, 76(4), 695711. https://doi.org/10.1016/j.neuron.2012.10.038CrossRefGoogle ScholarPubMed
Clark, A. (2013). Whatever next? Predictive brains, situated agents, and the future of cognitive science. Behavioral and Brain Sciences, 36(3), 181204. https://doi.org/10.1017/S0140525X12000477CrossRefGoogle ScholarPubMed
Friston, K. (2010). The free-energy principle: A unified brain theory? Nature Reviews Neuroscience, 11(2), 127138. https://doi.org/10.1038/nrn2787CrossRefGoogle ScholarPubMed
Friston, K., FitzGerald, T., Rigoli, F., Schwartenbeck, P., & Pezzulo, G. (2016). Active inference and learning. Neuroscience & Biobehavioral Reviews, 68, 862879. https://doi.org/10.1016/j.neubiorev.2016.06.022CrossRefGoogle ScholarPubMed
Friston, K., FitzGerald, T., Rigoli, F., Schwartenbeck, P., & Pezzulo, G. (2017). Active inference: A process theory. Neural Computation, 29(1), 149. https://doi.org/10.1162/NECO_a_00912CrossRefGoogle ScholarPubMed
Kaplan, R., & Friston, K. J. (2018). Planning and navigation as active inference. Biological Cybernetics, 112(4), 323343. https://doi.org/10.1007/s00422-018-0753-2CrossRefGoogle ScholarPubMed
Kriegeskorte, N., & Douglas, P. K. (2018). Cognitive computational neuroscience. Nature Neuroscience, 21(9), 11481160. https://doi.org/10.1038/s41593-018-0210-5CrossRefGoogle ScholarPubMed
Pezzulo, G., Parr, T., & Friston, K. (2024). Active inference as a theory of sentient behavior. Biological Psychology, 186, 108741. https://doi.org/10.1016/j.biopsycho.2023.108741CrossRefGoogle ScholarPubMed
Pezzulo, G., Rigoli, F., & Friston, K. J. (2018). Hierarchical active inference: A theory of motivated control. Trends in Cognitive Sciences, 22(4), 294306. https://doi.org/10.1016/j.tics.2018.01.009CrossRefGoogle ScholarPubMed
Schwartenbeck, P., FitzGerald, T. H., Mathys, C., Dolan, R., & Friston, K. (2015). The dopaminergic midbrain encodes the expected certainty about desired outcomes. Cerebral Cortex, 25(10), 34343445. https://doi.org/10.1093/cercor/bhu159CrossRefGoogle ScholarPubMed
Schwartenbeck, P., Passecker, J., Hauser, T. U., FitzGerald, T. H. B., Kronbichler, M., & Friston, K. J. (2019). Computational mechanisms of curiosity and goal-directed exploration. eLife, 8, e41703. https://doi.org/10.7554/eLife.41703CrossRefGoogle ScholarPubMed
Smith, R., Badcock, P., & Friston, K. J. (2021). Recent advances in the application of predictive coding and active inference models within clinical neuroscience. Psychiatry and Clinical Neurosciences, 75(1), 313. https://doi.org/10.1111/pcn.13138CrossRefGoogle ScholarPubMed
Whyte, C. J., & Smith, R. (2021). The predictive global neuronal workspace: A formal active inference model of visual consciousness. Progress in Neurobiology, 199, 101918. https://doi.org/10.1016/j.pneurobio.2020.101918CrossRefGoogle ScholarPubMed
Zador, A., Escola, S., Richards, B., Ölveczky, B., Bengio, Y., Boahen, K., Tsao, D. (2023). Catalyzing next-generation Artificial Intelligence through NeuroAI. Nature Communications, 14(1), 1597. https://doi.org/10.1038/s41467-023-37180-xCrossRefGoogle ScholarPubMed