Meta-learning: Bayesian or quantum?

Antonio Mastrogiorgio

doi:10.1017/S0140525X24000220

Meta-learning: Bayesian or quantum?

Published online by Cambridge University Press: 23 September 2024

Antonio Mastrogiorgio

Show author details

Antonio Mastrogiorgio*: Affiliation:
Department of Psychological and Social Sciences, John Cabot University, Rome, Italy [email protected] www.johncabot.edu https://sites.google.com/site/mastrogiorgioantonio/
*: *Corresponding author.

Article contents

Abstract
Financial support
Competing interest
References

Rights & Permissions

Abstract

Abundant experimental evidence illustrates violations of Bayesian models across various cognitive processes. Quantum cognition capitalizes on the limitations of Bayesian models, providing a compelling alternative. We suggest that a generalized quantum approach in meta-learning is simultaneously more robust and flexible, as it retains all the advantages of the Bayesian framework while avoiding its limitations.

Type: Open Peer Commentary
Information: Behavioral and Brain Sciences , Volume 47 , 2024 , e154

DOI: https://doi.org/10.1017/S0140525X24000220 [Opens in a new window]
Copyright: Copyright © The Author(s), 2024. Published by Cambridge University Press

The use of the Bayesian framework in meta-learning represents an elegant way to bypass the strictures of an ex-ante specification of models of cognition. As proposed by Binz et al. (hereafter the Authors), Bayesian inference, building upon unconstrained interactions with the environment, represents a viable alternative to more traditional hand-designed learning algorithms.

However, Bayesian models come with inherent limitations, which researchers are rarely aware of. While scholars normally endorse Bayesian models because of their unconstrained features, they rarely consider that such models are actually “constrained” to the Kolmogorovian assumptions of classical probability theory.

Abundant experimental evidence illustrates violations of Bayesian models across various cognitive processes, including probability judgment errors, memory recognition, semantic spaces, information processing, learning, concept combination, and perception (e.g., Pothos and Busemeyer, Reference Pothos and Busemeyer2022). These violations stem from the fact that many cognitive phenomena do not adhere to the law of total probability, along with the distributivity axiom, assumed in classical Kolmogorovian probability. Consequently, they do not admit a Bayesian operationalization.

Let us consider a stylized meta-learning process (similar to those discussed by the Authors) that can be operationalized through a Bayesian model. Suppose we hypothesize that learning performance can assume the (mutually exclusive) states P1 or P2, conditioned on meta-experience, which can assume the (mutually exclusive) states E1 or E2. Let's aim to predict the total probability, p, of the state P2. Consistent with a Bayesian framework, we can consider two cases: One, which we'll refer to as the “unconditioned case,” where we only observe p(P2); and the other, referred to as the “conditioned case,” where we observe the conditioned probabilities, p(P2|E1) and p(P2|E2).

Now, suppose that in experimental settings, the observed data reveal that in the “unconditioned case,” the probability is p(P2) = 0.29, while in the “conditioned case,” the probability is p(E1)⋅p(P2|E1) + p(E2)⋅p(P2|E2) = 0.59, where p(P2|E2) = 0.63.

We quickly realize that these experimental results are incompatible with a Bayesian framework: The total probability of P2 in the “unconditioned case” is inconsistent with that of the “conditioned case” (0.29 vs. 0.59), and the total probability of the “unconditioned case” is also lower than that in the “conditioned case,” restricted to E2 (0.29 vs. 0.63). When evidence violates the law of total probability, Bayesian models reveal inadequacies (for further discussion, see Busemeyer & Wang, Reference Busemeyer and Wang2015; Busemeyer, Wang, & Lambert-Mogiliansky, Reference Busemeyer, Wang and Lambert-Mogiliansky2009).

Such situations are paradoxical: They are somehow implausible to the extent that they do not admit a Bayesian formalization, yet they result from experimental evidence. When it comes to understanding such evidence, quantum cognition comes to the fore as a burgeoning field of research that dialectically capitalizes on the limitations of Bayesian models, providing a compelling alternative. A fundamental difference between Bayesian and quantum models – relevant for the present critique – lies in the fact that quantum models can account for such evidence precisely because they do not adhere to the law of total probability (for a comprehensive comparison between Bayesian and quantum models in cognition, refer to Bruza, Wang, & Busemeyer, Reference Bruza, Wang and Busemeyer2015).

Continuing with the example discussed above, what differs between the “unconditioned case” and “conditioned case” is that, respectively, the non-observation or the observation of the conditioning variable (E) is not neutral for the final probability. In other words, the two models are incompatible and do not admit a mutually consistent formalization.

In a quantum framework, the violations of the total law of probability are due to interference effects, which occur when the conditioning variables are not observed. In our example, the interference between the two conditions (E1 and E2) implies that they behave like waves, where the interference can be either destructive (canceling) or constructive (resonating), affecting the final probability p(P2). On the contrary, when E is observed, the result is compatible with a classical framework as the act of measuring the mutually exclusive states of experience eliminates their interference (for an overview on the role of interference effects, see Busemeyer & Bruza, Reference Busemeyer and Bruza2012).

The Authors wisely avoid claiming that meta-learning is the ultimate solution to every modeling problem, and they contemplate, in what they call “Intricate training processes,” the possibility that “the resulting model [of meta-learning] does not fit the observed data.”

We think that such situations are not just due, as implicitly suggested by the Authors, to the complexity of the scenarios, but to the fact that the tacit probabilistic assumptions of Bayesian models are somehow too restrictive.

More in detail, the plausibility of neurocognitive Bayesian foundations of meta-learning would require stronger justifications. Indeed, assuming that prefrontal circuits may constitute a meta-reinforcement learning system (cf., Wang et al., Reference Wang, Kurth-Nelson, Kumaran, Tirumala, Soyer, Leibo and Botvinick2018) or, in general terms, that the brain is a Bayesian machine, matching top-down prediction with bottom-up experience (cf., Friston, Reference Friston2010), would also imply assuming that Kolmogorovian probability ascribes to the biological realm.

The employment of Bayesian models is institutionalized to such an extent that their foundational assumptions are rarely contested in scientific debates. However, there are situations, like the one discussed here, in which Bayesian models reveal their limitations: Despite aspiring to provide a generalized framework for meta-learning, they inherently harbor very restrictive assumptions in probability theory.

On the contrary, quantum models exhibit greater flexibility, are more robust, and can offer a more sophisticated view of the neurocognitive mechanisms involved in human learning (cf., Mastrogiorgio, Reference Mastrogiorgio2022).

However, quantum cognition is not a tout court alternative to Bayesian models but rather a generalization applicable to cases where the law of total probability is violated. This implies that Bayesian models represent a special case within the broader quantum framework: Quantum models reduce to Bayesian models when experimental evidence aligns with the requirements of the distributivity axiom and the law of total probability.

Precisely because we support the Authors’ proposal of employing unconstrained logics in meta-learning, we also believe that a generalized quantum approach is simultaneously more robust and flexible, as it retains all the advantages of the Bayesian framework while avoiding its limitations.

Financial support

This research received no specific grant from any funding agency, commercial, or not-for-profit sectors.

Competing interest

None.

References

Bruza, P. D., Wang, Z., & Busemeyer, J. R. (2015). Quantum cognition: A new theoretical approach to psychology. Trends in Cognitive Sciences, 19, 383–393.CrossRef Google Scholar PubMed

Busemeyer, J. R., & Bruza, P. D. (2012). Quantum models of cognition and decision. Cambridge University Press.CrossRef Google Scholar

Busemeyer, J. R., & Wang, Z. (2015). What is quantum cognition, and how is it applied to psychology?. Current Directions in Psychological Science, 24(3), 163–169.CrossRef Google Scholar

Busemeyer, J. R., Wang, Z., & Lambert-Mogiliansky, A. (2009). Empirical comparison of Markov and quantum models of decision making. Journal of Mathematical Psychology, 53, 423–433.CrossRef Google Scholar

Friston, K. (2010). The free-energy principle: A unified brain theory? Nature Reviews Neuroscience, 11(2), 127–138.CrossRef Google Scholar PubMed

Mastrogiorgio, A. (2022). A quantum predictive brain: Complementarity between Top-down predictions and bottom-up evidence. Frontiers in Psychology, 13, 869894.CrossRef Google Scholar PubMed

Pothos, E.M., & Busemeyer, J.R. (2022). Quantum cognition. Annual Review of Psychology, 73, 749–778.CrossRef Google Scholar PubMed

Wang, J. X., Kurth-Nelson, Z., Kumaran, D., Tirumala, D., Soyer, H., Leibo, J. Z.,… Botvinick, M. (2018). Prefrontal cortex as a meta-reinforcement learning system. Nature Neuroscience, 21(6), 860–868.CrossRef Google Scholar PubMed