Hostname: page-component-586b7cd67f-l7hp2 Total loading time: 0 Render date: 2024-11-27T16:40:51.794Z Has data issue: false hasContentIssue false

“Switching” between fast and slow processes is just reward-based branching

Published online by Cambridge University Press:  18 July 2023

George Ainslie*
Affiliation:
Department of Veterans Affairs, Coatesville, PA, USA [email protected] www.picoeconomics.org

Abstract

Shortcuts to goals are rewarded by faster attainment and punished by more frequent failure, so selection of the various kinds – heuristics, cached sequences (habits or macros), gut instincts – depends on reward history just like other kinds of choice. The speeds of shortcuts lie on continua along with speeds of deliberation, and these continua have no obvious separation points.

Type
Open Peer Commentary
Copyright
Copyright © The Author(s), 2023. Published by Cambridge University Press

This target article (TA) follows on De Neys's recent proposal “that trying to answer the core single vs dual process model debate is pointless for empirical scientists” so it is “time to move on” (De Neys, Reference De Neys2021, Introduction). What he proposes in the TA is “a more viable dual process architecture” (target article Abstract), which is “orthogonal [to] whether the difference between the two types of processing should be conceived as merely quantitative or qualitative.” Nevertheless, he argues for two qualitatively different processes, perhaps characterized by the 14 different properties he listed in the earlier article (fast, effortless, affective, automatic… vs. slow, effortful, affectless, controlled…; De Neys, Reference De Neys2021, p. 4), which he calls here simply fast and slow. He demonstrates the flaws in dual-process theories' usual assumptions: that the two processes must operate separately (“exclusivity”), and that there must be a “switch feature… by which a reasoner can decide to shift between more intuitive and deliberate processing”; but he pulls back from other authors' proposal that “we simply abandon the dual process enterprise.” His refutation of the authors who have favored a single, quantitatively based decision process is just to point out that “some responses require more deliberation than others,” which would not seem to require a dichotomy.

Dual-process models have admittedly been popular over the years, beginning with Plato's wild versus well-behaved chariot horses. In addition to De Neys's fast versus slow examples, choice making has been described as passionate versus reasonable, impulsive versus reflective, myopic versus far-sighted, hot versus cool, and model-free versus model-based, among others. De Neys also includes as fast the products of “automatization,” by which repeated sequences of choices “will be elicited intuitively.” In addition, brain imaging has found evidence for steep-discounting versus shallow-discounting brain centers (McClure et al., Reference McClure, Laibson, Loewenstein and Cohen2004; van den Bos & McClure, Reference van den Bos and McClure2013).

However, as De Neys himself concludes, there is no operation that system 2 can perform that system 1 cannot, and “thinking always involves a continuous interaction between system 1 and system 2 application” (target article, sect. 3.4, para. 5). Other authors have pointed out obvious problems with the dual approach: If there are distinct systems, there must be more than two of them, because the properties attributed to the two systems do not reliably occur together (Zbrodoff & Logan, Reference Zbrodoff and Logan1986); in particular, automatic processes may or may not be affectively arousing (Ainslie, Reference Ainslie2021). Furthermore, the listed properties such as effort, affect, and speed itself are themselves continua. If the two whole lists of properties really constitute discrete systems, there should be natural breaks in the continua from fast to slow, and the breaks should occur at equivalent levels in the n dimensions. As a negative example, the only obvious break in transparency would be too-fast-to-introspect versus not-too-fast-to-introspect, which would not define different kinds.

Most “type 1” processing in humans comprises sequences that have been automatized, macros (or habits) that call up other macros. In language, a squiggly line is interpreted as a letter, a sequence of letters is interpreted as a word, a series of words forms a concept (or cliché). All highly automatized; but if I was to find an anomaly – no, it should be “were to find an anomaly” – my ear would be quick to re-set it. This should not require a distinct system. Even if I stopped to ponder the use of the subjunctive, I would just be trying out sequences I had previously automatized. Likewise, as my calculation proceeds from 2 + 2 through, say 8 + 8, to 64 + 64, and so forth, at some points my mind will pause to find component automatizations; but is there a point where the pause divides two systems?

The strongest case for separate processes might be based on the activities of separate sites in the brain, but even here true separation is doubtful. The dorsolateral striatum (putamen) is differentially active when repeated connections have been cached to form macros, whereas the dorsomedial striatum (caudate) is more active during flexible behavior; but their functioning has been observed to be integrally combined (Dolan & Dayan, Reference Dolan and Dayan2013; Keramati et al., Reference Keramati, Smittenaar, Dolan and Dayan2016). Similarly, the existence of separate steep and shallow reward discount centers in the brain is controversial (Kable & Glimcher, Reference Kable and Glimcher2007; Lempert et al., Reference Lempert, Steinglass, Pinto, Kable and Simpson2019). If there do exists anatomically separate response-selection systems in the brain, the best candidates would be those for motivational salience and (supposedly separate) reward, governing the attraction of attention and behavioral approach/avoidance, respectively (Berridge & Robinson, Reference Berridge and Robinson1998). But even here, salience and behavior selection are correlated with activity in mostly the same brain regions (Kim et al., Reference Kim, Nanavaty, Ahmed, Mathur and Anderson2021); and when even threatening stimuli are voluntarily gated out, attention to them must have been weighed in the common marketplace of reward (see Ainslie, Reference Ainslie2009).

The professed scope of the TA's model is universal, but except for its reference to cupcakes its examples are cognitive searches for correct solutions to puzzles, rather than choices among competing rewards. Accordingly, “the peak activation strength of an intuition reflects how automatized or instantiated the underlying knowledge structures are (i.e., how strongly it is tied to its eliciting stimulus).” This rather Pavlovian convention hampers the model's application to goal-directed activities. By contrast, it is feasible to model the selection of all learnable processes which can replace each other using the amount and timing of their contingent reward (Ainslie, Reference Ainslie2017). The sources of reward – consumption goods, ethical goods, social cues, puzzle solutions, signal detections, emotions, the satisfaction of urges – as well as their speeds of onset, are miscellaneous. It should not matter that some of their subroutines involve particular parts of the brain (for instance the amygdala – Aqino et al., Reference Aquino, Minxha, Dunne, Ross, Mamelak, Rutishauser and O'Doherty2020 – or hippocampus – Gauthier & Tank, Reference Gauthier and Tank2018), as long as their weights are ultimately comparable to each other. Likewise, the weighing process may or may not involve a specific site, such as the orbitofrontal cortex (Bartra et al., Reference Bartra, McGuire and Kable2013; Levy & Glimcher, Reference Levy and Glimcher2012), a set of interacting sites (Krönke et al., Reference Krönke, Wolff, Shi, Kräplin, Smolka, Bühringer and Goschke2020), or no identifiable dwelling place (Dohmatob, Dumas, & Bzdok, Reference Dohmatob, Dumas and Bzdok2020). In reward research, the adoption of millisecond-specific electroencephalography (for instance, Sambrook et al., Reference Sambrook, Hardwick, Wills and Goslin2018) promises to give precise evidence about branching to fast, slow, and intermediate processes.

Financial support

This material is the result of work supported with resources and the use of facilities at the Department of Veterans Affairs Medical Center, Coatesville, PA, USA. The opinions expressed are not those of the Department of Veterans Affairs or of the US Government.

Competing interest

None.

References

Ainslie, G. (2009). Pleasure and aversion: Challenging the conventional dichotomy. Inquiry: A Journal of Medical Care Organization, Provision and Financing, 52(4), 357377. http://dx.doi.org/10.1080/00201740903087342CrossRefGoogle Scholar
Ainslie, G. (2017). De gustibus disputare: Hyperbolic delay discounting integrates five approaches to choice. Journal of Economic Methodology 24(2), 166189. http://dx.doi.org/10.1080/1350178X.2017.1309748CrossRefGoogle Scholar
Ainslie, G. (2021). Reply to commentaries to “willpower with and without effort.” Behavioral and Brain Sciences 44, E57. https://doi.org/10.1017/s0140525x21000029Google Scholar
Aquino, T. G., Minxha, J., Dunne, S., Ross, I. B., Mamelak, A. N., Rutishauser, U., & O'Doherty, J. P. (2020). Value-related neuronal responses in the human amygdala during observational learning. Journal of Neuroscience, 40(24), 47614772.10.1523/JNEUROSCI.2897-19.2020CrossRefGoogle ScholarPubMed
Bartra, O., McGuire, J. T., & Kable, J. W. (2013). The valuation system: A coordinate-based meta-analysis of BOLD fMRI experiments examining neural correlates of subjective value. NeuroImage, 76, 412427.10.1016/j.neuroimage.2013.02.063CrossRefGoogle ScholarPubMed
Berridge, K. C., & Robinson, T. (1998). What is the role of dopamine in reward: Hedonic impact, reward learning, or incentive salience. Brain Research Reviews, 28, 309369.10.1016/S0165-0173(98)00019-8CrossRefGoogle ScholarPubMed
De Neys, W. (2021). On dual- and single-process models of thinking. Perspectives on Psychological Science, 16(6), 14121427.10.1177/1745691620964172CrossRefGoogle ScholarPubMed
Dohmatob, E., Dumas, G., & Bzdok, D. (2020). Dark control: The default mode network as a reinforcement learning agent. Human Brain Mapping, 41(12), 33183341.10.1002/hbm.25019CrossRefGoogle ScholarPubMed
Dolan, R. J., & Dayan, P. (2013). Goals and habits in the brain. Neuron, 80(2), 312325.10.1016/j.neuron.2013.09.007CrossRefGoogle ScholarPubMed
Gauthier, J. L., & Tank, D. W. (2018). A dedicated population for reward coding in the hippocampus. Neuron, 99(1), 179193.10.1016/j.neuron.2018.06.008CrossRefGoogle ScholarPubMed
Kable, J. W., & Glimcher, P. W. (2007) The neural correlates of subjective value during intertemporal choice. Nature Neuroscience 10, 16251633.10.1038/nn2007CrossRefGoogle ScholarPubMed
Keramati, M., Smittenaar, P., Dolan, R. J., & Dayan, P. (2016). Adaptive integration of habits into depth-limited planning defines a habitual-goal-directed spectrum. Proceedings of the National Academy of Sciences, 113(45), 1286812873.10.1073/pnas.1609094113CrossRefGoogle ScholarPubMed
Kim, H., Nanavaty, N., Ahmed, H., Mathur, V. A., & Anderson, B. A. (2021). Motivational salience guides attention to valuable and threatening stimuli: Evidence from behavior and functional magnetic resonance imaging. Journal of Cognitive Neuroscience, 33(12), 24402460.10.1162/jocn_a_01769CrossRefGoogle ScholarPubMed
Krönke, K. M., Wolff, M., Shi, Y., Kräplin, A., Smolka, M. N., Bühringer, G., & Goschke, T. (2020). Functional connectivity in a triple-network saliency model is associated with real-life self-control. Neuropsychologia, 149, 107667.10.1016/j.neuropsychologia.2020.107667CrossRefGoogle Scholar
Lempert, K. M., Steinglass, J. E., Pinto, A., Kable, J. W., & Simpson, H. B. (2019). Can delay discounting deliver on the promise of RDoC?. Psychological Medicine, 49(2), 190199.10.1017/S0033291718001770CrossRefGoogle ScholarPubMed
Levy, D. J., & Glimcher, P. W. (2012). The root of all value: A neural common currency for choice. Current Opinion in Neurobiology, 22(6), 10271038.10.1016/j.conb.2012.06.001CrossRefGoogle ScholarPubMed
McClure, S. M., Laibson, D. I., Loewenstein, G., & Cohen, J. D. (2004). The grasshopper and the ant: Separate neural systems value immediate and delayed monetary rewards. Science (New York, N.Y.) 306, 503507.10.1126/science.1100907CrossRefGoogle Scholar
Sambrook, T. D., Hardwick, B., Wills, A. J., & Goslin, J. (2018). Model-free and model-based reward prediction errors in EEG. NeuroImage, 178, 162171.10.1016/j.neuroimage.2018.05.023CrossRefGoogle ScholarPubMed
van den Bos, W., & McClure, S. M. (2013). Towards a general model of temporal discounting. Journal of the Experimental Analysis of Behavior, 99(1), 5873.10.1002/jeab.6CrossRefGoogle ScholarPubMed
Zbrodoff, N. J., & Logan, G. D. (1986). On the autonomy of mental processes: A case study of arithmetic. Journal of Experimental Psychology: General, 115(2), 118.10.1037/0096-3445.115.2.118CrossRefGoogle ScholarPubMed