No CrossRef data available.
Article contents
A deep new look at color
Published online by Cambridge University Press: 06 December 2023
Abstract
Bowers et al. counter deep neural networks (DNNs) as good models of human visual perception. From our color perspective we feel their view is based on three misconceptions: A misrepresentation of the state-of-the-art of color perception; the type of model required to move the field forward; and the attribution of shortcomings to DNN research that are already being resolved.
- Type
- Open Peer Commentary
- Information
- Copyright
- Copyright © The Author(s), 2023. Published by Cambridge University Press
References
Biederman, I. (1987). Recognition-by-components: A theory of human image understanding. Psychological Review, 94(2), 115–147. https://doi.org/10.4324/9781351156288-24CrossRefGoogle ScholarPubMed
Conway, B. R. (2018). The organization and operation of inferior temporal cortex. Annual Review of Vision Science, 4(1), 381–402. https://doi.org/10.1146/annurev-vision-091517-034202CrossRefGoogle ScholarPubMed
de Vries, J. P., Akbarinia, A., Flachot, A., & Gegenfurtner, K. R. (2022). Emergent color categorization in a neural network trained for object recognition. eLife, 11, e76472. https://doi.org/10.7554/eLife.76472CrossRefGoogle Scholar
Flachot, A., Akbarinia, A., Schütt, H. H., Fleming, R. W., Wichmann, F. A., & Gegenfurtner, K. R. (2022). Deep neural models for color classification and color constancy. Journal of Vision, 22(4), 1–24. https://doi.org/10.1167/jov.22.4.17CrossRefGoogle ScholarPubMed
Flachot, A., & Gegenfurtner, K. R. (2018). Processing of chromatic information in a deep convolutional neural network. Journal of the Optical Society of America A, 35(4), B334. https://doi.org/10.1364/josaa.35.00b334CrossRefGoogle ScholarPubMed
Flachot, A., & Gegenfurtner, K. R. (2021). Color for object recognition: Hue and chroma sensitivity in the deep features of convolutional neural networks. Vision Research, 182, 89–100. https://doi.org/10.1016/j.visres.2020.09.010CrossRefGoogle ScholarPubMed
Garg, A. K., Li, P., Rashid, M. S., & Callaway, E. M. (2019). Color and orientation are jointly coded and spatially organized in primate primary visual cortex. Science (New York, N.Y.), 364(6447), 1275–1279. https://doi.org/10.1126/science.aaw5868CrossRefGoogle ScholarPubMed
Gegenfurtner, K. R. (2003). Cortical mechanisms of colour vision. Nature Reviews Neuroscience, 4(7), 563–572. https://doi.org/10.1038/nrn1138CrossRefGoogle ScholarPubMed
Gegenfurtner, K. R., & Kiper, D. C. (2003). Annual review of neuroscience, 26, 181–206. https://doi.org/10.1146/annurev.neuro.26.041002.131116CrossRefGoogle Scholar
Hansen, T., & Gegenfurtner, K. R. (2009). Independence of color and luminance edges in natural scenes. Visual Neuroscience, 26(1), 35–49. https://doi.org/10.1017/S0952523808080796CrossRefGoogle ScholarPubMed
Hansen, T., & Gegenfurtner, K. R. (2017). Color contributes to object-contour perception in natural scenes. Journal of Vision, 17(3), 1–19. https://doi.org/10.1167/17.3.14CrossRefGoogle ScholarPubMed
Krakauer, J. W., Ghazanfar, A. A., Gomez-Marin, A., MacIver, M. A., & Poeppel, D. (2017). Neuroscience needs behavior: Correcting a reductionist bias. Neuron, 93(3), 480–490. https://doi.org/10.1016/j.neuron.2016.12.041CrossRefGoogle ScholarPubMed
Livingstone, M. S., & Hubel, D. H. (1987). Psychophysical evidence for separate channels for the perception of form, color, movement, and depth. Journal of Neuroscience, 7(11), 3416–3468. https://doi.org/10.1523/jneurosci.07-11-03416.1987CrossRefGoogle ScholarPubMed
Ponting, S., Morimoto, T., & Smithson, H. (2023). Modelling surface color discrimination under different lighting environments using image chromatic statistics and convolutional neural networks. Journal of the Optical Society of America A, 40(3), A149–A159. https://doi.org/10.1364/josaa.479986CrossRefGoogle ScholarPubMed
Rafegas, I., & Vanrell, M. (2018). Color encoding in biologically-inspired convolutional neural networks. Vision Research, 151, 7–17. https://doi.org/10.1016/j.visres.2018.03.010CrossRefGoogle ScholarPubMed
Rafegas, I., Vanrell, M., Alexandre, L. A., & Arias, G. (2020). Understanding trained CNNs by indexing neuron selectivity. Pattern Recognition Letters, 136, 318–325.CrossRefGoogle Scholar
Shamay-Tsoory, S. G., & Mendelsohn, A. (2019). Real-life neuroscience: An ecological approach to brain and behavior research. Perspectives on Psychological Science, 14(5), 841–859. https://doi.org/10.1177/1745691619856350CrossRefGoogle ScholarPubMed
Shapley, R., & Hawken, M. J. (2011). Color in the cortex: Single- and double-opponent cells. Vision Research, 51(7), 701–717. https://doi.org/10.1016/j.visres.2011.02.012CrossRefGoogle ScholarPubMed
Siuda-Krzywicka, K., & Bartolomeo, P. (2020). What cognitive neurology teaches us about our experience of color. Neuroscientist, 26(3), 252–265. https://doi.org/10.1177/1073858419882621CrossRefGoogle ScholarPubMed
Witzel, C., & Gegenfurtner, K. R. (2018). Color perception: Objects, constancy, and categories. Annual Review of Vision Science, 4, 475–499.CrossRefGoogle ScholarPubMed
Wright, A. A., & Cumming, W. W. (1971). Color-naming functions for the pigeon. Journal of the Experimental Analysis of Behavior, 15(1), 7–17.CrossRefGoogle ScholarPubMed
You have
Access
One of the main arguments that Bowers et al. put forth is that deep neural networks (DNNs) classify objects in a fundamentally different manner from humans. However, what Bowers et al. promote as the state-of-the-art in terms of color processing, namely a strict segregation of visual streams for color and shape (Livingstone & Hubel, Reference Livingstone and Hubel1987), is outdated and has repeatedly been rejected (see Garg, Li, Rashid, & Callaway, Reference Garg, Li, Rashid and Callaway2019; Gegenfurtner & Kiper, Reference Gegenfurtner and Kiper2003; Shapley & Hawken, Reference Shapley and Hawken2011). The fact that line drawings can be recognized quickly does not imply that object processing in humans does not rely on color. On the contrary, boundaries defined by color appear essential for image segmentation in humans (Hansen & Gegenfurtner, Reference Hansen and Gegenfurtner2009, Reference Hansen and Gegenfurtner2017; Shapley & Hawken, Reference Shapley and Hawken2011). Moreover, the view on how color is represented in the brain has evolved from one of having a single-color center, to one where color-biased regions are found throughout the ventral stream (e.g., Conway, Reference Conway2018; Gegenfurtner, Reference Gegenfurtner2003). While classical algebraic models of color vision have been highly successful in explaining the processing in the cones and color-opponent stages in the eye, higher level cortical processing is still not well understood (for a recent review, see Siuda-Krzywicka & Bartolomeo, Reference Siuda-Krzywicka and Bartolomeo2020). All the evidence points toward an integral role for color in extracting objects, and this perfectly matches the emphasis that DNNs place on objects rather than isolated features.
Bowers et al. emphasize the lack of experimental rigor in testing DNNs compared to testing humans. While we largely agree, it is also important to consider the limitations of a myopic drive to constrain experiments to single-feature manipulations. Reductionist experimental methodology in human research typically diverges greatly from our natural experiences. It biases research toward investigating cognitive functions not necessarily at the core of how our system operates in daily life (e.g., Shamay-Tsoory & Mendelsohn, Reference Shamay-Tsoory and Mendelsohn2019). While Biederman's (Reference Biederman1987) geons may be sufficient for recognizing isolated objects in stereotypical configurations, in daily scenes objects appear in countless varying states (e.g., a cat curled up in a ball on the couch) and other features will gain importance. To avoid a reductionist bias, neuroscientific models need to be grounded in behavior natural to the organism (Krakauer, Ghazanfar, Gomez-Marin, MacIver, & Poeppel, Reference Krakauer, Ghazanfar, Gomez-Marin, MacIver and Poeppel2017). DNNs may seem misplaced within a classical approach of model-based hypothesis testing, where strongly reductive process models are defined explicitly to test a single hypothesis. However, DNNs are highly suitable for studying how behavior shapes underlying mechanisms. By observing emerging properties in the context of learning specific tasks and manipulating input we can develop improved hypotheses on why colors are represented the way that they are in the human visual system. Subsequently, as DNNs allow for comparisons at many levels, from single trials in psychophysics or electrophysiology experiments, up to derived mental representations in a human observer, they make it possible to study how these mechanisms may be implemented.
Naturally, without considerable overlap with the human visual system we would not consider DNNs adequate models of human vision. However, Bowers et al. focus strongly on discrepancies between humans and DNNs, but neglect important overlap. For color, properties of artificial neurons show great overlap with those in primate visual cortex: Many neurons exhibit double-opponent receptive fields (Flachot & Gegenfurtner, Reference Flachot and Gegenfurtner2018, Reference Flachot and Gegenfurtner2021; Rafegas, Vanrell, Alexandre, & Arias, Reference Rafegas, Vanrell, Alexandre and Arias2020; Rafegas & Vanrell, Reference Rafegas and Vanrell2018), and a moderate functional segregation between color and achromatic information was found at the early stages, corresponding to retino-geniculate processing (Flachot & Gegenfurtner, Reference Flachot and Gegenfurtner2018). On a higher level, DNNs were also shown to outperform classical models in identifying regions of the objects that are highly predictive of human behavioral patterns when discriminating color of naturalistic objects (Ponting, Morimoto, & Smithson, Reference Ponting, Morimoto and Smithson2023). Moreover, qualitative similarities have been uncovered between DNNs and human participants in color constancy experiments where individual cues known to affect human color constancy were manipulated (Flachot et al., Reference Flachot, Akbarinia, Schütt, Fleming, Wichmann and Gegenfurtner2022). Finally, in our efforts to uncover why humans adopt a categorical representation of color, we found that DNNs trained specifically for object recognition incorporate a categorical representation of color that is highly similar to that of humans (de Vries, Akbarinia, Flachot, & Gegenfurtner, Reference de Vries, Akbarinia, Flachot and Gegenfurtner2022).
In that study, we strongly focused on translating psychophysical methods to DNNs. We created a match-to-sample task inspired by work on color categorization in pigeons (Wright & Cumming, Reference Wright and Cumming1971) using controlled-stimuli and, in secondary experiments, validated their use in the DNN. We also translated the concept of categorical color perception (where colors from different categories are distinguished faster than those from the same category) to the DNN. Our study on color constancy, mentioned above, also included the typical manipulations found in psychophysical studies on the issue. Finally, the studies on neural color tuning translated methods from single-cell recordings in nonhuman primates to DNNs. Together, this introduces important tools to move beyond purely correlational human–DNN comparisons and to investigate where the DNN is similar to the human visual system and where it deviates. Carefully designed experiments allow for collecting response patterns from DNNs through which richer human–DNN comparisons are possible. Notably, our findings are not purely correlational in nature. For example, to establish whether object recognition was important to finding a categorical representation of color, we contrasted an object-trained DNN with the one trained to distinguish artificial from man-made scenes. Importantly, human-like color categories were only found for the object task, indicating that object learning may be crucial in shaping human color categories.
Color and object processing are intricately connected (e.g., Conway, Reference Conway2018; Witzel & Gegenfurtner, Reference Witzel and Gegenfurtner2018) and understanding color perception will require a model that takes objects into account. DNNs enable us to investigate under what circumstances color phenomena arise and to inspect how they are implemented. Naturally, shortcomings, such as a reliance on correlation-based comparisons and strong divergences from human object processing should be addressed. However, we believe these shortcomings will prove predominantly temporary in nature, as they are already being taken into account in several recent studies. As such, where Bowers et al. take issue with using object-trained DNNs, we see opportunity.
Financial support
This study is funded by the Deutsche Forschungsgemeinschaft (DFG, German Research Foundation) – project number 222641018 – SFB/TRR 135 TP C2, and European Research Council Advanced Grant Color 3.0 (884116). A. F. is funded by a VISTA postdoctoral fellowship. T. M. is supported by a Sir Henry Wellcome Postdoctoral Fellowship from Wellcome Trust and a Junior Research Fellowship from Pembroke College, University of Oxford.
Competing interest
None.