Modeling Vision

doi:10.1017/9781108755610.039

34 - Modeling Vision

from Part IV - Computational Modeling in Various Cognitive Fields

Published online by Cambridge University Press: 21 April 2023

Lukas Vogelsang and

Pawan Sinha

Edited by

Ron Sun

Show author details

Ron Sun: Affiliation:
Rensselaer Polytechnic Institute, New York

Book contents

Get access

Summary

Vision is one of the most complex proficiencies we possess, but its underpinnings are still shrouded in mystery. Many great scientific minds have been engaged in the enterprise of modeling vision. This chapter takes a look at some of the history of this effort, stretching from the times of the ancient Greeks to recent developments in neural networks, and discusses how current techniques may play a role in furthering our understanding of vision.

Keywords

visual system history of vision development modeling deep neural networks

Type: Chapter
Information: The Cambridge Handbook of Computational Cognitive Sciences , pp. 1113 - 1134

DOI: https://doi.org/10.1017/9781108755610.039 [Opens in a new window]

Publisher: Cambridge University Press

Print publication year: 2023

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Book purchase

Temporarily unavailable

References

Adamson, P. (2016). Philosophy in the Islamic World: A History of Philosophy Without Any Gaps. Oxford: Oxford University Press.Google Scholar

Avicenna, . (1973). A Treatise on the Canon of Medicine of Avicenna. Trans. O. Cameron Gruner. New York, NY: AMS Press.Google Scholar

Berkeley, G. (1709). An Essay towards a New Theory of Vision. Dublin: Aaron Rhames.Google Scholar

Cadieu, C. F., Hong, H., Yamins, D. L., et al. (2014). Deep neural networks rival the representation of primate IT cortex for core visual object recognition. PLoS Comput Biol, 10 (12), e1003963.Google Scholar

Chen, X., Kundu, K., Zhang, Z., Ma, H., Fidler, S., & Urtasun, R. (2016). Monocular 3D object detection for autonomous driving. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 2147–2156).Google Scholar

Cichy, R. M., Khosla, A., Pantazis, D., Torralba, A., & Oliva, A. (2016). Comparison of deep neural networks to spatio-temporal cortical dynamics of human visual object recognition reveals hierarchical correspondence. Scientific Reports, 6, 27755.CrossRef Google Scholar PubMed

Cranefield, P. F. (1970). On the origin of the phrase Nihil est in intellectu quod non prius fuerit in sensu. Journal of the History of Medicine, 25 (1), 77–80.Google Scholar

Crick, F. (1989). The recent excitement about neural networks. Nature, 337 (6203), 129–132.Google Scholar

Deng, J., Dong, W., Socher, R., Li, L.-J., Li, K., & Fei-Fei, L. (2009). Imagenet: a large-scale hierarchical image database. In 2009 IEEE Conference on Computer Vision and Pattern Recognition (pp. 248–255).Google Scholar

Descartes, R. (1985). Treatise on man. In The Philosophical Writings of Rene Descartes (Vol. 1, pp. 99–107). Cambridge: Cambridge University Press.Google Scholar

Doerig, A., Schmittwilken, L., Sayim, B., Manassi, M., & Herzog, M. H. (2020a). Capsule networks as recurrent models of grouping and segmentation. PLoS Computational Biology, 16 (7), e1008017.Google Scholar

Doerig, A., Bornet, A., Choung, O. H., & Herzog, M. H. (2020b). Crowding reveals fundamental differences in local vs. global processing in humans and machines. Vision Research, 167, 39–45.Google Scholar

Fei-Fei, L., Fergus, R., & Perona, P. (2004). Learning generative visual models from few training examples: an incremental Bayesian approach tested on 101 object categories. In IEEE CVPR Workshop on Generative-Model Based Vision.Google Scholar

Felleman, D. J., & Van Essen, D. C. (1991). Distributed hierarchical processing in the primate cerebral cortex. Cerebral Cortex, 1, 1–47.CrossRef Google Scholar PubMed

Finger, S. (1994). Origins of Neuroscience: A History of Explorations into Brain Function (pp. 67–69). Oxford: Oxford University Press.CrossRef Google Scholar

Fukushima, K. (1980). Neocognitron: a self-organizing neural network model for a mechanism of pattern recognition unaffected by shift in position. Biological Cybernetics, 36, 193–202. https://doi.org/10.1007/BF00344251 Google Scholar

Fukushima, K., & Miyake, S. (1982). Neocognitron: a self-organizing neural network model for a mechanism of visual pattern recognition. In Competition and Cooperation in Neural Nets (pp. 267–285). Berlin and Heidelberg: Springer.Google Scholar

Galen, . (1968). Galen on the Usefulness of the Parts of the Body. Trans. Margaret Tallmadge May. Ithaca, NY: Cornell University Press.Google Scholar

Geirhos, R., Temme, C. R., Rauber, J., Schütt, H. H., Bethge, M., & Wichmann, F. A. (2018a). Generalisation in humans and deep neural networks. In Advances in Neural Information Processing Systems (pp. 7538–7550).Google Scholar

Geirhos, R., Rubisch, P., Michaelis, C., Bethge, M., Wichmann, F. A., & Brendel, W. (2018b). ImageNet-trained CNNs are biased towards texture; increasing shape bias improves accuracy and robustness. arXiv preprint arXiv:1811.12231.Google Scholar

Grüsser, O. J., & Hagner, M. (1990). On the history of deformation phosphenes and the idea of internal light generated in the eye for the purpose of vision. Documenta Ophthalmologica, 74 (1–2), 57–85.Google Scholar

He, K., Zhang, X., Ren, S., & Sun, J. (2016). Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 770–778).Google Scholar

Hinton, G. E., & Sejnowski, T. J. (1983). Optimal perceptual inference. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (pp. 448–453). Washington, DC: IEEE Computer Society.Google Scholar

Hochberg, J., & Brooks, V. (1962). Pictorial recognition as an unlearned ability: A study of one child’s performance. The American Journal of Psychology, 75 (4), 624–628.Google Scholar

Hopfield, J. J. (1982). Neural networks and physical systems with emergent collective computational abilities. Proceedings of the National Academy of Sciences, 79 (8), 2554–2558.Google Scholar

Huang, G. B., Ramesh, M., Berg, T., & Learned-Miller, E. (2007). Labeled faces in the wild: a database for studying face recognition in unconstrained environments. Technical Report 07-49, University of Massachusetts, Amherst.Google Scholar

Hubel, D. H., & Wiesel, T. N. (1959). Receptive fields of single neurones in the cat’s striate cortex. Journal of Physiology, 148, 574–591.Google Scholar

Hubel, D. H., & Wiesel, T. N. (1977). Ferrier Lecture: functional architecture of macaque monkey visual cortex. Proceedings of the Royal Society B: Biological Sciences, 198, 1–59.Google Scholar

Hubel, D. H., & Wiesel, T. N. (1998). Early exploration of the visual cortex. Neuron, 20, 401–412.CrossRef Google Scholar PubMed

Hubel, D. H., & Wiesel, T. N. (2005). Brain and Visual Perception: The Story of a 25-Year Collaboration. New York, NY: Oxford University Press.Google Scholar

Huttenlocher, P. R., de Courten, C., Garey, L. J., & Van der Loos, H. (1982). Synaptogenesis in human visual cortex – evidence for synapse elimination during normal development. Neuroscience Letters, 33, 247–252.CrossRef Google Scholar PubMed

Kant, I. (1781). Critique of Pure Reason (pp. 370–456). Modern Classical Philosophers. Cambridge, MA: Houghton Mifflin.Google Scholar

Khaligh-Razavi, S. M., & Kriegeskorte, N. (2014). Deep supervised, but not unsupervised, models may explain IT cortical representation. PLoS Computational Biology, 10 (11), e1003915.Google Scholar

Kheradpisheh, S. R., Ghodrati, M., Ganjtabesh, M., & Masquelier, T. (2016). Deep networks resemble human feed-forward vision in invariant object recognition. arXiv preprint arXiv:1508.03929Google Scholar

Kietzmann, T. C., Spoerer, C. J., Sörensen, L. K., Cichy, R. M., Hauk, O., & Kriegeskorte, N. (2019). Recurrence is required to capture the representational dynamics of the human visual system. Proceedings of the National Academy of Sciences, 116 (43), 21854–21863.Google Scholar

Koffka, K. (1935). Principles of Gestalt Psychology (p. 176). New York, NY: Harcourt, Brace.Google Scholar

Kreiman, G., & Serre, T. (2020). Beyond the feedforward sweep: feedback computations in the visual cortex. Annals of the New York Academy of Sciences, 1464 (1), 222–241.Google Scholar

Kriegeskorte, N. (2015). Deep neural networks: a new framework for modeling biological vision and brain information processing. Annual Review of Vision Science, 24 (1), 417–446.Google Scholar

Kriegeskorte, N., Mur, M., & Bandettini, P. A. (2008). Representational similarity analysis: connecting the branches of systems neuroscience. Frontiers in Systems Neuroscience, 2, 4.Google Scholar

Krizhevsky, A., Sutskever, I., & Hinton, G. E. (2012). Imagenet classification with deep convolutional neural networks. Advances in Neural Information Processing Systems, 25, 1097–1105.Google Scholar

Kuffler, S. W. (1953). Discharge patterns and functional organization of mammalian retina. Journal of Neurophysiology, 16 (1), 37–68.Google Scholar

Lake, B. M., Zaremba, W., Fergus, R., & Gureckis, T. M. (2015). Deep neural networks predict category typicality ratings for images. In Proceedings of the 37th Annual Conference of the Cognitive Science Society.Google Scholar

Land, E. H., & McCann, J. J. (1971). Lightness and retinex theory. Journal of the Optical Society of America, 61 (1), 1–11.Google Scholar

Lappe, M., & Rauschecker, J. P. (1993). A neural network for the processing of optic flow from ego-motion in man and higher mammals. Neural Computation, 5 (3), 374–391.Google Scholar

Lee, W. C., & Reid, R. C. (2011). Specificity and randomness: structure-function relationships in neural circuits. Current Opinion in Neurobiology, 21 (5), 801–807.CrossRef Google Scholar PubMed

Locke, J. (1690). An essay concerning human understanding. In Dennis, W. (Ed.), Readings in the History of Psychology (pp. 55–68). New York, NY: Appleton-Century-Crofts.Google Scholar

Lotter, W., Kreiman, G., & Cox, D. (2020). A neural network trained for prediction mimics diverse features of biological neurons and perception. Nature Machine Intelligence, 2 (4), 210–219.Google Scholar

Marr, D. (1982). Vision: A Computational Investigation into the Human Representation and Processing of Visual Information. New York, NY: Henry Holt.Google Scholar

Marr, D., & Poggio, T. (1976). Cooperative computation of stereo disparity. Science, 194 (4262), 283–287.Google Scholar

Minsky, M. L., & Papert, S. A. (1969). Perceptrons. Cambridge, MA: MIT Press.Google Scholar

Ng, H. W., & Winkler, S. (2014). A data-driven approach to cleaning large face datasets. In IEEE International Conference on Image Processing (ICIP) (pp. 343–347).Google Scholar

Nili, H., Wingfield, C., Walther, A., Su, L., Marslen-Wilson, W., & Kriegeskorte, N. (2014). A toolbox for representational similarity analysis. PLoS Computational Biology, 10 (4), e1003553.Google Scholar

Reymond, A. (1927). History of the Sciences in Greco-Roman Antiquity (p. 182). London: Methuen.Google Scholar

Rosenblatt, F. (1958). The Perceptron: a probabilistic model for information storage and organization in the brain. Psychological Review, 65 (6), 386–408. https://doi.org/10.1037/h0042519 Google Scholar

Rumelhart, D. E., Hinton, G. E., & Williams, R. J. (1986). Learning representations by back-propagating errors. Nature, 323 (6088), 533–536.Google Scholar

Schroff, F., Kalenichenko, D., & Philbin, J. (2015). Facenet: a unified embedding for face recognition and clustering. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 815–823).Google Scholar

Szegedy, C., Zaremba, W., Sutskever, I., et al. (2014). Intriguing properties of neural networks. arXiv preprint arXiv:1312.6199.Google Scholar

Tang, H., Schrimpf, M., Lotter, W., et al. (2018). Recurrent computations for visual pattern completion. Proceedings of the National Academy of Sciences, 115 (35), 8835–8840.Google Scholar

Thorpe, S., Fize, D., & Marlot, C. (1996). Speed of processing in the human visual system. Nature, 381, 520–522.Google Scholar

Titchener, E. B. (1929). Systematic Psychology: Prolegomena. New York: Macmillan.Google Scholar

Vanderah, T. W., & Gould, D. J. (2016). Nolte’s: The Human Brain (7th ed.). Philadelphia, PA: Elsevier.Google Scholar

Vogelsang, L., Gilad-Gutnick, S., Ehrenberg, E., et al. (2018). Potential downside of high initial visual acuity. Proceedings of the National Academy of Sciences, 115 (44), 11333–11338.Google Scholar

von Helmholtz, H. (1925). Handbuch der Physiologischen Optik, English translation, Southall, J. P. D. (Ed.) (p. 455). Rochester, NY: Optical Society of America.Google Scholar

Wertheimer, M. (1938). [Original work published 1924]. Gestalt theory. In Ellis, W. D. (Ed.), A Source Book of Gestalt Psychology. London: Routledge & Kegan Paul.Google Scholar

Wilson, H. R. (1993). Theories of infant visual development. In Simons, K. (Ed.), Early Visual Development: Normal and Abnormal (pp. 560–569). New York, NY: Oxford University Press.Google Scholar

Winer, G. A., Cottrell, J. E., Gregg, V., Fournier, J. S., & Bica, L. A. (2002). Fundamentally misunderstanding visual perception: adults’ beliefs in visual emissions. American Psychologist, 57, 417–424.Google Scholar

Wundt, W. M. (1897). Outlines of Psychology. Leipzig: Wilhelm Engelmann.Google Scholar

Yamashita, R., Nishio, M., Do, R. K. G., & Togashi, K. (2018). Convolutional neural networks: an overview and application in radiology. Insights into Imaging, 9 (4), 611–629.Google Scholar

Yamins, D. L., Hong, H., Cadieu, C. F., Solomon, E. A., Seibert, D., & DiCarlo, J. J. (2014). Performance-optimized hierarchical models predict neural responses in higher visual cortex. Proceedings of the National Academy of Sciences, 111 (23), 8619–8662.Google Scholar