Hostname: page-component-78c5997874-dh8gc Total loading time: 0 Render date: 2024-11-09T15:54:53.800Z Has data issue: false hasContentIssue false

The Philosophy of Exploratory Data Analysis

Published online by Cambridge University Press:  01 April 2022

I. J. Good*
Affiliation:
Statistics Department Virginia Polytechnic Institute and State University

Abstract

This paper attempts to define Exploratory Data Analysis (EDA) more precisely than usual, and to produce the beginnings of a philosophy of this topical and somewhat novel branch of statistics.

A data set is, roughly speaking, a collection of k-tuples for some k. In both descriptive statistics and in EDA, these k-tuples, or functions of them, are represented in a manner matched to human and computer abilities with a view to finding patterns that are not “kinkera”. A kinkus is a pattern that has a negligible probability of being even partly potentially explicable. A potentially explicable pattern is one for which there probably exists a hypothesis of adequate “explicativity”, which is another technical probabilistic concept. A pattern can be judged to be probably potentially explicable even if we cannot find an explanation. The theory of probability understood here is one of partially ordered (interval-valued), subjective (personal) probabilities. Among other topics relevant to a philosophy of EDA are the “reduction” of data; Francis Bacon's philosophy of science; the automatic formulation of hypotheses; successive deepening of hypotheses; neurophysiology; and rationality of type II.

Type
Research Article
Copyright
Copyright © 1983 by the Philosophy of Science Association

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Footnotes

I am grateful to John W. Pratt for some useful criticisms. This work was supported in part by N.I.H. Grant R01-GM18770.

References

Anscombe, F. J. (1967), “Topics in the investigation of linear relations fitted by the method of least squares” (with discussion), Journal of the Royal Statistical Society B 29: 152.Google Scholar
Cochran, W. G. (1972), “Observational studies”, in Statistical Papers in Honor of George W. Snedecor. Ames, Iowa: Iowa State University Press, pp. 7090.Google Scholar
Cox, D. R. (1978), “Some remarks on the role of statistics of graphical methods”, Journal of the Royal Statistical Society, Series C (Applied Statistics) 27: 49.Google Scholar
Daniel, Cuthbert (1978), “Patterns in residuals in the two-way layout”, Technometrics 20: 385395.CrossRefGoogle Scholar
de Groot, A. D. (1965), Thought and Choice in Chess, Baylor, G. W. (ed.) The Hague & Paris: Mouton. A translation, with additions, of a Dutch version of 1946.Google Scholar
Efron, B. (1971), “Does an observed sequence of numbers follow a simple rule? (Another look at Bode's law.)”, Journal of the American Statistical Association 66: 552568 (with discussion).CrossRefGoogle Scholar
Freund, R. J., Vail, R. W., & Clunies-Ross, C. W. (1961), “Residual analysis”, Journal of the American Statistical Association 56: 98104.CrossRefGoogle Scholar
Ghiselin, B. (ed.) (1952), The Creative Process. New York: New American Library.Google Scholar
Good, I. J. (1950), Probability and the Weighing of Evidence. London: Charles Griffin; New York: Hafners.Google Scholar
Good, I. J. (1958), “The interaction algorithm and practical Fourier analysis”, Journal of the Royal Statistical Society, Series B 20: 361372.Google Scholar
Good, I. J. (1959), “Could a machine make probability judgments?Computers and Automation 8: 1416 and 24–26.Google Scholar
Good, I. J. (1963), “Maximum entropy for hypothesis formulation, especially for multidimensional contingency tables”, Annals of Mathematical Statistics 34: 911934.CrossRefGoogle Scholar
Good, I. J. (1969), “A subjective analysis of Bode's law and an ‘objective’ test for approximate numerical rationality”, Journal of the American Statistical Association 64: 2366 (with discussion).Google Scholar
Good, I. J. (1977a), “Explicativity: a mathematical theory of explanation with statistical applications”, Proceedings of the Royal Society A, 354. London: 303–330. Also in Studies in Bayesian Econometrics and Statistics in Honor of Harold Jeffreys, Arnold Zellner (ed.) Amsterdam: North Holland, 1980; and, in part, in Good Thinking: the Foundations of Probability and its Applications. Minneapolis: University of Minnesota Press; 1983.Google Scholar
Good, I. J. (1977b), “Summing up of the discussion on inductive inference”, Machine Intelligence 8, E. W. Elcock and D. Michie (eds). Chichester: Ellis Horwood Ltd. & New York: Wiley, pp. 205206.Google Scholar
Good, I. J. (1978), “Adenine arabinoside therapy”, Journal of Statistical Computational Simulation 6: 314315.Google Scholar
Good, I. J. (1980), “Some logic and history of hypothesis testing”, in Philosophical Foundations of Economics, Pitt, J. C. (ed.). Dordrecht: D. Reidel.Google Scholar
Good, I. J. (1981), “The philosophy of exploratory datum analysis”, American Statistical Association: 1980 Proceedings of the Business and Economic Statistics Section. Washington, DC: American Statistical Association, pp. 17.Google Scholar
Hadamard, J. (1949), The Psychology of Invention in the Mathematical Field. Princeton: University Press. Reprinted by Dover Publications, New York, 1954.Google Scholar
Jaynes, E. T. (1957), “Information theory and statistical mechanics”, Physical Reviews 106: 620630.CrossRefGoogle Scholar
Keynes, J. M. (1921), A Treatise on Probability. London: Macmillan.Google Scholar
Kruskal, W. H. (1982), “Criteria for judging statistical graphics”, Utilitas Mathematica 21B: 283310.Google Scholar
Learner, E. E. (1978), Specification Searches: Ad Hoc Inferences with Nonexperimental Data. New York: Wiley.Google Scholar
Michalski, R. S. & Negri, P. (1977), “An experiment on inductive learning in chess end games”, In Machine Intelligence 8, Elcock, E. W. & Michie, D. (eds.). Chichester: Ellis Horwood & New York: Wylie, pp. 175192.Google Scholar
Nieto, M. M. (1972), The Titius-Bode Law of Planetary Distances. New York: Pergamon Press.Google Scholar
Ornstein, R. E. (1973), The Nature of Human Consciousness. San Francisco: Freeman.Google Scholar
Russell, Bertrand (1946), History of Western Philosophy. London: George Allen and Unwin.Google Scholar
Tukey, J. W. (1977), Exploratory Data Analysis. Reading, Mass.: Addison-Wesley.Google Scholar
Wainer, H. & Francolini, C. M. (1980), “An empirical inquiry concerning human understanding of two-variable color maps”, American Statistician 34: 8193.Google Scholar
Whittaker, E. T. & Robinson, G. (1944), The Calculus of Observations (4th edn.). London: Blackie; New York: Dover reprint 1967.Google Scholar