Hostname: page-component-745bb68f8f-b6zl4 Total loading time: 0 Render date: 2025-01-08T10:15:17.644Z Has data issue: false hasContentIssue false

Exploratory Data Analysis: Data Visualization or Torture?

Published online by Cambridge University Press:  02 January 2015

Mark A. Shelly*
Affiliation:
Highland Hospital and the University of Rochester, Rochester, NY
*
Highland Hospital of Rochester, 1000 South Ave, Box 45, Rochester, NY 14620

Abstract

Exploratory Data Analysis offers a set of graphical and statistical tools to find the full meaning from data sets. The user visualizes, analyzes, and transforms data distributions with these tools. Graphs reveal relationships between variables; the residuals left after fitting data show the adequacy of the model. Without this careful examination and understanding of the data, rote data analysis using standard statistical tests can give misleading results. Exploratory Data Analysis has its own set of pitfalls and must be used with confirmatory statistics and studies. Increasing power and resolution in personal computers enables modern statistical software to make these methods widely accessible. By easily moving between data and their graphic representation, analysis can be comprehensive without being tedious. Exploratory Data Analysis can add an exciting and useful tool to the epidemiologist's repertoire. This article illustrates several tools from an evolving list.

Type
Statistics for Hospital Epidemiology
Copyright
Copyright © The Society for Healthcare Epidemiology of America 1996

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

1. Tukey, JW. Exploratory Data Analysis. Reading, MA: Addison Wesley; 1977.Google Scholar
2. DuToit, SHC, Steyn, AGW, Stumpf, RH. Graphical Exploratory Data Analysis. New York, NY: Springer-Verlag; 1986.CrossRefGoogle Scholar
3. Cleveland, WS. Visualizing Data. Summit, NJ: Hobart Press; 1993.Google Scholar
4. Mills, JL. Data torturing. N Engl J Med 1993;329:11961199.CrossRefGoogle ScholarPubMed
5. Williamson, DF, Parker, RA, Kendrick, JS. The box plot: a simple visual method to interpret data. Ann Intern Med 1989;110:916921.Google Scholar
6. McGill, R, Tukey, JW, Larsen, WA. Variations of box plots. Am Stat 1978;32:1216.Google Scholar
7. Hartwig, F, Dearing, BE. Exploratory Data Analysis. Beverly Hills, CA: Sage Publications; 1979. Series on Quantitative Applications in the Social Sciences.Google Scholar
8. Cleveland, WS. Elements of Graphing Data. Summit, NJ: Hobart Press; 1994.Google Scholar
9. JMP Statistic and Graphics Guide. Version 3. Cary, NC: SAS Institute Inc; 1994.Google Scholar
10. Walker, JA, Martínez, DE. Statistics for the Macintosh. Q Rev Biol 1993;68:637648.CrossRefGoogle Scholar
11. Rothman, KJ. No adjustments are needed for multiple comparisons. Epidemiology 1990;1:4346.Google Scholar
12. Holland, BS, Copenhaver, MD. An improved sequentially rejective Bonferroni test procedure. Biometrics 1987;43:417423.CrossRefGoogle Scholar