Book contents
- Frontmatter
- Contents
- List of Figures
- List of Tables
- List of Boxes
- Acknowledgments
- 1 Introduction
- PART I R AND BASIC STATISTICS
- 2 Introduction to R
- 3 Looking at Data – Numerical Summaries
- 4 Looking at Data – Tables
- 5 Looking at Data – Graphs
- 6 Transformations
- 7 Missing Values
- 8 Confidence Intervals and Hypothesis Testing
- 9 Relating Variables
- PART II MULTIVARIATE METHODS
- PART III ARCHAEOLOGICAL APPROACHES TO DATA
- References
- Index
8 - Confidence Intervals and Hypothesis Testing
from PART I - R AND BASIC STATISTICS
Published online by Cambridge University Press: 22 July 2017
- Frontmatter
- Contents
- List of Figures
- List of Tables
- List of Boxes
- Acknowledgments
- 1 Introduction
- PART I R AND BASIC STATISTICS
- 2 Introduction to R
- 3 Looking at Data – Numerical Summaries
- 4 Looking at Data – Tables
- 5 Looking at Data – Graphs
- 6 Transformations
- 7 Missing Values
- 8 Confidence Intervals and Hypothesis Testing
- 9 Relating Variables
- PART II MULTIVARIATE METHODS
- PART III ARCHAEOLOGICAL APPROACHES TO DATA
- References
- Index
Summary
When we plotted the dart point lengths, some types seemed similar to one another and some seemed different. Under most circumstances, we assume that the data we are analyzing is a sample of a larger population to which we do not have access. When we computed the mean and the standard deviation of length for different point types, we were computing sample statistics, values that characterized the distribution of values in the sample. If we increase the sample by adding more points, the statistics would change somewhat. If we could collect all of the points of a particular type and compute the mean and the standard deviation, those values would now be parameters. They would represent the entire population of a point type.
Of course, we cannot hope to find all of the points of a particular type, or all of the pots of a particular type. We never have more than a sample to work with, but we would like to estimate the population parameters on the basis of a sample. Since two samples contain only estimates of the population, it also makes sense to wonder if the two samples are part of the same population or if they come from two different populations. It is only when we are looking at a part of the whole that we have to consider if the statistics computed from the sample are representative of the population as a whole. One of the goals of inferential statistics is to formalize the concepts of similar and dissimilar in terms of probability and this leads to the concepts of confidence intervals and hypothesis testing.
Confidence intervals provide a probability distribution around a statistic such that if we had many samples, we can say that a certain percentage of the confidence intervals of those samples would include the population value. Hypothesis testing allows us to assign a probability to the possibility that two samples were drawn from the same population (or from populations with the same parameters).
Classical inferential statistics often depends upon the normal or Gaussian distribution to determine those probabilities. These methods are generally referred to as parametric statistics.
- Type
- Chapter
- Information
- Quantitative Methods in Archaeology Using R , pp. 159 - 189Publisher: Cambridge University PressPrint publication year: 2017