Book contents
- Frontmatter
- Contents
- Acknowledgments
- 1 Introduction
- 2 Programming and statistical concepts
- 3 Choosing a test statistic
- 4 Random variables and distributions
- 5 More programming and statistical concepts
- 6 Parametric distributions
- 7 Linear model
- 8 Fitting distributions
- 9 Dependencies
- 10 How to get away with peeking at data
- 11 Contingency
- References
- Index
7 - Linear model
Published online by Cambridge University Press: 05 June 2012
- Frontmatter
- Contents
- Acknowledgments
- 1 Introduction
- 2 Programming and statistical concepts
- 3 Choosing a test statistic
- 4 Random variables and distributions
- 5 More programming and statistical concepts
- 6 Parametric distributions
- 7 Linear model
- 8 Fitting distributions
- 9 Dependencies
- 10 How to get away with peeking at data
- 11 Contingency
- References
- Index
Summary
Linear model
Fundamental to observing phenomena, hypothesizing explanations, and arguing their differential credibility is the recognition and quantification of distinctions. Are things different? If so, how different are they? The question, Can we distinguish by their weight the two groups of fish caught on different dry flies?, is an example. We have discussed several test statistics relevant to this question. Now we will examine an approach used by classical statistics to address the same basic question. This approach to hypothesis formation is widely used in the published literature of every natural science, and is not uncommon in the more quantitative publications of social science as well. It will be important for you to understand it, and possibly even use it, in your own work.
This approach is to hypothesize a non-probabilistic causal mechanism; any variation in the observed data that is not accounted for by this mechanism is attributed to error. This error is construed as random, quantified in a particular way that was convenient when people had to compute without computers. These non-probabilistic hypotheses are actually a whole family of such hypotheses, called the linear model. From this family, the particular member that minimizes the random (unexplained) variability in the data attributed to error is chosen. The result is a description of the variability in your data, and several candidates for test statistics.
- Type
- Chapter
- Information
- Publisher: Cambridge University PressPrint publication year: 2011