Book contents
- Frontmatter
- Dedication
- Contents
- Preface
- Content-how the chapters fit together
- 1 A brief introduction to R
- 2 Styles of data analysis
- 3 Statistical models
- 4 A review of inference concepts
- 5 Regression with a single predictor
- 6 Multiple linear regression
- 7 Exploiting the linear model framework
- 8 Generalized linear models and survival analysis
- 9 Time series models
- 10 Multi-level models and repeated measures
- 11 Tree-based classification and regression
- 12 Multivariate data exploration and discrimination
- 13 Regression on principal component or discriminant scores
- 14 The R system – additional topics
- 15 Graphs in R
- Epilogue
- References
- Index of R symbols and functions
- Index of terms
- Index of authors
- Plate Section
12 - Multivariate data exploration and discrimination
Published online by Cambridge University Press: 05 October 2013
- Frontmatter
- Dedication
- Contents
- Preface
- Content-how the chapters fit together
- 1 A brief introduction to R
- 2 Styles of data analysis
- 3 Statistical models
- 4 A review of inference concepts
- 5 Regression with a single predictor
- 6 Multiple linear regression
- 7 Exploiting the linear model framework
- 8 Generalized linear models and survival analysis
- 9 Time series models
- 10 Multi-level models and repeated measures
- 11 Tree-based classification and regression
- 12 Multivariate data exploration and discrimination
- 13 Regression on principal component or discriminant scores
- 14 The R system – additional topics
- 15 Graphs in R
- Epilogue
- References
- Index of R symbols and functions
- Index of terms
- Index of authors
- Plate Section
Summary
Earlier chapters have made extensive use of exploratory graphs that have examined variables and their pair wise relationships, as a preliminary to regression modeling. Scatter plot matrices have been an important tool, and will be used in this chapter also. The focus will move from regression to methods that seek insight into the pattern presented by multiple variables. While the methodology has applications in a regression context, this is not a primary focus.
There are a number of methods that project the data on to a low-dimensional space, commonly two dimensions, suggesting “views” of the data that may be insightful. In the absence of other sources of guidance, it is reasonable to begin with views that have been thus suggested. One of the most widely used methods for projecting onto a low-dimensional space is principal components analysis (PCA).
The PCA form of mathematical representation has applications in many contexts beyond those discussed here. As used here, PCA is a special case of a much wider class of multidimensional scaling (MDS) methods. Subsection 12.1.3 is a brief introduction to this wider class of methods.
Principal components analysis replaces the input variables by new derived variables, called principal components. The analysis orders the principal components according to the amounts that they contribute to the total of the variances of the original variables.
- Type
- Chapter
- Information
- Data Analysis and Graphics Using RAn Example-Based Approach, pp. 377 - 409Publisher: Cambridge University PressPrint publication year: 2010