Book contents
- Frontmatter
- Contents
- Preface
- List of abbreviations
- 1 Basic notions in classical data analysis
- 2 Linear multivariate statistical analysis
- 3 Basic time series analysis
- 4 Feed-forward neural network models
- 5 Nonlinear optimization
- 6 Learning and generalization
- 7 Kernel methods
- 8 Nonlinear classification
- 9 Nonlinear regression
- 10 Nonlinear principal component analysis
- 11 Nonlinear canonical correlation analysis
- 12 Applications in environmental sciences
- Appendices
- References
- Index
8 - Nonlinear classification
Published online by Cambridge University Press: 04 May 2010
- Frontmatter
- Contents
- Preface
- List of abbreviations
- 1 Basic notions in classical data analysis
- 2 Linear multivariate statistical analysis
- 3 Basic time series analysis
- 4 Feed-forward neural network models
- 5 Nonlinear optimization
- 6 Learning and generalization
- 7 Kernel methods
- 8 Nonlinear classification
- 9 Nonlinear regression
- 10 Nonlinear principal component analysis
- 11 Nonlinear canonical correlation analysis
- 12 Applications in environmental sciences
- Appendices
- References
- Index
Summary
So far, we have used NN and kernel methods for nonlinear regression. However, when the output variables are discrete rather than continuous, the regression problem turns into a classification problem. For instance, instead of a prediction of the wind speed, the public may be more interested in a forecast of either ‘storm’ or ‘no storm’. There can be more than two classes for the outcome –for seasonal temperature, forecasts are often issued as one of three classes, ‘warm’, ‘normal’ and ‘cool’ conditions. Issuing just a forecast for one of the three conditions (e.g. ‘cool conditions for next season’) is often not as informative to the public as issuing posterior probability forecasts. For instance, a ‘25% chance of warm, 30% normal and 45% cool’ condition forecast, is quite different from a ‘5% warm, 10% normal and 85% cool’ condition forecast, even though both are forecasting cool conditions.We will examine methods which choose one out of k classes, and methods which issue posterior probability over k classes. In cases where a class is not precisely defined, clustering methods are used to group the data points into clusters.
In Section 1.6, we introduced the two main approaches to classification – the first by discriminant functions, the second by posterior probability. A discriminant function is simply a function which tells us which class we should assign to a given predictor data point x (also called a feature vector).
- Type
- Chapter
- Information
- Machine Learning Methods in the Environmental SciencesNeural Networks and Kernels, pp. 170 - 195Publisher: Cambridge University PressPrint publication year: 2009