Book contents
- Frontmatter
- Contents
- Preface
- 1 Introduction
- 2 Binary Regression: The Logit Model
- 3 Generalized Linear Models
- 4 Modeling of Binary Data
- 5 Alternative Binary Regression Models
- 6 Regularization and Variable Selection for Parametric Models
- 7 Regression Analysis of Count Data
- 8 Multinomial Response Models
- 9 Ordinal Response Models
- 10 Semi- and Non-Parametric Generalized Regression
- 11 Tree-Based Methods
- 12 The Analysis of Contingency Tables: Log-Linear and Graphical Models
- 13 Multivariate Response Models
- 14 Random Effects Models and Finite Mixtures
- 15 Prediction and Classification
- A Distributions
- B Some Basic Tools
- C Constrained Estimation
- D Kullback-Leibler Distance and Information-Based Criteria of Model Fit
- E Numerical Integration and Tools for Random Effects Modeling
- List of Examples
- Bibliography
- Author Index
- Subject Index
15 - Prediction and Classification
Published online by Cambridge University Press: 05 June 2012
- Frontmatter
- Contents
- Preface
- 1 Introduction
- 2 Binary Regression: The Logit Model
- 3 Generalized Linear Models
- 4 Modeling of Binary Data
- 5 Alternative Binary Regression Models
- 6 Regularization and Variable Selection for Parametric Models
- 7 Regression Analysis of Count Data
- 8 Multinomial Response Models
- 9 Ordinal Response Models
- 10 Semi- and Non-Parametric Generalized Regression
- 11 Tree-Based Methods
- 12 The Analysis of Contingency Tables: Log-Linear and Graphical Models
- 13 Multivariate Response Models
- 14 Random Effects Models and Finite Mixtures
- 15 Prediction and Classification
- A Distributions
- B Some Basic Tools
- C Constrained Estimation
- D Kullback-Leibler Distance and Information-Based Criteria of Model Fit
- E Numerical Integration and Tools for Random Effects Modeling
- List of Examples
- Bibliography
- Author Index
- Subject Index
Summary
In prediction problems one considers a new observation (y, x). While the predictor value x is observed, y is unknown and is to be predicted. In general, the unknown y may be from any distribution, continuous or discrete, depending on the prediction problem. When the unknown value is categorical we will often denote it by Y, with Y taking values from {1, …, k}. Then prediction means to find the true underlying value from the set {1, …, k}. The problem is strongly related to the common classification problem where one wants to find the true class from which the observation stems. When the numbers 1, …, k denote the underlying classes, the classification problem has the same structure as the prediction problem. Classification problems are basically diagnostic problems. In medical applications one wants to identify the type of disease, in pattern recognition one might aim at recognizing handwritten characters, and in credit scoring (Example 1.7) one wants to identify risk clients. Sometimes the distinction between prediction and classification is philosophical. In credit scoring, where one wants to find out if a client is a risk client, one might argue that it is a prediction problem since the classification lies in the future. Nevertheless, it is mostly seen as a classification problem, implying that the client is already a risk client or not.
- Type
- Chapter
- Information
- Regression for Categorical Data , pp. 429 - 484Publisher: Cambridge University PressPrint publication year: 2011