Book contents
- Frontmatter
- Dedication
- Contents
- Preface
- 1 Introduction
- Part 1 Foundations
- Part 2 From Theory to Algorithms
- Part 3 Additional Learning Models
- 21 Online Learning
- 22 Clustering
- 23 Dimensionality Reduction
- 24 Generative Models
- 25 Feature Selection and Generation
- Part 4 Advanced Theory
- Appendix A Technical Lemmas
- Appendix B Measure Concentration
- Appendix C Linear Algebra
- References
- Index
24 - Generative Models
from Part 3 - Additional Learning Models
Published online by Cambridge University Press: 05 July 2014
- Frontmatter
- Dedication
- Contents
- Preface
- 1 Introduction
- Part 1 Foundations
- Part 2 From Theory to Algorithms
- Part 3 Additional Learning Models
- 21 Online Learning
- 22 Clustering
- 23 Dimensionality Reduction
- 24 Generative Models
- 25 Feature Selection and Generation
- Part 4 Advanced Theory
- Appendix A Technical Lemmas
- Appendix B Measure Concentration
- Appendix C Linear Algebra
- References
- Index
Summary
We started this book with a distribution free learning framework; namely, we did not impose any assumptions on the underlying distribution over the data. Furthermore, we followed a discriminative approach in which our goal is not to learn the underlying distribution but rather to learn an accurate predictor. In this chapter we describe a generative approach, in which it is assumed that the underlying distribution over the data has a specific parametric form and our goal is to estimate the parameters of the model. This task is called parametric density estimation.
The discriminative approach has the advantage of directly optimizing the quantity of interest (the prediction accuracy) instead of learning the underlying distribution. This was phrased as follows by Vladimir Vapnik in his principle for solving problems using a restricted amount of information:
When solving a given problem, try to avoid a more general problem as an intermediate step.
Of course, if we succeed in learning the underlying distribution accurately, we are considered to be “experts” in the sense that we can predict by using the Bayes optimal classifier. The problem is that it is usually more difficult to learn the underlying distribution than to learn an accurate predictor. However, in some situations, it is reasonable to adopt the generative learning approach. For example, sometimes it is easier (computationally) to estimate the parameters of the model than to learn a discriminative predictor.
- Type
- Chapter
- Information
- Understanding Machine LearningFrom Theory to Algorithms, pp. 295 - 308Publisher: Cambridge University PressPrint publication year: 2014
- 2
- Cited by