Book contents
- Frontmatter
- Dedication
- Contents
- Preface
- 1 Introduction
- Part 1 Foundations
- Part 2 From Theory to Algorithms
- 9 Linear Predictors
- 10 Boosting
- 11 Model Selection and Validation
- 12 Convex Learning Problems
- 13 Regularization and Stability
- 14 Stochastic Gradient Descent
- 15 Support Vector Machines
- 16 Kernel Methods
- 17 Multiclass, Ranking, and Complex Prediction Problems
- 18 Decision Trees
- 19 Nearest Neighbor
- 20 Neural Networks
- Part 3 Additional Learning Models
- Part 4 Advanced Theory
- Appendix A Technical Lemmas
- Appendix B Measure Concentration
- Appendix C Linear Algebra
- References
- Index
10 - Boosting
from Part 2 - From Theory to Algorithms
Published online by Cambridge University Press: 05 July 2014
- Frontmatter
- Dedication
- Contents
- Preface
- 1 Introduction
- Part 1 Foundations
- Part 2 From Theory to Algorithms
- 9 Linear Predictors
- 10 Boosting
- 11 Model Selection and Validation
- 12 Convex Learning Problems
- 13 Regularization and Stability
- 14 Stochastic Gradient Descent
- 15 Support Vector Machines
- 16 Kernel Methods
- 17 Multiclass, Ranking, and Complex Prediction Problems
- 18 Decision Trees
- 19 Nearest Neighbor
- 20 Neural Networks
- Part 3 Additional Learning Models
- Part 4 Advanced Theory
- Appendix A Technical Lemmas
- Appendix B Measure Concentration
- Appendix C Linear Algebra
- References
- Index
Summary
Boosting is an algorithmic paradigm that grew out of a theoretical question and became a very practical machine learning tool. The boosting approach uses a generalization of linear predictors to address two major issues that have been raised earlier in the book. The first is the bias-complexity tradeoff. We have seen (in Chapter 5) that the error of an ERM learner can be decomposed into a sum of approximation error and estimation error. The more expressive the hypothesis class the learner is searching over, the smaller the approximation error is, but the larger the estimation error becomes. A learner is thus faced with the problem of picking a good tradeoff between these two considerations. The boosting paradigm allows the learner to have smooth control over this tradeoff. The learning starts with a basic class (that might have a large approximation error), and as it progresses the class that the predictor may belong to grows richer.
The second issue that boosting addresses is the computational complexity of learning. As seen in Chapter 8, for many interesting concept classes the task of finding an ERM hypothesis may be computationally infeasible. A boosting algorithm amplifies the accuracy of weak learners. Intuitively, one can think of a weak learner as an algorithm that uses a simple “rule of thumb” to output a hypothesis that comes from an easy-to-learn hypothesis class and performs just slightly better than a random guess.
- Type
- Chapter
- Information
- Understanding Machine LearningFrom Theory to Algorithms, pp. 101 - 113Publisher: Cambridge University PressPrint publication year: 2014