Book contents
- Frontmatter
- Contents
- List of code fragments
- Preface
- Part I Basic concepts
- Part II Pattern analysis algorithms
- 5 Elementary algorithms in feature space
- 6 Pattern analysis using eigen-decompositions
- 7 Pattern analysis using convex optimisation
- 8 Ranking, clustering and data visualisation
- Part III Constructing kernels
- Appendix A Proofs omitted from the main text
- Appendix B Notational conventions
- Appendix C List of pattern analysis methods
- Appendix D List of kernels
- References
- Index
7 - Pattern analysis using convex optimisation
from Part II - Pattern analysis algorithms
Published online by Cambridge University Press: 29 March 2011
- Frontmatter
- Contents
- List of code fragments
- Preface
- Part I Basic concepts
- Part II Pattern analysis algorithms
- 5 Elementary algorithms in feature space
- 6 Pattern analysis using eigen-decompositions
- 7 Pattern analysis using convex optimisation
- 8 Ranking, clustering and data visualisation
- Part III Constructing kernels
- Appendix A Proofs omitted from the main text
- Appendix B Notational conventions
- Appendix C List of pattern analysis methods
- Appendix D List of kernels
- References
- Index
Summary
This chapter presents a number of algorithms for particular pattern analysis tasks such as novelty-detection, classification and regression. We consider criteria for choosing particular pattern functions, in many cases derived from stability analysis of the corresponding tasks they aim to solve. The optimisation of the derived criteria can be cast in the framework of convex optimization, either as linear or convex quadratic programs. This ensures that as with the algorithms of the last chapter the methods developed here do not suffer from the problem of local minima. They include such celebrated methods as support vector machines for both classification and regression.
We start, however, by describing how to find the smallest hypersphere containing the training data in the embedding space, together with the use and analysis of this algorithm for detecting anomalous or novel data. The techniques introduced for this problem are easily adapted to the task of finding the maximal margin hyperplane or support vector solution that separates two sets of points again possibly allowing some fraction of points to be exceptions. This in turn leads to algorithms for the case of regression.
An important feature of many of these systems is that, while enforcing the learning biases suggested by the stability analysis, they also produce ‘sparse’ dual representations of the hypothesis, resulting in efficient algorithms for both training and test point evaluation. This is a result of the Karush–Kuhn–Tucker conditions, which play a crucial role in the practical implementation and analysis of these algorithms.
- Type
- Chapter
- Information
- Kernel Methods for Pattern Analysis , pp. 195 - 251Publisher: Cambridge University PressPrint publication year: 2004
- 4
- Cited by