Book contents
- Frontmatter
- Contents
- List of code fragments
- Preface
- Part I Basic concepts
- 1 Pattern analysis
- 2 Kernel methods: an overview
- 3 Properties of kernels
- 4 Detecting stable patterns
- Part II Pattern analysis algorithms
- Part III Constructing kernels
- Appendix A Proofs omitted from the main text
- Appendix B Notational conventions
- Appendix C List of pattern analysis methods
- Appendix D List of kernels
- References
- Index
3 - Properties of kernels
from Part I - Basic concepts
Published online by Cambridge University Press: 29 March 2011
- Frontmatter
- Contents
- List of code fragments
- Preface
- Part I Basic concepts
- 1 Pattern analysis
- 2 Kernel methods: an overview
- 3 Properties of kernels
- 4 Detecting stable patterns
- Part II Pattern analysis algorithms
- Part III Constructing kernels
- Appendix A Proofs omitted from the main text
- Appendix B Notational conventions
- Appendix C List of pattern analysis methods
- Appendix D List of kernels
- References
- Index
Summary
As we have seen in Chapter 2, the use of kernel functions provides a powerful and principled way of detecting nonlinear relations using well-understood linear algorithms in an appropriate feature space. The approach decouples the design of the algorithm from the specification of the feature space. This inherent modularity not only increases the flexibility of the approach, it also makes both the learning algorithms and the kernel design more amenable to formal analysis. Regardless of which pattern analysis algorithm is being used, the theoretical properties of a given kernel remain the same. It is the purpose of this chapter to introduce the properties that characterise kernel functions.
We present the fundamental properties of kernels, thus formalising the intuitive concepts introduced in Chapter 2. We provide a characterization of kernel functions, derive their properties, and discuss methods for designing them. We will also discuss the role of prior knowledge in kernel-based learning machines, showing that a universal machine is not possible, and that kernels must be chosen for the problem at hand with a view to capturing our prior belief of the relatedness of different examples. We also give a framework for quantifying the match between a kernel and a learning task.
Given a kernel and a training set, we can form the matrix known as the kernel, or Gram matrix: the matrix containing the evaluation of the kernel function on all pairs of data points.
- Type
- Chapter
- Information
- Kernel Methods for Pattern Analysis , pp. 47 - 84Publisher: Cambridge University PressPrint publication year: 2004
- 5
- Cited by