Book contents
- Frontmatter
- Contents
- List of code fragments
- Preface
- Part I Basic concepts
- 1 Pattern analysis
- 2 Kernel methods: an overview
- 3 Properties of kernels
- 4 Detecting stable patterns
- Part II Pattern analysis algorithms
- Part III Constructing kernels
- Appendix A Proofs omitted from the main text
- Appendix B Notational conventions
- Appendix C List of pattern analysis methods
- Appendix D List of kernels
- References
- Index
4 - Detecting stable patterns
from Part I - Basic concepts
Published online by Cambridge University Press: 29 March 2011
- Frontmatter
- Contents
- List of code fragments
- Preface
- Part I Basic concepts
- 1 Pattern analysis
- 2 Kernel methods: an overview
- 3 Properties of kernels
- 4 Detecting stable patterns
- Part II Pattern analysis algorithms
- Part III Constructing kernels
- Appendix A Proofs omitted from the main text
- Appendix B Notational conventions
- Appendix C List of pattern analysis methods
- Appendix D List of kernels
- References
- Index
Summary
As discussed in Chapter 1 perhaps the most important property of a pattern analysis algorithm is that it should identify statistically stable patterns. A stable relation is one that reflects some property of the source generating the data, and is therefore not a chance feature of the particular dataset. Proving that a given pattern is indeed significant is the concern of ‘learning theory’, a body of principles and methods that estimate the reliability of pattern functions under appropriate assumptions about the way in which the data was generated. The most common assumption is that the individual training examples are generated independently according to a fixed distribution, being the same distribution under which the expected value of the pattern function is small. Statistical analysis of the problem can therefore make use of the law of large numbers through the ‘concentration’ of certain random variables.
Concentration would be all that we need if we were only to consider one pattern function. Pattern analysis algorithms typically search for pattern functions over whole classes of functions, by choosing the function that best fits the particular training sample. We must therefore be able to prove stability not of a pre-defined pattern, but of one deliberately chosen for its fit to the data.
Clearly the more pattern functions at our disposal, the more likely that this choice could be a spurious pattern. The critical factor that controls how much our choice may have compromised the stability of the resulting pattern is the ‘capacity’ of the function class.
- Type
- Chapter
- Information
- Kernel Methods for Pattern Analysis , pp. 85 - 108Publisher: Cambridge University PressPrint publication year: 2004