Book contents
- Frontmatter
- Contents
- Preface
- 1 Introduction
- Part one Pattern Classification with Binary-Output Neural Networks
- Part two Pattern Classification with Real-Output Networks
- Part three Learning Real-Valued Functions
- Part four Algorithmics
- 22 Efficient Learning
- 23 Learning as Optimization
- 24 The Boolean Perceptron
- 25 Hardness Results for Feed-Forward Networks
- 26 Constructive Learning Algorithms for Two-Layer Networks
- Appendix 1 Useful Results
- Bibliography
- Author index
- Subject index
23 - Learning as Optimization
Published online by Cambridge University Press: 26 February 2010
- Frontmatter
- Contents
- Preface
- 1 Introduction
- Part one Pattern Classification with Binary-Output Neural Networks
- Part two Pattern Classification with Real-Output Networks
- Part three Learning Real-Valued Functions
- Part four Algorithmics
- 22 Efficient Learning
- 23 Learning as Optimization
- 24 The Boolean Perceptron
- 25 Hardness Results for Feed-Forward Networks
- 26 Constructive Learning Algorithms for Two-Layer Networks
- Appendix 1 Useful Results
- Bibliography
- Author index
- Subject index
Summary
Introduction
The previous chapter demonstrated that efficient SEM and approximate-SEM algorithms for graded classes F = ∪Fn give rise to efficient learning algorithms, provided the expressive power of Fn grows polynomially with n (in, respectively, the binary classification and real prediction learning models). In this chapter we show that randomized SEM and approximate-SEM algorithms suffice, and that a converse result then holds: if efficient learning is possible then there must exist an efficient randomized approximate-SEM algorithm. (Hence, for the case of a binary function class, there must be an efficient randomized SEM algorithm.) This will establish that, in both models of learning, efficient learning is intimately related to the optimization problem of finding a hypothesis with small sample error.
Randomized Algorithms
For our purposes, a randomized algorithm has available to it a random number generator that produces a sequence of independent, uniformly distributed bits. We shall assume that examining one bit of this random sequence takes one unit of time. (It is sometimes convenient to assume that the algorithm has access to a sequence of independent uniformly distributed integers in the set {0, 1, …, I}, for some I ≥ 1; it is easy to construct such a sequence from a sequence of random bits.) The randomized algorithm A uses these random bits as part of its input, but it is useful to think of this input as somehow ‘internal’ to the algorithm, and to think of the algorithm as defining a mapping from an ‘external’ input to a probability distribution over outputs.
- Type
- Chapter
- Information
- Neural Network LearningTheoretical Foundations, pp. 307 - 315Publisher: Cambridge University PressPrint publication year: 1999