Book contents
- Frontmatter
- Contents
- Acknowledgements
- List of contributors
- Foreword
- 1 Introduction
- 2 On-line Learning and Stochastic Approximations
- 3 Exact and Perturbation Solutions for the Ensemble Dynamics
- 4 A Statistical Study of On-line Learning
- 5 On-line Learning in Switching and Drifting Environments with Application to Blind Source Separation
- 6 Parameter Adaptation in Stochastic Optimization
- 7 Optimal On-line Learning in Multilayer Neural Networks
- 8 Universal Asymptotics in Committee Machines with Tree Architecture
- 9 Incorporating Curvature Information into On-line Learning
- 10 Annealed On-line Learning in Multilayer Neural Networks
- 11 On-line Learning of Prototypes and Principal Components
- 12 On-line Learning with Time-Correlated Examples
- 13 On-line Learning from Finite Training Sets
- 14 Dynamics of Supervised Learning with Restricted Training Sets
- 15 On-line Learning of a Decision Boundary with and without Queries
- 16 A Bayesian Approach to On-line Learning
- 17 Optimal Perceptron Learning: an On-line Bayesian Approach
10 - Annealed On-line Learning in Multilayer Neural Networks
Published online by Cambridge University Press: 28 January 2010
- Frontmatter
- Contents
- Acknowledgements
- List of contributors
- Foreword
- 1 Introduction
- 2 On-line Learning and Stochastic Approximations
- 3 Exact and Perturbation Solutions for the Ensemble Dynamics
- 4 A Statistical Study of On-line Learning
- 5 On-line Learning in Switching and Drifting Environments with Application to Blind Source Separation
- 6 Parameter Adaptation in Stochastic Optimization
- 7 Optimal On-line Learning in Multilayer Neural Networks
- 8 Universal Asymptotics in Committee Machines with Tree Architecture
- 9 Incorporating Curvature Information into On-line Learning
- 10 Annealed On-line Learning in Multilayer Neural Networks
- 11 On-line Learning of Prototypes and Principal Components
- 12 On-line Learning with Time-Correlated Examples
- 13 On-line Learning from Finite Training Sets
- 14 Dynamics of Supervised Learning with Restricted Training Sets
- 15 On-line Learning of a Decision Boundary with and without Queries
- 16 A Bayesian Approach to On-line Learning
- 17 Optimal Perceptron Learning: an On-line Bayesian Approach
Summary
Abstract
In this article we will examine online learning with an annealed learning rate. Annealing the learning rate is necessary if online learning is to reach its optimal solution. With a fixed learning rate, the system will approximate the best solution only up to some fluctuations. These fluctuations are proportional to the size of the fixed learning rate. It has been shown that an optimal annealing can make online learning asymptotically efficient meaning that asymptotically it learns as fast as possible. These results are until now only realized in very simple networks, like single–layer perceptrons (section 3). Even the simplest multilayer network, the soft committee machine, shows an additional symptom, which makes straightforward annealing uneffective. This is because, at the beginning of learning the committee machine is attracted by a metastable, suboptimal solution (section 4). The system stays in this metastable solution for a long time and can only leave it, if the learning rate is not too small. This delays the start of annealing considerably. Here we will show that a non–local or matrix update can prevent the system from becoming trapped in the metastable phase, allowing for annealing to start much earlier (section 5). Some remarks on the influence of the initial conditions and a possible candidate for a theoretical support are discussed in section 6. The paper ends with a summary of future tasks and a conclusion.
Introduction
One of the most attractive properties of artificial neural networks is their ability to learn from examples and to generalize the acquired knowledge to unknown data.
- Type
- Chapter
- Information
- On-Line Learning in Neural Networks , pp. 209 - 230Publisher: Cambridge University PressPrint publication year: 1999
- 5
- Cited by