Incorporating Curvature Information into On-line Learning

doi:10.1017/CBO9780511569920.010

9 - Incorporating Curvature Information into On-line Learning

Published online by Cambridge University Press: 28 January 2010

Magnus Rattray and

David Saad

Edited by

David Saad

Show author details

Magnus Rattray: Affiliation:
Neural Computing Research Group, Aston University Birmingham B4 7ET, UK
David Saad: Affiliation:
Neural Computing Research Group, Aston University Birmingham B4 7ET, UK
David Saad: Affiliation:
Aston University

Book contents

Get access

Summary

Abstract

We analyse the dynamics of a number of second order on-line learning algorithms training multi-layer neural networks, using the methods of statistical mechanics. We first consider on-line Newton's method, which is known to provide optimal asymptotic performance. We determine the asymptotic generalization error decay for a soft committee machine, which is shown to compare favourably with the result for standard gradient descent. Matrix momentum provides a practical approximation to this method by allowing an efficient inversion of the Hessian. We consider an idealized matrix momentum algorithm which requires access to the Hessian and find close correspondence with the dynamics of on-line Newton's method. In practice, the Hessian will not be known on-line and we therefore consider matrix momentum using a single example approximation to the Hessian. In this case good asymptotic performance may still be achieved, but the algorithm is now sensitive to parameter choice because of noise in the Hessian estimate. On-line Newton's method is not appropriate during the transient learning phase, since a suboptimal unstable fixed point of the gradient descent dynamics becomes stable for this algorithm. A principled alternative is to use Amari's natural gradient learning algorithm and we show how this method provides a significant reduction in learning time when compared to gradient descent, while retaining the asymptotic performance of on-line Newton's method.

Introduction

On-line learning is a popular method for training multi-layer feed-forward neural networks, especially for large systems and for problems requiring rapid and adaptive data processing. Under the on-line learning framework, network parameters are updated according to only the latest in a sequence of training examples.

Type: Chapter
Information: On-Line Learning in Neural Networks , pp. 183 - 208

DOI: https://doi.org/10.1017/CBO9780511569920.010 [Opens in a new window]

Publisher: Cambridge University Press

Print publication year: 1999

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Book contents

9 - Incorporating Curvature Information into On-line Learning

Summary

Access options

Book purchase

Temporarily unavailable

Book contents

9 - Incorporating Curvature Information into On-line Learning

Summary

Access options

Book purchase

Temporarily unavailable

Save book to Kindle

Save book to Dropbox

Save book to Google Drive