Bounding the VC-Dimension using Geometric Techniques

Martin Anthony; Peter L. Bartlett

doi:10.1017/CBO9780511624216.008

7 - Bounding the VC-Dimension using Geometric Techniques

Published online by Cambridge University Press: 26 February 2010

Martin Anthony and

Peter L. Bartlett

Show author details

Martin Anthony: Affiliation:
London School of Economics and Political Science
Peter L. Bartlett: Affiliation:
Australian National University, Canberra

Book contents

Get access

Summary

Introduction

Results in the previous chapter show that the VC-dimension of the class of functions computed by a network of linear threshold units with W parameters is no larger than a constant times W log W. These results cannot immediately be extended to networks of sigmoid units (with continuous activation functions), since the proofs involve counting the number of distinct outputs of all linear threshold units in the network as the input varies over m patterns, and a single sigmoid unit has an infinite number of output values. In this chapter and the next we derive bounds on the VC-dimension of certain sigmoid networks, including networks of units having the standard sigmoid activation function σ(α) = 1/(1 + e−α). Before we begin this derivation, we study an example that shows that the form of the activation function is crucial.

The Need for Conditions on the Activation Functions

One might suspect that if we construct networks of sigmoid units with a well-behaved activation function, they will have finite VC-dimension. For instance, perhaps it suffices if the activation function is sufficiently smooth, bounded, and monotonically increasing. Unfortunately, the situation is not so simple. The following result shows that there is an activation function that has all of these properties, and even has its derivative monotonically increasing to the left of zero and decreasing to the right (so it is convex and concave in those regions), and yet is such that a two-layer network having only two computation units in the first layer, each with this activation function, has infinite VC-dimension.

Type: Chapter
Information: Neural Network Learning
Theoretical Foundations
, pp. 86 - 107

DOI: https://doi.org/10.1017/CBO9780511624216.008 [Opens in a new window]

Publisher: Cambridge University Press

Print publication year: 1999

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Book contents

7 - Bounding the VC-Dimension using Geometric Techniques

Summary

Access options

Save book to Kindle

Save book to Dropbox

Save book to Google Drive