Book contents
- Frontmatter
- Dedication
- Contents
- Preface
- Part I Machine learning and kernel vector spaces
- Part II Dimension-reduction: PCA/KPCA and feature selection
- Part III Unsupervised learning models for cluster analysis
- 5 Unsupervised learning for cluster discovery
- 6 Kernel methods for cluster analysis
- Part IV Kernel ridge regressors and variants
- Part V Support vector machines and variants
- Part VI Kernel methods for green machine learning technologies
- Part VII Kernel methods and statistical estimation theory
- Part VIII Appendices
- References
- Index
6 - Kernel methods for cluster analysis
from Part III - Unsupervised learning models for cluster analysis
Published online by Cambridge University Press: 05 July 2014
- Frontmatter
- Dedication
- Contents
- Preface
- Part I Machine learning and kernel vector spaces
- Part II Dimension-reduction: PCA/KPCA and feature selection
- Part III Unsupervised learning models for cluster analysis
- 5 Unsupervised learning for cluster discovery
- 6 Kernel methods for cluster analysis
- Part IV Kernel ridge regressors and variants
- Part V Support vector machines and variants
- Part VI Kernel methods for green machine learning technologies
- Part VII Kernel methods and statistical estimation theory
- Part VIII Appendices
- References
- Index
Summary
Introduction
The various types of raw data encountered in real-world applications fall into two main categories, vectorial and nonvectorial types. For vectorial data, the Euclidean distance or inner product is often used as the similarity measure of the training vectors: (xi, i = 1,…, N}. This leads to the conventional K-means or SOM clustering methods. This chapter extends these methods to kernel-based cluster discovery and then to nonvectorial clustering applications, such as sequence analysis (e.g. protein sequences and signal motifs) and graph partition problems (e.g. molecular interactions, social networks). The fundamental unsupervised learning theory will be systematically extended to nonvectorial data analysis.
This chapter will cover the following kernel-based unsupervised learning models for cluster discovery.
Section 6.2 explores kernel K-means in intrinsic space. In this basic kernel K-means learning model, the original vectors are first mapped to the basis functions for the intrinsic vector space H, and the mapped vectors will then be partitioned into clusters by the conventional K-means. Because the intrinsic-space approach will not be implementable for some vectorial and all nonvectorial applications, alternative representations need to be pursued. According to Theorem 1.1, the LSP condition holds for K-means. According to Eq. (1.20), this means that the problem formulation may be fully and uniquely characterized by the kernel matrix K associated with the training vector, i.e. without a specific vector space being explicitly defined. In short, the original vector-based clustering criterion is converted to a vector-free clustering criterion.
- Type
- Chapter
- Information
- Kernel Methods and Machine Learning , pp. 178 - 218Publisher: Cambridge University PressPrint publication year: 2014
- 1
- Cited by