Hostname: page-component-745bb68f8f-lrblm Total loading time: 0 Render date: 2025-01-07T19:25:44.002Z Has data issue: false hasContentIssue false

Comparative Evaluation of Two Superior Stopping Rules for Hierarchical Cluster Analysis

Published online by Cambridge University Press:  01 January 2025

Robert S. Atlas
Affiliation:
The University of Texas Medical School at Houston
John E. Overall*
Affiliation:
The University of Texas Medical School at Houston
*
Reprints may be requested from John E. Overall, University of Texas-Houston Medical School, Department of Psychiatry and Behavioral Science, P.O. Box 20708, Houston, TX 77225.

Abstract

A split-sample replication stopping rule for hierarchical cluster analysis is compared with the internal criterion previously found superior by Milligan and Cooper (1985) in their comparison of 30 different procedures. The number and extent of overlap of the latent population distributions was systematically varied in the present evaluation of stopping-rule validity. Equal and unequal population base rates were also considered. Both stopping rules correctly identified the actual number of populations when there was essentially no overlap and clusters occupied visually distinct regions of the measurement space. The replication criterion, which is evaluated by clustering of cluster means from preliminary analyses that are accomplished on random partitions of an original data set, was superior as the degree of overlap in population distributions increased. Neither method performed adequately when overlap obliterated visually discernible density nodes.

Type
Original Paper
Copyright
Copyright © 1994 The Psychometric Society

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Footnotes

This research was supported in part by NIMH grant 5R01 MH 32457 14.

References

Aldenderfer, M. S., Blashfield, R. K. (1984). Cluster analysis, Beverly Hills, CA: Sage Publications.CrossRefGoogle Scholar
Bayne, R., Beauchamp, J., Begovich, C., Kane, V. (1980). Monte Carlo comparisons of selected clustering procedures. Pattern Recognition, 12, 5162.CrossRefGoogle Scholar
Blashfield, R. K. (1976). Mixture model tests of cluster analysis: Accuracy of four agglomerative hierarchical methods. Psychological Bulletin, 83, 377388.CrossRefGoogle Scholar
Blashfield, R., Morey, L. (1980). A comparison of four clustering methods using MMPI Monte Carlo data. Applied Psychological Measurement, 4, 5764.CrossRefGoogle Scholar
Breckenridge, J. N. (1989). Replicating cluster analysis: Method, consistency and validity. Multivariate Behavioral Research, 24(2), 147161.CrossRefGoogle Scholar
Calinski, R. B., Harabasz, J. (1974). A dendrite method for cluster analysis. Communications in Statistics, 3, 127.Google Scholar
Edelbrock, C. (1979). Comparing the accuracy of hierarchical clustering algorithms. Multivariate Behavioral Research, 14, 367384.CrossRefGoogle ScholarPubMed
Everitt, B. (1980). Cluster analysis, New York: Halsted.Google Scholar
Haggard, E. A. (1958). Intraclass correlation and the analysis of variance, New York, NY: The Dryden Press.Google Scholar
Jain, A. K., Dubes, R. C. (1988). Algorithms for clustering data, Englewood Cliffs, NJ: Prentice Hall.Google Scholar
Kosko, B. (1992). Fuzziness and probability. Neural networks and fuzzy systems, Englewood Cliffs, NJ: Prentice Hall.Google Scholar
McIntyre, R. M., Blashfield, R. K. (1980). A nearest-centroid technique for evaluating the minimum-variance clustering procedure. Multivariate Behavioral Research, 2, 225238.CrossRefGoogle Scholar
Milligan, G. W. (1989). A study of the beta-flexible clustering method. Multivariate Behavioral Research, 24, 163176.CrossRefGoogle ScholarPubMed
Milligan, G. W. (1980). An examination of the effects of six types of error perturbations on fifteen clustering algorithms. Psychometrika, 45, 325342.CrossRefGoogle Scholar
Milligan, G. W., Cooper, M. C. (1985). An examination of procedures for determining the number of clusters in a data set. Psychometrika, 50, 159179.CrossRefGoogle Scholar
Mojena, R. (1977). Hierarchical grouping methods and stopping rules—an evaluation. Computer Journal, 20, 359363.CrossRefGoogle Scholar
Overall, J. E., Gibson, J. M., & Novy, D. M. (In press). Population recovery capabilities of 35 cluster analysis methods. Journal of Clinical Psychology.Google Scholar
Overall, J. E., Klett, C. J. (1972). Applied multivariate analysis, New York: McGraw-Hill.Google Scholar
Overall, J. E., Magee, K. N. (1992). Replication as a rule for determining the number of clusters in hierarchical cluster analysis. Applied Psychological Measurement, 16, 119128.CrossRefGoogle Scholar
Sneath, P. H. A., Sokal, R. R. (1973). Numerical taxonomy, San Francisco: W. H. Freeman.Google Scholar
Ward, J. H. (1963). Hierarchical grouping to optimize an objective function. Journal of the American Statistical Association, 58, 236244.CrossRefGoogle Scholar