Published online by Cambridge University Press: 04 January 2017
Social scientists spend considerable energy constructing typologies and discussing their roles in measurement. Less discussed is the role of typologies in evaluating and revising theoretical arguments. We argue that unsupervised machine learning tools can be profitably applied to the development and testing of theory-based typologies. We review recent advances in mixture models as applied to cluster analysis and argue that these tools are particularly important in the social sciences where it is common to claim that high-dimensional objects group together in meaningful clusters. Model-based clustering (MBC) grounds analysis in probability theory, permitting the evaluation of uncertainty and application of information-based model selection tools. We show that the MBC approach forces analysts to consider dimensionality problems that more traditional clustering tools obscure. We apply MBC to the “varieties of capitalism,” a typology receiving significant attention in political science and economic sociology. We find weak and conflicting evidence for the theory's expected grouping. We therefore caution against the current practice of including typology-derived dummy variables in regression and case-comparison research designs.
Edited by Jonathan N. Katz
Authors' note: Previous versions of this paper were presented at the 2008 meetings of the Society for Political Methodology, the 2009 10th Anniversary Conference for the University of Washington Center for Statistics and the Social Sciences, and colloquia at Duke, Purdue, and MPIfG. We thank Justin Grimmer, Martin Hoepner, Ryan Moore, Kevin Quinn, Adrian Raftery, and Mike Ward for comments and helpful conversation. Carlisle Rainey provided research assistance. Replication materials are available at the Political Analysis dataverse, Ahlquist's dataverse (http://dvn.iq.harvard.edu/dvn/dv/ahlquist), and Breunig's Web site http://individual.utoronto.ca/cbreunig/.