Latent Class Modeling with Covariates: Two Improved Three-Step Approaches

Jeroen K. Vermunt

doi:10.1093/pan/mpq025

Latent Class Modeling with Covariates: Two Improved Three-Step Approaches

Published online by Cambridge University Press: 04 January 2017

Jeroen K. Vermunt

Show author details

Jeroen K. Vermunt*: Affiliation:
Department of Methodology and Statistics, Tilburg University, PO Box 90153, 5000 LE Tilburg, The Netherlands. e-mail: [email protected]

Article contents

Abstract
References

Get access

Rights & Permissions

Abstract

Researchers using latent class (LC) analysis often proceed using the following three steps: (1) an LC model is built for a set of response variables, (2) subjects are assigned to LCs based on their posterior class membership probabilities, and (3) the association between the assigned class membership and external variables is investigated using simple cross-tabulations or multinomial logistic regression analysis. Bolck, Croon, and Hagenaars (2004) demonstrated that such a three-step approach underestimates the associations between covariates and class membership. They proposed resolving this problem by means of a specific correction method that involves modifying the third step. In this article, I extend the correction method of Bolck, Croon, and Hagenaars by showing that it involves maximizing a weighted log-likelihood function for clustered data. This conceptualization makes it possible to apply the method not only with categorical but also with continuous explanatory variables, to obtain correct tests using complex sampling variance estimation methods, and to implement it in standard software for logistic regression analysis. In addition, a new maximum likelihood (ML)—based correction method is proposed, which is more direct in the sense that it does not require analyzing weighted data. This new three-step ML method can be easily implemented in software for LC analysis. The reported simulation study shows that both correction methods perform very well in the sense that their parameter estimates and their SEs can be trusted, except for situations with very poorly separated classes. The main advantage of the ML method compared with the Bolck, Croon, and Hagenaars approach is that it is much more efficient and almost as efficient as one-step ML estimation.

Type: Research Article
Information: Political Analysis , Volume 18 , Issue 4 , Autumn 2010 , pp. 450 - 469

DOI: https://doi.org/10.1093/pan/mpq025 [Opens in a new window]
Copyright: Copyright © The Author 2010. Published by Oxford University Press on behalf of the Society for Political Methodology

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Article purchase

Temporarily unavailable

References

Bandeen-Roche, Karen, Miglioretti, Diana L., Zeger, Scott L., and Rathouz, Paul J. 1997. Latent variable regression for multiple discrete outcomes. Journal of the American Statistical Association 92: 1375–86.CrossRef Google Scholar

Blaydes, Lisa, and Linzer, Drew A. 2008. The political economy of women's support for fundamentalist Islam. World Politics 60: 579–609.CrossRef Google Scholar

Bolck, Annabel, Croon, Marcel A., and Hagenaars, Jacques A. 2004. Estimating latent structure models with categorical variables: One-step versus three-step estimators. Political Analysis 12: 3–27.CrossRef Google Scholar

Breen, Richard. 2000. Why is support for extreme parties underestimated by surveys? A latent class analysis. British Journal of Political Science 30: 375–82.CrossRef Google Scholar

Chung, Hwan, Flaherty, Brian P., and Schafer, Joseph L. 2006. Latent class logistic regression: Application to marijuana use and attitudes among high school seniors. Journal of the Royal Statistical Society Series A—Statistics in Society 169: 723–43.CrossRef Google Scholar

Clogg, Clifford C. 1981. New developments in latent structure analysis. In Factor analysis and measurement in sociological research, ed. Jackson, D. J. and Borgotta, E. F., 215–46. Beverly Hills, CA: Sage.Google Scholar

Collins, Linda M., and Wugalter, Stuart E. 1992. Latent class models for stage-sequential dynamic latent variables. Multivariate Behavioral Research 27: 131–57.Google Scholar

Croon, Marcel A. 2002. Using predicted latent scores in general latent structure models. In Latent variable and latent structure models, ed. Marcoulides, George A. and Moustaki, Irini, 195–224. Mahwah, NJ: Lawrence Erlbaum.Google Scholar

Dalton, Russell J. 2006. The two faces of citizenship. Democracy & Society 3: 21–3.Google Scholar

Dalton, Russell J. 2008. Citizenship norms and the expansion of political participation. Political Studies 56: 76–98.CrossRef Google Scholar

Dayton, C. Mitchell, and Macready, Geoffrey B. 1988. Concomitant-variable latent-class models. Journal of the American Statistical Association 83: 173–8.CrossRef Google Scholar

Dias, José G., and Vermunt, Jeroen K. 2008. A bootstrap-based aggregate classifier for model-based clustering. Computational Statistics 23: 643–59.CrossRef Google Scholar

Edlund, Jonas. 2006. Trust in the capability of the welfare state and general welfare state support: Sweden 1997-2002. Acta Sociologica 49: 395–417.CrossRef Google Scholar

Feick, Lawrence F. 1989. Latent class analysis of survey questions that include don't know responses. Public Opinion Quarterly 53: 525–47.CrossRef Google Scholar

Galindo-Garre, Francisca, and Vermunt, Jeroen K. 2006. Avoiding boundary estimates in latent class analysis by Bayesian posterior mode estimation. Behaviormetrika 33: 43–59.CrossRef Google Scholar

Garrett, Elisabeth S., and Zeger, Scott L. 2000. Latent class model diagnosis. Biometrics 56: 1055–67.CrossRef Google Scholar PubMed

Garrett, Elisabeth S., Eaton, William W., and Zeger, Scott L. 2002. Methods for evaluating the performance of diagnostic tests in the absence of a gold standard: A latent class model approach. Statistics in Medicine 21: 1289–307.CrossRef Google Scholar

Goodman, Leo A. 1974a. The analysis of systems of qualitative variables when some of the variables are unobservable: Part I—A modified latent structure approach. American Journal of Sociology 79: 1179–259.Google Scholar

Goodman, Leo A. 1974b. Exploratory latent structure analysis using both identifiable and unidentifiable models. Biometrika 61: 215–31.CrossRef Google Scholar

Goodman, Leo A. 2007. On the assignment of individuals to classes. Sociological Methodology 37: 1–22.CrossRef Google Scholar

Haberman, Shelby J. 1979. Analysis of qualitative data, Vol. 2: New developments. New York: Academic Press.Google Scholar

Hagenaars, Jacques A. 1990. Categorical longitudinal data—Loglinear analysis of panel, trend and cohort data. Newbury Park, CA: Sage.Google Scholar

Hagenaars, Jacques A. 1993. Loglinear models with latent variables. Newbury Park, CA: Sage.CrossRef Google Scholar

Hill, Jennifer L., and Kriesi, Hanspeter. 2001a. Classification by opinion-changing behavior: A mixture model approach. Political Analysis 9: 301–24.CrossRef Google Scholar

Hill, Jennifer L., and Kriesi, Hanspeter. 2001b. An extension and test of converse's ‘black-and-white’ model of response stability. American Political Science Review 95: 397–413.Google Scholar

Howard, Marc M., Gibson, James L., and Stolle, Dietlind. 2005. The U.S. Citizenship, Involvement, Democracy survey. Center for Democracy and Civil Society, Georgetown University.Google Scholar

Kamakura, Wagner A., Wedel, Michel, and Agrawal, Jagdish. 1994. Concomitant variable latent class models for the external analysis of choice data. International Journal of Marketing Research 11: 451–64.Google Scholar

Katz, Jonathan N., and Katz, Gabriel. 2009. Reassessing the link between voter heterogeneity and political accountability: A latent class regression model of economic voting. Paper presented at the 26th Annual Society for Political Methodology Summer Conference, July 23-25, 2009, Yale University.Google Scholar

Lazarsfeld, Paul F., and Henry, Neil W. 1968. Latent structure analysis. Boston, MA: Houghton Mill.Google Scholar

Linzer, Drew A. 2006. A comparative analysis of ideological constraint using latent class models. Paper presented at the annual meeting of the Midwest Political Science Association, Palmer House Hilton, Chicago, IL, April 20, 2006.Google Scholar

Lu, Irene R.R., and Roland Thomas, D. 2008. Avoiding and correcting bias in score-based latent variable regression with discrete manifest items. Structural Equation Modeling 15: 462–90.CrossRef Google Scholar

Magidson, Jay. 1981. Qualitative variance, entropy, and correlation ratios for nominal dependent variables. Social Science Research 10: 177–94.CrossRef Google Scholar

Magidson, Jay, and Vermunt, Jeroen K. 2001. Latent class factor and cluster models, bi-plots and related graphical displays. Sociological Methodology 31: 223–64.CrossRef Google Scholar

McCutcheon, Allan L. 1985. A latent class analysis of tolerance for nonconformity in the American public. Public Opinion Quarterly 49: 474–88.CrossRef Google Scholar

McCutcheon, Allan L. 1987. Latent class analysis. Newbury Park, CA: Sage.CrossRef Google Scholar

McLachlan, Geoffrey J., and Peel, David. 2000. Finite mixture models. New York: Wiley.Google Scholar

Moors, Guy, and Vermunt, Jeroen K. 2007. Heterogeneity in postmaterialist value priorities. Evidence from a latent class discrete choice approach. European Sociological Review 23: 631–48.CrossRef Google Scholar

Muthén, Linda K., and Muthén, Bengt O., 2004. Mplus3.0: User's manual. Los Angeles, CA: Muthén and Muthén.Google Scholar

Patterson, Blossom H., Mitchell Dayton, C., and Graubard, Barry I. 2002. Latent class analysis of complex sample survey data: Application to dietary data. Journal of the American Statistical Association 97: 721–8.CrossRef Google Scholar

Rubin, Donald B. 1987. Multiple imputation for nonresponse in surveys. New York: Wiley.CrossRef Google Scholar

Schafer, Joseph L. 1997. Analysis of incomplete multivariate data. London: Chapman & Hall.CrossRef Google Scholar

Skinner, Chris J., Holt, Tim, and Fred Smith, T. M. 1989. Analysis of complex surveys. New York: Wiley.Google Scholar

Simmons, Solon. 2008. Ascriptive justice: The prevalence, distribution, and consequences of political correctness in the academy. Forum 6: 8.Google Scholar

Skrondal, Anders, and Laake, Petter. 2001. Regression among factor scores. Psychometrika 88: 563–76.Google Scholar

Van den Hout, Ardo, and Van der Heijden, Peter G. M. 2004. The analysis of multivariate misspecified data, with special attention to randomized response data. Sociological Methods and Research 32: 310–36.Google Scholar

Van de Pol, Frank, and Langeheine, Rolf. 1990. Mixed Markov latent class models. Sociological Methodology 20: 213–47.Google Scholar

Van der, Heijden, Zvi Gilula, Peter G. M., and Andries Van der Ark, L. 1999. An extended study into the relationship between correspondence analysis and latent class analysis. Sociological Methodology 29: 147–86.Google Scholar

Vermunt, Jeroen K. 1997. Log-linear models for event histories. Advanced quantitative techniques in the social sciences series. Thousand Oaks, CA: Sage.Google Scholar

Vermunt, Jeroen K. 2003. Multilevel latent class models. Sociological Methodology 33: 213–39.CrossRef Google Scholar

Vermunt, Jeroen K. 2005. Mixed-effects logistic regression models for indirectly observed outcome variables. Multivariate Behavioral Research 40: 281–301.CrossRef Google Scholar PubMed

Vermunt, Jeroen K. 2008. Latent class and finite mixture models for multilevel data sets. Statistical Methods in Medical Research 17: 33–51.Google Scholar

Vermunt, Jeroen K., Langeheine, Rolf, and Böckenholt, Ulf. 1999. Discrete-time discrete-state latent Markov models with time-constant and time-varying covariates. Journal of Educational and Behavioral Statistics 24: 178–205.CrossRef Google Scholar

Vermunt, Jeroen K., and Magidson, Jay. 2004. Latent class analysis. In The Sage encyclopedia of social science research methods, ed. Lewis-Beck, Michael, Bryman, Alan, and Liao, Tim F., 549–53. Newbury Park, CA: Sage.Google Scholar

Vermunt, Jeroen K., and Magidson, Jay. 2005. Latent GOLD 4.0 user's guide. Belmont, MA: Statistical Innovations.Google Scholar

Vermunt, Jeroen K., and Magidson, Jay. 2008. LG-Syntax user's guide: Manual for Latent GOLD 4.5 syntax module. Belmont, MA: Statistical Innovations.Google Scholar

Yamaguchi, Kazuo. 2000. Multinomial logit latent-class regression models: An analysis of the predictors of gender-role attitudes among Japanese women. American Journal of Sociology 105: 1702–40.Google Scholar

Article contents

Latent Class Modeling with Covariates: Two Improved Three-Step Approaches

Abstract

Access options

Article purchase

Temporarily unavailable

References

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests