Hostname: page-component-745bb68f8f-cphqk Total loading time: 0 Render date: 2025-01-07T08:49:42.811Z Has data issue: false hasContentIssue false

Tests of Measurement Invariance Without Subgroups: A Generalization of Classical Methods

Published online by Cambridge University Press:  01 January 2025

Edgar C. Merkle*
Affiliation:
University of Missouri
Achim Zeileis
Affiliation:
Universität Innsbruck
*
Requests for reprints should be sent to Edgar C. Merkle, Department of Psychological Sciences, University of Missouri, Columbia, MO 65211, USA. E-mail: [email protected]

Abstract

The issue of measurement invariance commonly arises in factor-analytic contexts, with methods for assessment including likelihood ratio tests, Lagrange multiplier tests, and Wald tests. These tests all require advance definition of the number of groups, group membership, and offending model parameters. In this paper, we study tests of measurement invariance based on stochastic processes of casewise derivatives of the likelihood function. These tests can be viewed as generalizations of the Lagrange multiplier test, and they are especially useful for: (i) identifying subgroups of individuals that violate measurement invariance along a continuous auxiliary variable without prespecified thresholds, and (ii) identifying specific parameters impacted by measurement invariance violations. The tests are presented and illustrated in detail, including an application to a study of stereotype threat and simulations examining the tests’ abilities in controlled conditions.

Type
Original Paper
Copyright
Copyright © The Psychometric Society

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

Andrews, D.W.K. (1993). Tests for parameter instability and structural change with unknown change point. Econometrica, 61, 821856CrossRefGoogle Scholar
Bauer, D.J., & Curran, P.J. (2004). The integration of continuous and discrete latent variable models: potential problems and promising opportunities. Psychological Methods, 9, 329CrossRefGoogle ScholarPubMed
Bauer, D.J., & Hussong, A.M. (2009). Psychometric approaches for developing commensurate measures across independent studies: traditional and new models. Psychological Methods, 14, 101125CrossRefGoogle ScholarPubMed
Boker, S., Neale, M., Maes, H., Wilde, M., Spiegel, M., Brick, T. et al. (2011). OpenMx: an open source extended structural equation modeling framework. Psychometrika, 76(2), 306317CrossRefGoogle ScholarPubMed
Bollen, K.A. (1989). Structural equations with latent variables, New York: WileyCrossRefGoogle Scholar
Borsboom, D. (2006). When does measurement invariance matter?. Medical Care, 44(11), S176S181CrossRefGoogle ScholarPubMed
Breiman, L., Friedman, J.H., Olshen, R.A., & Stone, C.J. (1984). Classification and regression trees, Belmont: WadsworthGoogle Scholar
Brown, R.L., Durbin, J., & Evans, J.M. (1975). Techniques for testing the constancy of regression relationships over time. Journal of the Royal Statistical Society. Series B, 37, 149163CrossRefGoogle Scholar
Dolan, C.V., & van der Maas, H.L.J. (1998). Fitting multivariate normal finite mixtures subject to structural equation modeling. Psychometrika, 63, 227253CrossRefGoogle Scholar
Ferguson, T.S. (1996). A course in large sample theory, London: Chapman & HallCrossRefGoogle Scholar
Ferrer, E., Balluerka, N., & Widaman, K.F. (2008). Factorial invariance and the specification of second-order latent growth models. Methodology, 4, 2236CrossRefGoogle ScholarPubMed
Hansen, B.E. (1992). Testing for parameter instability in linear models. Journal of Policy Modeling, 14, 517533CrossRefGoogle Scholar
Hansen, B.E. (1997). Approximate asymptotic p values for structural-change tests. Journal of Business & Economic Statistics, 15, 6067Google Scholar
Hjort, N.L., & Koning, A. (2002). Tests for constancy of model parameters over time. Nonparametric Statistics, 14, 113132CrossRefGoogle Scholar
Horn, J.L., & McArdle, J.J. (1992). A practical and theoretical guide to measurement invariance in aging research. Experimental Aging Research, 18, 117144CrossRefGoogle ScholarPubMed
Jöreskog, K.G. (1971). Simultaneous factor analysis in several populations. Psychometrika, 36, 409426CrossRefGoogle Scholar
Lubke, G.H., & Muthén, B. (2005). Investigating population heterogeneity with factor mixture models. Psychological Methods, 10, 2139CrossRefGoogle ScholarPubMed
MacCallum, R.C., Zhang, S., Preacher, K.J., & Rucker, D.D. (2002). On the practice of dichotomization of quantitative variables. Psychological Methods, 7, 1940CrossRefGoogle ScholarPubMed
McArdle, J.J. (2009). Latent variable modeling of differences and changes with longitudinal data. Annual Review of Psychology, 60, 577605CrossRefGoogle ScholarPubMed
McDonald, R.P. (1999). Test theory: a unified treatment, Mahwah: ErlbaumGoogle Scholar
Mellenbergh, G.J. (1989). Item bias and item response theory. International Journal of Educational Research, 13, 127143CrossRefGoogle Scholar
Meredith, W. (1993). Measurement invariance, factor analysis, and factorial invariance. Psychometrika, 58, 525543CrossRefGoogle Scholar
Merkle, E.C., & Shaffer, V.A. (2011). Binary recursive partitioning methods with application to psychology. British Journal of Mathematical & Statistical Psychology, 64(1), 161181CrossRefGoogle ScholarPubMed
Millsap, R.E. (2005). Four unresolved problems in studies of factorial invariance. In Maydeu-Olivares, A., & McArdle, J. J. (Eds.), Contemporary psychometrics, Mahwah: Erlbaum 153171Google Scholar
Millsap, R.E. (2011). Statistical approaches to measurement invariance, New York: RoutledgeGoogle Scholar
Molenaar, D., Dolan, C.V., Wicherts, J.M., & van der Mass, H.L.J. (2010). Modeling differentiation of cognitive abilities within the higher-order factor model using moderated factor analysis. Intelligence, 38, 611624CrossRefGoogle Scholar
Neale, M.C., Aggen, S.H., Maes, H.H., Kubarych, T.S., & Schmitt, J.E. (2006). Methodological issues in the assessment of substance use phenotypes. Addictive Behaviors, 31, 10101034CrossRefGoogle ScholarPubMed
Nyblom, J. (1989). Testing for the constancy of parameters over time. Journal of the American Statistical Association, 84, 223230CrossRefGoogle Scholar
Ploberger, W., & Krämer, W. (1992). The CUSUM test with OLS residuals. Econometrica, 60(2), 271285CrossRefGoogle Scholar
Purcell, S. (2002). Variance components models for gene-environment interaction in twin analysis. Twin Research, 5, 554571CrossRefGoogle ScholarPubMed
R Development Core Team (2012). R: a language and environment for statistical computing [Computer software manual]. URL http://www.R-project.org/. Vienna, Austria (ISBN 3-900051-07-0). Google Scholar
Rosseel, Y. (2012). lavaan: an R package for structural equation modeling. Journal of Statistical Software, 48(2), 136 URL:http://www.jstatsoft.org/v48/i02/CrossRefGoogle Scholar
Sánchez, G. (2009). PATHMOX approach: segmentation trees in partial least squares path modeling. Unpublished doctoral dissertation. Universitat Politécnica de Catalunya. Google Scholar
Satorra, A. (1989). Alternative test criteria in covariance structure analysis: a unified approach. Psychometrika, 54, 131151CrossRefGoogle Scholar
Shorack, G.R., & Wellner, J.A. (1986). Empirical processes with applications to statistics, New York: WileyGoogle Scholar
Stark, S., Chernyshenko, O.S., & Drasgow, F. (2006). Detecting differential item functioning with confirmatory factor analysis and item response theory: Toward a unified strategy. Journal of Applied Psychology, 91, 12921306CrossRefGoogle Scholar
Strobl, C., Kopf, J., & Zeileis, A. (2010). A new method for detecting differential item functioning in the Rasch model (Technical Report No. 92). Department of Statistics, Ludwig-Maximilians-Universität München. URL http://epub.ub.uni-muenchen.de/11915/. Google Scholar
Strobl, C., Malley, J., & Tutz, G. (2009). An introduction to recursive partitioning: rationale, application, and characteristics of classification and regression trees, bagging, and random forests. Psychological Methods, 14, 323348CrossRefGoogle ScholarPubMed
Wicherts, J.M., Dolan, C.V., & Hessen, D.J. (2005). Stereotype threat and group differences in test performance: a question of measurement invariance. Journal of Personality and Social Psychology, 89(5), 696716CrossRefGoogle ScholarPubMed
Wothke, W. (2000). Longitudinal and multi-group modeling with missing data. In Little, T.D., Schnabel, K.U., & Baumert, J. (Eds.), Modeling longitudinal and multilevel data: practical issues, applied approaches, and specific examples, Mahwah: ErlbaumGoogle Scholar
Zeileis, A. (2005). A unified approach to structural change tests based on ML scores, F statistics, and OLS residuals. Econometric Reviews, 24(4), 445466CrossRefGoogle Scholar
Zeileis, A. (2006). Implementing a class of structural change tests: an econometric computing approach. Computational Statistics & Data Analysis, 50(11), 29873008CrossRefGoogle Scholar
Zeileis, A., & Hornik, K. (2007). Generalized M-fluctuation tests for parameter instability. Statistica Neerlandica, 61, 488508CrossRefGoogle Scholar
Zeileis, A., Hothorn, T., & Hornik, K. (2008). Model-based recursive partitioning. Journal of Computational and Graphical Statistics, 17, 492514CrossRefGoogle Scholar
Zeileis, A., Leisch, F., Hornik, K., & Kleiber, C. (2002). strucchange: an R package for testing for structural change in linear regression models. Journal of Statistical Software, 7(2), 138 URL http://www.jstatsoft.org/v07/i02/CrossRefGoogle Scholar
Zeileis, A., Shah, A., & Patnaik, I. (2010). Testing, monitoring, and dating structural changes in exchange rate regimes. Computational Statistics & Data Analysis, 54, 16961706CrossRefGoogle Scholar