Hostname: page-component-745bb68f8f-l4dxg Total loading time: 0 Render date: 2025-01-07T18:54:56.779Z Has data issue: false hasContentIssue false

On Latent Trait Estimation in Multidimensional Compensatory Item Response Models

Published online by Cambridge University Press:  01 January 2025

Chun Wang*
Affiliation:
University of Minnesota
*
Requests for reprints should be sent to Chun Wang, University of Minnesota, 75 East River Road, Elliott Hall, N658, Minneapolis, MN, 55455, USA. E-mail: [email protected]

Abstract

Making inferences from IRT-based test scores requires accurate and reliable methods of person parameter estimation. Given an already calibrated set of item parameters, the latent trait could be estimated either via maximum likelihood estimation (MLE) or using Bayesian methods such as maximum a posteriori (MAP) estimation or expected a posteriori (EAP) estimation. In addition, Warm’s (Psychometrika 54:427–450, 1989) weighted likelihood estimation method was proposed to reduce the bias of the latent trait estimate in unidimensional models. In this paper, we extend the weighted MLE method to multidimensional models. This new method, denoted as multivariate weighted MLE (MWLE), is proposed to reduce the bias of the MLE even for short tests. MWLE is compared to alternative estimators (i.e., MLE, MAP and EAP) and shown, both analytically and through simulations studies, to be more accurate in terms of bias than MLE while maintaining a similar variance. In contrast, Bayesian estimators (i.e., MAP and EAP) result in biased estimates with smaller variability.

Type
Original Paper
Copyright
Copyright © 2014 The Psychometric Society

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

Ackerman, T., Gierl, M.J., & Walker, C.M. (2003). Using multidimensional item response theory to evaluate educational psychological tests. Educational Measurement: Issues and Practices, 22, 3751.CrossRefGoogle Scholar
Anderson, J.A., & Richardson, S.C. (1979). Logistic discrimination and bias correction in maximum likelihood estimation. Technometrics, 21, 7178.CrossRefGoogle Scholar
Bock, R.D., Gibbons, R., Schilling, S.G., Muraki, E., Wilson, D.T., & Wood, R. (2003). TESTFACT 4.0. Lincolnwood: Scientific Software International. [Computer software and manual].Google Scholar
Cai, L. (2008). A Metropolis–Hastings Robbins–Monro algorithm for maximum likelihood nonlinear latent structure analysis with a comprehensive measurement model. Unpublished doctoral dissertation, University of North Carolina, Chapel Hill, NC.Google Scholar
Cai, L. (2010). High-dimensional exploratory item factor analysis by a Metropolis–Hastings Robbins–Monro algorithm. Psychometrika, 75, 3357.CrossRefGoogle Scholar
Cai, L. (2010). Metropolis–Hastings Robbins–Monro algorithm for confirmatory item factor analysis. Journal of Educational and Behavioral Statistics, 35, 307335.CrossRefGoogle Scholar
Cai, L., Thissen, D., & du Toit, S.H.C. (2011). IRTPRO: flexible, multidimensional, multiple categorical IRT modeling [Computer software]. Lincoln wood: Scientific Software International.Google Scholar
Chalmers, R.P. (2012). MIRT: A multidimensional item response theory package for the R environment. Journal of Statistical Software. www.jstatsoft.org.CrossRefGoogle Scholar
Eignor, D. R., & Schaeffer, G. A. (1995). Comparability studies for the GRE General CAT and the NCLEX using CAT. Paper presented at the meeting of the National Council on Measurement in Education, San Francisco, April.Google Scholar
Finkelman, M., Nering, M.L., & Roussos, L.A. (2009). A conditional exposure method for multidimensional adaptive testing. Journal of Educational Measurement, 46, 84103.CrossRefGoogle Scholar
Firth, D. (1993). Bias reduction of maximum likelihood estimates. Biometrika, 80, 2738.CrossRefGoogle Scholar
Fraser, C. (1998). NOHARM: a Fortran program for fitting unidimensional and multidimensional normal ogive models in latent trait theory. The University of New England, Center for Behavioral Studies, Armidale, Australia.Google Scholar
Hattie, J. (1981). Decision criteria for determining unidimensionality. Unpublished doctoral dissertation, University of Toronto, Canada.Google Scholar
Kim, J.K., & Nicewander, W.A. (1993). Ability estimation for conventional tests. Psychometrika, 58, 587599.CrossRefGoogle Scholar
Lee, P. (1989). Bayesian statistics: an introduction. London: Edward Arnold.Google Scholar
Lehmann, E.L., & Casella, G. (1998). Theory of point estimation. New York: Springer.Google Scholar
Lord, F.M. (1983). Unbiased estimation of ability parameters, of their variance and of their parallel forms reliability. Psychometrika, 48, 223245.CrossRefGoogle Scholar
Lord, F.M. (1986). Maximum likelihood and Bayesian parameter estimation in item response theory. Journal of Educational Measurement, 2, 157162.CrossRefGoogle Scholar
Lord, F.M., & Novick, M.R. (1968). Statistical theories of mental test scores. Reading: Addison-Wesley.Google Scholar
Mulder, J., & van der Linden, W.J. (2009). Multidimensional adaptive testing with optimal design criteria for item selection. Psychometrika, 74, 273296.CrossRefGoogle ScholarPubMed
Reckase, M.D. (2009). Multidimensional item response theory. New York: Springer.CrossRefGoogle Scholar
Samejima, F. (1993). An approximation for the bias function of the maximum likelihood estimate of a latent variable for the general case where the item responses are discrete. Psychometrika, 58, 119138.CrossRefGoogle Scholar
Schaefer, R.L. (1983). Bias correction in maximum likelihood logistic regression. Statistics in Medicine, 2, 7178.CrossRefGoogle ScholarPubMed
Segall, D.O. (1996). Multidimensional adaptive testing. Psychometrika, 61, 331354.CrossRefGoogle Scholar
Serfling, R.J. (1980). Approximation theorems of mathematical statistics. New York: Wiley.CrossRefGoogle Scholar
Stroud, A.H., & Sechrest, D. (1966). Gaussian quadrature formulas. Englewood Cliffs: Prentice-Hall.Google Scholar
Tao, J., Shi, N., & Chang, H. (2012). Item-weighted likelihood method for ability estimation in tests composed of both dichotomous and polytomous items. Journal of Educational and Behavioral Statistics, 37, 298315.CrossRefGoogle Scholar
Tseng, F.L., & Hsu, T.C. (2001). Multidimensional adaptive testing using the weighted likelihood estimation: a comparison of estimation methods. Paper presented at the annual meeting of Seattle, WA.Google Scholar
van der Linden, W.J. (1999). Multidimensional adaptive testing with a minimum error-variance criterion. Journal of Educational and Behavioral Statistics, 24, 398412.CrossRefGoogle Scholar
van der Linden, W.J. (1999). A procedure for empirical initialization of the trait estimator in adaptive testing. Applied Psychological Measurement, 23, 2129.CrossRefGoogle Scholar
van der Linden, W.J. (2007). A hierarchical framework for modeling speed and accuracy on test items. Psychometrika, 72, 287308.CrossRefGoogle Scholar
van der Linden, W.J. (2008). Using response times for item selection in adaptive testing. Journal of Educational and Behavioral Statistics, 33, 520.CrossRefGoogle Scholar
Veldkamp, B.P., & van der Linden, W.J. (2002). Multidimensional adaptive testing with constraints on test content. Psychometrika, 67, 575588.CrossRefGoogle Scholar
Warm, T.A. (1989). Weighted likelihood estimation of ability in item response theory. Psychometrika, 54, 427450.CrossRefGoogle Scholar
Wang, C., & Chang, H. (2011). Item selection in multidimensional computerized adaptive tests—gaining information from different angles. Psychometrika, 76, 363384.CrossRefGoogle Scholar
Wang, S., & Wang, T. (2001). Precision of Warm’s weighted likelihood estimates for a polytomous model in computerized adaptive testing. Applied Psychological Measurement, 25, 317331.CrossRefGoogle Scholar
Wang, T., Hanson, B.A., & Lau, C.-M.A. (1999). Reducing bias in CAT trait estimation: a comparison of approaches. Applied Psychological Measurement, 23, 263278.CrossRefGoogle Scholar
Wang, W.C., Chen, P.H., & Cheng, Y.Y. (2004). Improving measurement precision of test batteries using multidimensional item response models. Psychological Methods, 9, 116136.CrossRefGoogle ScholarPubMed
Wang, C., Chang, H., & Boughton, K. (2011). Kullback–Leibler information and its applications in multi-dimensional adaptive testing. Psychometrika, 76, 1339.CrossRefGoogle Scholar
Zhang, J., Xie, M., Song, X., & Lu, T. (2011). Investigating the impact of uncertainty about item parameters on ability estimation. Psychometrika, 76, 97118.CrossRefGoogle Scholar