Hostname: page-component-5f745c7db-sbzbt Total loading time: 0 Render date: 2025-01-06T06:00:47.583Z Has data issue: true hasContentIssue false

Modification Indices for the 2-PL and the Nominal Response Model

Published online by Cambridge University Press:  01 January 2025

Cees A. W. Glas*
Affiliation:
Department of educational measurement and data analysis, University of Twente
*
Requests for reprints should be sent to Cees A. W. Glas, Department of Educational Measurement and Data Analysis, P.O. Box 217, 7500 AE Enschede, the Netherlands.

Abstract

In this paper, it is shown that various violations of the 2-PL model and the nominal response model can be evaluated using the Lagrange multiplier test or the equivalent efficient score test. The tests presented here focus on violation of local stochastic independence and insufficient capture of the form of the item characteristic curves. Primarily, the tests are item-oriented diagnostic tools, but taken together, they also serve the purpose of evaluation of global model fit. A useful feature of Lagrange multiplier statistics is that they are evaluated using maximum likelihood estimates of the null-model only, that is, the parameters of alternative models need not be estimated. As numerical examples, an application to real data and some power studies are presented.

Type
Original Paper
Copyright
Copyright © 1999 The Psychometric Society

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

Agresti, A., & Yang, M. (1987). An empirical investigation of some effects of sparseness in contingency tables. Computational Statistics and Data Analysis, 5, 921.CrossRefGoogle Scholar
Aitchison, J., & Silvey, S.D. (1958). Maximum likelihood estimation of parameters subject to restraints. Annals of Mathematical Statistics, 29, 813828.CrossRefGoogle Scholar
Albert, J.H. (1992). Bayesian estimation of normal ogive item response functions using Gibbs sampling. Journal of Educational Statistics, 17, 251269.CrossRefGoogle Scholar
Andersen, E.B. (1973). A goodness of for test for the Rasch model. Psychometrika, 38, 123140.CrossRefGoogle Scholar
Andersen, E.B. (1985). Estimating latent correlations between repeated testings. Psychometrika, 50, 316.CrossRefGoogle Scholar
Ando, A., & Kaufmann, O.M. (1965). Bayesian analysis of the independent normal process-neither mean nor precision known. Journal of the American Statistical Association, 60, 347358.Google Scholar
Baker, F.B. (1998). An investigation of item parameter recovery characteristics of a Gibbs sampling procedure. Applied Psychological Measurement, 22, 153169.CrossRefGoogle Scholar
Birnbaum, A. (1968). Some latent trait models. In Lord, F.M., & Novick, M.R. (Eds.), Statistical theories of mental test scores (pp. 395479). Reading, MA: Addison-Wesley.Google Scholar
Bock, R.D. (1972). Estimating item parameters and latent ability when responses are scored in two or more nominal categories. Psychometrika, 37, 2951.CrossRefGoogle Scholar
Bock, R.D., & Aitkin, M. (1981). Marginal maximum likelihood estimation of item parameters: an application of an EM-algorithm. Psychometrika, 46, 443459.CrossRefGoogle Scholar
Breusch, T.S., & Pagan, A.R. (1980). The Lagrange multiplier test and its applications to model specification in econometrics. Review of Economic Studies, 47, 239254.CrossRefGoogle Scholar
Buse, A. (1982). The likelihood ratio, Wald, and Lagrange multiplier tests: An expository note. The American Statistician, 36, 153157.Google Scholar
Choppin, B. (1983). A two-parameter latent trait model, Los Angeles, CA: University of California, Center for Study of Evaluation, Graduate School of Education.Google Scholar
de Leeuw, J., & Verhelst, N. D. (1986). Maximum likelihood estimation in generalized Rasch models. Journal of Educational Statistics, 11, 183196.CrossRefGoogle Scholar
Fischer, G.H. (1974). Einführung in die Theorie Psychologischer Tests, Bern: Huber [Introduction to the theory of psychological tests]Google Scholar
Follmann, D. (1988). Consistent estimation in the Rasch model based on nonparametric margins. Psychometrika, 53, 553562.CrossRefGoogle Scholar
Gelman, A., Carlin, J.B., Stern, H.S., & Rubin, D.B. (1995). Bayesian data analysis, London: Chapman and Hall.CrossRefGoogle Scholar
Glas, C.A.W. (1988). The derivation of some tests for the Rasch model from the multinomial distribution. Psychometrika, 53, 525546.CrossRefGoogle Scholar
Glas, C.A.W. (1992). A Rasch model with a multivariate distribution of ability. In Wilson, M. (Eds.), Objective measurement: Theory into practice, Vol. 1 (pp. 236258). New Jersey: Ablex Publishing Co.Google Scholar
Glas, C.A.W. (1998). Detection of differential item functioning using Lagrange multiplier tests. Statistica Sinica, 8, 647667.Google Scholar
Glas, C.A.W., & Verhelst, N.D. (1989). Extensions of the partial credit model. Psychometrika, 54, 635659.CrossRefGoogle Scholar
Glas, C.A.W., & Verhelst, N.D. (1995). Tests of fit for polytomous Rasch models. In Fischer, G. H., & Molenaar, I. W. (Eds.), Rasch models. Their foundation, recent developments and applications, New York: Springer.Google Scholar
Grayson, D.A. (1988). Two-group classification in item response theory: Scores with monotone likelihood ratio. Psychometrika, 53, 383392.CrossRefGoogle Scholar
Hemker, B.T., Sijtsma, K., Molenaar, I.W., & Junker, B.W. (1996). Polytomous IRT models and monotone likelihood ratio of the total score. Psychometrika, 61, 679693.CrossRefGoogle Scholar
Holland, P.W., & Rosenbaum, P.R. (1986). Conditional association and unidimensionality in monotone latent variable models. Annals of Statistics, 14, 15231543.CrossRefGoogle Scholar
Huynh, H. (1994). A new proof for monotone likelihood ratio for the sum of independent bernoulli random variables. Psychometrika, 59, 7779.CrossRefGoogle Scholar
Jannarone, R.J. (1986). Conjunctive item response theory kernels. Psychometrika, 51, 357373.CrossRefGoogle Scholar
Junker, B. (1991). Essential independence and likelihood-based ability estimation for polytomous items. Psychometrika, 56, 255278.CrossRefGoogle Scholar
Kelderman, H. (1984). Loglinear Rasch model tests. Psychometrika, 49, 223245.CrossRefGoogle Scholar
Kelderman, H. (1989). Item bias detection using loglinear IRT. Psychometrika, 54, 681697.CrossRefGoogle Scholar
Koehler, K. (1986). Goodness-of-fit tests for loglinear models in sparse contingency tables. Journal of the American Statistical Association, 81, 483493.CrossRefGoogle Scholar
Koehler, K., & Larntz, K. (1980). An empirical investigation of goodness-of-fit statistics for sparse multinomials. Journal of the American Statistical Association, 75, 336344.CrossRefGoogle Scholar
Larntz, K. (1978). Small-sample comparison of exact levels for goodness-of-fit statistics. Journal of the American Statistical Association, 73, 253263.CrossRefGoogle Scholar
Louis, T.A. (1982). Finding the observed information matrix when using the EM algorithm. Journal of the Royal Statistical Society, Series B, 44, 226233.CrossRefGoogle Scholar
Lord, F.M. (1980). Applications of item response theory to practical testing problems, Hillsdale, NJ: Erlbaum.Google Scholar
Martin-Löf, P. (1973). Statistika Modeller. Anteckningar från seminarier Lasåret 1969–1970, utardeltade av Rolf Sunberg. Obetydligt ändrat nytryck, oktober 1973, Stockholm: Institutet för Försäkringsmatematik och Matematisk Statistik vid Stockholms Universitet.Google Scholar
Martin Löf, P. (1974). The notion of redundancy and its use as a quantitative measure if the discrepancy between a statistical hypothesis and a set of observational data. Scandinavian Journal of Statistics, 1, 318.Google Scholar
McDonald, R.P. (1967). Nonlinear factor analysis. Psychometric monographs, No.15.Google ScholarPubMed
McDonald, R.P. (1997). Normal-ogive multidimensional model. In van der Linden, W.J., & Hambleton, R.K. (Eds.), Handbook of modern item response theory (pp. 257269). New York: Springer.CrossRefGoogle Scholar
Mislevy, R.J. (1986). Bayes modal estimation in item response models. Psychometrika, 51, 177195.CrossRefGoogle Scholar
Mislevy, R.J., & Bock, R.D. (1990). PC-Bilog. Item analysis and test scoring with binary logistic models, Chicago: Scientific Software International.Google Scholar
Molenaar, I.W. (1983). Some improved diagnostics for failure in the Rasch model. Psychometrika, 48, 4972.CrossRefGoogle Scholar
Muraki, E. (1992). A generalized partial credit model: application of an EM algorithm. Applied Psychological Measurement, 16, 159176.CrossRefGoogle Scholar
Patz, R.J., & Junker, B.W. (1997). Applications and extensions of MCMC in IRT: Multiple item types, missing data, and rated responses, Pittsburgh: Carnegie Mellon University, Department of Statistics.Google Scholar
Rao, C.R. (1947). Large sample tests of statistical hypothesis concerning several parameters with applications to problems of estimation. Proceedings of the Cambridge Philosophical Society, 44, 5057.Google Scholar
Reckase, M.D. (1985). The difficulty of test items that measure more than one ability. Applied Psychological Measurement, 9, 401412.CrossRefGoogle Scholar
Reckase, M.D. (1997). A linear logistic multidimensional model for dichotomous item response data. In van der Linden, W.J., & Hambleton, R. K. (Eds.), Handbook of modern item response theory (pp. 271286). New York: Springer.CrossRefGoogle Scholar
Reiser, M. (1996). Analysis of residuals for the multinomial item response model. Psychometrika, 61, 509528.CrossRefGoogle Scholar
Rosenbaum, P.R. (1984). Testing the conditional independence and monotonicity assumptions of item response theory. Psychometrika, 49, 425436.CrossRefGoogle Scholar
Rubin, D.B. (1976). Inference and missing data. Biometrika, 63, 581592.CrossRefGoogle Scholar
Stout, W.F. (1987). A nonparametric approach for assessing latent trait dimensionality. Psychometrika, 52, 589617.CrossRefGoogle Scholar
Stout, W.F. (1990). A new item response theory modeling approach with applications to unidimensional assessment and ability estimation. Psychometrika, 55, 293326.CrossRefGoogle Scholar
Thissen, D. (1991). MULTILOG. Multiple, categorical item analysis and test scoring using item response theory, Chicago: Scientific Software International.Google Scholar
Thissen, D., & Steinberg, L. (1986). A taxonomy of item response models. Psychometrika, 51, 567577.CrossRefGoogle Scholar
Yen, W.M. (1981). Using simultaneous results to choose a latent trait model. Applied Psychological Measurement, 5, 245262.CrossRefGoogle Scholar
Yen, W.M. (1984). Effects of local item dependence on the fit and equating performance of the three-parameter logistic model. Applied Psychological Measurement, 8, 125145.CrossRefGoogle Scholar
Zimowski, M.F., Muraki, E., Mislevy, R.J., & Bock, R.D. (1996). Bilog MG: Multiple-group IRT analysis and test maintenance for binary items, Chicago: Scientific Software International.Google Scholar