Hostname: page-component-745bb68f8f-mzp66 Total loading time: 0 Render date: 2025-01-07T19:11:48.076Z Has data issue: false hasContentIssue false

Bayesian IRT Guessing Models for Partial Guessing Behaviors

Published online by Cambridge University Press:  01 January 2025

Jing Cao*
Affiliation:
Southern Methodist University
S. Lynne Stokes
Affiliation:
Southern Methodist University
*
Requests for reprints should be sent to Jing Cao, Department of Statistical Science, Southern Methodist University, University Park, USA. E-mail: [email protected]

Abstract

According to the recent Nation’s Report Card, 12th-graders failed to produce gains on the 2005 National Assessment of Educational Progress (NAEP) despite earning better grades on average. One possible explanation is that 12th-graders were not motivated taking the NAEP, which is a low-stakes test. We develop three Bayesian IRT mixture models to describe the results from a group of examinees including both nonguessers and partial guessers. The first assumes that the guesser answers questions based on his or her knowledge up to a certain test item, and guesses thereafter. The second model assumes that the guesser answers relatively easy questions based on his or her knowledge and guesses randomly on the remaining items. The third is constructed to describe more general low-motivation behavior. It assumes that the guesser gives less and less effort as he or she proceeds through the test. The models can provide not only consistent estimates of IRT parameters but also estimates of each examinee’s nonguesser/guesser status and degree of guessing behavior. We show results of a simulation study comparing the performance of the three guessing models to the 2PL-IRT model. Finally, an analysis of real data from a low-stakes test administered to university students is presented.

Type
Theory and Methods
Copyright
Copyright © 2007 The Psychometric Society

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

Berger, J.O. (1985). Statistical decision theory and bayesian analysis, (2nd ed.). New York: Springer.CrossRefGoogle Scholar
Bechger, T., Maris, G., Verstralen, H., & Verhelst, N. (2003). The Nedelsky model for multiple choice items (R & D Report) Arnhem: Cito.Google Scholar
Brophy, J., & Ames, C. (2005). NAEP testing for twelfth graders: motivational issues. A paper prepared for the national assessment governing board.Google Scholar
Gilk, W., & Wild, P. (1992). Adaptive rejection sampling for Gibbs sampling. Applied Statistics, 41, 337348.CrossRefGoogle Scholar
Grigg, W., Donahue, P., & Dion, G. (2007). The nation’s report card: 12th-grade reading and mathematics 2005. National Center for Education Statistics.Google Scholar
Johnson, V. (2004). A Bayesian χ 2 test for goodness of fit. Annals of Statistics, 32, 23612384.CrossRefGoogle Scholar
Martin, E.S., del Pino, G., & De Boeck, P. (2006). IRT models for ability-based guessing. Applied Psychological Measurement, 30, 183203.CrossRefGoogle Scholar
Mislevy, R.J., & Verhelst, N. (1990). Modeling item responses when different subjects employ different solution strategies. Psychometrika, 55, 195215.CrossRefGoogle Scholar
Neal, R.M. (2003). Slice sampling. The Annuals of Statistics, 31, 705767.CrossRefGoogle Scholar
Rijmen, F., De Boeck, P., & van der Maas, H.L.J. (2005). An IRT model with a parameter-driven process for change. Psychometrika, 70, 651669.CrossRefGoogle Scholar
Speckman, P.L., & Sun, D. (2003). Fully Bayesian spline smoothing and intrinsic autoregressive priors. Biometrika, 90, 289302.CrossRefGoogle Scholar
Spiegelhalter, D.J., Best, N.G., Carlin, B.P., & van der Linde, A. (2002). Bayesian measures of model complexity and fit. Journal of the Royal Statistical Society, Series B, Methodological, 64, 583616.CrossRefGoogle Scholar
Wise, S.L., & DeMars, C.E. (2006). An application of item response time: the effort-moderated IRT model. Journal of Educational Measurement, 43(1), 1938.CrossRefGoogle Scholar
Wise, S.L., & DeMars, C.E. (2005). Low examinee effort in low-stakes assessment: problems and potential solutions. Educational Assessment, 10(1), 117.CrossRefGoogle Scholar
Wise, S.L., & Kong, X. (2005). Response time effort: a new measure of examinee motivation in computer-based tests. Applied Measurement in Education, 18(2), 117.CrossRefGoogle Scholar
Yamamoto, K. (1995). Estimating the effects of test length and test time on parameter estimation using the HYBRID model (TOEFL Technical Report No. TR-10).Google Scholar