Hostname: page-component-5f745c7db-xx4dx Total loading time: 0 Render date: 2025-01-06T23:29:16.417Z Has data issue: true hasContentIssue false

Dynamical Non-compensatory Multidimensional IRT Model Using Variational Approximation

Published online by Cambridge University Press:  01 January 2025

Hiroshi Tamano*
Affiliation:
The Graduate University for Advanced Studies, Sokendai
Daichi Mochihashi
Affiliation:
The Institute of Statistical Mathematics
*
Correspondence should be made to Hiroshi Tamano, Department of Statistical Science, The Graduate University for Advanced Studies, Sokendai, 10-3 Midori-cho, Tachikawa, Tokyo190-8562, Japan. Email: [email protected]

Abstract

Multidimensional item response theory (MIRT) is a statistical test theory that precisely estimates multiple latent skills of learners from the responses in a test. Both compensatory and non-compensatory models have been proposed for MIRT: the former assumes that each skill can complement other skills, whereas the latter assumes they cannot. This non-compensatory assumption is convincing in many tests that measure multiple skills; therefore, applying non-compensatory models to such data is crucial for achieving unbiased and accurate estimation. In contrast to tests, latent skills will change over time in daily learning. To monitor the growth of skills, dynamical extensions of MIRT models have been investigated. However, most of them assumed compensatory models, and a model that can reproduce continuous latent states of skills under the non-compensatory assumption has not been proposed thus far. To enable accurate skill tracing under the non-compensatory assumption, we propose a dynamical extension of non-compensatory MIRT models by combining a linear dynamical system and a non-compensatory model. This results in a complicated posterior of skills, which we approximate with a Gaussian distribution by minimizing the Kullback–Leibler divergence between the approximated posterior and the true posterior. The learning algorithm for the model parameters is derived through Monte Carlo expectation maximization. Simulation studies verify that the proposed method is able to reproduce latent skills accurately, whereas the dynamical compensatory model suffers from significant underestimation errors. Furthermore, experiments on an actual data set demonstrate that our dynamical non-compensatory model can infer practical skill tracing and clarify differences in skill tracing between non-compensatory and compensatory models.

Type
Theory and Methods
Copyright
Copyright © The Author(s) under exclusive licence to The Psychometric Society

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

Ackerman, T.(1996).Graphical representation of multidimensional item response theory analyses.Applied Psychological Measurement,20(4),311329.CrossRefGoogle Scholar
Andrade, D. F., &Tavares, H. R.(2005).Item response theory for longitudinal data: population parameter estimation.Journal of Multivariate Analysis,95(1),122.CrossRefGoogle Scholar
Bishop, M. (2006). Pattern recognition and machine learning. Pattern Recognition.Google Scholar
Bogan, E. D., & Yen, W. M. (1983). Detecting multidimensionality and examining its effects on vertical equating with the three-parameter logistic model.Google Scholar
Bollen, K. A., & Curran, P. J. (2006). Latent curve models: A structural equation perspective(vol. 467). Wiley.Google Scholar
Bolt, D. M., &Lall, V. F.(2003).Estimation of compensatory and noncompensatory multidimensional item response models using Markov chain Monte Carlo.Applied Psychological Measurement,27(6),395414.CrossRefGoogle Scholar
Buchholz, J., &Hartig, J.(2018).The impact of ignoring the partially compensatory relation between ability dimensions on norm-referenced test scores.Psychological Test and Assessment Modeling,60(3),369385.Google Scholar
Chalmers, R. P.(2012).mirt: A multidimensional item response theory package for the R environment.Journal of Statistical Software,48(6),129.CrossRefGoogle Scholar
Chen, P., Lu, Y., Zheng, V. W., & Pian, Y. (2018). Prerequisite-driven deep knowledge tracing. In 2018 IEEE International Conference on Data Mining (ICDM) (pp. 39–48). IEEE.CrossRefGoogle Scholar
Corbett, A. T., &Anderson, J. R.(1994).Knowledge tracing: Modeling the acquisition of procedural knowledge.User Modeling and User-Adapted Interaction,4(4),253278.CrossRefGoogle Scholar
Cully, A., &Demiris, Y.(2019).Online knowledge level tracking with data-driven student models and collaborative filtering.IEEE Transactions on Knowledge and Data Engineering,32(10),20002013.CrossRefGoogle Scholar
DeMars, C. E.(2016).Partially compensatory multidimensional item response theory models: Two alternate model forms.Educational and Psychological Measurement,76(2),231257.CrossRefGoogle ScholarPubMed
Dempster, A. P.,Laird, N. M., &Rubin, D. B.(1977).Maximum likelihood from incomplete data via the EM algorithm.Journal of Royal Statistical Society,39(1),122.CrossRefGoogle Scholar
Embretson, S.(1984).A general latent trait model for response processes.Psychometrika,49(2),175186.CrossRefGoogle Scholar
Embretson, S. E., & Reise, S. P. (2013). Item response theory. Psychology Press.CrossRefGoogle Scholar
Embretson, S. E., &Yang, X.(2013).A multicomponent latent trait model for diagnosis.Psychometrika,78(1),1436.CrossRefGoogle ScholarPubMed
Feng, M.,Heffernan, N. T., &Koedinger, K. R.(2009).Addressing the assessment challenge in an Intelligent Tutoring System that tutors as it assesses.The Journal of User Modeling and User-Adapted Interaction,19,243266.CrossRefGoogle Scholar
Ghahramani, Z., & Hinton, G. E. (1996). Parameter estimation for linear dynamical systems.Google Scholar
Kalman, R. E. (1960). A new approach to linear filtering and prediction problems.CrossRefGoogle Scholar
Kingma, D. P., & Welling, M. (2013). Auto-encoding variational bayes. arXiv preprint arXiv:1312.6114.Google Scholar
Kitagawa, G. (1993). A Monte Carlo filtering and smoothing method for non-Gaussian nonlinear state space models. In Proceedings of the 2nd US-Japan joint seminar on statistical time series analysis (pp. 110–131).Google Scholar
Lan, A. S., Studer, C., & Baraniuk, R. G. (2014). Time-varying learning and content analytics via sparse factor analysis. In Proceedings of the 20th ACM SIGKDD international conference on knowledge discovery and data mining (pp. 452–461).CrossRefGoogle Scholar
Leighton, J., & Gierl, M. (Eds.). (2007). Cognitive diagnostic assessment for education: Theory and applications. Cambridge University Press.CrossRefGoogle Scholar
Li, F.,Cohen, A.,Bottge, B., &Templin, J.(2016).A latent transition analysis model for assessing change in cognitive skills.Educational and Psychological Measurement,76(2),181204.CrossRefGoogle ScholarPubMed
Lord, F. M.Applications of item response theory to practical testing problems,(1980 London:Routledge.Google Scholar
MacKay, D. J.,Mac Kay, D. J.Information theory, inference and learning algorithms,(2003 Cambridge:Cambridge University Press.Google Scholar
Minka, T. P. (2001). Expectation propagation for approximate Bayesian inference. In Proceedings of the seventeenth conference on uncertainty in artificial intelligence (pp. 362–369).Google Scholar
Oka, M., & Okada, K. (2021). Scalable estimation algorithm for the DINA Q-matrix combining stochastic optimization and variational inference. arXiv preprint arXiv:2105.09495.Google Scholar
Paek, I.,Li, Z., &Park, H. J.(2016).Specifying ability growth models using a multidimensional item response model for repeated measures categorical ordinal item response data.Multivariate Behavioral Research,51(4),569580.CrossRefGoogle ScholarPubMed
Piech, C., Bassen, J., Huang, J., Ganguli, S., Sahami, M., Guibas, L., & Sohl-Dickstein, J. (2015). Deep knowledge tracing. In Proceedings of the 28th international conference on neural information processing systems-volume 1, pp. 505–513.Google Scholar
Pu, S., Yudelson, M., Ou, L., & Huang, Y. (2020). Deep knowledge tracing with transformers. In International conference on artificial intelligence in education (pp. 252–256). Springer, Cham.CrossRefGoogle Scholar
Rasch, G. (1960). Studies in mathematical psychology: I. Probabilistic models for some intelligence and attainment tests.Google Scholar
Reckase, M. D.(1985).The difficulty of test items that measure more than one ability.Applied Psychological Measurement,9(4),401412.CrossRefGoogle Scholar
Reckase, M. D. (2009). Multidimensional item response theory models. In Multidimensional item response theory (pp. 79-112). Springer, New York.CrossRefGoogle Scholar
Spray, J. A., Davey, T. C., Reckase, M. D., Ackerman, T. A., & Carlson, J. E. (1990). Comparison of two logistic multidimensional item response theory models. American Coll Testing Program Iowa City IA.Google Scholar
Stamper, J., Niculescu-Mizil, A., Ritter, S., Gordon, G. J., & Koedinger, K. R. (2010). Algebra I 2008–2009. Challenge data set from KDD Cup 2010 educational data mining challenge. Find it at http://pslcdatashop.web.cmu.edu/KDDCup/downloads.jsp.Google Scholar
Su, Y.,Cheng, Z.,Luo, P.,Wu, J.,Zhang, L.,Liu, Q., &Wang, S.(2021).Time-and-concept enhanced deep multidimensional item response theory for interpretable knowledge tracing.Knowledge-Based Systems,218CrossRefGoogle Scholar
Sympson, J. B. (1978). A model for testing with multidimensional items. In Proceedings of the 1977 computerized adaptive testing conference (no. 00014).Google Scholar
Tatsuoka, K. K. (1983). Rule space: An approach for dealing with misconceptions based on item response theory. Journal of Educational Measurement, 345–354.CrossRefGoogle Scholar
Templin, J., & Henson, R. A. (2010). Diagnostic measurement: Theory, methods, and applications. Guilford Press.Google Scholar
Wang, C., &Nydick, S. W.(2015).Comparing two algorithms for calibrating the restricted non-compensatory multidimensional IRT model.Applied Psychological Measurement,39(2),119134.CrossRefGoogle ScholarPubMed
Wang, C., &Nydick, S. W.(2020).On longitudinal item response theory models: A didactic.Journal of Educational and Behavioral Statistics,45(3),339368.CrossRefGoogle Scholar
Wang, J. M., Fleet, D. J., & Hertzmann, A. (2005). Gaussian process dynamical models. In NIPS (vol. 18, p. 3).Google Scholar
Wang, S.,Yang, Y.,Culpepper, S. A., &Douglas, J. A.(2018).Tracking skill acquisition with cognitive diagnosis models: A higher-order, hidden markov model with covariates.Journal of Educational and Behavioral Statistics,43(1),5787.CrossRefGoogle Scholar
Wei, G. C., &Tanner, M. A.(1990).A Monte Carlo implementation of the EM algorithm and the poor man’s data augmentation algorithms.Journal of the American Statistical Association,85(411),699704.CrossRefGoogle Scholar
Whitely, S. E.(1980).Multicomponent latent trait models for ability tests.Psychometrika,45(4),479494.CrossRefGoogle Scholar
Yeung, C. K. (2019). Deep-IRT: Make deep learning based knowledge tracing explainable using item response theory. arXiv preprint arXiv:1904.11738.Google Scholar
Zhang, J., Shi, X., King, I., & Yeung, D. Y. (2017). Dynamic key-value memory networks for knowledge tracing. In Proceedings of the 26th international conference on World Wide Web (pp. 765–774).CrossRefGoogle Scholar
Zhan, P.,Jiao, H.,Liao, D., &Li, F.(2019).A longitudinal higher-order diagnostic classification model.Journal of Educational and Behavioral Statistics,44(3),251281.CrossRefGoogle Scholar