A Constrained Metropolis–Hastings Robbins–Monro Algorithm for Q Matrix Estimation in DINA Models

Chen-Wei Liu; Björn Andersson; Anders Skrondal

doi:10.1007/s11336-020-09707-4

A Constrained Metropolis–Hastings Robbins–Monro Algorithm for Q Matrix Estimation in DINA Models

Published online by Cambridge University Press: 01 January 2025

and

Chen-Wei Liu*: Affiliation:
National Taiwan Normal University
Björn Andersson: Affiliation:
University of Oslo
Anders Skrondal: Affiliation:
Norwegian Institute of Public Health University of Oslo University of California, Berkeley
*: Correspondence should be made to Chen-Wei Liu, Department of Educational Psychology and Counseling, National Taiwan Normal University, 162, Section 1, Heping E. Road, 10610, Taipei, Taiwan. Email: [email protected]

Article contents

Abstract
References

Get access

Rights & Permissions

Abstract

In diagnostic classification models (DCMs), the Q matrix encodes in which attributes are required for each item. The Q matrix is usually predetermined by the researcher but may in practice be misspecified which yields incorrect statistical inference. Instead of using a predetermined Q matrix, it is possible to estimate it simultaneously with the item and structural parameters of the DCM. Unfortunately, current methods are computationally intensive when there are many attributes and items. In addition, the identification constraints necessary for DCMs are not always enforced in the estimation algorithms which can lead to non-identified models being considered. We address these problems by simultaneously estimating the item, structural and Q matrix parameters of the Deterministic Input Noisy “And” gate model using a constrained Metropolis–Hastings Robbins–Monro algorithm. Simulations show that the new method is computationally efficient and can outperform previously proposed Bayesian Markov chain Monte-Carlo algorithms in terms of Q matrix recovery, and item and structural parameter estimation. We also illustrate our approach using Tatsuoka’s fraction–subtraction data and Certificate of Proficiency in English data.

Keywords

Diagnostic classification models Q matrix stochastic algorithm

Type: Theory and Methods
Information: Psychometrika , Volume 85 , Issue 2 , June 2020 , pp. 322 - 357

DOI: https://doi.org/10.1007/s11336-020-09707-4 [Opens in a new window]
Copyright: Copyright © 2020 The Psychometric Society, corrected publication 2024

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Article purchase

Temporarily unavailable

References

Agresti, A., & Hitchcock, D. B. (2005). Bayesian inference for categorical data analysis. Statistical Methods & Applications, 14 (3), 297– 330. CrossRef Google Scholar

Bock, R. D., & Lieberman, M. (1970). Fitting a response model for

n

\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$n$$\end{document}

dichotomously scored items. Psychometrika, 35 (2), 179– 197. CrossRef Google Scholar

Buck, G., & Tatsuoka, K. Application of the rule-space procedure to language testing: Examining attributes of a free response listening test. Language Testing, (1998). 15 (2), 119– 157. CrossRef Google Scholar

Cai, L. (2010). High-dimensional exploratory item factor analysis by a Metropolis-Hastings Robbins-Monro algorithm. Psychometrika, 75 (1), 33– 57. CrossRef Google Scholar

Chen, Y., Culpepper, S. A., Chen, Y., & Douglas, J (2017). Bayesian estimation of the DINA Q matrix. Psychometrika, 83 (1), 89– 108. CrossRef Google Scholar PubMed

Chen, Y., Culpepper, S., & Liang, F. (2020). A sparse latent class model for cognitive diagnosis. Psychometrika, 85 (1), 121– 153. CrossRef Google Scholar PubMed

Chen, Y., Liu, J., Xu, G., & Ying, Z. (2015). Statistical analysis of Q-matrix based diagnostic classification models. Journal of the American Statistical Association, 110 (510), 850– 866. CrossRef Google Scholar

Culpepper, S. A. (2015). Bayesian estimation of the DINA model with Gibbs sampling. Journal of Educational and Behavioral Statistics, 40 (5), 454– 476. CrossRef Google Scholar

Culpepper, S. A. (2019). Estimating the cognitive diagnosis

Q

matrix with expert knowledge: Application to the fraction-subtraction dataset Psychometrika, 84 (2), 333– 357. CrossRef Google Scholar

Culpepper, S. A. (2019). An exploratory diagnostic model for ordinal responses with binary attributes: Identifiability and estimation. Psychometrika, 84 (4), 921– 940. CrossRef Google Scholar PubMed

Culpepper, S. A., & Chen, Y. (2019). Development and application of an exploratory reduced reparameterized unified model. Journal of Educational and Behavioral Statistics, 44 (1), 3– 24. CrossRef Google Scholar

de la Torre, J. (2008). An empirically based method of Q-matrix validation for the DINA model: Development and applications. Journal of Educational Measurement, 45 (4), 343– 362. CrossRef Google Scholar

de la Torre, J. (2011). The generalized DINA model framework. Psychometrika, 76 ((2),), 179– 199. CrossRef Google Scholar

de la Torre, J., & Chiu, C. -Y. (2016). A general method of empirical Q-matrix validation. Psychometrika, 81 (2), 253– 273. CrossRef Google Scholar PubMed

de la Torre, J., & Douglas, J. A. (2004). Higher-order latent trait models for cognitive diagnosis. Psychometrika, 69 (3), 333– 353. CrossRef Google Scholar

DeCarlo, L. T. (2011). On the analysis of fraction subtraction data: The DINA model, classification, latent class sizes, and the Q-matrix. Applied Psychological Measurement, 35 (1), 8– 26. CrossRef Google Scholar

DeCarlo, L. T. (2012). Recognizing uncertainty in the Q-matrix via a Bayesian extension of the DINA model. Applied Psychological Measurement, 36 (6), 447– 468. CrossRef Google Scholar

Dempster, A. P., Laird, N. M., & Rubin, D. B. (1977). Maximum likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society: Series B (Methodological), 39 (1), 1– 38. CrossRef Google Scholar

Diebolt, J., & Ip, E. H. S. Gilks, W. R., Richardson, S., & Spiegelhalter, D. J. (1996). Stochastic EM: Method and application. Markov chain Monte Carlo in practice, London: Chapman and Hall. 259– 273. Google Scholar

Gelman, A., Hwang, J., & Vehtari, A. (2014). Understanding predictive information criteria for Bayesian models Statistics and Computing 24 (6), 997– 1016. CrossRef Google Scholar

George, A. C. Robitzsch, A. (2015). Cognitive diagnosis models in R: A didactic. The Quantitative Methods for Psychology 11, (3), 189– 205. CrossRef Google Scholar

George, A. C., & Robitzsch, A. Kiefer, T. Groß, J. Ünlü, A. (2016). The R package CDM for cognitive diagnosis models Journal of Statistical Software 74 (2), 1– 24. CrossRef Google Scholar

George, E. I., & McCulloch, R. E. (1997). Approaches for Bayesian variable selection. Statistica Sinica, 7 (2), 339– 373. Google Scholar

Gu, M. G. Kong, F. H.(1998). A stochastic approximation algorithm with Markov chain Monte-Carlo method for incomplete data estimation problems. Proceedings of the National Academy of Sciences, 95 (13), 7270– 7274. CrossRef Google Scholar PubMed

Gu, Y., & Xu, G. (2018). Sufficient and necessary conditions for the identifiability of the Q-matrix. Statistica Sinica, Google Scholar

Gu, Y., & Xu, G. (2019). The sufficient and necessary condition for the identifiability and estimability of the DINA model. Psychometrika, 84 (2), 468– 483. CrossRef Google Scholar PubMed

Geweke, J. Bernardo, J. M. Berger, J. Dawid, A. P. Smith, JFM (1992). Evaluating the accuracy of sampling-based approaches to the calculation of posterior moments. Bayesian Statistics 4, Oxford University Press. 169– 193. CrossRef Google Scholar

Haertel, E. H. (1989). Using restricted latent class models to map the skill structure of achievement items. Journal of Educational Measurement, 26 (4), 301– 321. CrossRef Google Scholar

Hartz, S. M. (2002). A Bayesian framework for the unified model for assessing cognitive abilities: Blending theory with practicality. Unpublished doctoral dissertation, University of Illinois at Urbana-Champaign. Google Scholar

Henson, R. A. Templin, J. L. Willse, J. T.(2009). Defining a family of cognitive diagnosis models using log-linear models with latent variables Psychometrika 74 (2), 191– 210. CrossRef Google Scholar

Junker, B. W. Sijtsma, K.(2001). Cognitive assessment models with few assumptions, and connections with nonparametric item response theory Applied Psychological Measurement 25 (3), 258– 272. CrossRef Google Scholar

Kunina-Habenicht, O. Rupp, A. A. Wilhelm, O.(2009). A practical illustration of multidimensional diagnostic skills profiling: Comparing results from confirmatory factor analysis and diagnostic classification models Studies in Educational Evaluation 35 64– 70. CrossRef Google Scholar

Liu, J.(2017). On the consistency of Q-matrix estimation: A commentary Psychometrika 82 (2), 523– 527. CrossRef Google Scholar PubMed

Liu, C-W Chalmers, R. P.(2020). A note on computing Louis’ observed information matrix identity for IRT and cognitive diagnostic models British Journal of Mathematical and Statistical Psychology Google Scholar

Liu, J. Xu, G. Ying, Z.(2012). Data-driven learning of Q-matrix Applied Psychological Measurement 36 (7), 548– 564. CrossRef Google Scholar PubMed

Liu, J. Xu, G. Ying, Z.(2013). Theory of the self-learning Q-matrix Bernoulli 19 5A 1790– 4011940 CrossRef Google Scholar PubMed

Liu, J. S.(1996). Peskun’s theorem and a modified discrete-state Gibbs sampler Biometrika 83 (3), 681– 682. CrossRef Google Scholar

Louis, T. A.(1982). Finding the observed information matrix when using the EM algorithm Journal of the Royal Statistical Society: Series B (Methodological) 44 (2), 226– 233. CrossRef Google Scholar

Macready, G. B. Dayton, C. M.(1977). The use of probabilistic models in the assessment of mastery. Journal of Educational Statistics, 2 (2), 99– 120. CrossRef Google Scholar

Madigan, D. York, J. Allard, D.(1995). Bayesian graphical models for discrete data. International Statistical Review, 63 (2), 215– 232. CrossRef Google Scholar

Maydeu-Olivares, A. (2013). Goodness-of-fit assessment of item response theory models. Measurement: Interdisciplinary Research and Perspectives, 11(3), 71–101.Google Scholar

Orlando, M. Thissen, D. (2000). Likelihood-based item-fit indices for dichotomous item response theory models. Applied Psychological Measurement, 24 (1), 50– 64. CrossRef Google Scholar

Perks, W.(1947). Some observations on inverse probability including a new indifference rule. Journal of the Institute of Actuaries, 73 (2), 285– 334. CrossRef Google Scholar

Philipp, M. Strobl, C. de la Torre, J. Zeileis, A.(2017). On the estimation of standard errors in cognitive diagnosis models. Journal of Educational and Behavioral Statistics, 43 (1), 88– 115. CrossRef Google Scholar

Robbins, H. Monro, S.(1951). A stochastic approximation method. The Annals of Mathematical Statistics, 22 (3), 400– 407. CrossRef Google Scholar

Rupp, A. A. Templin, J. L.(2008). The effects of Q-matrix misspecification on parameter estimates and classification accuracy in the DINA model. Educational and Psychological Measurement, 68 (1), 78– 96. CrossRef Google Scholar

Rupp, A. A. Templin, J. L. Henson, R. A.(2010). Diagnostic measurement: Theory, methods, and applications, New York: Guilford Press. Google Scholar

Schwarz, G.(1978). Estimating the dimension of a model. The Annals of Statistics, 6 (2), 461– 464. CrossRef Google Scholar

Sun, Y., Ye, S., Su, G., & Sun, Y. (2016). Q-matrix learning and DINA model parameter estimation. Paper presented at the 2016 International Conference on Behavioral, Economic and Socio-cultural Computing (BESC), Durham, NC, USA. CrossRef Google Scholar

Tatsuoka, C. (2002). Data analytic methods for latent partially ordered classification models. Journal of the Royal Statistical Society: Series C (Applied Statistics), 51 (3), 337– 350. Google Scholar

Tatsuoka, K. K. (1984). Analysis of errors in fraction addition and subtraction problems (Final Report for NIE-G-81-0002). Urbana-Champaign: University of Illinois.Google Scholar

Templin, J. L., & Bradshaw, L. (2014). Hierarchical diagnostic classification models: A family of models for estimating and testing attribute hierarchies. Psychometrika, 79 (2), 317– 339. CrossRef Google Scholar PubMed

Templin, J. L., & Hoffman, L. (2013). Obtaining diagnostic classification model estimates using Mplus. Educational Measurement: Issues and Practice, 32 (2), 37– 50. CrossRef Google Scholar

Templin, J. L., & Henson, R. A. (2006). Measurement of psychological disorders using cognitive diagnosis models. Psychological Methods, 11 (3), 287– CrossRef Google Scholar PubMed

von Davier, M. (2008). A general diagnostic model applied to language testing data. British Journal of Mathematical and Statistical Psychology, 61 (2), 287– 307. CrossRef Google Scholar PubMed

von Davier, M. (2014). The DINA model as a constrained general diagnostic model: Two variants of a model equivalency. British Journal of Mathematical and Statistical Psychology, 67 (1), 49– 71. CrossRef Google Scholar

von Davier, M., & Sinharay, S. (2010). Stochastic approximation methods for latent regression item response models Journal of Educational and Behavioral Statistics 35 (2), 174– 193. CrossRef Google Scholar

Wang, S. (2018). Two-stage maximum likelihood estimation in the misspecified restricted latent class model. British Journal of Mathematical and Statistical Psychology, 71 (2), 300– 333. CrossRef Google Scholar PubMed

Wang, W., Song, L., Ding, S., Meng, Y., Cao, C., & Jie, Y. (2018). An EM-based method for Q-matrix validation. Applied Psychological Measurement, 42 (6), 446– 459. CrossRef Google Scholar PubMed

Watanabe, S. (2010). Asymptotic equivalence of Bayes cross validation and widely applicable information criterion in singular learning theory. Journal of Machine Learning Research, 11(Dec), 3571–3594. Google Scholar

Wei, G. C., & Tanner, M. A. (1990). A Monte Carlo implementation of the EM algorithm and the poor man’s data augmentation algorithms. Journal of the American Statistical Association, 85 (411), 699– 704. CrossRef Google Scholar

Xu, G. (2017). Identifiability of restricted latent class models with binary responses. The Annals of Statistics, 45 (2), 675– 707. CrossRef Google Scholar

Xu, G., & Shang, Z. (2017). Identifying latent structures in restricted latent class models. Journal of the American Statistical Association, 113 (423), 1284– 1295. CrossRef Google Scholar

Yang, J. S., & Cai, L. (2014). Estimation of contextual effects through nonlinear multilevel latent variable modeling with a Metropolis-Hastings Robbins-Monro algorithm. Journal of Educational and Behavioral Statistics, 39 (6), 550– 582. CrossRef Google Scholar

Zhang, S., Chen, Y., & Liu, Y. (2020). An improved stochastic EM algorithm for large-scale full-information item factor analysis. British Journal of Mathematical and Statistical Psychology, 73 (1), 44– 71. CrossRef Google Scholar PubMed

Erratum: A Constrained Metropolis–Hastings Robbins–Monro Algorithm for Q Matrix Estimation in DINA Models

Chen-Wei Liu Chen-Wei Liu ,

Björn Andersson Björn Andersson and Anders Skrondal

Psychometrika , Volume 89 , Issue 3

Article contents

A Constrained Metropolis–Hastings Robbins–Monro Algorithm for Q Matrix Estimation in DINA Models

Abstract

Keywords

Access options

Article purchase

Temporarily unavailable

References

A correction has been issued for this article:

Linked content

This article has been cited by the following publications. This list is generated based on data provided by Crossref.

Article contents

A Constrained Metropolis–Hastings Robbins–Monro Algorithm for Q Matrix Estimation in DINA Models

Abstract

Keywords

Access options

Article purchase

Temporarily unavailable

References

A correction has been issued for this article:

Linked content

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests