Hostname: page-component-745bb68f8f-mzp66 Total loading time: 0 Render date: 2025-01-08T15:09:07.408Z Has data issue: false hasContentIssue false

Learning Large Q-Matrix by Restricted Boltzmann Machines

Published online by Cambridge University Press:  01 January 2025

Chengcheng Li
Affiliation:
University of Michigan
Chenchen Ma
Affiliation:
University of Michigan
Gongjun Xu*
Affiliation:
University of Michigan
*
Correspondence should be made to Gongjun Xu, Department of Statistics, University of Michigan, Ann Arbor, USA. Email: [email protected]

Abstract

Estimation of the large Q-matrix in cognitive diagnosis models (CDMs) with many items and latent attributes from observational data has been a huge challenge due to its high computational cost. Borrowing ideas from deep learning literature, we propose to learn the large Q-matrix by restricted Boltzmann machines (RBMs) to overcome the computational difficulties. In this paper, key relationships between RBMs and CDMs are identified. Consistent and robust learning of the Q-matrix in various CDMs is shown to be valid under certain conditions. Our simulation studies under different CDM settings show that RBMs not only outperform the existing methods in terms of learning speed, but also maintain good recovery accuracy of the Q-matrix. In the end, we illustrate the applicability and effectiveness of our method through a TIMSS mathematics data set.

Type
Theory & Methods
Copyright
Copyright © 2022 The Author(s) under exclusive licence to The Psychometric Society

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Footnotes

Supplementary Information The online version contains supplementary material available at https://doi.org/10.1007/s11336-021-09828-4.

References

Bengio, Y., & Delalleau, O., (2009). Justifying and generalizing contrastive divergence Neural Computation 21(6) 16011621CrossRefGoogle ScholarPubMed
Carreira-Perpinan, M. A., & Hinton, G. E. (2005). On contrastive divergence learning. In: Aistats, vol. 10, pp. 33–40. Citeseer.Google Scholar
Chen, Y.,Culpepper, S. A., Chen, Y., & Douglas, J., (2018). Bayesian estimation of the DINA Q matrix Psychometrika 83(1) 89108CrossRefGoogle ScholarPubMed
Chen, Y.,Liu, J., Xu, G., & Ying, Z., (2015). Statistical analysis of Q-matrix based diagnostic classification models Journal of the American Statistical Association 110(510) 850866CrossRefGoogle Scholar
Chiu, C. Y., (2013). Statistical refinement of the Q-matrix in cognitive diagnosis Applied Psychological Measurement 37(8) 598618CrossRefGoogle Scholar
Choi, K.,Lee, Y. S., Park, Y. S., (2015). What CDM can tell about what students have learned: An analysis of TIMSS eighth grade mathematics Eurasia Journal of Mathematics, Science and Technology Education 11 15631577CrossRefGoogle Scholar
Chung, M., & Johnson, M. S. (2018). An MCMC algorithm for estimating the Q-matrix in a Bayesian framework. arXiv preprint arXiv:1802.02286.Google Scholar
Collins, M.,Globerson, A., Koo, T. K.,Carreras, X., Bartlett, P. L., (2008). Exponentiated gradient algorithms for conditional random fields and max-margin markov networks Journal of Machine Learning Research 9 17751822Google Scholar
Culpepper, S., (2019). Estimating the cognitive diagnosis Q matrix with expert knowledge: Application to the fraction-subtraction dataset Psychometrika 84(2) 333357CrossRefGoogle Scholar
de la Torre, J., (2011). The generalized DINA model framework Psychometrika 76(2) 179199CrossRefGoogle Scholar
de la Torre, J., & Chiu, C. Y., (2016). A general method of empirical Q-matrix validation Psychometrika 81(2) 253–73CrossRefGoogle ScholarPubMed
de la Torre, J., (2008). An empirically based method of Q-matrix validation for the DINA model: Development and applications Journal of Educational Measurement 45(4) 343362CrossRefGoogle Scholar
de la Torre, J.,van der Ark, L. A., Rossi, G., (2018). Analysis of clinical data from a cognitive diagnosis modeling framework Measurement and Evaluation in Counseling and Development 51(4) 281296CrossRefGoogle Scholar
DeCarlo, L. T., (2012). Recognizing uncertainty in the Q-matrix via a Bayesian extension of the DINA model Applied Psychological Measurement 36(6) 447468CrossRefGoogle Scholar
García, P.,Olea, J., de la Torre, J., (2014). Application of cognitive diagnosis models to competency-based situational judgment tests Psicothema 26(3) 372–7 25069557CrossRefGoogle ScholarPubMed
González, J., & Wiberg, M., Applying test equating methods New York SpringerCrossRefGoogle Scholar
Gu, Y., & Xu, G., (2017). The sufficient and necessary condition for the identifiability and estimability of the DINA model Psychometrika (2019). 84(2) 468483CrossRefGoogle Scholar
Gu, Y., & Xu, G., (2020). Partial identifiability of restricted latent class models Annals of Statistics 48(4) 20822107CrossRefGoogle Scholar
Gu, Y., & Xu, G. (2020b). Sufficient and necessary conditions for the identifiability of the Q-matrix. Statistica Sinica.CrossRefGoogle Scholar
Haertel, E. H., (1989). Using restricted latent class models to map the skill structure of achievement items Journal of Educational Measurement 26(4) 301321CrossRefGoogle Scholar
Hartz, S. (2002). A Bayesian framework for the unified model for assessing cognitive abilities: Blending theory with practicality. Unpublished doctoral dissertation.Google Scholar
Henson, R. A.,Templin, J. L., Willse, J. T., (2008). Defining a family of cognitive diagnosis models using log-linear models with latent variables Psychometrika 74(2) 191CrossRefGoogle Scholar
Hinton, G. E., (2002). Training products of experts by minimizing contrastive divergence Neural Computation 14(8) 17711800CrossRefGoogle ScholarPubMed
Hinton, G. E., & Salakhutdinov, R. R., (2006). Reducing the dimensionality of data with neural networks Science 313(5786) 504507CrossRefGoogle ScholarPubMed
Jiang, B., & Wu, T-Y Jin, Y., & Wong, W. H., et.al., (2018). Convergence of contrastive divergence algorithm in exponential family The Annals of Statistics 46 6A 30673098CrossRefGoogle Scholar
Junker, B. W., & Sijtsma, K., (2001). Cognitive assessment models with few assumptions, and connections with nonparametric item response theory Applied Psychological Measurement 25(3) 258272CrossRefGoogle Scholar
Kuhn, H. W., (1955). The Hungarian method for the assignment problem Naval Research Logistics Quarterly 2 1–2 8397CrossRefGoogle Scholar
Larochelle, H., & Bengio, Y. (2008). Classification using discriminative restricted Boltzmann machines. In: Proceedings of the 25th international conference on machine learning, ICML ’08, pages 536–543, New York, NY, USA. ACMCrossRefGoogle Scholar
Lee, Y. S.,Park, Y. S., Taylan, D., (2011). A cognitive diagnostic modeling of attribute mastery in Massachusetts, Minnesota, and the U.S. national sample using the TIMSS 2007 International Journal of Testing 11 144177CrossRefGoogle Scholar
Liu, J.,Xu, G., Ying, Z., (2012). Data-driven learning of Q-matrix Applied Psychological Measurement 36(7) 548564CrossRefGoogle ScholarPubMed
Long, P. M., & Servedio, R. A. (2010). Restricted Boltzmann machines are hard to approximately evaluate or simulate. In: Proceedings of the 27th international conference on machine learning, ICML’10, page 703–710, Madison, WI, USA. OmnipressGoogle Scholar
MacKay, D. (2001). Failures of the one-step learning algorithm. In Available electronically at http://www.inference.phy.cam.ac.uk/mackay/abstracts/gbm.html.Google Scholar
Robitzsch, A.,Kiefer, T., George, A. C.,Uenlue, A., Robitzsch, M. A., Handbook of diagnostic classification models New York SpringerGoogle Scholar
Rosasco, L. (2009). Sparsity based regularization. MIT class notes.Google Scholar
Salakhutdinov, R., Mnih, A., & Hinton, G. (2007). Restricted Boltzmann machines for collaborative filtering. In: Proceedings of the 24th international conference on machine learning, ICML ’07, pp. 791–798, New York, NY, USA. ACMCrossRefGoogle Scholar
Schlueter, J. (2014). Restricted Boltzmann machine derivations. NotesGoogle Scholar
Smolensky, P., Information processing in dynamical systems: Foundations of harmony theory (1986). Colorado Colorado University at Boulder Department of Computer ScienceGoogle Scholar
Su, Y-L Choi, K.,Lee, W., Choi, T., & McAninch, M., (2020). Hierarchical cognitive diagnostic analysis for timss 2003 mathematics Centre for Advanced Studies in Measurement and Assessment (2013). 35 171Google Scholar
Sutskever, I., & Tieleman, T. (2010). On the convergence properties of contrastive divergence. In: Proceedings of the thirteenth international conference on artificial intelligence and statistics, pp. 789–795.Google Scholar
Templin, J., & Henson, R., (2006). Measurement of psychological disorders using cognitive diagnosis models Psychological Methods 11 287305CrossRefGoogle ScholarPubMed
Tsuruoka, Y., Tsujii, J., & Ananiadou, S. (2009). Stochastic gradient descent training for L1-regularized log-linear models with cumulative penalty. In: Proceedings of the joint conference of the 47th annual meeting of the ACL and the 4th international joint conference on natural language processing of the AFNLP: Vol. 1, pp. 477–485. Association for Computational Linguistics.CrossRefGoogle Scholar
von Davier, M., A general diagnostic model applied to language testing data (ETS research report RR-05-16) Princeton Educational Testing ServiceGoogle Scholar
von Davier, M., (2005). A general diagnostic model applied to language testing data British Journal of Mathematical and Statistical Psychology (2008). 61(2) 287307CrossRefGoogle Scholar
Wu, Z.,Deloria-Knoll, M., Zeger, S. L., (2016). Nested partially latent class models for dependent binary data; Estimating disease etiology Biostatistics 18(2) 200213Google Scholar
Xu, G., (2017). Identifiability of restricted latent class models with binary responses Annals of Statistics 45(2) 675707CrossRefGoogle Scholar
Xu, G., & Shang, Z., (2018). Identifying latent structures in restricted latent class models Journal of the American Statistical Association 113(523) 12841295CrossRefGoogle Scholar
Yuille, A. L., (2004). The convergence of contrastive divergences Advances in Neural Information Processing Systems 17 15931600Google Scholar
Supplementary material: File

Li et al. supplementary material

Supplementary Information The online version contains supplementary material available at https://doi.org/10.1007/s11336-021-09828-4.
Download Li et al. supplementary material(File)
File 330.4 KB