Hostname: page-component-745bb68f8f-d8cs5 Total loading time: 0 Render date: 2025-01-08T12:22:15.872Z Has data issue: false hasContentIssue false

An Efficient MCMC Algorithm to Sample Binary Matrices with Fixed Marginals

Published online by Cambridge University Press:  01 January 2025

Norman D. Verhelst*
Affiliation:
CITO, National Institute for Educational Measurement
*
Requests for reprints should be sent to Norman D. Verhelst, CITO, National Institute for Educational Measurement, P.O. Box 1034, 6801 MG Arnhem, The Netherlands. E-mail: [email protected]; [email protected]

Abstract

Uniform sampling of binary matrices with fixed margins is known as a difficult problem. Two classes of algorithms to sample from a distribution not too different from the uniform are studied in the literature: importance sampling and Markov chain Monte Carlo (MCMC). Existing MCMC algorithms converge slowly, require a long burn-in period and yield highly dependent samples. Chen et al. developed an importance sampling algorithm that is highly efficient for relatively small tables. For larger but still moderate sized tables (300×30) Chen et al.’s algorithm is less efficient. This article develops a new MCMC algorithm that converges much faster than the existing ones and that is more efficient than Chen’s algorithm for large problems. Its stationary distribution is uniform. The algorithm is extended to the case of square matrices with fixed diagonal for applications in social network theory.

Type
Theory and Methods
Copyright
Copyright © 2008 The Psychometric Society

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Footnotes

I am indebted to my colleague Gunter Maris for his suggestion to add a Metropolis–Hastings step as the finishing touch of the algorithm.

References

Besag, J., & Clifford, P. (1989). Generalized Monte Carlo significance tests. Biometrika, 76, 633642.CrossRefGoogle Scholar
Chen, Y. (2006). Simple existence conditions for zero-one matrices with at most one structural zero in each row and column. Discrete Mathematics, 306, 28702877.CrossRefGoogle Scholar
Chen, Y., Diaconis, P., Holmes, S., & Liu, J. (2005). Sequential Monte Carlo methods for statistical analysis of tables. Journal of the American Statistical Association, 100, 109120.CrossRefGoogle Scholar
Chen, Y., & Small, D. (2005). Exact tests for the Rasch model via sequential importance sampling. Psychometrika, 70, 1130.CrossRefGoogle Scholar
Connor, E., & Simberloff, D. (1979). The assembly of species communities: chance or competition. Ecology, 60, 11321140.CrossRefGoogle Scholar
Gale, D. (1957). A theorem on flows in networks. Pacific Journal of Mathematics, 7, 10731082.CrossRefGoogle Scholar
Guttorp, P. (1995). Stochastic modeling of scientific data, London: Chapman and Hall.CrossRefGoogle Scholar
Hastings, W.K. (1970). Monte Carlo sampling methods using Markov chains and their applications. Biometrika, 57, 97109.CrossRefGoogle Scholar
Kong, A., Liu, J., & Wong, W. (1994). Sequential imputations and Bayesian missing data problems. Journal of the American Statistical Association, 89, 278288.CrossRefGoogle Scholar
Marshall, A., & Olkin, I. (1979). Inequalities: theory of majorization and its applications, San Diego: Academic Press.Google Scholar
Metropolis, N., Rosenbluth, A.W., Rosenbluth, M.N., Teller, A.H., & Teller, E. (1953). Equations of state calculations by fast computing machines. Journal of Chemical Physics, 21, 10871091.CrossRefGoogle Scholar
Musalem, A., Bradlow, E., & Raju, J. (2008, in press). Bayesian estimation of random-coefficients models using aggregate data. Journal of Applied Econometrics.CrossRefGoogle Scholar
Ponocny, I. (2001). Nonparametric goodness-of-fit tests for the Rasch model. Psychometrika, 66, 437460.CrossRefGoogle Scholar
Prabhu, N. (1965). Stochastic processes. Basic theory and its applications, New York: Macmillan.Google Scholar
Rao, A., Jana, R., & Bandyopadhyay, S. (1996). A Markov chain Monte Carlo method for generating random (0,1)-matrices with given marginals. Sankhya, Series A, 58, 225242.Google Scholar
Roberts, A., & Stone, L. (1990). Island sharing by archipelago species. Oecologia, 83, 560567.CrossRefGoogle ScholarPubMed
Ryser, H. (1957). Combinatorial properties of matrices with zeros and ones. The Canadian Journal of Mathematics, 9, 371377.CrossRefGoogle Scholar
Ryser, H. (1963). Combinatorial mathematics. Carus mathematical monographs, Washington: The Mathematical Association of America.Google Scholar
Snijders, T. (1991). Enumeration and simulation for 0-1 matrices with given marginals. Psychometrika, 56, 397417.CrossRefGoogle Scholar
Tanner, M.A. (1996). Tools for statistical inference, (3rd ed.). New York: Springer.CrossRefGoogle Scholar
Wasserman, S. (1977). Random directed graph distributions and the triad census in social networks. Journal of Mathematical Sociology, 5, 6186.CrossRefGoogle Scholar