Hostname: page-component-586b7cd67f-vdxz6 Total loading time: 0 Render date: 2024-12-02T19:17:35.312Z Has data issue: false hasContentIssue false

An Introduction to Bayesian Inference via Variational Approximations

Published online by Cambridge University Press:  04 January 2017

Justin Grimmer*
Affiliation:
Department of Political Science, Stanford University, 616 Serra St., Encina Hall West, Room 100, Stanford, CA 94305. e-mail: [email protected]
Rights & Permissions [Opens in a new window]

Abstract

Core share and HTML view are not available for this content. However, as you have access to this content, a full PDF is available via the ‘Save PDF’ action button.

Markov chain Monte Carlo (MCMC) methods have facilitated an explosion of interest in Bayesian methods. MCMC is an incredibly useful and important tool but can face difficulties when used to estimate complex posteriors or models applied to large data sets. In this paper, we show how a recently developed tool in computer science for fitting Bayesian models, variational approximations, can be used to facilitate the application of Bayesian models to political science data. Variational approximations are often much faster than MCMC for fully Bayesian inference and in some instances facilitate the estimation of models that would be otherwise impossible to estimate. As a deterministic posterior approximation method, variational approximations are guaranteed to converge and convergence is easily assessed. But variational approximations do have some limitations, which we detail below. Therefore, variational approximations are best suited to problems when fully Bayesian inference would otherwise be impossible. Through a series of examples, we demonstrate how variational approximations are useful for a variety of political science research. This includes models to describe legislative voting blocs and statistical models for political texts. The code that implements the models in this paper is available in the supplementary material.

Type
Research Article
Copyright
Copyright © The Author 2010. Published by Oxford University Press on behalf of the Society for Political Methodology 

References

Ansolabehere, Stephen, Rodden, Jonathan, and Jr, James Snyder. 2008. The strength of issues: Using multiple measures to gauge preference stability, ideological constraint, and issue voting. American Political Science Review 102: 215–32.Google Scholar
Antoniak, C. E. 1974. Mixtures of Dirichlet processes with applications to Bayesian nonparametric problems. Annals of Statistics 2: 1152–74.Google Scholar
Bishop, Christopher. 2006. Pattern recognition and machine learning. New York: Springer.Google Scholar
Blei, David, and Jordan, Michael. 2006. Variational inference for Dirichlet process mixtures. Journal of Bayesian Analysis 1: 121–44.Google Scholar
Blei, David, and Lafferty, John. 2006. Dynamic Topic Models. Proceedings of the 23rd International Conference on Machine Learning, Pittsburgh, PA.Google Scholar
Blei, David, and Lafferty, John. 2009. Text mining: Theory and applications, chapter topic models. Oxford, UK: Taylor and Francis.Google Scholar
Celeux, G., Hurn, M., and Robert, C. P. 2000. Computational and inferential difficulties with mixture posterior distributions. Journal of the American Statistical Association 95: 957.Google Scholar
Clinton, Joshua, Jackman, Simon, and Rivers, Douglas. 2004. The statistical analysis of roll call data. American Political Science Review 98: 355–70.CrossRefGoogle Scholar
Cowles, Mary Kathryn, and Carlin, Bradley. 1996. Markov chain Monte Carlo convergence diagnostics: A comparative review. Journal of the American Statistical Association 91: 883904.CrossRefGoogle Scholar
Dempster, Arthur, Laird, Nathan, and Rubin, Donald. 1977. Maximum likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society, Series B 39: 138.Google Scholar
Escobar, Michael, and West, Mike. 1995. Bayesian density estimation and inference using mixtures. Journal of the American Statistical Association 90: 577–88.Google Scholar
Fenno, Richard. 1978. Home style: House members in their districts. Boston: Addison Wesley.Google Scholar
Ferguson, Thomas. 1973. Bayesian analysis of some nonparametric problems. Annals of Statistics 1: 209–30.CrossRefGoogle Scholar
Gelfand, Alan, and Smith, A. F. M. 1990. Sampling based approaches to calculating marginal densities. Journal of the American Statistical Association 85: 398409.Google Scholar
Gelman, Andrew, Carlin, John, Stern, Hal, and Rubin, Donald. 1995. Bayesian data analysis. New York: Chapman & Hall.Google Scholar
Gelman, Andrew, and Rubin, Donald. 1992. Inference from iterative simulation: Simulation using multiple sequences. Statistical Science 7: 457–72.CrossRefGoogle Scholar
Geman, S., and Geman, D. 1984. Stochastic relaxation, Gibbs distributions and the Bayesian restoration of images. IEEE Transactions on Pattern Analysis and Machine Intelligence 6: 721–41.Google Scholar
Ghahramani, Zoubin, and Beal, Matthew. 2001. Propagation algorithms for variational Bayesian learning. Advances in Neural Information Processing Systems 13: 852–59.Google Scholar
Gill, Jeff. 2004. Is partial-dimension convergence a problem for inferences from MCMC algorithms? Political Analysis 12: 153–78.Google Scholar
Gill, Jeff, and Casella, George. 2004. Dynamic tempered transitions for exploring multimodal posterior distributions. Political Analysis 12: 425.Google Scholar
Gill, Jeff, and Casella, George. 2009. Nonparametric priors for ordinal Bayesian Social Science models: Specification and estimation. Journal of the American Statistical Association 104: 453–54.Google Scholar
Gill, Jeff, and Walker, Lee. 2005. Elicited priors for Bayesian model specifications in political science research. Journal of Politics 67: 841–72.Google Scholar
Grimmer, Justin. 2010. A Bayesian hierarchical topic model for political texts: Measuring expressed agendas in Senate press releases. Political Analysis 18: 135.Google Scholar
Hoff, Peter, and Ward, Michael. 2004. Modeling dependencies in international relations networks. Political Analysis 12: 160–75.Google Scholar
Jackman, Simon. 2000. Estimation and inference via Bayesian simulation: An introduction to Markov chain Monte Carlo. American Journal of Political Science 44: 375404.Google Scholar
Jasra, A., Holmes, C. C., and Stephens, D. A. 2005. Markov chain Monte Carlo methods and the label switching problem in Bayesian mixture models. Statistical Science 20: 50.Google Scholar
Jordan, Michael, Ghahramani, Zoubin, Jaakkola, Tommi, and Saul, Lawrence. 1999. An introduction to variational methods for graphical models. Machine Learning 37: 183233.Google Scholar
Kass, Robert, and Raftery, Adrian. 1995. Bayes factors. Journal of the American Statistical Association 90: 773–95.Google Scholar
King, Gary. 1998. Unifying political methodology: The likelihood theory of statistical inference. Ann Arbor: University of Michigan Press.Google Scholar
Kirkpatrick, S., Gelatt, C. D., and Vecchi, M. P. 1983. Optimization by simulated annealing. Science 220: 671.Google Scholar
Kottas, A., Branco, M. D., and Gelfand, A. E. 2002. A nonparametric Bayesian modeling approach for cytogenetic dosimetry. Biometrics 58: 593600.Google Scholar
Lax, Jeffrey, and Phillips, Justin. 2009. Gay rights in the states: Public opinion and policy responsiveness. American Political Science Review 103: 367–86.Google Scholar
Londregan, John. 2000. Estimating legislators' preferred points. Political Analysis 8: 3556.Google Scholar
MacKay, David. 2003. Information theory, inference, and learning algorithms. Cambridge: Cambridge University Press.Google Scholar
Manning, Christopher, Raghavan, Pabhakar, and Shutze, Hinrich. 2008. Introduction to information retrieval. Cambridge: Cambridge University Press.Google Scholar
Martin, Andrew, Quinn, Kevin, and Hee Park, Jong. Markov chain Monte Carlo package (MCMCpack). Journal of Statistical Software. Forthcoming.Google Scholar
McLachlan, Geoffrey, and Peel, David. 2000. Finite mixture models. San Francisco: John Wiley & Sons.Google Scholar
Medvedovic, M., and Sivaganesan, S. 2002. Bayesian infinite mixture model-based clustering of gene expression profiles. Bioinformatics 18: 1194–206.Google Scholar
Neal, Radford. 2000. Markov chain sampling methods for Dirichlet process mixture models. Journal of Computational and Graphical Statistics 9: 249–65.Google Scholar
Petrone, S., and Raftery, A. 1997. A note on the Dirichlet process prior in Bayesian nonparametric inference with partial exchangeability. Statistics & Probability Letters 36: 6983.CrossRefGoogle Scholar
Quinn, Kevin, and Spirling, Arthur. 2010. Identifying intra-party voting blocs in the UK house of commons. Journal of the American Statistical Association. Forthcoming.Google Scholar
Quinn, Kevin M., Monroe, Burt L., Colaresi, Michael, Crespin, Michael H., and Radev, Dragomir R. 2010. How to analyze political attention with minimal assumptions and costs. American Journal of Political Science 54: 209–28.Google Scholar
Sethuraman, Jayaram. 1994. A constructive definition of Dirichlet priors. Statistica Sinica 4: 639–50.Google Scholar
Teh, Yee Weh. 2010. Dirichlet processes. In Encyclopedia of machine learning, eds. Sammut, Claude and Webb, Geoffrey. New York: Springer.Google Scholar
Teh, Yee Weh, Jordan, Michael, Beal, Matthew, and Blei, David. 2006. Hierarchical Dirichlet processes. Journal of the American Statistical Association 101: 1566–81.Google Scholar
Teh, Y. W., Newman, D., and Welling, M. 2007. A collapsed variational Bayesian inference algorithm for latent Dirichlet allocation. Advances in Neural Information Processing Systems 19: 1353.Google Scholar
Trier, Shawn, and Jackman, Simon. 2008. Democracy as a latent variable. American Journal of Political Science 52: 201–17.Google Scholar
Wang, Bo, and Titterington, D. M. 2004. Convergence and asymptotic normality of variational Bayesian approximations for exponential family models with missing values. Proceedings of the 20th Conference on Uncertainty in Artificial Intelligence 20: 577–84.Google Scholar
Western, Bruce, and Jackman, Simon. 1994. Bayesian inference for comparative research. American Political Science Review 88: 412–23.Google Scholar
Supplementary material: PDF

Grimmer supplementary material

Supplementary Material

Download Grimmer supplementary material(PDF)
PDF 303.8 KB