An Introduction to Bayesian Inference via Variational Approximations

Justin Grimmer

doi:10.1093/pan/mpq027

An Introduction to Bayesian Inference via Variational Approximations

Published online by Cambridge University Press: 04 January 2017

Justin Grimmer

Show author details

Justin Grimmer*: Affiliation:
Department of Political Science, Stanford University, 616 Serra St., Encina Hall West, Room 100, Stanford, CA 94305. e-mail: [email protected]

Article contents

Abstract
References

Rights & Permissions

Abstract

Core share and HTML view are not available for this content. However, as you have access to this content, a full PDF is available via the ‘Save PDF’ action button.

Markov chain Monte Carlo (MCMC) methods have facilitated an explosion of interest in Bayesian methods. MCMC is an incredibly useful and important tool but can face difficulties when used to estimate complex posteriors or models applied to large data sets. In this paper, we show how a recently developed tool in computer science for fitting Bayesian models, variational approximations, can be used to facilitate the application of Bayesian models to political science data. Variational approximations are often much faster than MCMC for fully Bayesian inference and in some instances facilitate the estimation of models that would be otherwise impossible to estimate. As a deterministic posterior approximation method, variational approximations are guaranteed to converge and convergence is easily assessed. But variational approximations do have some limitations, which we detail below. Therefore, variational approximations are best suited to problems when fully Bayesian inference would otherwise be impossible. Through a series of examples, we demonstrate how variational approximations are useful for a variety of political science research. This includes models to describe legislative voting blocs and statistical models for political texts. The code that implements the models in this paper is available in the supplementary material.

Type: Research Article
Information: Political Analysis , Volume 19 , Issue 1 , Winter 2011 , pp. 32 - 47

DOI: https://doi.org/10.1093/pan/mpq027 [Opens in a new window]
Copyright: Copyright © The Author 2010. Published by Oxford University Press on behalf of the Society for Political Methodology

References

Ansolabehere, Stephen, Rodden, Jonathan, and Jr, James Snyder. 2008. The strength of issues: Using multiple measures to gauge preference stability, ideological constraint, and issue voting. American Political Science Review 102: 215–32.Google Scholar

Antoniak, C. E. 1974. Mixtures of Dirichlet processes with applications to Bayesian nonparametric problems. Annals of Statistics 2: 1152–74.Google Scholar

Bishop, Christopher. 2006. Pattern recognition and machine learning. New York: Springer.Google Scholar

Blei, David, and Jordan, Michael. 2006. Variational inference for Dirichlet process mixtures. Journal of Bayesian Analysis 1: 121–44.Google Scholar

Blei, David, and Lafferty, John. 2006. Dynamic Topic Models. Proceedings of the 23rd International Conference on Machine Learning, Pittsburgh, PA.Google Scholar

Blei, David, and Lafferty, John. 2009. Text mining: Theory and applications, chapter topic models. Oxford, UK: Taylor and Francis.Google Scholar

Celeux, G., Hurn, M., and Robert, C. P. 2000. Computational and inferential difficulties with mixture posterior distributions. Journal of the American Statistical Association 95: 957.Google Scholar

Clinton, Joshua, Jackman, Simon, and Rivers, Douglas. 2004. The statistical analysis of roll call data. American Political Science Review 98: 355–70.CrossRef Google Scholar

Cowles, Mary Kathryn, and Carlin, Bradley. 1996. Markov chain Monte Carlo convergence diagnostics: A comparative review. Journal of the American Statistical Association 91: 883–904.CrossRef Google Scholar

Dempster, Arthur, Laird, Nathan, and Rubin, Donald. 1977. Maximum likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society, Series B 39: 1–38.Google Scholar

Escobar, Michael, and West, Mike. 1995. Bayesian density estimation and inference using mixtures. Journal of the American Statistical Association 90: 577–88.Google Scholar

Fenno, Richard. 1978. Home style: House members in their districts. Boston: Addison Wesley.Google Scholar

Ferguson, Thomas. 1973. Bayesian analysis of some nonparametric problems. Annals of Statistics 1: 209–30.CrossRef Google Scholar

Gelfand, Alan, and Smith, A. F. M. 1990. Sampling based approaches to calculating marginal densities. Journal of the American Statistical Association 85: 398–409.Google Scholar

Gelman, Andrew, Carlin, John, Stern, Hal, and Rubin, Donald. 1995. Bayesian data analysis. New York: Chapman & Hall.Google Scholar

Gelman, Andrew, and Rubin, Donald. 1992. Inference from iterative simulation: Simulation using multiple sequences. Statistical Science 7: 457–72.CrossRef Google Scholar

Geman, S., and Geman, D. 1984. Stochastic relaxation, Gibbs distributions and the Bayesian restoration of images. IEEE Transactions on Pattern Analysis and Machine Intelligence 6: 721–41.Google Scholar

Ghahramani, Zoubin, and Beal, Matthew. 2001. Propagation algorithms for variational Bayesian learning. Advances in Neural Information Processing Systems 13: 852–59.Google Scholar

Gill, Jeff. 2004. Is partial-dimension convergence a problem for inferences from MCMC algorithms? Political Analysis 12: 153–78.Google Scholar

Gill, Jeff, and Casella, George. 2004. Dynamic tempered transitions for exploring multimodal posterior distributions. Political Analysis 12: 425.Google Scholar

Gill, Jeff, and Casella, George. 2009. Nonparametric priors for ordinal Bayesian Social Science models: Specification and estimation. Journal of the American Statistical Association 104: 453–54.Google Scholar

Gill, Jeff, and Walker, Lee. 2005. Elicited priors for Bayesian model specifications in political science research. Journal of Politics 67: 841–72.Google Scholar

Grimmer, Justin. 2010. A Bayesian hierarchical topic model for political texts: Measuring expressed agendas in Senate press releases. Political Analysis 18: 1–35.Google Scholar

Hoff, Peter, and Ward, Michael. 2004. Modeling dependencies in international relations networks. Political Analysis 12: 160–75.Google Scholar

Jackman, Simon. 2000. Estimation and inference via Bayesian simulation: An introduction to Markov chain Monte Carlo. American Journal of Political Science 44: 375–404.Google Scholar

Jasra, A., Holmes, C. C., and Stephens, D. A. 2005. Markov chain Monte Carlo methods and the label switching problem in Bayesian mixture models. Statistical Science 20: 50.Google Scholar

Jordan, Michael, Ghahramani, Zoubin, Jaakkola, Tommi, and Saul, Lawrence. 1999. An introduction to variational methods for graphical models. Machine Learning 37: 183–233.Google Scholar

Kass, Robert, and Raftery, Adrian. 1995. Bayes factors. Journal of the American Statistical Association 90: 773–95.Google Scholar

King, Gary. 1998. Unifying political methodology: The likelihood theory of statistical inference. Ann Arbor: University of Michigan Press.Google Scholar

Kirkpatrick, S., Gelatt, C. D., and Vecchi, M. P. 1983. Optimization by simulated annealing. Science 220: 671.Google Scholar

Kottas, A., Branco, M. D., and Gelfand, A. E. 2002. A nonparametric Bayesian modeling approach for cytogenetic dosimetry. Biometrics 58: 593–600.Google Scholar

Lax, Jeffrey, and Phillips, Justin. 2009. Gay rights in the states: Public opinion and policy responsiveness. American Political Science Review 103: 367–86.Google Scholar

Londregan, John. 2000. Estimating legislators' preferred points. Political Analysis 8: 35–56.Google Scholar

MacKay, David. 2003. Information theory, inference, and learning algorithms. Cambridge: Cambridge University Press.Google Scholar

Manning, Christopher, Raghavan, Pabhakar, and Shutze, Hinrich. 2008. Introduction to information retrieval. Cambridge: Cambridge University Press.Google Scholar

Martin, Andrew, Quinn, Kevin, and Hee Park, Jong. Markov chain Monte Carlo package (MCMCpack). Journal of Statistical Software. Forthcoming.Google Scholar

McLachlan, Geoffrey, and Peel, David. 2000. Finite mixture models. San Francisco: John Wiley & Sons.Google Scholar

Medvedovic, M., and Sivaganesan, S. 2002. Bayesian infinite mixture model-based clustering of gene expression profiles. Bioinformatics 18: 1194–206.Google Scholar

Neal, Radford. 2000. Markov chain sampling methods for Dirichlet process mixture models. Journal of Computational and Graphical Statistics 9: 249–65.Google Scholar

Petrone, S., and Raftery, A. 1997. A note on the Dirichlet process prior in Bayesian nonparametric inference with partial exchangeability. Statistics & Probability Letters 36: 69–83.CrossRef Google Scholar

Quinn, Kevin, and Spirling, Arthur. 2010. Identifying intra-party voting blocs in the UK house of commons. Journal of the American Statistical Association. Forthcoming.Google Scholar

Quinn, Kevin M., Monroe, Burt L., Colaresi, Michael, Crespin, Michael H., and Radev, Dragomir R. 2010. How to analyze political attention with minimal assumptions and costs. American Journal of Political Science 54: 209–28.Google Scholar

Sethuraman, Jayaram. 1994. A constructive definition of Dirichlet priors. Statistica Sinica 4: 639–50.Google Scholar

Teh, Yee Weh. 2010. Dirichlet processes. In Encyclopedia of machine learning, eds. Sammut, Claude and Webb, Geoffrey. New York: Springer.Google Scholar

Teh, Yee Weh, Jordan, Michael, Beal, Matthew, and Blei, David. 2006. Hierarchical Dirichlet processes. Journal of the American Statistical Association 101: 1566–81.Google Scholar

Teh, Y. W., Newman, D., and Welling, M. 2007. A collapsed variational Bayesian inference algorithm for latent Dirichlet allocation. Advances in Neural Information Processing Systems 19: 1353.Google Scholar

Trier, Shawn, and Jackman, Simon. 2008. Democracy as a latent variable. American Journal of Political Science 52: 201–17.Google Scholar

Wang, Bo, and Titterington, D. M. 2004. Convergence and asymptotic normality of variational Bayesian approximations for exponential family models with missing values. Proceedings of the 20th Conference on Uncertainty in Artificial Intelligence 20: 577–84.Google Scholar

Western, Bruce, and Jackman, Simon. 1994. Bayesian inference for comparative research. American Political Science Review 88: 412–23.Google Scholar

Grimmer supplementary material

Supplementary Material

PDF 303.8 KB

Article contents

An Introduction to Bayesian Inference via Variational Approximations

Abstract

References

Grimmer supplementary material

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests