Hostname: page-component-745bb68f8f-s22k5 Total loading time: 0 Render date: 2025-01-12T10:56:25.386Z Has data issue: false hasContentIssue false

Matching as Nonparametric Preprocessing for Reducing Model Dependence in Parametric Causal Inference

Published online by Cambridge University Press:  04 January 2017

Daniel E. Ho
Affiliation:
Stanford Law School, 559 Nathan Abbott Way, Stanford, CA 94305. e-mail: [email protected]
Kosuke Imai
Affiliation:
Department of Politics, Princeton University, Princeton, NJ 08544. e-mail: [email protected]
Gary King
Affiliation:
Department of Government, Harvard University, 1737 Cambridge Street, Cambridge, MA 02138. e-mail: [email protected] (corresponding author)
Elizabeth A. Stuart
Affiliation:
Departments of Mental Health and Biostatistics, Johns Hopkins Bloomberg School of Public Health, 624 North Broadway, Room 804, Baltimore, MD 21205. e-mail: [email protected]
Rights & Permissions [Opens in a new window]

Abstract

Core share and HTML view are not available for this content. However, as you have access to this content, a full PDF is available via the ‘Save PDF’ action button.

Although published works rarely include causal estimates from more than a few model specifications, authors usually choose the presented estimates from numerous trial runs readers never see. Given the often large variation in estimates across choices of control variables, functional forms, and other modeling assumptions, how can researchers ensure that the few estimates presented are accurate or representative? How do readers know that publications are not merely demonstrations that it is possible to find a specification that fits the author's favorite hypothesis? And how do we evaluate or even define statistical properties like unbiasedness or mean squared error when no unique model or estimator even exists? Matching methods, which offer the promise of causal inference with fewer assumptions, constitute one possible way forward, but crucial results in this fast-growing methodological literature are often grossly misinterpreted. We explain how to avoid these misinterpretations and propose a unified approach that makes it possible for researchers to preprocess data with matching (such as with the easy-to-use software we offer) and then to apply the best parametric techniques they would have used anyway. This procedure makes parametric models produce more accurate and considerably less model-dependent causal inferences.

Type
Research Article
Copyright
Copyright © The Author 2007. Published by Oxford University Press on behalf of the Society for Political Methodology 

References

Abadie, Alberto, Druckker, David, Herr, Jane Leber, and Imbens, Guido W. 2002. Implementing matching estimators for average treatment effects in stata. The Stata Journal 1: 118.Google Scholar
Abadie, Alberto, and Imbens, Guido. 2006a. Estimation of the conditional variance in paired experiments. KSG working paper. http://ksghome.harvard.edu/∼.aabadie.academic.ksg/cve.pdf (accessed September 1, 2006).Google Scholar
Abadie, Alberto, and Imbens, Guido. 2006b. Large sample properties of matching estimators for average treatment effects. Econometrica 74: 235–67.Google Scholar
Becker, Sascha O., and Ichino, Andrea. 2002. Estimation of average treatment effects based on propensity scores. The Stata Journal 2: 358–77.CrossRefGoogle Scholar
Bergstralh, Erik, and Kosanke, Jon. 2003. dist, gmatch, and vmatch: SAS macros. Mayo Clinic, Division of Biostatistics. http://mayoresearch.mayo.edu/mayo/research/biostat/sasmacros.cfm (accessed September 1, 2006).Google Scholar
Bishop, Christopher M. 1995. Neural networks for pattern recognition. Oxford: Oxford University Press.Google Scholar
Black, Dan A., and Smith, Jeffrey A. 2004. How robust is the evidence on the effects of college quality? Evidence from matching. Journal of Econometrics 121: 99124.Google Scholar
Carpenter, Daniel P. 2002. Groups, the media, agency waiting costs, and FDA drug approval. American Journal of Political Science 46 (July): 490505.CrossRefGoogle Scholar
Cochran, William G. 1968. The effectiveness of adjustment by subclassification in removing bias in observational studies. Biometrics 24: 295313.CrossRefGoogle ScholarPubMed
Cochran, William G., and Rubin, Donald B. 1973. Controlling bias in observational studies: A review. Sankhya: The Indian Journal of Statistics, Series A 35 (Part 4): 417–66.Google Scholar
Cox, David R. 1958. Planning of experiments. New York: John Wiley.Google Scholar
Dehejia, Rajeev H., and Wahba, Sadek. 1999. Causal effects in nonexperimental studies: Re-evaluating the evaluation of training programs. Journal of the American Statistical Association 94 (December): 1053–62.CrossRefGoogle Scholar
Diamond, Alexis, and Sekhon, Jasjeet. 2005. Genetic matching for estimating causal effects: A new method of achieving balance in observational studies. http://sekhon.berkeley.edu/ (accessed September 1, 2006).Google Scholar
Fisher, Ronald A. 1935. The design of experiments. London: Oliver and Boyd.Google Scholar
Frölich, Markus. 2004. Finite sample properties of propensity score matching and weighting estimators. Review of Econometrics and Statistics 86: 7790.Google Scholar
Glazerman, Steve, Levy, Dan M., and Myers, David. 2003. Nonexperimental versus experimental estimates of earnings impacts. The Annals of the American Academy of Political and Social Science 589 (September): 6393.Google Scholar
Goldberger, Arthur. 1991. A course in econometrics. Cambridge, MA: Harvard University Press.Google Scholar
Gu, Xing S., and Rosenbaum, Paul R. 1993. Comparison of multivariate matching methods: Structures, distances, and algorithms. Journal of Computational and Graphical Statistics 2: 405–20.Google Scholar
Hansen, Ben B. 2004. Full matching in an observational study of coaching for the SAT. Journal of the American Statistical Association 99: 609–18.Google Scholar
Hansen, Ben B. 2005. Optmatch: Software for optimal matching. http://www.stat.lsa.umich.edu/∼bh/optmatch.html (accessed September 1, 2006).Google Scholar
Heckman, James J., Ichimura, Hidehiko, Smith, Jeffrey A., and Todd, Petra E. 1998. Characterizing selection bias using experimental data. Econometrica 66: 1017–98.Google Scholar
Heckman, James J., Ichimura, Hidehiko, and Todd, Petra E. 1997. Matching as an econometric evaluation estimator: Evidence from evaluating a job training program. Review of Economic Studies 64: 605–54.Google Scholar
Heckman, James J., and Robb, Richard. 1985. Alternative methods for evaluating the impacts of interventions. In Longitudinal analysis of labor market data, ed. Heckman, J. and Singer, B.; Cambridge: Cambridge University Press.Google Scholar
Hirano, Keisuke, Imbens, Guido W., and Ridder, Geert. 2003. Efficient estimation of average treatment effects using the estimated propensity score. Econometrica 71 (July): 1161–89.Google Scholar
Ho, Daniel E., Imai, Kosuke, King, Gary, and Stuart, Elizabeth A. 2006. Replication data set for ‘matching as nonparametric preprocessing for reducing model dependence in parametric causal inference’. http://id.thedata.org/hdl%3A1902.1%2FYVDZEQIYDS hdl:1902.1/YVDZEQIYDS UNF:3:QV0mYCd8eV+mJg WDnYct5g== Murray Research Archive [distributed by DDI].Google Scholar
Hoeting, Jennifer A., Madigan, David, Raftery, Adrian E., and Volinsky, Chris T. 1999. Bayesian model averaging: A tutorial. Statistical Science 14: 382417.Google Scholar
Holland, Paul W. 1986. Statistics and causal inference. Journal of the American Statistical Association 81: 945–60.Google Scholar
Iacus, Stefano, and Porro, Giuseppe. 2006. Random recursive partitioning: A matching method for the estimation of the average treatment effect. UNIMI—Research Papers in Economics, Business, and Statistics. Economics. Working paper 9. http://services.bepress.com/unimi/economics/art9 (accessed September 1, 2006).Google Scholar
Imai, Kosuke, and King, Gary. 2004. Did illegal overseas absentee ballots decide the 2000 U.S. presidential election? Perspectives on Politics 2 (September): 537–49.Google Scholar
Imai, Kosuke, King, Gary, and Lau, Olivia. 2006. Zelig: Everyone's statistical software. http://gking.harvard.edu/zelig (accessed September 1, 2006).Google Scholar
Imai, Kosuke, King, Gary, and Stuart, Elizabeth A. 2006. Misunderstandings among experimentalists and observationalists: Balance test fallacies in causal inference. http://gking.harvard.edu/files/abs/matchse-abs.shtml (accessed September 1, 2006).Google Scholar
Imai, Kosuke, and van Dyk, David A. 2004. Causal inference with general treatment regimes: Generalizing the propensity score. Journal of the American Statistical Association 99 (September): 854–66.CrossRefGoogle Scholar
Imbens, Guido W. 2004. Nonparametric estimation of average treatment effects under exogeneity: A review. Review of Economics and Statistics 86: 429.Google Scholar
King, Gary. 1989. Unifying political methodology: The likelihood theory of statistical inference. Ann Arbor: Michigan University Press.Google Scholar
King, Gary, Honaker, James, Joseph, Anne, and Scheve, Kenneth. 2001. Analyzing incomplete political science data: An alternative algorithm for multiple imputation. American Political Science Review 95 (March): 4969. http://gking.harvard.edu/files/abs/evil-abs.shtml (accessed September 1, 2006).CrossRefGoogle Scholar
King, Gary, Keohane, Robert O., and Verba, Sidney. 1994. Designing social inquiry: Scientific inference in qualitative research. Princeton, NJ: Princeton University Press.Google Scholar
King, Gary, and Zeng, Langche. 2006. The dangers of extreme counterfactuals. Political Analysis 14: 131–59. http://gking.harvard.edu/files/abs/counterft-abs.shtml (accessed September 1, 2006).Google Scholar
King, Gary, and Zeng, Langche. 2007. When can history be our guide? The pitfalls of counterfactual inference. International Studies Quarterly (March). http://gking.harvard.edu/files/abs/counterf-abs.shtml.Google Scholar
Koch, Jeffrey M. 2002. Gender stereotypes and citizens' impressions of house candidates' ideological orientation. American Journal of Political Science 46: 453–62.Google Scholar
Lechner, Michael. 2000. A note on the common support problem in applied evaluation studies. University of St. Galen. http://www.siaw.unisg.ch/lechner (accessed September 1, 2006).Google Scholar
Leuven, Edwin, and Sianesi, Barbara. 2004. PSMATCH2: Stata module to perform full Mahalanobis and propensity score matching, common support graphing, and covariate imbalance testing. EconPapers. http://econpapers.repec.org/software/bocbocode/S432001.htm (accessed September 1, 2006).Google Scholar
Lewis, David K. 1973. Counterfactuals. Cambridge, MA: Harvard University Press.Google Scholar
McCullagh, Peter, and Nelder, James A. 1989. Generalized linear models. 2nd ed. Monograph on statistics and applied probability 37. New York (NY): Chapman & Hall/CRC.Google Scholar
Meng, Xiao-Li, and Romero, Martin. 2003. Discussion: Efficiency and self-efficiency with multiple imputation inference. International Statistical Review 71: 607–18.Google Scholar
Neyman, Jerzy. 1935. Statistical problems in agricultural experiments. Supplement to the Journal of the Royal Statistical Society 2: 107–80.Google Scholar
Parsons, Lori S. 2000. Using SAS software to perform a case-control match on propensity score in an observational study. http://www2.sas.com/proceedings/sugi25/25/po/25p225.pdf (accessed September 1, 2006).Google Scholar
Parsons, Lori S. 2001. Reducing bias in a propensity score matched-pair sample using greedy matching techniques.Google Scholar
Quandt, Richard. 1972. Methods of estimating switching regressions. Journal of the American Statistical Association 67: 306–10.CrossRefGoogle Scholar
Robins, James M., and Rotnitzky, Andrea. 2001. Comment on the Peter J. Bickel and Jaimyoung Kwon, ‘Inference for semiparametric models: Some questions and an answer’. Statistica Sinica 11: 920–36.Google Scholar
Robins, James M., and Rotnitzky, Andrea. Forthcoming. Inverse probability weighting estimation in survival analysis. Encyclopedia of Biostatistics.Google Scholar
Rosenbaum, Paul R. 1984. The consequences of adjusting for a concomitant variable that has been affected by the treatment. Journal of the Royal Statistical Society, Series A 147: 656–66.Google Scholar
Rosenbaum, Paul R. 1986. Dropping out of high school in the United States: An observational study. Journal of Educational Statistics 11: 207–24.CrossRefGoogle Scholar
Rosenbaum, Paul R. 1989. Optimal matching for observational studies. Journal of the American Statistical Association 84: 1024–32.Google Scholar
Rosenbaum, Paul R. 2002. Observational studies. 2nd ed. New York: Springer.Google Scholar
Rosenbaum, Paul R., and Rubin, Donald B. 1983. The central role of the propensity score in observational studies for causal effects. Biometrika 70: 4155.Google Scholar
Rosenbaum, Paul R., and Rubin, Donald B. 1984. Reducing bias in observational studies using subclassification on the propensity score. Journal of the American Statistical Association 79: 516–24.Google Scholar
Rosenbaum, Paul R., and Rubin, Donald B. 1985. The bias due to incomplete matching. Biometrics 41: 103–16.Google Scholar
Rosenbaum, Paul R., and Silber, Jeffrey H. 2001. Matching and thick description in an observational study of mortality after surgery. Biostatistics 2: 217–32.Google Scholar
Roy, A. D. 1951. Some thoughts on the distribution of earnings. Oxford Economic Papers 3: 135–46.Google Scholar
Rubin, Donald B. 1973. The use of matched sampling and regression adjustment to remove bias in observational studies. Biometrics 29: 185203.CrossRefGoogle Scholar
Rubin, Donald B. 1974. Estimating causal effects of treatments in randomized and nonrandomized studies. Journal of Educational Psychology 6: 688701.Google Scholar
Rubin, Donald B. 1978. Bayesian inference for causal effects: The role of randomization. The Annals of Statistics 6: 3458.Google Scholar
Rubin, Donald B. 1979. Using multivariate matched sampling and regression adjustment to control bias in observational studies. Journal of the American Statistical Association 74: 318–28.Google Scholar
Rubin, Donald B. 1987. Multiple imputation for nonresponse in surveys. New York: John Wiley.Google Scholar
Rubin, Donald B. 2001. Using propensity scores to help design observational studies: Application to the tobacco litigation. Health Services & Outcomes Research Methodology 2 (December): 169–88.Google Scholar
Rubin, Donald B., and Stuart, Elizabeth A. 2006. Affinely invariant matching methods with discriminant mixtures of proportional ellipsoidally symmetric distributions. Annals of Statistics 34: 18141826.Google Scholar
Rubin, Donald B., and Thomas, Neal. 1992. Characterizing the effect of matching using linear propensity score methods with normal distributions. Biometrika 79: 797809.Google Scholar
Rubin, Donald B., and Thomas, Neal. 1996. Matching using estimated propensity scores, relating theory to practice. Biometrics 52: 249–64.Google Scholar
Rubin, Donald B., and Thomas, Neal. 2000. Combining propensity score matching with additional adjustments for prognostic covariates. Journal of the American Statistical Association 95: 573–85.Google Scholar
Sekhon, Jasjeet S. 2004. Multivariate and propensity score matching software. http://jsekhon.fas.harvard.edu/matching/ (accessed September 1, 2006).Google Scholar
Smith, Herbert L. 1997. Matching with multiple controls to estimate treatment effects in observational studies. Sociological Methodology 27: 325–53.Google Scholar
Smith, Jeffrey A., and Todd, Petra E. 2005. Does matching overcome LaLonde's critique of nonexperimental estimators? Journal of Econometrics 125 (March-April): 305–53.Google Scholar
Stuart, Elizabeth A. 2004. Matching methods for estimating causal effects using multiple control groups. Ph.D. thesis, Department of Statistics, Harvard University.Google Scholar