Generalized Full Matching

Fredrik Sävje; Michael J. Higgins; Jasjeet S. Sekhon

doi:10.1017/pan.2020.32

Generalized Full Matching

Published online by Cambridge University Press: 23 November 2020

and

Fredrik Sävje*: Affiliation:
Department of Political Science, and Department of Statistics & Data Science, Yale University, New Haven, CT, USA. Email: [email protected]
Michael J. Higgins: Affiliation:
Department of Statistics, Kansas State University, Manhattan, KS, USA. Email: [email protected]
Jasjeet S. Sekhon: Affiliation:
Travers Department of Political Science, and Department of Statistics, UC Berkeley, Berkeley, CA, USA. Email: [email protected]
*: Corresponding author Fredrik Sävje

Article contents

Abstract
Footnotes
References

Get access

Rights & Permissions

Abstract

Matching is a conceptually straightforward method to make groups of units comparable on observed characteristics. The method is, however, limited to settings where the study design is simple and the sample is moderately sized. We illustrate these limitations by asking what the causal effects would have been if a large-scale voter mobilization experiment that took place in Michigan for the 2006 election were scaled up to the full population of registered voters. Matching could help us answer this question, but no existing matching method can accommodate the six treatment arms and the 6,762,701 observations involved in the study. To offer a solution for this and similar empirical problems, we introduce a generalization of the full matching method that can be used with any number of treatment conditions and complex compositional constraints. The associated algorithm produces near-optimal matchings; the worst-case maximum within-group dissimilarity is guaranteed to be no more than four times greater than the optimal solution, and simulation results indicate that it comes considerably closer to the optimal solution on average. The algorithm’s ability to balance the treatment groups does not sacrifice speed, and it uses little memory, terminating in linearithmic time using linear space. This enables investigators to construct well-performing matchings within minutes even in complex studies with samples of several million units.

Keywords

causal inference matching methods treatment effects

Type: Article
Information: Political Analysis , Volume 29 , Issue 4 , October 2021 , pp. 423 - 447

DOI: https://doi.org/10.1017/pan.2020.32 [Opens in a new window]
Copyright: © The Author(s) 2020. Published by Cambridge University Press on behalf of the Society for Political Methodology

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Footnotes

Edited by Daniel Hopkins

References

Abadie, A., and Imbens, G. W.. 2006. “Large Sample Properties of Matching Estimators for Average Treatment Effects.” Econometrica 74(1):235–267.CrossRef Google Scholar

Arya, S., Mount, D. M., Netanyahu, N. S., Silverman, R., and Wu, A. Y.. 1998. “An Optimal Algorithm for Approximate Nearest Neighbor Searching Fixed Dimensions.” Journal of the ACM 45(6):891–923.CrossRef Google Scholar

Bennett, M., Vielma, J. P., and Zubizarreta, J. R.. 2020. “Building Representative Matched Samples with Multi-valued Treatments in Large Observational Studies.” Journal of Computational and Graphical Statistics. doi:10.1080/10618600.2020.1753532.CrossRef Google Scholar

Buchanan, A. L., et al. 2018. “Generalizing Evidence from Randomized Trials Using Inverse Probability of Sampling Weights.” Journal of the Royal Statistical Society: Series A (Statistics in Society) 181(4): 1193–1209.CrossRef Google Scholar PubMed

Cochran, W. G., and Rubin, D. B.. 1973. “Controlling Bias in Observational Studies: A Review.” Sankhyā: The Indian Journal of Statistics, Series A 35(4):417–446.Google Scholar

Dehejia, R., Pop-Eleches, C., and Samii, C.. 2019. “From Local to Global: External Validity in a Fertility Natural Experiment.” Journal of Business and Economic Statistics. doi:10.1080/07350015.2019.1639407.Google Scholar

Diamond, A., and Sekhon, J. S.. 2013. “Genetic Matching for Estimating Causal Effects: A General Multivariate Matching Method for Achieving Balance in Observational Studies.” Review of Economics and Statistics 95(3):932–945.CrossRef Google Scholar

Downs, A. 1957. An Economic Theory of Democracy. New York: Harper & Row.Google Scholar

Friedman, J. H., Bentley, J. L., and Finkel, R. A.. 1977. “An Algorithm for Finding Best Matches in Logarithmic Expected Time.” ACM Transactions on Mathematical Software 3(3):209–226.CrossRef Google Scholar

Gerber, A. S., Green, D. P., and Larimer, C. W.. 2008. “Social Pressure and Voter Turnout: Evidence from a Large-Scale Field Experiment.” American Political Science Review 102(1):33–48.CrossRef Google Scholar

Graham, B. S., De Xavier Pinto, C. C., and Egel, D.. 2012. “Inverse Probability Tilting for Moment Condition Models with Missing Data.” The Review of Economic Studies 79(3):1053–1079.CrossRef Google Scholar

Hainmueller, J. 2012. “Entropy Balancing for Causal Effects: A Multivariate Reweighting Method to Produce Balanced Samples in Observational Studies.” Political Analysis 20(1):25–46.CrossRef Google Scholar

Hansen, B. B. 2004. “Full Matching in an Observational Study of Coaching for the SAT.” Journal of the American Statistical Association 99(467):609–618.CrossRef Google Scholar

Hansen, B. B., and Klopfer, S. O.. 2006. “Optimal Full Matching and Related Designs Via Network Flows.” Journal of Computational and Graphical Statistics 15(3):609–627.CrossRef Google Scholar

Hartman, E., Grieve, R., Ramsahai, R., and Sekhon, J. S.. 2015. “From Sample Average Treatment Effect to Population Average Treatment Effect on the Treated: Combining Experimental with Observational Studies to Estimate Population Treatment Effects.” Journal of the Royal Statistical Society: Series A (Statistics in Society) 178(3):757–778.CrossRef Google Scholar

Higgins, M. J., Sävje, F., and Sekhon, J. S.. 2016. “Improving Massive Experiments with Threshold Blocking.” Proceedings of the National Academy of Sciences 113(27):7369–7376.CrossRef Google Scholar PubMed

Ho, D. E., Imai, K., King, G., and Stuart, E. A.. 2007. “Matching as Nonparametric Preprocessing for Reducing Model Dependence in Parametric Causal Inference.” Political Analysis 15(03):199–236.CrossRef Google Scholar

Iacus, S. M., King, G., and Porro, G.. 2011. “Multivariate Matching Methods That Are Monotonic Imbalance Bounding.” Journal of the American Statistical Association 106(493):345–361.CrossRef Google Scholar

Iacus, S. M., King, G., and Porro, G.. 2012. “Causal Inference Without Balance Checking: Coarsened Exact Matching.” Political Analysis 20(1):1–24.CrossRef Google Scholar

Imai, K., and Ratkovic, M.. 2014. “Covariate Balancing Propensity Score.” Journal of the Royal Statistical Society: Series B (Statistical Methodology) 76(1):243–263.CrossRef Google Scholar

Imbens, G. W., and Rubin, D. B.. 2015. Causal Inference for Statistics, Social, and Biomedical Sciences: An Introduction. New York: Cambridge University Press.CrossRef Google Scholar

Kern, H. L., Stuart, E. A., Hill, J., and Green, D. P.. 2016. “Assessing Methods for Generalizing Experimental Impact Estimates to Target Populations.” Journal of Research on Educational Effectiveness 9(1):103–127.CrossRef Google Scholar PubMed

Li, S., Vlassis, N., Kawale, J., and Fu, Y.. 2016. “Matching via dimensionality reduction for estimation of treatment effects in digital marketing campaigns.” In Proceedings of the 25th International Joint Conference on Artificial Intelligence, 3768–3774.Google Scholar

Pimentel, S. D., Kelz, R. R., Silber, J. H., and Rosenbaum, P. R.. 2015. “Large, Sparse Optimal Matching with Refined Covariate Balance in an Observational Study of the Health Outcomes Produced by New Surgeons.” Journal of the American Statistical Association 110(510):515–527.CrossRef Google Scholar

Rosenbaum, P. R. 1991. “A Characterization of Optimal Designs for Observational Studies.” Journal of the Royal Statistical Society. Series B (Methodological) 53(3):597–610.CrossRef Google Scholar

Rosenbaum, P. R. 2002. Observational Studies. 2nd edn. New York: Springer.CrossRef Google Scholar

Rosenbaum, P. R. 2010. Design of Observational Studies. New York: Springer.CrossRef Google Scholar PubMed

Rosenbaum, P. R. 2017. “Imposing Minimax and Quantile Constraints on Optimal Matching in Observational Studies.” Journal of Computational and Graphical Statistics 26(1):66–78.CrossRef Google Scholar

Rosenbaum, P. R., Ross, R. N., and Silber, J. H.. 2007. “Minimum Distance Matched Sampling with Fine Balance in an Observational Study of Treatment for Ovarian Cancer.” Journal of the American Statistical Association 102(477):75–83.CrossRef Google Scholar

Rosenbaum, P. R., and Rubin, D. B.. 1983. “The Central Role of the Propensity Score in Observational Studies for Causal Effects.” Biometrika 70(1):41–55.CrossRef Google Scholar

Sävje, F., Higgins, M., and Sekhon, J.. 2020. “Replication Data for: Generalized Full Matching.” https://doi.org/10.7910/DVN/1YIX0D, Harvard Dataverse, V1.CrossRef Google Scholar

Sekhon, J. S. 2011. “Multivariate and Propensity Score Matching Software with Automated Balance Optimization: The Matching Package for R.” Journal of Statistical Software 42(7):1–52.CrossRef Google Scholar

Silber, J. H., et al. 2014. “Template Matching for Auditing Hospital Cost and Quality.” Health Services Research 49(5):1446–1474.CrossRef Google Scholar PubMed

Sipser, M. 2012. Introduction to the Theory of Computation. 3rd edn. Boston, MA: Cengage.Google Scholar

Stuart, E. A. 2010. “Matching Methods for Causal Inference: A Review and a Look Forward.” Statistical Science 25(1):1–21.CrossRef Google Scholar

Stuart, E. A., Cole, S. R., Bradshaw, C. P., and Leaf, P. J.. 2011. “The Use of Propensity Scores to Assess the Generalizability of Results from Randomized Trials.” Journal of the Royal Statistical Society: Series A (Statistics in Society) 174(2):369–386.CrossRef Google Scholar

Tipton, E. 2013. “Improving Generalizations from Experiments Using Propensity Score Subclassification: Assumptions, Properties, and Contexts.” Journal of Educational and Behavioral Statistics 38(3):239–266.CrossRef Google Scholar

Yu, R., Silber, J. H., and Rosenbaum, P. R.. 2019. “Matching Methods for Observational Studies Derived from Large Administrative Databases.” Statistical Science 35(3):338–355.Google Scholar

Zubizarreta, J. R. 2012. “Using Mixed Integer Programming for Matching in an Observational Study of Kidney Failure After Surgery.” Journal of the American Statistical Association 107(500):1360–1371.CrossRef Google Scholar

Sävje et al. supplementary material

PDF 271.6 KB

Article contents

Generalized Full Matching

Abstract

Keywords

Access options

Footnotes

References

Sävje et al. supplementary material

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests