Hostname: page-component-cd9895bd7-jkksz Total loading time: 0 Render date: 2024-12-27T22:44:16.338Z Has data issue: false hasContentIssue false

Using Split Samples to Improve Inference on Causal Effects

Published online by Cambridge University Press:  18 September 2017

Marcel Fafchamps
Affiliation:
Stanford University, Freeman Spogli Institute for International Studies, Encina Hall E105, Stanford, CA 94305, USA. Email: [email protected]
Julien Labonne*
Affiliation:
Blavatnik School of Government, University of Oxford Radcliffe Observatory Quarter, Woodstock Road, Oxford, OX2 6GG, UK. Email: [email protected]

Abstract

We discuss a statistical procedure to carry out empirical research that combines recent insights about preanalysis plans (PAPs) and replication. Researchers send their datasets to an independent third party who randomly generates training and testing samples. Researchers perform their analysis on the training sample and are able to incorporate feedback from both colleagues, editors, and referees. Once the paper is accepted for publication the method is applied to the testing sample and it is those results that are published. Simulations indicate that, under empirically relevant settings, the proposed method delivers more power than a PAP. The effect mostly operates through a lower likelihood that relevant hypotheses are left untested. The method appears better suited for exploratory analyses where there is significant uncertainty about the outcomes of interest. We do not recommend using the method in situations where the treatment are very costly and thus the available sample size is limited. An interpretation of the method is that it allows researchers to perform direct replication of their work. We also discuss a number of practical issues about the method’s feasibility and implementation.

Type
Articles
Copyright
Copyright © The Author(s) 2017. Published by Cambridge University Press on behalf of the Society for Political Methodology. 

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Footnotes

Author’s note: We thank Michael Alvarez (Co-Editor), two anonymous referees, Rob Garlick and Kate Vyborny for discussions and comments. All remaining errors are ours. Replication data are available on the Harvard Dataverse (Fachamps and Labonne 2017). Supplementary materials for this article are available on the Political Analysis Web site.

Contributing Editor: R. Michael Alvarez

References

Anderson, Michael L. 2008. Multiple inference and gender differences in the effects of early intervention: A reevaluation of the abecedaian, perry preschool, and early training projects. Journal of the American Statistical Association 103(484):14811495.Google Scholar
Athey, Susan, and Imbens, Guido. 2015. Machine learning methods for estimating heterogeneous causal effects. Stanford University. Mimeo.Google Scholar
Bell, Mark, and Miller, Nicholas. 2015. Questioning the effect of nuclear weapons on conflict. Journal of Conflct Resolution 59(1):7492.Google Scholar
Belloni, Alexandre, Chernozhukov, Victor, and Hansen, Christian. 2014. High-dimensional methods and inference on structural and treatment effects. Journal of Economic Perspectives 28(2):2950.Google Scholar
Benjamini, Yoav, Krieger, Abba M., and Yekutieli, Daniel. 2006. Adaptive linear step-up procedures that control the false discovery rate. Biometrika 93(3):491507.Google Scholar
Benjamini, Yoav, and Yekutieli, Daniel. 2001. The control of the false discovery rate in multiple testing under dependency. The Annals of Statistics 29(4):11651188.Google Scholar
Benjamini, Yoav, and Hochberg, Yosef. 1995. Controlling the false discovery rate: A pactrical and powerful approach to multiple testing. Journal of the Royal Statistical Society. Series B (Methodological) 57(1):289300.Google Scholar
Blair, Graeme, Cooper, Jasper, Coppock, Alexander, and Humphreys, Macartan. 2016. Declaring and diagnosing research designs. Columbia University. Mimeo.Google Scholar
Brodeur, Abel, Le, Mathias, Sangnier, Marc, and Zylberberg, Yanos. 2016. Star wars: The empirics strike back. American Economic Journal: Applied Economics 8(1):132.Google Scholar
Coffman, Lucas C., and Niederle, Muriel. 2015. Pre-analysis plans are not the solution replications might be. Journal of Economic Perspectives 29(3):8198.Google Scholar
Dunning, Thad. 2016. Transparency, replication, and cumulative learning: What experiments alone cannot achieve. Annual Review of Political Science 19(1):S1S23.Google Scholar
Einav, Liran, and Levin, Jonathan. 2014. Economics in the age of big data. Science 346(6210):715.Google Scholar
Fachamps, Marcel, and Labonne, Julien. 2017. Replication data for “Using split samples to improve inference on causal effects”. doi:10.7910/DVN/Q0IXQY, Harvard Dataverse, V1.Google Scholar
Findley, Michael G., Jensen, Nathan M., Malesky, Edmund J., and Pepinsky, Thomas B.. Forthcoming. Can results-free review reduce publication bias? The results and implications of a pilot study. Comparative Political Studies.Google Scholar
Franco, Annie, Malhotra, Neil, and Simonovits, Gabor. 2014. Publication bias in the social sciences: Unlocking the file drawer. Science 345(6203):15021505.Google Scholar
Gelman, Andrew. 2014. Preregistration: What’s in it for you? http://andrewgelman.com/2014/03/10/ preregistration-whats/.Google Scholar
Gelman, Andrew. 2015. The connection between varying treatment effects and the crisis of unreplicable research. Journal of Management 41(2):632643.Google Scholar
Gelman, Andrew, Carlin, John, Stern, Hal, Dunson, David, Vehtari, Aki, and Rubin, Donald. 2013. Bayesian data analysis . 3rd edn. London: Chapman and Hall/CRC.Google Scholar
Gerber, Alan, and Malhotra, Neil. 2008. Do statistical reporting standards affect what is published? Publication bias in two leading political science journals. Quaterly Journal of Political Science 3(3):313326.Google Scholar
Gerber, Alan S., Green, Donald P., and Nickerson, David. 2001. Testing for Publication Bias in Political Science. Political Analysis 9(4):385392.Google Scholar
Green, Don, Humphreys, Macartan, and Smith, Jenny. 2013. Read it, understand it, believe it, use it: Principles and proposals for a more credible research publication. Columbia University. mimeo.Google Scholar
Grimmer, Justin. 2015. We are all social scientists now: How big data, machine learning, and causal inference work together. PS: Political Science & Politics 48(1):8083.Google Scholar
Hainmueller, Jens, and Hazlett, Chad. 2013. Kernel regularized least squares: Reducing misspecification bias with a flexible and interpretable machine learning approach. Political Analysis 22(2):143168.Google Scholar
Hartman, Erin, and Hidalgo, F. Daniel. 2015. What’s the alternative?: An equivalence approach to balance and placebo tests. UCLA. mimeo.Google Scholar
Humphreys, Macartan, Sanchez de la Sierra, Raul, and van der Windt, Peter. 2013. Fishing, commitment, and communication: A proposal for comprehensive nonbinding research registration. Political Analysis 21(1):120.Google Scholar
Ioannidis, John. 2005. Why most published research findings are false. PLOS Medicine 2(8):e124.Google Scholar
Laitin, David D. 2013. Fisheries management. Political Analysis 21:4247.Google Scholar
Leamer, Edward. 1974. False models and post-data model construction. Journal of the American Statistical Association 69(345):122131.Google Scholar
Leamer, Edward. 1978. Specification searches. Ad hocinference with nonexperimental data . New York, NY: Wiley.Google Scholar
Leamer, Edward. 1983. Let’s take the Con out of econometrics. American Economic Review 73(1):3143.Google Scholar
Lin, Winston, and Green, Donald P.. 2016. Standard operating procedures: A safety net for pre-analysis plans. PS: Political Science & Politics 49(3):495500.Google Scholar
Lovell, M. 1983. Data mining. Review of Economic and Statistics 65(1):112.Google Scholar
Miguel, E., Camerer, C., Casey, K., Cohen, J., Esterling, K. M., Gerber, A., Glennerster, R., Green, D. P., Humphreys, M., Imbens, G., Laitin, D., Madon, T., Nelson, L., Nosek, B. A., Petersen, M., Sedlmayr, R., Simmons, J. P., Simonsohn, U., and Van der Laan, M.. 2014. Promoting transparency in social science research. Science 343(6166):3031.Google Scholar
Monogan, James E. 2015. Research preregistration in political science: The case, counterarguments, and a response to critiques. PS: Political Science & Politics 48(3):425429.Google Scholar
Nyhan, Brendan. 2015. Increasing the credibility of political science research: A proposal for journal reforms. PS: Political Science & Politics 48(S1):7883.Google Scholar
Olken, Benjamin. 2015. Pre-analysis plans in economics. Journal of Economic Perspectives 29(3):6180.Google Scholar
Pepinsky, Tom. 2013. The perilous peer review process. http://tompepinsky.com/2013/09/16/the-perilous- peer-review-process/.Google Scholar
Rauchhaus, Robert. 2009. Evaluating the nuclear peace hypothesis a quantitative approach. Journal of Conflict Resolution 53(2):258277.Google Scholar
Sankoh, A. J., Huque, M. F., and Dubey, S. D.. 1997. Some comments on frequently used multiple endpoint adjustment methods in clinical trials. Statistics in Medicine 16(22):25292542.Google Scholar
Supplementary material: File

Fafchamps and Labonne supplementary material

Fafchamps and Labonne supplementary material 1

Download Fafchamps and Labonne supplementary material(File)
File 167.8 KB