Skip to main content Accessibility help
×
Hostname: page-component-745bb68f8f-b95js Total loading time: 0 Render date: 2025-01-28T23:19:46.025Z Has data issue: false hasContentIssue false

External Validity and Evidence Accumulation

Published online by Cambridge University Press:  29 November 2024

Tara Slough
Affiliation:
New York University
Scott A. Tyson
Affiliation:
University of Rochester, New York

Summary

The accumulation of empirical evidence that has been collected in multiple contexts, places, and times requires a more comprehensive understanding of empirical research than is typically required for interpreting the findings from individual studies. We advance a novel conceptual framework where causal mechanisms are central to characterizing social phenomena that transcend context, place, or time. We distinguish various concepts of external validity, all of which characterize the relationship between the effects produced by mechanisms in different settings. Approaches to evidence accumulation require careful consideration of cross-study features, including theoretical considerations that link constituent studies and measurement considerations about how phenomena are quantifed. Our main theoretical contribution is developing uniting principles that constitute the qualitative and quantitative assumptions that form the basis for a quantitative relationship between constituent studies. We then apply our framework to three approaches to studying general social phenomena: meta-analysis, replication, and extrapolation.
Get access
Type
Element
Information
Online ISBN: 9781009375856
Publisher: Cambridge University Press
Print publication: 02 January 2025

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

Allcott, H. (2015). Site selection bias in program evaluation. The Quarterly Journal of Economics, 130(3), 11171165.CrossRefGoogle Scholar
Andrews, I., & Oster, E. (2019). A simple approximation for evaluating external validity bias. Economics Letters, 178, 5862.CrossRefGoogle Scholar
Angrist, J. D., & Pischke, J.-S. (2008). Mostly harmless econometrics: An empiricist’s companion. Princeton University Press.CrossRefGoogle Scholar
Angrist, J. D., & Pischke, J.-S. (2010). The credibility revolution in empirical economics: How better research design is taking the con out of econometrics. Journal of Economic Perspectives, 24(2), 330.CrossRefGoogle Scholar
Arias, E., Larreguy, H., Marshall, J., & Querubin, P. (2022). Priors rule: When do malfeasance revelations help or hurt incumbent parties? Journal of the European Economic Association, 20(4), 14331477.CrossRefGoogle Scholar
Aronow, P. M., & Miller, B. T. (2019). Foundations of agnostic statistics. Cambridge University Press.CrossRefGoogle Scholar
Ashworth, S. (2012). Electoral accountability: Recent theoretical and empirical work. Annual Review of Political Science, 15(1), 183201.CrossRefGoogle Scholar
Ashworth, S., Berry, C. R., & de Mesquita, E. B. (2021). Theory and credibility: Integrating theoretical and empirical social science. Princeton University Press.Google Scholar
Ashworth, S., Bueno de Mesquita, E., & Friedenberg, A. (2018). Learning about voter rationality. American Journal of Political Science, 62(1), 3754.CrossRefGoogle Scholar
Banerjee, A., & Duflo, E. (2009). The experimental approach to development economics. Annual Review of Economics, 1, 151178.CrossRefGoogle Scholar
Banerjee, A., Duflo, E., Goldberg, N., et al. (2015). A multifaceted program causes lasting progress for the very poor: Evidence from six countries. Science, 348(6236), 117.CrossRefGoogle ScholarPubMed
Berger, R. L. (1982). Multiparameter hypothesis testing and acceptance sampling. Technometrics, 24(4), 295300.CrossRefGoogle Scholar
Besley, T. (2006). Principled agents?: The political economy of good government. Oxford University Press.Google Scholar
Björkman, M., & Svensson, J. (2009). Power to the people: Evidence from a randomized field experiment on community-based monitoring in Uganda. The Quarterly Journal of Economics, 124(2), 735769.CrossRefGoogle Scholar
Blair, G., Cooper, J., Coppock, A., & Humphreys, M. (2019). Declaring and diagnosing research designs. American Political Science Review, 113(3), 838859.CrossRefGoogle ScholarPubMed
Blair, G., Weinstein, J. M., Christia, F., et al. (2021). Community policing does not build citizen trust in police or reduce crime in the global south. Science, 374(6571), eabd3446.CrossRefGoogle ScholarPubMed
Boas, T. C., Hidalgo, F., & Melo, M. (2019). Norms versus action: Why voters fail to sanction malfeasance in Brazil. American Journal of Political Science, 63(2), 385400.CrossRefGoogle Scholar
Bogen, J., & Woodward, J. (1988). Saving the phenomena. The Philosophical Review, 97(3), 303352.CrossRefGoogle Scholar
Borsboom, D. (2005). Measuring the mind: Conceptual issues in contemporary psychometrics. Cambridge University Press.CrossRefGoogle Scholar
Brinch, C. N., Mogstad, M., & Wiswall, M. (2017). Beyond LATE with a discrete instrument. Journal of Political Economy, 125(4), 9851039.CrossRefGoogle Scholar
Brodeur, A., Cook, N., & Heyes, A. (2020). Methods matter: P-hacking and publication bias in causal analysis in economics. American Economic Review, 110(11), 3634–3600.CrossRefGoogle Scholar
Bueno de Mesquita, E., & Tyson, S. A. (2020). The commensurability problem: Conceptual difficulties in estimating the effect of behavior on behavior. American Political Science Review, 114(2), 375391.CrossRefGoogle Scholar
Camerer, C. F., Dreber, A., Forsell, E., et al. (2016). Evaluating replicability of laboratory experiments in economics. Science, 351(6280), 14331436.CrossRefGoogle ScholarPubMed
Camerer, C. F., Dreber, A., Holzmeister, F., et al. (2018). Evaluating the replicability of social science experiments in nature and science between 2010 and 2015. Nature Human Behaviour, 2(9), 637644.CrossRefGoogle Scholar
Chakravartty, A. (2007). A metaphysics for scientific realism: Knowing the unobservable. Cambridge University Press.CrossRefGoogle Scholar
Chang, H. (2001). Spirit, air, and quicksilver: The search for the “real” scale of temperature. Historical Studies in the Physical and Biological Sciences, 31(2), 249284.CrossRefGoogle ScholarPubMed
Chang, H. (2004). Inventing temperature: Measurement and scientific progress. Oxford University Press.CrossRefGoogle Scholar
Cheung, M. W.-L. (2014). Modeling dependent effect sizes with three-level meta-analyses: A structural equation modeling approach. Psychological Methods, 19(2), 211.CrossRefGoogle ScholarPubMed
Chong, A., de la O, A., Karlan, D., & Wantchekon, L. (2015). Quash the hope? A field experiment in Mexico on voter turnout, choice, and party identification. Journal of Politics, 77(1), 5571.CrossRefGoogle Scholar
Clayton, A., O’Brien, D., & Piscopo, J. M. (2019). All male panels? Representation and democratic legitimacy. American Journal of Political Science, 63(1), 113129.CrossRefGoogle Scholar
Cole, S. R., & Stuart, E. A. (2010). Generalizing evidence from randomized clinical trials to target populations: The actg 320 trial. American Journal of Epidemiology, 172(1), 107115.CrossRefGoogle ScholarPubMed
Collins, H. (1992). Changing order: Replication and induction in scientific practice. University of Chicago Press.Google Scholar
Coppock, A., Hill, S. J., & Vavreck, L. (2020). The small effects of political advertising are small regardless of context, message, sender, or receiver: Evidence from 59 real-time randomized experiments. Science Advances, 6(eabc4046), 16.CrossRefGoogle ScholarPubMed
de la O, A., Green, D. P., John, P., et al. (2021). Fiscal contracts? A six-country randomized experiment on transaction costs, public services, and taxation in developing countries. Working paper. https://nikhargaikwad.com/resources/De-La-O-et-al_2021.pdfGoogle Scholar
Dear, P. (1995). Discipline and experience: The mathematical way in the scientific revolution. University of Chicago Press.CrossRefGoogle Scholar
Deaton, A. (2010). Instruments, randomization, and learning about development. Journal of Economic Literature, 48(2), 424455.CrossRefGoogle Scholar
Deaton, A., & Cartwright, N. (2018). Understanding and misunderstanding randomized controlled trials. Social Science & Medicine, 210, 221.CrossRefGoogle ScholarPubMed
Dehejia, R., Pop-Eleches, C., & Samii, C. (2021). From local to global: External validity in a fertility natural experiment. Journal of Business & Economic Statistics, 39(1), 217243.CrossRefGoogle Scholar
Devaux, M., & Egami, N. (2022, September). Quantifying robustness to external validity bias. (Working paper available at https://naokiegami.com/paper/external_robust.pdf)CrossRefGoogle Scholar
Diaconis, P., & Skyrms, B. (2017). Ten great ideas about chance. Princeton University Press.CrossRefGoogle Scholar
Dunning, T. (2016). Transparency, replication, and cumulative learning: What experiments alone cannot achieve. Annual Review of Political Science, 19, 541563.CrossRefGoogle Scholar
Dunning, T., Grossman, G., Humphreys, M., et al. (2019a). Voter information campaigns and political accountability: Cumulative findings from a preregistered meta-analysis of coordinated trials. Science Advances, 5(7), 110.CrossRefGoogle ScholarPubMed
Dunning, T., Grossman, G., Humphreys, , , M., et al. (Eds.). (2019b). Information, accountability, and cumulative learning: Lessons from Metaketa I. Cambridge University Press.CrossRefGoogle Scholar
Egami, N., & Hartman, E. (2023). Elements of external validity: Framework, design, and analysis. American Political Science Review, 117(3), 10701088.CrossRefGoogle Scholar
Fariss, C. J., & Jones, Z. M. (2018). Enhancing validity in observational settings when replication is not possible. Political Science Research and Methods, 6(2), 365380.CrossRefGoogle Scholar
Ferejohn, J. (1986). Incumbent performance and electoral control. Public Choice, 50, 525.CrossRefGoogle Scholar
Findley, M. G., Kikuta, K., & Denly, M. (2021). External validity. Annual Review of Political Science, 24, 365393.CrossRefGoogle Scholar
Fowler, A., & Montagnes, B. P. (2023). Distinguishing between false positives and genuine results: The case of irrelevant events and elections. The Journal of Politics, 85(1), 304309.CrossRefGoogle Scholar
Gailmard, S. (2021). Theory, history, and political economy. Journal of Historical Political Economy, 1(1), 69104.CrossRefGoogle Scholar
Gechter, M., & Meager, R. (2021). Combining experimental and observational studies in meta-analysis: A mutual debiasing approach. Mimeo. www.personal.psu.edu/mdg5396/MGRM_Combining_Experimental_and_Observational_Studies.pdfGoogle Scholar
Gerber, A. S., & Green, D. P. (2012). Field experiments: Design, analysis and interpretation. W. W. Norton.Google Scholar
Gerber, A. S., & Malhotra, N. (2008). Do statistical reporting standards affect what is Published? Publication bias in two leading political science journals. Quarterly Journal of Political Science, 3(3), 313326.CrossRefGoogle Scholar
Giere, R. N. (2010). Scientific perspectivism. University of Chicago press.Google Scholar
Glennan, S. (1996). Mechanisms and the nature of causation. Erkenntnis, 44(1), 4971.CrossRefGoogle Scholar
Glennan, S. (2017). The new mechanical philosophy. Oxford University Press.CrossRefGoogle Scholar
Godefroidt, A. (2023). How terrorism does (and does not) affect citizens’ political attitudes: A meta-analysis. American Journal of Political Science, 67(1), 2238.CrossRefGoogle Scholar
Graham, M. H., Huber, G. A., Malhotra, N., & Mo, C. H. (2023). How should we think about replicating observational studies? A reply to Fowler and Montagnes. The Journal of Politics, 85(1), 310313.CrossRefGoogle Scholar
Guala, F. (2003). Experimental localism and external validity. Philosophy of Science, 70(5), 11951205.CrossRefGoogle Scholar
Guala, F. (2005). The methodology of experimental economics. Cambridge University Press.CrossRefGoogle Scholar
Hacking, I. (1983). Representing and intervening: Introductory topics in the philosophy of natural science. Cambridge university press.CrossRefGoogle Scholar
Hamming, R. W. (1980). The unreasonable effectiveness of mathematics. The American Mathematical Monthly, 87(2), 8190.CrossRefGoogle Scholar
Hartman, E., & Hidalgo, F. (2018). An equivalence approach to balance and placebo tests. American Journal of Political Science, 62(4), 10001013.CrossRefGoogle Scholar
Heckman, J. J., & Vytlacil, E. (2005). Structural equations, treatment effects, and econometric policy evaluation 1. Econometrica, 73(3), 669738.CrossRefGoogle Scholar
Huber, J. D. (2017). Exclusion by elections: Inequality, ethnic identity, and democracy. Cambridge University Press.CrossRefGoogle Scholar
Hume, D. (1777). Enquiries concerning the human understanding and concerning the principles of morals. Oxford University Press.CrossRefGoogle Scholar
Imbens, G. W., & Wooldridge, J. M. (2009). Recent developments in the econometrics of program evaluation. Journal of Economic Literature, 47(1), 586.CrossRefGoogle Scholar
Incerti, T. (2020). Corruption information and vote share: A meta-analysis and lessons for experimental design. American Political Science Review, 114(3), 761774.CrossRefGoogle Scholar
Izzo, F., Dewan, T., & Wolton, S. (2022). Cumulative knowledge in the social sciences: The case of improving voters’ information. Available at SSRN 3239047.Google Scholar
Jerven, M. (2013). Poor numbers. Cornell University Press.Google Scholar
Kern, H. L., Stuart, E. A., Hill, J., & Green, D. P. (2016). Assessing methods for generalizing experimental impact estimates to target populations. Journal of Research on Educational Effectiveness, 9, 103127.CrossRefGoogle ScholarPubMed
Latour, B. (1993). The pasteurization of France. Harvard University Press.Google Scholar
Latour, B., & Woolgar, S. (1986). Laboratory life: The construction of scientific facts. Princeton University Press.Google Scholar
Leamer, E. E. (1983). Let’s take the con out of econometrics. The American Economic Review, 73(1), 3143.Google Scholar
Leibniz, G. W. (1714). The monadology. In Jonathan, C. W. Edwards (Ed.) Philosophical papers and letters (pp. 643653). Springer 1989.Google Scholar
Mares, I., & Visconti, G. (2020). Voting for the lesser evil: Evidence from a conjoint experiment in Romania. Political Science Research and Methods, 8, 315328.CrossRefGoogle Scholar
Marquis de Laplace, P. S. (1825). A philosophical essay on probabilities. Wiley 1902.Google Scholar
Mayo, D. G. (1996). Error and the growth of experimental knowledge. University of Chicago Press.CrossRefGoogle Scholar
Meager, R. (2019). Understanding the average impact of microcredit expansions: A Bayesian hierarchical analysis of seven randomized experiments. American Economic Journal: Applied Economics, 11(1), 5791.Google Scholar
Mill, J. S. (1856). A system of logic, ratiocinative and inductive: 1 (Vol. 1). Parker.Google Scholar
Moher, D., Shamseer, L., Clarke, M., et al. (2015). Preferred reporting items for systematic review and meta-analysis (prisma-p) 2015 statement. Systematic Reviews, 4(1), 19.CrossRefGoogle ScholarPubMed
Munger, K. (2023). Temporal validity as meta-science. Research & Politics, 10(3), 20531680231187271.CrossRefGoogle Scholar
Open Science Collaboration. (2015). Estimating the reproducibility of psychological science. Science, 349(6251), 18.Google Scholar
Orzack, S. H., & Sober, E. (1993). A critical assessment of Levins’s The strategy of model building in population biology (1966). The Quarterly Review of Biology, 68(4), 533546.CrossRefGoogle Scholar
Pearl, J., & Bareinboim, E. (2011). Transportability of causal and statistical relations: A formal approach. In Twenty-fifth AAAI conference on artificial intelligence.CrossRefGoogle Scholar
Pearl, J., & Bareinboim, E. (2014). External validity: From do-calculus to transportability across populations. Statistical Science, 29(4), 579595.CrossRefGoogle Scholar
Pritchett, L., & Sandefur, J. (2015). Learning from experiments when context matters. American Economic Review, 105(5), 471475.CrossRefGoogle Scholar
Putnam, H. (1981). Reason, truth and history. Cambridge University Press.CrossRefGoogle Scholar
Raffler, P., Posner, D. N., & Parkerson, D. (2020, October). Can citizen pressure be induced to improve public service provision? (Working paper, Harvard University)Google Scholar
Salmon, W. C. (1984). Scientific explanation and the causal structure of the world. Princeton University Press.Google Scholar
Samii, C. (2016). Causal empiricism in quantitative research. The Journal of Politics, 78(3), 941955.CrossRefGoogle Scholar
Schedler, A. (2012). Judgment and measurement in political science. Perspectives on Politics, 10(1), 2136.CrossRefGoogle Scholar
Schwarz, S., & Coppock, A. (2022). What have we learned about gender from candidate choice experiments? A meta-analysis of 67 factorial survey experiments. Journal of Politics, 84(2), 655668.CrossRefGoogle Scholar
Shadish, W., Cook, T. D., & Campbell, D. T. (2002). Experimental and quasi-experimental designs for generalized causal inference. Houghton Mifflin.Google Scholar
Slough, T. (2024). Bureaucratic quality and electoral accountability. American Political Science Review, Forthcoming.CrossRefGoogle Scholar
Slough, T., Rubenson, D., Levy, R., et al. (2021). Adoption of community monitoring improves common pool resource management across contexts. Proceedings of the National Academy of Sciences, 10(1073), 110.Google Scholar
Slough, T., & Tyson, S. A. (2023). External validity and meta-analysis. American Journal of Political Science, 67(2), 440455.CrossRefGoogle Scholar
Slough, T., & Tyson, S. A. (2024). Sign-congruent external validity and replication. Political Analysis, Forthcoming.Google Scholar
Smith, V. L. (1982). Microeconomic systems as an experimental science. The American Economic Review, 72(5), 923955.Google Scholar
Venn, J. (1888). The logic of chance: An essay on the foundations and province of the theory of probability, with especial reference to its logical bearings and its application to moral and social science, and to statistics. Macmillan.Google Scholar
Woodward, J. (2002). What is a mechanism? A counterfactual account. Philosophy of Science, 69(S3), S366S377.CrossRefGoogle Scholar

Save element to Kindle

To save this element to your Kindle, first ensure [email protected] is added to your Approved Personal Document E-mail List under your Personal Document Settings on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part of your Kindle email address below. Find out more about saving to your Kindle.

Note you can select to save to either the @free.kindle.com or @kindle.com variations. ‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi. ‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.

Find out more about the Kindle Personal Document Service.

External Validity and Evidence Accumulation
Available formats
×

Save element to Dropbox

To save content items to your account, please confirm that you agree to abide by our usage policies. If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account. Find out more about saving content to Dropbox.

External Validity and Evidence Accumulation
Available formats
×

Save element to Google Drive

To save content items to your account, please confirm that you agree to abide by our usage policies. If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account. Find out more about saving content to Google Drive.

External Validity and Evidence Accumulation
Available formats
×