Hostname: page-component-745bb68f8f-b95js Total loading time: 0 Render date: 2025-01-23T21:33:49.580Z Has data issue: false hasContentIssue false

Rural electrification, the credibility revolution, and the limits of evidence-based policy

Published online by Cambridge University Press:  15 January 2025

Jörg Ankel-Peters*
Affiliation:
RWI – Leibniz Institute for Economic Research, Essen, Germany School of Business, Economics and Information Systems, University of Passau, Passau, Germany
Christoph M. Schmidt
Affiliation:
RWI – Leibniz Institute for Economic Research, Essen, Germany Faculty of Management and Economics, Ruhr University Bochum, Bochum, Germany
*
*Corresponding author: Jörg Ankel-Peters; Email: [email protected]
Rights & Permissions [Opens in a new window]

Abstract

The so-called credibility revolution dominates empirical economics, with its promise of causal identification to improve scientific knowledge and ultimately policy. By examining the case of rural electrification in the Global South, this opinion paper exposes the limits of this evidence-based policy paradigm. The electrification literature boasts many studies using the credibility revolution toolkit, but at the same time, several systematic reviews demonstrate that the evidence is divided between very positive and muted effects. This bifurcation presents a challenge to the science-policy interface, where policymakers, lacking the resources to sift through the evidence, may be drawn to the results that serve their (agency's) interests. The interpretation is furthermore complicated by unresolved methodological debates circling around external validity as well as selective reporting and publication decisions. These features, we argue, are not particular to the electrification literature but inherent to the credibility revolution toolkit.

Type
Perspectives
Creative Commons
Creative Common License - CCCreative Common License - BY
This is an Open Access article, distributed under the terms of the Creative Commons Attribution licence (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted re-use, distribution and reproduction, provided the original article is properly cited.
Copyright
Copyright © The Author(s), 2025. Published by Cambridge University Press

1. Introduction

The extent to which the high costs of rural electrification are justified by its impacts on societies and economies has been a matter of debate for decades (see, for example, Rose, Reference Rose1940; Devine, Reference Devine1983; Barnes and Binswanger, Reference Barnes and Binswanger1986; Barnes, Reference Barnes2010). In recent years, academic contributions to this discussion have been influenced considerably by the so-called credibility revolution in economics (see Angrist and Pischke, Reference Angrist and Pischke2010). The claim is that ‘design-based research’ (Card, Reference Card2022) like randomized controlled trials (RCTs) and instrumental variables (IVs) leads to more credible and verifiable identification of causal effects. This ‘experimentalist paradigm’ (Biddle and Hamermesh, Reference Biddle and Hamermesh2017) is closely linked to the vision of evidence-based policy: well-identified causal effects, so the narrative goes, will eventually tell us which interventions work and hence should be scaled to shape future policies (Young et al., Reference Young, Ashby, Boaz and Grayson2002; Duflo, Reference Duflo2004, Reference Duflo2020; Panhans and Singleton, Reference Panhans and Singleton2017).

In this paper, we examine the case of rural electrification in the Global South, documenting that design-based research is much less effective in improving policy than it is often claimed. This is not a new verdict, and we build on previous critical reflection on the credibility revolution paradigm (Rodrik, Reference Rodrik2008; Ravallion, Reference Ravallion2009, Reference Ravallion, Bédécarrats, Guérin and Roubaud2020; Heckman and Urzua, Reference Heckman and Urzua2010; Basu, Reference Basu2014; Deaton and Cartwright, Reference Deaton and Cartwright2018; Deaton, Reference Deaton2020; Drèze, Reference Drèze2020; Muller, Reference Muller2023).Footnote 1 We extend this line of discussion by a specific application to rural electrification, an important area of development policy that absorbs large amounts of public funding (World Bank, 2018; Blimpo and Cosgrove-Davies, Reference Blimpo and Cosgrove-Davies2019). While national governments often justify investments into rural electrification from a social justice and hence a rights-based perspective, donor agencies and international development banks are under pressure to prove that the investment is worthwhile, following an explicit or implicit cost-benefit analysis logic. There is also an interesting within-sector cost-effectiveness debate because expensive grid extension competes with infrastructure leapfrogging via lower cost decentralized solutions like stand-alone solar or mini-grids (Levin and Thomas, Reference Levin and Thomas2016).

To inform this debate, many empirical studies have been published in recent years that examine the impacts of rural electrification, increasingly also using design-based methods from the credibility revolution toolkit. Several systematic reviews and meta-analyses have summarized this growing literature. In a nutshell, these reviews show that the literature is divided, with some studies finding very large effects, and others very modest or no effects. This divide is consequential for policy, especially considering that, for the extension of the power grid, large effects are required to justify the high costs. This holds true under a cost-benefit principle as it is applied by many donors, but also under a rights-based principle because then grid extension competes with off-grid technologies for cost-effectiveness.

Such meta-analyses and systematic reviews are important because, while design-based research is good at generating well-identified causal effects, the external validity gap still needs to be bridged. For this, an accumulation of evidence is needed – something that Duflo (Reference Duflo2020) refers to as the ‘pointillist painting,’ with each causal study being one dot on the painting.Footnote 2 We use the case of rural electrification to show that even in a rich literature the pointillist painting is hard to compile and the dots on the canvas leave a lot of room for interpretation. We further argue that in highly contested policy areas, even well-meaning policymakers will use this wiggle room to pursue their interests. Next, we argue that the practice of design-based research, despite its intellectual beauty in identifying causality, is not immune to other biases stemming from questionable research practices, underpowered designs, overgeneralization, and publication bias. This further complicates the use of evidence in the policy landscape.

To conclude, we argue that this observation is not particular to electrification. We therefore call for a debate on what this implies for the science-policy interface. More research is needed on how evidence is generated and synthesized as well as how it is used for policy.

2. The credibility revolution in the electrification literature

Prior to the credibility revolution, empirical research on rural electrification had been conducted for many decades and had recurrently featured insightful studies based on various methods. Nonetheless, it is a showcase example of what the credibility revolution rightly criticized in the 2000s: many studies made some sort of causal inference based on a naive comparison of people or regions with and without access to electricity, without accounting for endogenous selection processes (see Peters, Reference Peters2009).Footnote 3 That has changed over the past 15 years or so, with an increasing number of published studies revealing more sensitivity for the problems of selection bias. The methodological portfolio first covered quasi-experimental matching and difference-in-difference designs, but increasingly also IVs and sometimes RCTs.

In fact, IVs have been used in many papers on grid-based electrification. Dinkelman (Reference Dinkelman2011) and Lipscomb et al. (Reference Lipscomb, Mobarak and Barham2013) are the earliest examples and they have been influential and foundational for the literature. The decentralization of electricity access also facilitated randomization, so that the first RCTs appeared in the mid-2010s (Furukawa, Reference Furukawa2014; Aklin et al., Reference Aklin, Bayer, Harish and Urpelainen2017; Grimm et al., Reference Grimm, Munyehirwe, Peters and Sievert2017). RCTs for power infrastructure in most settings proved to be infeasible for political or budgetary reasons; Lee et al. (Reference Lee, Miguel and Wolfram2020a) is a notable exception. Yet, for on-grid electrification, quasi-experimental methods and especially IVs continue to be the dominant identification strategies, while for off-grid solar several RCTs exist.

This wave of intense design-based research was followed by a battery of overview papers and systematic reviews (henceforth ‘reviews’) (Bernard, Reference Bernard2012; Peters and Sievert, Reference Peters and Sievert2016; Bonan et al., Reference Bonan, Pareglio and Tavoni2017; Jimenez, Reference Jimenez2017; Bos et al., Reference Bos, Chaplin and Mamun2018; Morrissey, Reference Morrissey2018; Blimpo and Cosgrove-Davies, Reference Blimpo and Cosgrove-Davies2019; Hamburger et al., Reference Hamburger, Jaeger, Bayer, Kennedy, Yang and Urpelainen2019; Bayer et al., Reference Bayer, Kennedy, Yang and Urpelainen2020; Perdana et al., Reference Perdana, Glandon, Moore and Snilsveit2020; Lee et al., Reference Lee, Miguel and Wolfram2020b; Jeuland et al., Reference Jeuland, Fetter, Li, Pattanayak, Usmani, Bluffstone, Chávez, Girardeau, Hassen, Jagger, Jaime, Karumba, Köhlin, Lenz, Naranjo, Peters, Qin, Ruhinduka and Toman2021). The research community has hence not only generated the dots on Duflo's pointillist painting but also invested in compiling what the painting shows. All these reviews diagnose a divide in the literature, that is, one set of studies comes to very positive conclusions about the development effects of electrification while another set of studies rather observes small or no effects.

To understand the policy implication of this, the size of the effect must be assessed in relation to the costs. Here, it is important to distinguish between on-grid and off-grid electrification. Given the high cost of grid-based rural electrification, large positive effects are required to make the intervention cost-effective and even modest positive effects would advocate against the investment. Based on their finding of muted effects, Lee et al. (Reference Lee, Miguel and Wolfram2020a) conclude that the investment into grid extension entails a ‘social surplus loss’. In contrast, for off-grid electrification such as small-scale solar, even modest effects can render a cost-benefit analysis positive and suggest that promoting this technology is cost-effective – because of the considerably lower investment cost (Grimm et al., Reference Grimm, Lenz, Peters and Sievert2020).Footnote 4

The reviews refer to several potential explanations of the divide, but in our reading, two narratives stand out: a regional divide and a methodological divide.Footnote 5 Jeuland et al. (Reference Jeuland, Fetter, Li, Pattanayak, Usmani, Bluffstone, Chávez, Girardeau, Hassen, Jagger, Jaime, Karumba, Köhlin, Lenz, Naranjo, Peters, Qin, Ruhinduka and Toman2021) is an insightful starting point. It does not delve into a narrative for the divide in the literature. Its main purpose, rather, is to comprehensively take stock of the literature. Jeuland et al. (Reference Jeuland, Fetter, Li, Pattanayak, Usmani, Bluffstone, Chávez, Girardeau, Hassen, Jagger, Jaime, Karumba, Köhlin, Lenz, Naranjo, Peters, Qin, Ruhinduka and Toman2021) thereby illustrates how vast the evidence base is when a review is very inclusive. By covering a generous list of journals as well as the grey literature, it shows that the electrification literature comprises some 2,000 studies. As an extreme case, one can draw from this large pool to compile the pointillist painting, even if there is certainly a broad consensus that many of these 2,000 dots should be dismissed, for example because a study does not apply design-based methods. All other reviews employ much more exclusive selections of the literature and most include design-based studies only.

The regional narrative for the divide in the literature points to the different development potentials in different regions and target populations (see, for example, Peters and Sievert, Reference Peters and Sievert2016; Hamburger et al., Reference Hamburger, Jaeger, Bayer, Kennedy, Yang and Urpelainen2019; Lee et al., Reference Lee, Miguel and Wolfram2020b). Hamburger et al. (Reference Hamburger, Jaeger, Bayer, Kennedy, Yang and Urpelainen2019) reveal that large parts of the design-based electrification literature are concentrated in just a few countries. Especially Sub-Saharan Africa is largely ignored. Related to this, Peters and Sievert (Reference Peters and Sievert2016) argue that the large effects observed in some Latin American and Asian countries cannot be generalized to Sub-Saharan Africa because of different economic conditions at baseline. They also provide evidence for small effects from several Sub-Saharan African countries, which contrast with the much larger effects in the pre-existing literature. In a similar vein, Lee et al. (Reference Lee, Miguel and Wolfram2020b) emphasize that, historically, electrification in most industrialized countries happened while the economies were on a growth trajectory. Evidence from such contexts is hence not transferable to places today where remote areas are connected that are barely integrated in economic development processes.

The methodological narrative is raised mainly in Bayer et al. (Reference Bayer, Kennedy, Yang and Urpelainen2020) and Lee et al. (Reference Lee, Miguel and Wolfram2020b). Bayer et al. (Reference Bayer, Kennedy, Yang and Urpelainen2020) establish that studies using randomized designs typically deliver smaller effects than those using quasi-experimental designs. They explain this by the selection bias inherent to non-randomized methods that inflates impact estimates. The pattern in their data is indeed striking, but an important caveat is that with one exception all RCTs were done on off-grid electrification technologies, not the grid. Grid extension programs are mostly evaluated using IVs, and sometimes regression discontinuity and difference-in-difference designs. Lee et al. (Reference Lee, Miguel and Wolfram2020b: 131), focusing on grid electrification, point to the large number of IVs in that literature and suggest that ‘it is hard to rule out the possibility that the correlation between the instrument and the dependent variable runs through additional channels beyond electrification’.

In fact, the heavy reliance on observational data and especially IVs is conspicuous in the electrification literature, and it might import risks of bias. Above all, the geographic IVs that are often used in electrification evaluations such as the land gradient or water flow are suspected of violating exclusion restrictions because they affect the causal network through many pathways, not just through electrification, the instrumented variable.Footnote 6 Another reason to be concerned is that these geographical IVs are often weak IVs, which is not a problem per se if appropriate remedies are used. But these remedies are less effective if weakness concurs with violated exclusion restrictions (Bensch et al., Reference Bensch, Gotz and Ankel-Peters2020) and if scholars screen specifications based on first-stage strength (Ankel-Peters et al., Reference Ankel-Peters, Bensch and Vance2023). Related to the screening aspect, IVs are suspected of being more prone to publication bias and p-hacking (Brodeur et al., Reference Brodeur, Cook and Heyes2020: 3,636), because ‘when using a non-experimental method like IV there are many points at which a researcher exercises discretion in ways that could affect statistical significance’.Footnote 7 Relatedly, we are not aware of an IV-based study in the electrification literature with a null result (Bayer et al., Reference Bayer, Kennedy, Yang and Urpelainen2020).

It is furthermore conspicuous that those more recent studies that find smaller effects use self-collected primary data to evaluate specific electrification interventions, irrespective of whether they are RCTs or based on a difference-in-differences. This covers studies like Lee et al. (Reference Lee, Miguel and Wolfram2020a), an RCT, but also Bensch et al. (Reference Bensch, Cornelissen, Peters, Wagner, Reichert and Stepanikova2019), Chaplin et al. (Reference Chaplin, Mamun, Protik, Schurrer, Vohra, Bos, Burak, Meyer, Dumitrescu, Ksoll and Cook2017), Masselus et al. (Reference Masselus, Ankel-Peters, Sutil, Modi, Mugyenyi, Munyehirwe, Williams and Sievert2024a) and Lenz et al. (Reference Lenz, Munyehirwe, Peters and Sievert2017) as well as Peters et al. (Reference Peters, Vance and Harsdorff2011). We therefore raise the question of whether this evaluative setting – primary data and specific interventions under evaluation – could possibly lead to fewer incentives to publish large effects. One reason could be that the specific interventions under evaluation are often large and well-known investments, making a null effect more interesting. Self-collected data also allows for tracking potential effects much more meticulously along a theoretical results chain (e.g., by eliciting appliance adoption, productive appliance adoption, jobs in electricity-using firms, etc.). This is not to say that such evaluations are without problems. Regional scope is limited and cooptation by funding development agencies is possible. Primary data also often covers shorter time periods (Nag and Stern, Reference Nag and Stern2023; see Masselus et al. (Reference Masselus, Ankel-Peters, Sutil, Modi, Mugyenyi, Munyehirwe, Williams and Sievert2024a) for an exception).

In any case, the electrification literature should be evaluated in light of recent trends in the economics profession towards more transparency (Christensen and Miguel, Reference Christensen and Miguel2018). This requires sensitivity for pre-specification and robustness replicability as well as quantitative meta-analyses that account for potential publication bias (Andrews and Kasy, Reference Andrews and Kasy2019; Carter et al., Reference Carter, Schönbrodt, Gervais and Hilgard2019; Irsova et al., Reference Irsova, Doucouliagos, Havranek and Stanley2024) – something that has hitherto not been done.

3. Bayesian policymakers and reasoned intuition

The target audience of applied empirical research according to the evidence-based policy paradigm are policymakers.Footnote 8 Economists have started to examine the conditions under which policymakers indeed make use of available evidence (Banuri et al., Reference Banuri, Dercon and Gauri2019; Hjort et al., Reference Hjort, Moreira, Rao and Santini2021; Vivalt and Coville, Reference Vivalt and Coville2023). The underlying assumption often is that the evidence provides a scientifically clear picture. In practice, though, the evidence is often murky and contradictory, and subject to debates about methodological issues. The electrification literature is a showcase example of this. It is therefore important to ask how policymakers form their beliefs.

Ideally, policymakers and we, their academic advisors, are Bayesians: We have a prior which we update as new evidence comes in. The prior's responsiveness is a function of the evidence's methodological quality. That is, the prior is firmer and less responsive to new evidence the better the already existing evidence is. Likewise, it responds more to methodologically sound new evidence. This type of thinking, though, requires repeated appraisals of the incoming evidence. For this appraisal, there exist no standards. At best, these appraisals are based on experience and expertise. In other words, we must use what Basu (Reference Basu2014: 466) calls reasoned intuition: ‘intuition and gut feeling […] need to be held under the scanner of reason before we use them to translate experience and evidence into rules and behaviour and policy.’ Most policymakers have experience and expertise, so it is possible that reasoned intuition can work when policymakers come across new evidence.

Yet, so far, we have assumed benevolent policymakers while in practice they might have some sort of vested interests. This is, in many cases, not condemnable. For example, policymakers are typically civil servants and hence subscribe to a certain political agenda of the administration they represent. It is natural that policymakers extract from the evidence what serves their interest. A divided literature like the one on rural electrification provides the basis for confirmation bias as it is empirically diagnosed by Banuri et al. (Reference Banuri, Dercon and Gauri2019). In a similar vein, Vivalt and Coville (Reference Vivalt and Coville2023) provide empirical evidence for what they call ‘asymmetric optimism’: policymakers update more on good news than on bad news.

Policymakers managing electrification portfolios can have agendas. For example, major development banks have a long history of investing in large infrastructure through grants and lending, and it is understandable that they – or some of their staff members – prefer on-grid electrification over off-grid electrification. Confirmation bias and asymmetric optimism might tempt them to seize that part of the literature that suggests substantial development effects of grid extension programs. Staff of solar advocacy organizations or private sector representatives seeking subsidies for their off-grid solar programs might, by contrast, prefer evidence suggesting only modest impacts of on-grid electrification. This would strengthen the cost-effectiveness of off-grid technologies. The hawker's tray of the electrification literature has much evidence to offer for both camps.

An informed debate between these two camps based on reasoned intuition is hence problematic. An additional important layer of complexity is that applying reasoned intuition is harder the more prevalent methodological concerns are that are not well understood within academia.Footnote 9 For example, academic debates do not converge when it comes to publication bias and how to account for it when making inferences. Likewise, controversies about robustness in replications and reproductions are hard to settle among replicators and original authors (Ozier, Reference Ozier2021; Ankel-Peters et al., Reference Ankel-Peters, Fiala and Neubauer2024). And while external validity is an accepted barrier in economics between rigorous evidence and its policy relevance, the literature on how to account for it in the generalization of scientific results is nascent but so far inconclusive (Muller, Reference Muller2015, Reference Muller, Kincaid and Ross2021; Pritchett and Sandefur, Reference Pritchett and Sandefur2015; Peters et al., Reference Peters, Langbein and Roberts2018; Vivalt, Reference Vivalt2020; Dehejia et al., Reference Dehejia, Pop-Eleches and Samii2021; Gechter, Reference Gechter2024). Concerns about construct validity are less widely discussed and virtually absent in the economics literature, although they are of utmost importance for generalization across supposedly similar interventions (Pritchett et al., Reference Pritchett, Samji and Hammer2013; Esterling et al., Reference Esterling, Brady and Schwitzgebel2023; Masselus et al., Reference Masselus, Ankel-Peters and Petrik2024b). Such debates including their ambiguous outcomes are not a failure but rather a natural part of the scientific enterprise. Nevertheless, they do pose major hurdles for the evidence-policy interface.

4. Conclusion and way forward

In this paper, we have argued that the evidence-based policy paradigm reaches its limits in the case of rural electrification. Policymakers with vested interests of different kinds will each find support for their respective agenda. But even benevolent policymakers might get into difficulties because of unresolved methodological debates in the literature. It is overly simplistic, though, to merely blame policymakers for extracting only a partial interpretation of the evidence. Academic researchers bear part of the responsibility in that they often communicate results with what Manski (Reference Manski2011, Reference Manski2019) calls incredible certitude.

Manski stresses that the logic of any inference is: assumptions + data = > conclusions. In terms of data, the rural electrification research community deserves to be applauded for the many systematic reviews it has produced, to which we owe the consolidated understanding that this literature is divided. In terms of assumptions, though, most individual papers wishfully extrapolate (again, Manski) their data to much too strong conclusions. These often heavy assumptions are only partly made transparent and range from external validity concerns to a much weaker robustness than what is communicated in the papers.

The patterns we have diagnosed in this paper are not a peculiarity of rural electrification.Footnote 10 Many literatures that have been subject to a myriad of design-based impact evaluations exhibit fuzzy pointillist paintings and methodological issues related to external validity and reproducibility. What are the implications for the learning model in the electrification literature and beyond? One response would be to do more and more design-based studies, accompanied by robustness replications ensuring that the right inference is being made, and hope for a clearer picture emerging in the literature soon. However, ‘the pace of politics is faster than the pace of scientific consensus formation’ (Collins and Evans, Reference Collins and Evans2002: 241).Footnote 11

Theory-based evaluation will help to accelerate this process (Duflo et al., Reference Duflo, Glennerster, Kremer, Schultz and Strauss2007; White, Reference White2009). Clearly outlined theory can identify mechanisms, which are then tested in (quasi-) experiments. The hope is that such mechanisms are less context-dependent and hence more generalizable than the effects of the whole program, which is often a bundle of interventions (Ludwig et al., Reference Ludwig, Kling and Mullainathan2011). It is true that much of the literature on rural electrification is lacking such a clear theory, and potential context-stable mechanisms such as productive use are rarely tested in a theory-grounded manner. This would require pre-specification of hypotheses, not just explorative heterogeneity analysis (which is indeed done in several studies). A clearly outlined theory would also render Manski's wishful extrapolation more difficult because the theoretical foundation would expose the assumptions underlying the extrapolation.

In the meantime, a pragmatic way forward for design-based research is to become humbler: impact evaluations could focus on informing the specific program under evaluation only and widely refrain from generalization to other contexts.Footnote 12 Impact evaluations would then rather be a feature for internal program management than for global learning processes. Elements of this can be found also in proposals from within the credibility revolution movement (see Banerjee et al., Reference Banerjee, Banerji, Berry, Duflo, Kannan, Mukerji, Shotland and Walton2017; Duflo, Reference Duflo2017). Yet the current reward system in academia and from funding agencies does probably not incentivize such a humbler approach.

More generally, more research is needed in the economics profession on how the science-policy interface can be improved. Absent formal evidence clearinghouses like the World Health Organization or the Intergovernmental Panel on Climate Change, policy often relies on in-house literature reviews or policy briefs, scientific advisory boards or bilateral consultations to be backed up by scientific expertise. That is fine, but policymakers need to be sensitized to the pitfalls of evidence-based policy advice outlined in this paper. Ultimately, we need a better methodology for how to organize and synthesize knowledge formation in economics – a slightly belated version of ‘studies of expertise and experience’ (Collins and Evans, Reference Collins and Evans2002). This will raise many important downstream questions for the economics profession, ushering in a veritable research program.

Acknowledgements

We are grateful for valuable comments and suggestions from E. Somanathan, two anonymous referees, Gunther Bensch, Maximiliane Sievert, Colin Vance and from participants at the Sustainable Energy Transition Initiative (SETI) 2020 workshop, International Association of Energy Economics (IAEE) 2021 conference, the Power to Empower Emerging Africa 2020 workshop in Marrakesh and the 3rd Conference on Econometrics and the Environment 2020.

Competing interest

One of the authors has made several contributions to the literature under review in this paper. Beyond this, the authors declare no competing interests.

Footnotes

1 A less economics-centric introspection reveals that similar debates about positivist claims for epistemological hegemony have been well-known in the sociology of science for decades (see, for example, Collins and Evans, Reference Collins and Evans2002). See also Whittington et al. (Reference Whittington, Jeuland, Barker and Yuen2012) for a perspective on the use of evidence in the water and sanitation sector.

2 For the sake of completeness, in this allegory Duflo (Reference Duflo2020) refers to RCTs alone. Most proponents of the ‘experimentalist paradigm’ would extend this epistemology to other non-randomized design-based methods like IVs, regression discontinuity design and difference-in-differences; see for example Angrist and Pischke (Reference Angrist and Pischke2010) for a brief reference to this epistemology. Yet, clear statements and instructions on how the evidence is supposed to be compiled are very rare, in both textbooks and declaration-like papers.

3 For more general cases beyond the electrification example, see Frondel and Schmidt (Reference Frondel and Schmidt2005), Ravallion (Reference Ravallion, Schultz and Strauss2007) and Schmidt (Reference Schmidt2001).

4 We are aware that cost-benefit analysis is not always, perhaps even rarely, applied in a narrow sense, and not all donor agencies expect a strictly positive cost-benefit analysis. For our argument to hold, some implicit application of cost-benefit logic among donor agencies is sufficient. For example, even if a donor agency accepts a negative cost-benefit analysis, pressure to justify investments increases with the ‘social surplus loss’ (to use the Lee et al. (Reference Lee, Miguel and Wolfram2020a) term).

5 Various other sources of heterogeneity might explain why effects of electrification differ across studies, for example differences in grid reliability (Chakravorty et al., Reference Chakravorty, Pelli and Marchand2014; Allcott et al., Reference Allcott, Collard-Wexler and O'Connell2016), exposure to exogenous economic development (Fetter and Usmani, Reference Fetter and Usmani2024), size of the targeted communities (Burlig and Preonas, Reference Burlig and Preonas2024), complementary services like access to finance, and the duration of exposure to electricity (Nag and Stern, Reference Nag and Stern2023; Masselus et al., Reference Masselus, Ankel-Peters, Sutil, Modi, Mugyenyi, Munyehirwe, Williams and Sievert2024a). Traces of these factors can be found throughout the reviews, but in our reading, none of the reviews puts emphasis on them.

6 See Haveresch et al. (Reference Haveresch, Ankel-Peters and Bensch2024) for the case of topography as an IV, as well as Gallen and Raymond (Reference Gallen and Raymond2023), Lal et al. (Reference Lal, Lockhart, Xu and Zu2024) and Mellon (Reference Mellon2024) for related critiques.

7 See as well Kranz and Pütz (Reference Kranz and Pütz2022) and Brodeur et al. (Reference Brodeur, Cook and Heyes2022).

8 We use this term broadly and include different actors at the science-policy interface, for example, decision-makers in governmental agencies at the strategic and operational level as well as policy advisory committees.

9 Vivalt and Coville (Reference Vivalt and Coville2023) also emphasize that potential biases of policymakers in reading the evidence are more problematic in the presence of biases in the underlying evidence, such as publication bias or lacking external validity.

10 Problematic or controversial patterns in other literature related to environmental economics and policy have been diagnosed, for example, in Ferraro and Shukla (Reference Ferraro and Shukla2020, Reference Ferraro and Shukla2023), Vrolijk and Sato (Reference Vrolijk and Sato2023), Bagilet and Zabrocki-Hallak (Reference Bagilet and Zabrocki-Hallak2022), Krasovskaia and Just (Reference Krasovskaia and Just2024), and Whittington et al. (Reference Whittington, Jeuland, Barker and Yuen2012).

11 David Card, in his Nobel Prize lecture in 2021, expressed his optimism that the debate around minimum wages that started in the early 1990s might converge to a common understanding in ‘another decade or two’ (Card, Reference Card2022).

References

Aklin, M, Bayer, P, Harish, SP and Urpelainen, J (2017) Does basic energy access generate socioeconomic benefits? A field experiment with off-grid solar power in India. Science Advances 3, e1602153.CrossRefGoogle ScholarPubMed
Allcott, H, Collard-Wexler, A and O'Connell, SD (2016) How do electricity shortages affect industry? Evidence from India. American Economic Review 106, 587624.CrossRefGoogle Scholar
Andrews, I and Kasy, M (2019) Identification of and correction for publication bias. American Economic Review 109, 27662794.CrossRefGoogle Scholar
Angrist, JD and Pischke, JS (2010) The credibility revolution in empirical economics: how better research design is taking the con out of econometrics. Journal of Economic Perspectives 24, 330.CrossRefGoogle Scholar
Ankel-Peters, J, Bensch, G and Vance, C (2023) Spotlight on researcher decisions: infrastructure evaluation, instrumental variables, and specification screening. Ruhr Economic Papers, No. 991, RWI – Leibniz-Institut für Wirtschaftsforschung, Essen.CrossRefGoogle Scholar
Ankel-Peters, J, Fiala, N and Neubauer, F (2024) Is economics self-correcting? Replications in the American Economic Review. Economic Inquiry, forthcoming.CrossRefGoogle Scholar
Bagilet, V and Zabrocki-Hallak, L (2022) Why some acute health effects of air pollution could be inflated. I4R Discussion Paper Series, No. 11, Institute for Replication (I4R).Google Scholar
Banerjee, A, Banerji, R, Berry, J, Duflo, E, Kannan, H, Mukerji, S, Shotland, M and Walton, M (2017) From proof of concept to scalable policies: challenges and solutions, with an application. Journal of Economic Perspectives 31, 73102.CrossRefGoogle Scholar
Banuri, S, Dercon, S and Gauri, V (2019) Biased policy professionals. World Bank Economic Review 33, 310327.CrossRefGoogle Scholar
Barnes, DF (2010) The Challenge of Rural Electrification – Strategies for Developing Countries. Routledge.CrossRefGoogle Scholar
Barnes, DF and Binswanger, HP (1986) Impact of rural electrification and infrastructure on agricultural changes, 1966–1980. Economic and Political Weekly 21, 2634.Google Scholar
Basu, K (2014) Randomisation, causality and the role of reasoned intuition. Oxford Development Studies 42, 455472.CrossRefGoogle Scholar
Bayer, P, Kennedy, R, Yang, J and Urpelainen, J (2020) The need for impact evaluation in electricity access research. Energy Policy 137, 111099.CrossRefGoogle Scholar
Bensch, G, Cornelissen, W, Peters, J, Wagner, N, Reichert, J and Stepanikova, V (2019) Electrifying Rural Tanzania. A Grid Extension and Reliability Improvement Intervention. The Hague: Netherlands Enterprise Agency. Available at https://www.econstor.eu/handle/10419/222259Google Scholar
Bensch, G, Gotz, G and Ankel-Peters, J (2020) Effects of rural electrification on employment: a comment on Dinkelman (2011). Available at https://osf.io/preprints/metaarxiv/zhn9bCrossRefGoogle Scholar
Bernard, T (2012) Impact analysis of rural electrification projects in Sub-Saharan Africa. The World Bank Research Observer 27, 3351.CrossRefGoogle Scholar
Biddle, JE and Hamermesh, DS (2017) Theory and measurement: emergence, consolidation, and erosion of a consensus. History of Political Economy 49, 3457.CrossRefGoogle Scholar
Blimpo, MP and Cosgrove-Davies, M (2019) Electricity Access in Sub-Saharan Africa: Uptake, Reliability, and Complementary Factors for Economic Impact. Africa Development Forum series. Washington, DC: World Bank.Google Scholar
Bonan, J, Pareglio, S and Tavoni, M (2017) Access to modern energy: a review of barriers, drivers and impacts. Environment and Development Economics 22, 491516.CrossRefGoogle Scholar
Bos, K, Chaplin, D and Mamun, A (2018) Benefits and challenges of expanding grid electricity in Africa: a review of rigorous evidence on household impacts in developing countries. Energy for Sustainable Development 44, 6477.CrossRefGoogle Scholar
Brodeur, A, Cook, N and Heyes, A (2020) Methods matter: P-hacking and publication bias in causal analysis in economics. American Economic Review 110, 36343660.CrossRefGoogle Scholar
Brodeur, A, Cook, N. and Heyes, A (2022) Methods matter: p-hacking and publication bias in causal analysis in economics: reply. American Economic Review 112, 31373139.CrossRefGoogle Scholar
Burlig, F and Preonas, L (2024) Out of the darkness and into the light? Development effects of rural electrification. Journal of Political Economy 132, 29372971.CrossRefGoogle Scholar
Card, D (2022) Design-based research in empirical microeconomics. American Economic Review 112, 17731781.CrossRefGoogle Scholar
Carter, EC, Schönbrodt, FD, Gervais, WM and Hilgard, J (2019) Correcting for bias in psychology: a comparison of meta-analytic methods. Advances in Methods and Practices in Psychological Science 2, 115144.CrossRefGoogle Scholar
Chakravorty, U, Pelli, M and Marchand, BU (2014) Does the quality of electricity matter? Evidence from rural India. Journal of Economic Behavior & Organization 107, 228247.CrossRefGoogle Scholar
Chaplin, D, Mamun, A, Protik, A, Schurrer, J, Vohra, D, Bos, K, Burak, H, Meyer, L, Dumitrescu, A, Ksoll, C and Cook, T (2017) Grid Electricity Expansion in Tanzania by MCC: Findings from a Rigorous Impact Evaluation. Final report submitted to the Millennium Challenge Corporation. Washington, DC: Mathematica Policy Research.Google Scholar
Christensen, G and Miguel, E (2018) Transparency, reproducibility, and the credibility of economics research. Journal of Economic Literature 56, 920980.CrossRefGoogle Scholar
Collins, HM and Evans, R (2002) The third wave of science studies: studies of expertise and experience. Social Studies of Science 32, 235296.CrossRefGoogle Scholar
Deaton, A (2020) Randomization in the tropics revisited: a theme and eleven variations. National Bureau of Economic Research Working Paper Series, No. W27600, Cambridge, MA.CrossRefGoogle Scholar
Deaton, A and Cartwright, N (2018) Understanding and misunderstanding randomized controlled trials. Social Science & Medicine 210, 221.CrossRefGoogle ScholarPubMed
Dehejia, R, Pop-Eleches, C and Samii, C (2021) From local to global: external validity in a fertility natural experiment. Journal of Business & Economic Statistics 39, 217243.CrossRefGoogle Scholar
Devine, WD (1983) From shafts to wires: historical perspective on electrification. The Journal of Economic History 43, 347372.CrossRefGoogle Scholar
Dinkelman, T (2011) The effects of rural electrification on employment: new evidence from South Africa. American Economic Review 101, 30783108.CrossRefGoogle Scholar
Drèze, J (2020) Policy beyond evidence. World Development 127, 104797.CrossRefGoogle Scholar
Duflo, E (2004) Scaling up and evaluation. A paper prepared for the Annual World Bank Conference on Development Economics 2004: Accelerating Development, vol. 1, Report No. 30228.Google Scholar
Duflo, E (2017) The economist as plumber. American Economic Review 107, 126.CrossRefGoogle Scholar
Duflo, E (2020) Field experiments and the practice of policy. American Economic Review 110, 19521973.CrossRefGoogle Scholar
Duflo, E, Glennerster, R and Kremer, M (2007) Using randomization in development economics research: a toolkit. In Schultz, TP and Strauss, JA (eds), Handbook of Development Economics 4. Elsevier, pp. 38953962.Google Scholar
Esterling, KM, Brady, D and Schwitzgebel, E (2023) The necessity of construct and external validity for generalized causal claims. I4R Discussion Paper Series, No. 18, Institute for Replication (I4R).Google Scholar
Ferraro, PJ and Shukla, P (2020) Is a replicability crisis on the horizon for environmental and resource economics. Review of Environmental Economics and Policy 14, 339351.CrossRefGoogle Scholar
Ferraro, PJ and Shukla, P (2023) Credibility crisis in agricultural economics. Applied Economic Perspectives and Policy 45, 12751291.CrossRefGoogle Scholar
Ferraro, PJ, Cherry, TL, Shogren, JF, Vossler, CA, Cason, TN, Flint, HB, Hochard, JP, Johansson-Stenman, O, Martinsson, P, Murphy, JJ and Newbold, SC (2023) Create a culture of experiments in environmental programs. Science (New York, N.Y.) 381, 735737.CrossRefGoogle ScholarPubMed
Fetter, TR and Usmani, F (2024) Fracking, farmers, and rural electrification in India. Journal of Development Economics 170, 103308.CrossRefGoogle Scholar
Frondel, M and Schmidt, CM (2005) Evaluating environmental programs: the perspective of modern evaluation research. Ecological Economics 55, 515526.CrossRefGoogle Scholar
Furukawa, C (2014) Do solar lamps help children study? Contrary evidence from a pilot study in Uganda. Journal of Development Studies 50, 319341.CrossRefGoogle Scholar
Gallen, T and Raymond, B (2023) Broken instruments. Available at https://www.tgallen.com/Papers/Gallen_Raymond_BrokenInstruments.pdfGoogle Scholar
Gechter, M (2024) Generalizing the results from social experiments: theory and evidence from India. Journal of Business & Economic Statistics 42, 801811.CrossRefGoogle Scholar
Grimm, M, Munyehirwe, A, Peters, J and Sievert, M (2017) A first step up the energy ladder? Low-cost solar kits and household's welfare in rural Rwanda. The World Bank Economic Review 31, 631649.Google Scholar
Grimm, M, Lenz, L, Peters, J and Sievert, M (2020) Demand for off-grid solar electricity: experimental evidence from Rwanda. Journal of the Association of Environmental and Resource Economists 7, 417454.CrossRefGoogle Scholar
Hamburger, D, Jaeger, J, Bayer, P, Kennedy, R, Yang, J and Urpelainen, J (2019) Shades of darkness or light? A systematic review of geographic bias in impact evaluations of electricity access. Energy Research & Social Science 58, 101236.CrossRefGoogle Scholar
Haveresch, N, Ankel-Peters, J and Bensch, G (2024) A Slippery Slope: Topographic Variation as an Instrumental variable. Mimeo.Google Scholar
Heckman, JJ and Urzua, S (2010) Comparing IV with structural models: what simple IV can and cannot identify. Journal of Econometrics 156, 2737.CrossRefGoogle ScholarPubMed
Hjort, J, Moreira, D, Rao, G and Santini, JF (2021) How research affects policy: experimental evidence from 2,150 Brazilian municipalities. American Economic Review 111, 14421480.CrossRefGoogle Scholar
Irsova, Z, Doucouliagos, H, Havranek, T and Stanley, TD (2024) Meta-analysis of social science research: a practitioner's guide. Journal of Economic Surveys, forthcoming.CrossRefGoogle Scholar
Jeuland, M, Fetter, TR, Li, Y, Pattanayak, SK, Usmani, F, Bluffstone, RA, Chávez, C, Girardeau, H, Hassen, S, Jagger, P, Jaime, MM, Karumba, M, Köhlin, G, Lenz, L, Naranjo, EA, Peters, J, Qin, P, Ruhinduka, RD and Toman, M (2021) Is energy the golden thread? A systematic review of the impacts of modern and traditional energy use in low-and middle-income countries. Renewable and Sustainable Energy Reviews 135, 110406.CrossRefGoogle Scholar
Jimenez, R (2017) Development effects of rural electrification. Policy brief No IDB-PB-261, Inter-American Development Bank.Google Scholar
Kranz, S and Pütz, P (2022) Methods matter: P-hacking and publication bias in causal analysis in economics: comment. American Economic Review 112, 31243136.CrossRefGoogle Scholar
Krasovskaia, E and Just, DR (2024) Food, nutrition, and related policy issues: evidence-based policy and the credibility crisis. Q Open, qoae013. Available https://doi.org/10.1093/qopen/qoae013.CrossRefGoogle Scholar
Lal, A, Lockhart, M, Xu, Y and Zu, Z (2024) How much should we trust instrumental variable estimates in political science? Practical advice based on 67 replicated studies. Political Analysis, forthcoming.CrossRefGoogle Scholar
Lee, K, Miguel, E and Wolfram, C (2020 a) Experimental evidence on the economics of rural electrification. Journal of Political Economy 128, 15231565.CrossRefGoogle Scholar
Lee, K, Miguel, E and Wolfram, C (2020 b) Does household electrification supercharge economic development? Journal of Economic Perspectives 34, 122144.CrossRefGoogle Scholar
Lenz, L, Munyehirwe, A, Peters, J and Sievert, M (2017) Does large-scale infrastructure investment alleviate poverty? Impacts of Rwanda's electricity access roll-out program. World Development 89, 88110.CrossRefGoogle Scholar
Levin, T and Thomas, VM (2016) Can developing countries leapfrog the centralized electrification paradigm? Energy for Sustainable Development 31, 97107.CrossRefGoogle Scholar
Lipscomb, M, Mobarak, AM and Barham, T (2013) Development effects of electrification: evidence from the topographic placement of hydropower plants in Brazil. American Economic Journal: Applied Economics 5, 200231.Google Scholar
Ludwig, J, Kling, JR and Mullainathan, S (2011) Mechanism experiments and policy evaluations. Journal of Economic Perspectives 25, 1738.CrossRefGoogle Scholar
Manski, CF (2011) Policy analysis with incredible certitude. Economic Journal 121, F261F289.CrossRefGoogle Scholar
Manski, CF (2019) Communicating uncertainty in policy analysis. Proceedings of the National Academy of Sciences 116, 76347641.CrossRefGoogle ScholarPubMed
Masselus, L, Ankel-Peters, J, Sutil, GG, Modi, V, Mugyenyi, J, Munyehirwe, A, Williams, N and Sievert, M (2024 a) 10 years after: long-term adoption of electricity in rural Rwanda. Ruhr Economic Papers, No. 1086, RWI – Leibniz-Institut für Wirtschaftsforschung, Essen.CrossRefGoogle Scholar
Masselus, L, Ankel-Peters, J and Petrik, C (2024 b) Lost in the design space? Construct validity in the microfinance literature. Ruhr Economic Papers No 184, RWI – Leibniz-Institut für Wirtschaftsforschung, Essen.CrossRefGoogle Scholar
Mellon, J (2024) Rain, rain, go away: 195 potential exclusion-restriction violations for studies using weather as an instrumental variable. American Journal of Political Science, forthcoming.CrossRefGoogle Scholar
Morrissey, J (2018) Linking Electrification and Productive Use. Oxfam Research Backgrounder Series.Google Scholar
Muller, SM (2015) Causal interaction and external validity: obstacles to the policy relevance of randomized evaluations. The World Bank Economic Review 29, 217225.CrossRefGoogle Scholar
Muller, SM (2021) Randomised trials in economics. In Kincaid, H and Ross, D (eds), A Modern Guide to Philosophy of Economics. Edward Elgar Publishing, pp. 90126.Google Scholar
Muller, SM (2023) Is economics credible? A critical appraisal of three examples from microeconomics. Journal of Economic Methodology 30, 157175.CrossRefGoogle Scholar
Nag, S and Stern, DI (2023) Are the benefits of electrification realized only in the long run? Evidence from rural India. SSRN working paper. Available at https://ssrn.com/abstract=4591072CrossRefGoogle Scholar
Ozier, O (2021) Replication redux: the reproducibility crisis and the case of deworming. The World Bank Research Observer 36, 101130.CrossRefGoogle Scholar
Panhans, MT and Singleton, JD (2017) The empirical economist's toolkit: from models to methods. History of Political Economy 49, 127157.CrossRefGoogle Scholar
Perdana, A, Glandon, D, Moore, N and Snilsveit, B (2020) How do electricity access interventions affect social outcomes? A forthcoming systematic review. International Initiative for Impact Evaluation (3ie).Google Scholar
Peters, J (2009) Evaluating rural electrification projects-methodological approaches. Ruhr Economic Papers, No. 136, RWI – Leibniz-Institut für Wirtschaftsforschung, Essen.CrossRefGoogle Scholar
Peters, J and Sievert, M (2016) Impacts of rural electrification revisited – the African context. Journal of Development Effectiveness 8, 327345.CrossRefGoogle Scholar
Peters, J, Vance, C and Harsdorff, M (2011) Grid extension in rural Benin: micro-manufacturers and the electrification trap. World Development 39, 773783.CrossRefGoogle Scholar
Peters, J, Langbein, J and Roberts, G (2018) Generalization in the tropics–development policy, randomized controlled trials, and external validity. The World Bank Research Observer 33, 3464.CrossRefGoogle Scholar
Pritchett, L and Sandefur, J (2015) Learning from experiments when context matters. American Economic Review 105, 471475.CrossRefGoogle Scholar
Pritchett, L, Samji, S and Hammer, JS (2013) It's all about MeE: using structured experiential learning ('e') to crawl the design space. Center for Global Development, Working Paper, 322.Google Scholar
Ravallion, M (2007) Evaluating anti-poverty programs. In Schultz, TP and Strauss, JA (eds), Handbook of Development Economics 4. Elsevier, pp. 38953962.Google Scholar
Ravallion, M (2009) Should the randomistas rule? The Economists' Voice 6(2). Available at https://doi.org/10.2202/1553-3832.1368.CrossRefGoogle Scholar
Ravallion, M (2020) Should the randomistas (continue to) rule? In Bédécarrats, F, Guérin, I and Roubaud, F (eds), Randomized Control Trials in the Field of Development: A Critical Perspective. Oxford, UK: Oxford University Press, pp. 4778.CrossRefGoogle Scholar
Rodrik, D (2008) The new development economics: we shall experiment, but how shall we learn? Harvard Kennedy School Faculty Working Paper Series, No RWP08-055.CrossRefGoogle Scholar
Rose, JK (1940) Rural electrification: a field for social research. Rural Sociology 5, 411426.Google Scholar
Schmidt, CM (2001) Knowing what works: the case for rigorous program evaluation. Available at https://ssrn.com/abstract=273173Google Scholar
Vivalt, E (2020) How much can we generalize from impact evaluations? Journal of the European Economic Association 18, 30453089.CrossRefGoogle Scholar
Vivalt, E and Coville, A (2023) How do policymakers update their beliefs? Journal of Development Economics 165, 103121.CrossRefGoogle Scholar
Vrolijk, K and Sato, M (2023) Quasi-experimental evidence on carbon pricing. The World Bank Research Observer 38, 213248.CrossRefGoogle Scholar
White, H (2009) Theory-based impact evaluation: principles and practice. Journal of Development Effectiveness 1, 271284.CrossRefGoogle Scholar
Whittington, D, Jeuland, M, Barker, K and Yuen, Y (2012) Setting priorities, targeting subsidies among water, sanitation, and preventive health interventions in developing countries. World Development 40, 15461568.CrossRefGoogle Scholar
World Bank (2018) Africa's Pulse, Spring 2018: Analysis of Issues Shaping Africa's Economic Future (April). Washington, DC: World Bank.Google Scholar
Young, K, Ashby, D, Boaz, A and Grayson, L (2002) Social science and the evidence-based policy movement. Social Policy and Society 1, 215224.CrossRefGoogle Scholar