8.1 Introduction
A transparency revolution is sweeping the social sciences.Footnote 1 The failure to replicate existing findings, a suspicious absence of disconfirming results, the proliferation of uninformative or inaccurate citations, and broader concerns about a media environment that privileges “fake news” and sensationalism over rigorously grounded facts have all raised concerns about the legitimacy and credibility of academic scholarship. Journals, professional associations, funders, politicians, regulators, and colleagues now press researchers to open their data, analysis, and methods to greater scrutiny.Footnote 2 Qualitative researchers who conduct case studies, collect archival or interview data, and do ethnography, participant observation, or other types of nonquantitative studies are no exception. They have been developing specific standards and techniques for enhancing transparency, including some that exploit digital technology. Reputable research now requires more than solid empirical evidence, state-of-the-art theory, and sophisticated methods: It must be transparent.Footnote 3
Yet the transparency of qualitative analysis by practitioners in governmental, intergovernmental, and civil society institutions lags behind. In recent years, practitioners have pushed policy-makers to improve governmental transparency, yet, ironically, the data, analysis, methods, and other elements of their own research lack a similar openness.Footnote 4 The data and analysis in policy case studies and histories, after-action reports, and interview or focus-group analyses are often opaque. This is troubling, since the justifications for enhancing transparency in academic research apply equally, or even more so, to research by practitioners in governments, think tanks, and international organizations. To them, moreover, we can add numerous and pressing justifications for greater transparency specific to the policy world. Safeguarding the clarity, accessibility, and integrity of policy-relevant research helps ensure that decision-makers avoid basing costly policy interventions on flawed analysis or incomplete information. Transparency helps guard against potential conflicts of interest that might arise in research or policy implementation. Most importantly, it opens up public assessment and evaluation to proper official and public deliberation – thus according them greater legitimacy.
This chapter offers a brief background on the basic logic and practice of transparency in qualitative social science and reviews the cost-effectiveness of the available practical options to enhance it – both in the academy and in the policy world. Section 8.2 defines three dimensions of research transparency and explores some of the distinctiveness of qualitative research, which suggests various reasons why the applied transparency standards in qualitative research may differ from those employed in quantitative research. Section 8.3 examines three commonly discussed strategies to enhance transparency. It argues that in most cases it is infeasible and inappropriate – and, at the very least, insufficient – for qualitative policy analysts to employ conventional footnotes, hyperlinks to web-based sources, or, as some suggest by analogy to statistical research, centralized “datasets” to store all of a project’s qualitative source material. Section 8.4 introduces a new strategy to enhance qualitative research transparency that is emerging as a “best practice.” This is “Active Citation” (AC) or “Annotation for Transparency Initiative” (ATI): a digitally enabled open-source discursive annotation system that is flexible, simple, and compatible with all existing online formats.Footnote 5 For practitioners, as for scholars, AC/ATI is likely to be the most practical and broadly applicable means to enhance the transparency of qualitative research and reporting.
8.2 Research Transparency in the Social Sciences
Transparency is a norm that mandates that “researchers have an ethical obligation to facilitate the evaluation of their evidence-based knowledge claims.”Footnote 6 This is a foundational principle of all scientific work. Scholars embrace it across the full range of epistemological commitments, theoretical views, and substantive interests. It enjoys this status because nearly all researchers view scholarship as a collective enterprise: a conversation among scholars and often extending to those outside academia.Footnote 7 Researchers who conduct transparent work enhance the ability of others to engage in the conversation through productive evaluation, application, critique, debate, and extension of existing work. Without transparent data, theory, and methods, the conversation would be impoverished. A research community in which scholars can read, understand, verify, and debate published work when they choose should foster legitimate confidence in results. A research community in which analysts accept findings because of the prominence of the author or the apparent authority of big data, copious citations, clever arguments, or sophisticated “gold standard” methods should not inspire trust.
Research transparency has three broad dimensions.Footnote 8 The first, data transparency, stipulates that researchers should publicize the data and evidence on which their research rests. This helps readers apprehend the richness and diversity of the real-world political activity scholars study and to assess for themselves to what extent (and how reliably) that evidence of that activity confirms particular descriptive, interpretive, or causal interpretations and theories linked to it. The second dimension, analytic transparency, stipulates that researchers should publicize how they interpret and analyze evidence in order to generate descriptive and causal inferences. In social research, evidence does not speak for itself but is analyzed to infer unobservable characteristics such as preferences, identities, beliefs, rationality, power, strategic intent, and causality. For readers to understand and engage with research, they must be able to assess how the author purports to conceptualize and measure behavior, draw descriptive and causal inferences from those measures, determine that the results are conclusive vis-à-vis alternatives, and specify broader implications. The third dimension, production transparency, stipulates that social scientists should publicize the broader set of design choices that underlie the research. Decisions on how to select data, measure variables, test propositions, and weight overall findings – before, during, and after data analysis – often drive research results by defining the particular combination of data, theories, and methods they use for empirical analysis. Researchers are obliged, to the extent possible, to afford readers all three types of research transparency.
These three elements of research transparency underlie all scientific research communities, including those in fields such as history, law, ethnography, policy assessment, and discourse analysis.Footnote 9 Yet its form varies by research method. The appropriate rules and standards of applied transparency in qualitative research, for example, differ from those governing quantitative research. An ideal-typical qualitative case study of public policy has three distinctive characteristics. It focuses intensively on only one or a few cases. It employs primarily textual evidence, such as documents, transcripts, descriptions, and notes (though visual and numerical evidence may sometimes also be used). And, finally, it is generally reported and written up as a temporal, causal, or descriptive narrative, with individual pieces of evidence (and interpretation) inserted at specific points in the story. Different types of data and inference should generate subtly different transparency norms.
Qualitative research methods – intensive, text-based narrative studies of individual cases – are indispensable. They play a critical role in a healthy and balanced environment of research and policy evaluation – not just in the academy, but in the policy world as well. In both contexts, qualitative research enjoys distinct comparative advantages. For policy-makers, one of the most important is that qualitative analysis permits analysts to draw inferences from and about single cases (see Cartwright, Chapter 2 this volume). Detailed knowledge and insights about the characteristics of a single case, rather than average outcomes, are often what policy-makers and analysts most need. This may be because some types of phenomena are intrinsically rare, even unique. If only a limited number of cases exist, a case study may be the best way to inform policy.Footnote 10 The demand for precise knowledge about a single case may arise also because policy-makers are focused on designing a particular intervention at a specific time and geographical location. Even if solid quantitative generalizations exist, policy-makers often want to know exactly what mix of factors is at work in that case – that is, whether the case before them is a typical case or an outlier. If, for example, after-action reports show that a promising program design recently failed when implemented in Northern India, does that mean it is less likely to succeed if launched in Bolivia? Answering this type of everyday policy problem in real time often requires detailed knowledge of important contextual nuances of the local culture, politics, and economics. This, in turn, implies that, in order to be useful, the original after-action report may want to consider detailed evidence of incentives, perceptions, and inclinations as revealed by actions, documents, and statements. For similar reasons, case studies often enjoy a comparative advantage in situations where analysts possess relatively little prior knowledge and seek to observe and theorize previously unknown causal mechanisms, social contexts, and outcomes in detail, thus contributing to the development of new and more accurate explanations and theories.Footnote 11
8.3 Practical Options for Enhancing Qualitative Transparency
By what means can we best render qualitative research more transparent? Social scientists generally possess some inkling of the research transparency norms governing statistical and experimental research. When we turn to qualitative research, however, many analysts remain unaware that explicit standards for transparency of data, analysis, or methods exist, let alone what they are. In recent years, qualitative social scientists have moved to establish stronger norms of transparency. Building on the American Political Science Association’s initiative on Data Access and Research Transparency (APSA/DA-RT) in the US field of political science, a team of scholars has developed specific applied transparency guidelines for qualitative research.Footnote 12 A series of conferences, workshops, journal articles, and foundation projects are further elaborating how best to implement qualitative transparency in practice.Footnote 13 The National Science Foundation (NSF) has funded a Qualitative Data Repository (QDR) based at Syracuse University, as well as various projects demonstrating new transparency standards and instruments that use new software and internet technologies.Footnote 14
Scholars have thereby generated shared knowledge and experience about this issue. They have learned that qualitative research poses distinctive practical problems due to factors such as human subject protection, intellectual property law, and logistical complexity, and distinctive epistemological problems, which arise from its unique narrative form. These must be kept in mind when assessing alternative proposals to enhance transparency.
Four major options exist: conventional footnotes, hyperlinks to online sources, archiving textual data, and digitally enabled discursive notes. A close examination of these options reveals, first, that the practical and epistemological distinctiveness of qualitative research implies a different strategy than is employed in quantitative research, and, second, that the optimal strategy is that of creating digital entries containing annotated source material, often called Active Citation or the Annotation for Transparency Initiative. We consider each of these four options in turn.
8.3.1 Conventional Footnotes
The simplest and most widespread instruments of transparency used today in social science are citations found in footnotes, endnotes, and the text itself. Yet the current state of citation practice demonstrates the flaws in this approach. Basic citations in published work are often incomplete or incorrect, particularly if they appear as brief in-text “scientific citations” designed for a world in which most (quantitative) analysts use footnotes to acknowledge other researchers rather than cite evidence. Such citations do not provide either data access or analytic transparency. Scientific citations are often incomplete, leaving out page numbers and failing to specify the concrete textual reference within an article or on a page that the author considers decisive. Even if a citation is precise, most readers will be deterred by the need to locate each source at some third location, perhaps a library or an archive – and, in many cases, as with interviews and records of focus groups, the source material may not be available at all.Footnote 15 Even more troubling, conventional citations offer no analytical transparency whatsoever: the reader knows what is cited, but generally much less about why.
In theory, an attractive solution would be to return to the traditional method of linking evidence and explanation in most scholarly fields: long discursive footnotes containing extended quotations with interpretive annotations. Discursive footnotes of this kind remain widespread in legal academia, history, some humanities, and a few other academic disciplines that still prize qualitative transparency. In legal academia, for example, where fidelity to the precise text and rigorous interpretation are of great academic and practical value, articles may have dozens, even hundreds, of such discursive footnotes – a body of supplementary material many times longer than the article itself. The format evolved because it can enhance all three dimensions of transparency. The researcher is often obliged to insert extensive quotations from sources (data access); annotate those quotations with extensive interpretation of how, why, and to what extent they support a claim made in the text and how they fit into the broader context (analytic transparency); and discuss issues of data selection and opposing evidence (production transparency). At a glance, readers can scan everything: the main argument, the citation, the source material, the author’s interpretation, and information about how representative the source is. In many ways, discursive footnotes remain the “best practice” instruments for providing efficient qualitative transparency.
Yet recent trends in formatting social science journals – in particular, the advent of so-called scientific citations and ever-tighter word limits – have all but banished discursive footnotes. This trend is not methodologically neutral: it privileges quantitative research that employs external datasets and cites secondary journals rather than data, while blocking qualitative research from citing and interpreting texts in detail. As a result, in many social sciences, we see relatively little serious debate about the empirics of qualitative research. Replication or reanalysis is extremely difficult, and extension or secondary analysis almost impossible.Footnote 16 Given the economics of social science journals, this trend is unlikely to reverse. Practitioners and policy analysts face similar constraints, because they often aim their publications, at least in part, at nonexperts. Memos and reports have been growing shorter. Long discursive footnotes pose a visual barrier, both expanding the size of a text, and rendering it less readable and accessible. In sum, conventional footnotes and word limits are part of the problem, not the solution.
8.3.2 Hyperlinks to Online Sources
Some suggest that a simple digital solution would be to link articles and reports to source documents already posted online. Many government reports, journalistic articles, contemporary scholarship, and blogs often do just this. Yet this offers an inadequate level of research transparency, for three basic reasons. First, much material simply cannot be found online: Most primary field research evidence (e.g., interviews) is not there, and despite the efforts of archives to digitalize, we are far from having all documents online even in the most advanced industrial democracies, let alone elsewhere. Even journalistic articles and secondary scholarly works are unevenly available, with much inaccessible online (or hidden behind paywalls), in foreign languages, or buried within longer documents. Second, links to outside sources are notoriously unstable, and subject to “link rot” or removal.Footnote 17 Attempts to stabilize links to permit cross-citation have proven extremely challenging even when they focus on a very narrow range of documents (e.g., academic medical journals), and it is nearly impossible to do so if one is dealing, as policy analysts do, with an essentially unlimited range of contemporary material of many types and in many languages. Third, even when sources are available online – or when we place them online for this purpose – hyperlinks provide only data transparency, not analytical and process transparency. We learn what source a scholar cited but not why, let alone how he or she interpreted, contextualized, and weighed the evidence. This undermines one of the distinctive epistemological advantages of qualitative research.
8.3.3 Archiving Evidence in a Centralized Database
For many from other research traditions, data archiving may seem at first glance the most natural way to enhance transparency. It is, after all, the conventional solution employed by statistical researchers, who create centralized, homogeneous “datasets” where all evidence is stored, connected to a single set of algorithms used to analyze it. Moreover, data repositories do already exist for textual material, notably the Qualitative Data Repository for social science materials recently established with NSF funding at Syracuse University.Footnote 18 Data archiving is admittedly essential, especially for the purpose of preserving complete collections of new field data drawn from interviews, ethnographic notes, primary document collections, and web-searches of manageable size that are unencumbered by human subject or copyright restrictions.Footnote 19 Archiving full datasets can also help create a stronger bulwark against selection bias (“cherry-picking” or constructing biased case studies by selecting only confirming evidence) by obliging qualitative scholars to archive “all” their data.
Yet, while data archiving can be a useful ancillary technique in selected cases, it is unworkable as a general “default” approach for assuring qualitative research transparency because it is both impractical and inappropriate. Archiving is often impractical because ethical, legal, and logistical constraints limit the analyst’s ability to reveal to readers all the interviews, documents, or notes underlying qualitative research. Doing so often threatens to infringe the confidentiality of human subjects and violates copyright law limiting the reproduction of published material.Footnote 20 Sanitizing all the interviews, documents, and notes (i.e., rendering them entirely anonymous and consistent with confidentiality agreements) is likely to impose a prohibitive logistical burden on many research projects. These limitations become much greater when the researcher seeks to archive comprehensive sets of complete documents, as opposed to just releasing quotations or summaries, as some other transparency strategies require. This is often particularly problematic for policy practitioners, perhaps more so than scholars, because policy case studies and histories, after-action reports, and interview or focus-group analyses so commonly contain sensitive information.
Archiving is also inappropriate because it dilutes the distinctive epistemological advantages of qualitative research. The notion that archiving documents in one large collection generates transparency overlooks a distinctive quality of case study analysis. A qualitative analyst does not treat the data as one undifferentiated mass, analyzing all of it at once using a centralized algorithm, as in a statistical study. Instead, he or she presents and interprets individual pieces of data one at a time, each linked to a single step in the main narrative.Footnote 21 Qualitative analysts enjoy considerable flexibility to assign a different location, role, relative weight, reliability, and exact meaning to each piece of evidence, depending on its logical position in a causal narrative, the specific type of document it is, and the textual content of the quotation within that document. This type of nuanced and open-ended, yet rigorous and informed, contextual interpretation of sources is highly prized in fields such as history, law, anthropology, and the humanities. Any serious effort to enhance qualitative transparency must thus make clear to the reader how the analyst interprets each piece of data and exactly where in the narrative it fits. Simply placing all the evidence in a single database, even where it is logistically and legally feasible, does not help the reader much.Footnote 22 Links from citations to archived material are, at best, cumbersome. Moreover, as with hyperlinks and conventional citations, archiving fails to specify particular passages and provides little analytic transparency, because it fails to explain why each source supports the underlying argument at that point in the narrative. To achieve qualitative transparency, a less costly approach is required – one that reveals the inferential connection between each datum and the underlying analytical point in the narrative.
8.3.4 Active Citation/ATI: A “Best Practice” Standard of Qualitative Transparency
Given the practical and epistemological constraints outlined above, social scientists have recently agreed that the best way to enhance transparency is to exploit recent innovations in internet formatting and software engineering. These technologies permit us to create new digital formats that can reestablish the high levels of qualitative transparency afforded by discursive footnotes in a more efficient and flexible way. Active Citation (AC) and Annotation for Transparency Initiative (ATI) are two related, digitally enhanced transparency standards designed do just this. They are practical and epistemologically appropriate to qualitative research.
AC/ATI envisages a digitally enabled appendix to research publications and reports. Rather than being an entirely separate document, however, the appendix embeds each source and annotation in an entry linked to a specific statement or citation in the main narrative of a research article or report. These may take the form of numbered hyperlinks from the article to an appendix or, in the ATI version, a set of annotations that overlay the article using a separate but parallel software platform. Unlike modern in-text footnotes, hyperlinks, and archiving, AC/ATI reinforces the epistemological link between narrative, data, and interpretation central to qualitative research. This author-driven process of annotation and elaboration via a separate document assures the same (or greater) levels of data, analytical, and production transparency as discursive footnotes, but with greater flexibility and no constraint on overall length. Moreover, it reduces the logistical difficulties by leaving the existing format of basic digital or paper articles and reports completely unchanged. Indeed, AC/ATI has the advantage that some audiences can simply skim or read the article without any additional materials, while those with a desire for more information can activate the additional materials.
Two ways exist to implement the AC/ATI standards. One, initially proposed by advocates of AC, obliges authors to design standardized entries that promote realistic levels of data, analytic, and production transparency in a relatively structured way. Accordingly, AC prescribes that researchers link each annotation that concerns an “empirically contestable knowledge claim” to a corresponding appendix entry. Of course, this still leaves tremendous leeway to the author(s), who decide (as with any footnote or citation) what is sufficiently “empirical” or “contestable” to merit further elaboration. Once an author decides that further elaboration is required, each entry would contain three mandatory elements and room for one more optional one – though, again, the author would decide how detailed and lengthy this elaboration needs to be.
An examination of the four elements in an AC entry shows how, in essence, this system simply updates the centuries-old practice of discursive footnoting in a flexible, author-driven, and electronic form appropriate to a digital age.Footnote 23 The four elements that can be in each entry are:
1) A textual excerpt from the source. This excerpt is presumptively 50–100 words long, though the length is ultimately up to the author. It achieves basic qualitative data transparency by placing the essential textual source material that supports the claim “one click away” from the reader. Sources subject to human subject or copyright restrictions can be replaced with a sanitized version, a summary, or a brief description, as is feasible. This provides a modest level of prima facie data transparency, while minimizing the logistical demands on authors, the ethical threats to subjects, and the potential legal liability.
2) An annotation. This length of interpretive commentary explains how, why, to what extent, and with what certainty the source supports the underlying claim in the main text. This provides basic analytic transparency, explaining how the author has interpreted the source. In this section, the author may raise not just the analysis of a given source, but its interpretive context, its representativeness of a broader sample, the existence of counterevidence, how it should be read in broader context, how it was translated, etc. This annotation can be of any length the author believes is justified.
3) A copy of the full footnote citation, sufficient to locate the document. This is critical because authors may seek to use the appendices independently of the text – for example, in a bibliography or database. Also, it assures that, whatever the format being employed in the main report, a genuine full citation exists somewhere, which is far from true today.
4) An optional link to (or scan of) the full source. A visual copy of the source would provide more context and unambiguous evidence of the source, as well as creating additional flexibility to accommodate nontraditional sources such as maps, charts, photographs, drawings, video, recordings, and so on. This option can be invoked, however, only if the author has the right to link or copy material legally and the ability to do so cost effectively, which may not always be the case – and doing so at all remains at the discretion of the author.
Of course, the de facto level of transparency that an author chooses to provide in any specific case will still reflect other important constraints. One constraint is ethical. The active citations cannot make material transparent that would harm research subjects or that is subject to confidentiality agreements. Ethical imperatives obviously override transparency.Footnote 24 A second constraint is legal. The content of the entries must respect intellectual property rights. Fortunately, small citations of most published material (except artistic or visual products) can be cited subject to “fair use” or its equivalent in almost all jurisdictions – but in cases of conflict, legal requirements override transparency. A third constraint is logistical. The amount of time and effort required to provide discursive notes of the type AC envisages is surely manageable, since discursive footnotes with roughly the same content were the norm in some academic disciplines and were widely used in the social sciences until a generation ago – and still appear in many published books. Today, the advent of electronic scanning and word processing make the process far easier. One can readily imagine situations in which that would create excessive work for the likely benefit. This is yet another reason why the decision of how many annotations to provide and how long they are remains primarily with individual authors, subject to guidance from relevant research communities, as is currently the case with conventional citations. Ultimately, the number of such entries, and their length and content, remain essentially up to the author, much as the nature of footnotes is today.
ATI offers the slightly different prospect of a more flexible, open-ended standard. ATI’s major innovation is to use innovative software provided by the nonprofit firm hypothesis.Footnote 25 In lieu of storing the annotated source entries in a conventional appendix (akin to existing practice with formal and quantitative research) and hyperlinking individual entries to selected citations, as AC initially recommended, ATI allows the annotations to be written at will, stored in a separate program, and seamlessly layered on top of a PDF article by running the two programs simultaneously. ATI software makes the annotated sections appear as highlighted portions of the article, and when one clicks on a section of highlighting, the additional material appears in a box alongside the article. ATI provides a particularly efficient and manipulable means of delivering these source material and annotations, and it provides almost infinite flexibility to authors. In trials, authors use the software to add annotations as they see fit. This type of software option also allows for separate commentary by readers. One might imagine the social sciences moving forward for a time with a set of such experiments that recommend no specific set of minimum standards for transparency but permit authors to define their own digital options. In a number of large test studies, dozens of younger scholars have tried ATI out with considerable enthusiasm, and this approach is in the process of adoption by major university presses that publish journals. This, it seems, is the future.
8.4 Conclusion: Qualitative Transparency in the Future
Qualitative social science journals, publishers, and scholars, having inadvertently undermined traditional qualitative transparency in recent decades, appear now to be moving back toward the higher levels practiced by researchers in history, law, and the humanities. An approach such as AC/ATI offers a more attractive trade-off between enhanced research transparency and the imperatives of ethics/confidentiality, intellectual property rights, and logistics than that offered by any existing alternative, even if data archives, conventional citations, and hyperlinks to existing web sources can occasionally be useful. These new digital standards are logistically efficient, flexible in the face of competing concerns, and remain firmly decentralized in the hands of researchers themselves. Over the next decade, journals and research communities are likely to adopt levels and strategies of qualitative transparency that differ in detail but all move in this direction, not least because funders and their fellow scholars are coming to expect it. Thus, while it remains to be seen precisely how standards for qualitative transparency will evolve in the future, it seems likely that digital means will be deployed more intensively to enhance research transparency. This is true not just because it renders social science research richer and more rigorous, but because society as a whole is moving in that direction. As digital transparency that clicks through to more detailed source material has become the norm in journalism, government messaging, business, and entertainment, the notion that researchers should not follow suit seems increasingly anachronistic. The same is true, of course, for practitioners and policy analysts who work on the major international challenges of our time.