1. The Explanandum for a Theory of Justification
It is sometimes said that justification lies at the heart of epistemological study. For instance, Fumerton states that “the concept of justification may be the most fundamental in epistemology” (Reference Fumerton and Moser2005: 204). Even today, where epistemology is a more varied domain than just the project of theorising about knowledge and justification, it is still fair to say that the latter is taken to be one of the central concepts. A key reason for thinking this is that justification is one of the key notions that distinguishes epistemology from the psychological topic of capturing how we do in fact form beliefs. With justification comes an element of normativity, a domain within which the question is not only how we do in fact form our beliefs, but how we ought to.
Corpus analysis is an empirical methodology that has been attracting increasing levels of interest across a range of disciplines, particularly with the rise of the so-called digital humanities. In this paper, I will show how this methodology can contribute to the more traditional epistemological investigation of justification. In particular, a corpus analysis of the lemma “justify” provides us with a new way of investigating our ordinary epistemic practices. Before I can make the case for this, however, some groundwork is required in order to distinguish between different kinds of theories of justification. Rather than distinguishing along familiar lines between those theories that posit some foundational level of justification and those that don't, or between theories that have some internalist requirement and those that don't, I instead want to delineate in terms of what the various theories take the explanandum to be. One feature of much of the debate on justification is that there has been a lack of clarity on this matter, and so it is unsurprising that some have felt that the justification debate in epistemology is in some sense defective (Alston Reference Alston1993; Cohen Reference Cohen2016). We can in fact distinguish between three broad approaches to justification that differ in terms of their explanandum, and the corpus analytical findings to be presented are relevant primarily to only one of them.
The first approach takes there to be a notion of justification that is already circulating within our community i.e. that notion that serves as the standard of epistemic evaluation by which we judge our own beliefs and the beliefs of others. A theory of justification, then, seeks to capture the ordinary notion of epistemic evaluation. I will call this the folk justification approach. One common way of embarking upon an investigation of folk justification is to attempt a conceptual analysis of justification, by providing a set of necessary and sufficient conditions for when a given belief is justified. The formulation of these conditions would be guided by consideration of hypothetical cases and whether or not beliefs in these cases would count as justified or not. This familiar approach would be to treat justification in the same way that knowledge has commonly been treated. Of course, conceptual analysis has faced a number of criticisms. For instance, certain experimental findings suggest that the kinds of intuitions that philosophers typically rely upon in conceptual analysis are in fact dependent upon extraneous factors such as ordering effects, socio-economic background, ethnic background etc. (Weinberg et al. Reference Weinberg, Nichols and Stich2001; Swain et al. Reference Swain, Alexander and Weinberg2008; Beebe and Buckwalter Reference Beebe and Buckwalter2010).Footnote 1 This has led to a wider debate about the role of intuitions in philosophy and whether conceptual analysis is a suitable philosophical method (Williamson Reference Williamson2007; Cappelen Reference Cappelen2012; Deutsch Reference Deutsch2015). But whether conceptual analysis or the elicitation of intuitions is a suitable method in philosophy is not essential to the folk justification approach as it is understood here – what is essential is that there is some notion of epistemic evaluation that is part of our ordinary epistemic practices and that this notion is the explanandum for a theory of justification.
We can certainly find support for the folk justification approach in the literature. For instance, Goldman (Reference Goldman and Pappas1979) was explicit that he was seeking an account of the notion of justification as ordinarily used and that a theory of justification “will be a set of principles that specify truth conditions for the schema ‘S's belief in p at time t is justified’ i.e. conditions for the satisfaction of the schema in all possible cases” (Goldman Reference Goldman and Pappas1979: 3). Many authors have since explicitly endorsed the view that justification is reflected in our ordinary epistemic judgments. To select just one further example, Ichikawa (Reference Ichikawa2014: 188) states “our ordinary epistemic judgments respect a distinction between knowledge and non-knowledge justified belief”. This would also be a natural way of understanding some of the key thought experiments regarding justification. For instance Lehrer's (Reference Lehrer1990: 163) Mr Truetemp case, BonJour's (Reference BonJour1985: 38) clairvoyant case, and Lehrer and Cohen's (Reference Lehrer and Cohen1983) new evil demon case are all thought experiments whereby we consider whether a given subject has a justified belief.Footnote 2 In relying on our ordinary intuitions in order to reach a verdict regarding these cases, the thought presumably is that we are able to employ justification as the ordinary notion of epistemic evaluation.Footnote 3
Some have adopted a particularly linguistic form of the folk justification approach, by investigating the linguistic properties of “justify” and related word forms. For instance, Kvanvig and Menzel (Reference Kvanvig and Menzel1990) urge epistemologists to pay greater attention to the various different locutions in which “justify” appears, and so they provide a theory of a basic notion of justification that can then be used to explain the various locutions. More recently, Hawthorne and Logins (Reference Hawthorne and Logins2020) have investigated the semantic gradability of “justified”. They argue that “justified” is an absolute gradable adjective that is only derivatively associated with a scale, and so speakers and hearers must construct an appropriate scale on-the-fly depending on the concerns most salient to them. Certainly it is true that not all advocates of the folk justification approach choose to place such importance on the linguistic properties of sentences containing “justify”, but it is worth keeping this possibility in mind as it will be pertinent to the discussion later in the paper.
In this paper, I aim to use insights from large corpora to pose a challenge to the folk justification approach. However, while the folk justification approach is common, it is not the only way in which justification is conceived of. Here I will consider two prominent alternatives. First, epistemic justification could be conceived of as a theoretical notion i.e. as whatever it is that one must add to true belief in order to obtain knowledge. This can often feel like the way we introduce new students to the notion of justification, by pointing out that true beliefs can fall short of knowledge if, for instance, they are just lucky guesses. As this approach develops justification in terms of its relation to knowledge, I will label it the Theory of Knowledge (TOK) approach. One problem with the TOK approach is that it is hard to make sense of the Gettier (Reference Gettier1963) problem if Gettier cases are supposed to be cases of justified true belief that fall short of knowledge. Of course, one post-Gettier tactic is to claim that knowledge must be justified true belief plus some other property, and nothing I will say rules out that possibility. But notice that now in even claiming this, we seem to be envisioning justification in some way other than as what we add to true belief to get knowledge, and to that extent, we have moved away from the TOK approach. For instance, many view it as a requirement on a theory of justification that it is possible to have false justified beliefs, and this might constrain the theoretical moves available in taking knowledge to be justified true belief plus something else. But notice that in even taking on this requirement, we seem to have moved away from the TOK approach to some extent – we are placing restrictions on our theory of justification that don't arise from it being a constituent of knowledge.Footnote 4 A second problem with this approach is that some explicitly deny that justification is even necessary for knowledge.Footnote 5 Again, in order for this theoretical move to even make sense, it must be that we are envisioning justification as something other than as the difference between knowledge and true belief.
A second alternative to the folk justification approach is to develop a notion of justification from a theory of epistemic value. Instrumentalism about justification is roughly the view that those beliefs that are epistemically justified are those beliefs that are conducive to the epistemic goal or goals. If we take this as our starting point for a theory of justification, then it is conceivable that we could construct a theory of justification based on a theory of epistemic value. If we are able to identify what is epistemically valuable, and then identify the minimally acceptable ways of promoting that value in terms of belief-formation, then we would thereby have some account of epistemic permissibility that could serve well enough as a theory of epistemic justification. I won't say too much about this approach here, other than that while there are advocates of this approach in the literature (Alston Reference Alston1985; BonJour Reference BonJour1985; Foley Reference Foley1987), it is unclear the extent to which a pure version of this approach has been adopted i.e. one not combined with the previous two approaches mentioned. To take one example, Foley (Reference Foley1987, Reference Foley, Fairweather and Zagzebski2001, Reference Foley and Weithman2008) is an instrumentalist of this stripe who urges that we must theorise about epistemic rationality, with no initial regard for whether it has some conceptual connection to knowledge. Instead, we should try to capture epistemic rationality in its own right and its connections to other forms of rationality. However, in doing so, Foley takes it to be a success criterion for the theory that it captures “the everyday assessment of rationality of opinions, which tend to focus on whether individuals have been responsible in forming their opinions rather than on whether they have satisfied the prerequisites of knowledge” (Foley Reference Foley, Fairweather and Zagzebski2001: 214). To the extent that the theory is not merely sensitive to considerations about epistemic value, but also to the manner in which we evaluate beliefs in the everyday, the approach to justification described here is combined with the conceptual analysis approach.
In sum, we can distinguish between three positive approaches to justification: the folk justification approach, the TOK approach, and the instrumentalist approach. It is the folk justification approach that is of primary concern in this paper, and I don't take what I say here to be particularly damaging to the alternative approaches. But it is important to emphasise that pure versions of the alternative approaches are relatively rare in the literature. In the case of the TOK approach, Gettier famously argued that we shouldn't think that justification is the difference between knowledge and mere true belief, and for many this is one of the few points of agreement in epistemology. In the case of instrumentalism, there usually is appeal to some ordinary notion of epistemic evaluation, as we saw in the case of Foley. I haven't shown that pure versions of the alternative approaches are not viable; I merely want to emphasise that the folk justification approach is a very common one and that there is good reason to adopt it.
In this paper, I will show how corpus analysis can serve a theory of justification by shedding light on the extent to which there is an ordinary notion of justification. In particular, I will analyse the use of the lemma “justify” and its related word forms in large English corpora.Footnote 6 After presenting a range of insights from a few different corpora, I will present a challenge that any advocate of the folk justification approach must meet. As a preview: the challenge for the folk justification approach is to give some account of how the ordinary notion of justification is spoken about. In this paper, I will explore the most straightforward possibility that speakers use “justify” to speak about epistemic justification. I will suggest that the evidence available from various corpora suggest that “justify” is not being widely used to talk about the epistemic justification of beliefs.
2. Corpus Analysis: Principles and Tools
Corpus linguistics attempts to generate linguistic insight on the basis of evidence drawn from linguistic corpora. The history of corpus linguistics is certainly an interesting one,Footnote 7 but it is fair to say that the approach has become much more powerful in the last few decades as large corpora – as well as the computational methods used to analyse those corpora – have become more available and more easily accessible. In terms of the study of meaning, it is often said that a starting principle within corpus linguistics is the distributional hypothesis: that there is a correlation between the meaning of a term and its distribution across a corpus, or as Harris (Reference Harris1954: 156) puts it:
if we consider words or morphemes A and B to be more different in meaning than A and C, then we will often find that the distributions of A and B are more different than the distributions of A and C. In other words, difference of meaning correlates with difference of distribution.
This hypothesis could be interpreted in stronger or weaker senses. One strong way of interpreting the hypothesis is as claiming that, in the fullest sense, the meaning of a term can be captured purely by an account of its distribution across a suitably large corpus, and that a theory of meaning for a language should be distributional in nature.Footnote 8 It is something like this stronger sense that drives distributional semantics: a form of corpus linguistics where the meanings of terms are typically represented as vectors across a high dimensional space. For the purposes of this paper, a much weaker version of the distributional hypothesis is required. The thought is that we can gain insight into how a term is used by paying careful attention to (i) the kinds of texts the term is used in and (ii) the other terms that typically collocate with the term. In doing so, we can thereby draw tentative conclusions about the meaning of the term. This kind of inference will be used repeatedly throughout this paper, but I will be explicit at each stage of the kind of reasoning involved.
It should be noted that although this paper can be viewed as a form of experimental philosophy insofar as it will introduce empirical considerations from large corpora, the methodology used here is largely exploratory. That is to say, the conclusions drawn in this paper are not reached via statistical tests of significance concerning the relationship between a dependent and independent variable. This is not because such tests are not possible in corpus analysis, even if corpus analysis is particularly suited to exploratory investigations.Footnote 9 This does affect the kinds of conclusions that one can draw from the investigation, but I nevertheless hope to illustrate that such conclusions can prove useful within epistemology.
Something I hope to show in this paper is that corpus analysis can be a powerful tool for philosophical investigation, particularly considering the previously mentioned criticisms that more traditional methodologies have faced. Unlike the traditional armchair methods, corpus analysis depends primarily on patterns to be found in empirical data rather than on intuitions of the theorist. And unlike survey-based methods, there is no worry about ecological validity as corpora are recordings of authentic language use rather than elicited language use in an experimental setting. This is not to say that corpus analysis is always superior to alternative methodologies, but it is to say that if there are ways in which it can help shed light on philosophical questions, we absolutely should take advantage of that. In recent times, others have attempted to do precisely this (Hansen et al. Reference Hansen, Porter and Francis2019; Sytsma et al. Reference Sytsma, Bluhm, Willemsen, Reuter, Fischer and Curtis2019; Liao and Hansen Reference Liao and Hansen2022), and this paper follows their lead to that extent. However, one distinctive aspect of this paper is in the claim that not only do corpus findings raise a challenge for the folk justification approach, but that corpus analysis can also help the folk justification approach to answer that challenge.
Turning to the kinds of corpora that will be used, we require corpora that are representative of ordinary language use so that we can consider whether a notion of epistemic justification occurs within our ordinary linguistic practices. My focus will be only on English language use, and there are a few English language corpora that are readily available for analysis. I will use three:
(i) Corpus of Contemporary American-English (COCA)
COCA is a large corpus of over 1 billion words of contemporary American-English, collected between 1990–2019. It is evenly balanced across eight different genres of text (Spoken, Fiction, Magazines, Newspapers, Academic Writing, Web, Blogs, TV and movies). This not only ensures that the corpus is fairly representative, but also allows one to investigate how a term is used differently across different genres. Another attractive feature of COCA is that it is easy to explore using the host website (https://www.english-corpora.org/coca/).
-
(ii) British National Corpus (BNC)
The British National Corpus is a 100 million word text of British English collected from the late 20th century. It is smaller and older than COCA, and while it is separated into a few different sub-corpora, it has not been curated with the same balance across genres. The key benefit of using the BNC is that there is a great deal more information and meta-data about the documents and sub-corpora that make up the BNC, and so we have the opportunity to draw more inferences about the contexts in which a given term (such as “justify”) is used, and the way in which the term is distributed across the corpus. This is particularly true given the BNCWeb interface made available by the University of Lancaster (http://corpora.lancs.ac.uk/BNCweb/), which I will make use of later in the paper.
-
(iii) EnTenTen20
EnTenTen20 is a very large corpus of around 40 billion words of web-based English. It is compiled by scraping English text from the web up until 2020, and so there isn't the same deliberate balance across genres that one finds in COCA nor the detailed metadata of sub-corpora and documents that one finds in the BNC. Even if a large corpus is not necessarily a representative corpus, there is clearly some reason to think that such a large amount of text taken from the web, that is clearly not restricted to a particular genre or topic, would provide a good account of English language use. As we will see, another attractive feature of investigating EnTenTen20 is that this can be done via the program SketchEngine. SketchEngine is a text analysis program that comes with over 500 corpora across 90 languages pre-loaded onto it (including EnTenTen20). It is particularly useful for performing collocation analyses that take into account the grammatical relations of a term. For instance, if I want to find out the noun that most frequently serves as the object of the verb “catch”, this is easily done (it is “eye”, as in “caught my eye”).
3. Initial Indications: Comparing the Frequency of “Justify” and “Know”
The folk justification approach is commonly employed with regard to knowledge or “know”. That is, it is very common in analytic epistemology to give some account of the concept of knowledge that enjoys wider circulation within our epistemic community. And it seems plausible enough that there is a widely used notion often appealed to across a range of contexts and that could serve as the explanandum of an epistemological theory. Of course, there have been controversies over whether a conceptual analysis of the notion is possible, and over whether a reliance on the intuitions of philosophers in constructing such an analysis is appropriate. Here I will ignore those controversies and just focus on the initial step of selecting the ordinary notion of knowledge as the explanandum of a theory. We may wonder then whether that step could also be taken with justification. A worry one might have is that whereas “know” is one of the most common verbs used in English,Footnote 10 “justify” is used far less frequently. In COCA, the lemma “know” is the 39th most frequent lemma with a frequency of 2781.03 per million words, whereas the lemma “justify_v”Footnote 11 is the 2,643rd most frequent lemma (frequency 30.41 per mil). The noun “knowledge” is the 841st most frequent lemma (116.17 per mil) while the noun “justification” lies outside of the top 5,000 most frequent lemmas.Footnote 12 There clearly is a marked difference between the frequency of knowledge talk and the frequency of justification talk.
This is an initial indication that perhaps justification is not spoken of in ordinary discourse, and certainly not to the same extent that knowledge is spoken of. Others have previously reached (or at least considered) the same conclusion via consideration of the philosophical literature. Plantinga (Reference Plantinga1990) notes that prior to Gettier, there were very few analyses of knowledge that appeal to justification and so suggests that “it is almost as if a distinguished critic created a tradition in the very act of destroying it” (Plantinga Reference Plantinga1990: 45). In his recent data-driven history of philosophy, Brian Weatherson (Reference Weatherson2020) reaches a similar conclusion.Footnote 13 He uses topic-modelling on a dataset of philosophy journal articles taken from 12 prominent journals between 1876–2013. The result is 90 different topics that can be used to group the various articles. More specifically, topic-modelling will take the distribution of the words across the articles and, via the use of an unsupervised learning algorithm, create a number of topics such that each article can be given a probabilistic assignment according to how well it fits under each topic. The topics can be characterised according to a set of keywords that are more likely to turn up in articles under that topic than in the average article. Weatherson's study dives into each of the 90 topics, identifying the subject matter for that topic, and then charting its progress and popularity over time. He reasonably identifies one of the topics as about justification, with the following keywords: [believing, beliefs, epistemically, belief, justification, reliable, justified, reliability, epistemic, goldman, forming, believe, believed, accepting, warrant]. Charting the progress of the topic over time, Weatherson notes that the topic has very few papers from earlier years:
The first paper that gets a topic probability above one-third is Harry Frankfurt! This tells us something interesting about the background to Gettier's Reference Gettier1963 paper. Just a few years before that paper, there was virtually no discussion of beliefs being justified. It wasn't that Gettier showed a familiar concept couldn't play a role in the analysis of knowledge. He effectively introduced the concept of justification. (Weatherson Reference Weatherson2020: 2.76 Justification)
The comments from Plantinga and Weatherson are arguably unfair to Gettier himself, who cites examples of the kind of view he is targeting, but the important issue is whether the notion of justification really only started to be discussed in the philosophical literature in the middle of the twentieth century. As Weatherson suggests, if that is right, this may be initial reason to think that justification is a technical notion introduced by epistemologists.
On the other hand, it would be difficult to draw such conclusions from the state of the philosophical literature alone. After all, that a notion only starts to be discussed within philosophical circles from a given point in time doesn't tell us whether or not the notion was already in wider circulation prior to that time, just never discussed in philosophical contexts (as would obviously be the case with, for instance, bullshit). We also shouldn't place too much focus on historical considerations in considering whether the notion of justification is in wider circulation today. It is consistent with the notion only being introduced within the philosophical literature in the mid-twentieth century that it is in wider circulation today or even that it plays some crucial role in our ordinary epistemic lives. This possibility may seem far-fetched, but still it should lead us to turn our focus away from the philosophical history and towards the way the notion is currently employed in ordinary life.
Turning back to the corpus-based evidence, the frequency statistics cited above are really just indications as to the difference between “know” and “justify”. It could be argued that “know” is really the outlier here in being a widely used term and that we shouldn't expect all terms amenable to conceptual analysis to be as frequent within a corpus as “know” is. Furthermore, once we look at the kind of terms with similar levels of frequency as the lemma “justify” (in COCA), we find perfectly ordinary terms such as “juice”, “joy”, “cousin”, and “aunt”. So while there is a marked difference in frequency between “justify” and “know”, the frequency of the former in English corpora does not by itself suggest that it is not a widely circulated notion.
4. Distribution
It can often be useful to look not only at the frequency of a term but also the distribution of a term across a corpus. Some terms will be evenly spread, used at a consistent rate across all documents within the corpus. Others will have an uneven distribution, with higher frequencies in some documents, and lower frequencies or a zero frequency in others. If the term turns up in fewer documents, this is an indication that it is in some sense specialist – it may be that it is part of some dialect, that it concerns a particular subject matter, that it is a technical term for a discipline, or that it belongs to a particular register.
In general, the way to measure distribution across a corpus is to divide the corpus up into parts and measure the frequency of the term in each part. The result can be displayed visually, or it can be used to calculate a dispersion score, of which there are many different kinds available. For example, Juilland's D is a dispersion score between 0 and 1 with a score closer to 1 indicating that the term is more widely dispersed. Using the available frequency data on COCA from www.wordfrequency.info, we can easily access the Juilland's D for the lemmas “know” and “justify”. “Know” has a Juilland's D of 0.96 while “justify” has a Juilland's D of 0.94. This is an initial indication that both terms are well-dispersed across the corpus, with little to pick between the two of them. However, there are reasons to take this with a pinch of salt. First, a high Juilland's D is very common among the most frequent 5,000 lemmas in COCA, with the average Juilland's D shown in Table 1.
Second, there is reason to think that Juilland's D does not always accurately reflect an uneven distribution when large corpora have been partitioned into many parts (Biber et al., Reference Biber, Reppen, Schnur and Ghanem2016). The scores given above are the result of partitioning COCA into 100 parts, and so it may be that a term like “justify” or “know” is in fact quite unevenly distributed but that this is not reflected by the score. Biber et al. (Reference Biber, Reppen, Schnur and Ghanem2016) recommend instead using Gries’ DP (Gries Reference Gries2008, Reference Gries, Paquot and Gries2019) and Gries himself has made this score (and a range of other dispersion scores) available for all word forms in the BNC.Footnote 14 Gries’ DP (with a lower score indicating that the term is more widely distributed) for the various word forms for “justify” and “know” are shown in Table 2.
Here we can more clearly see a difference between the two lemmas, with a marked difference between the Gries’ DP scores achieved across the various word forms. This suggests that “justify” is not widely distributed in the way that “know” is.
Moving away from quantitative measures, one natural way to investigate the dispersion of a term across a corpus is to rely upon a corpus that has been compiled with a balance across different kinds of text, as COCA has. For any given term in COCA, we can look at the frequency of the term across the eight genres (Spoken, Fiction, Magazines, Newspapers, Academic, Web, Blogs, TV and Movies), and even drill down into sub-genres (e.g. the Blog genre consists in the following sub-genres: Academic, Argument, Fiction, Informational, Instructional, Legal, News, Personal, Promotional, Review, Miscellaneous).
If the distribution of a given term is heavily weighted towards a particular genre or even sub-genre, this may be good reason beyond the bare frequency to think the term is not widely circulated. In particular, if a term's distribution is heavily weighted towards academic writings, this may be good reason to think the term is largely used as a technical term confined to a specific discipline or disciplines. Focusing initially on the distribution of “justify” vs “know”, there is a striking difference. The three most frequent genres for “know” are TV/Movies (6,021.49 frequency per mil), Spoken (4,501.43), and Fiction (1,851.46). The three most frequent genres for “justify” are Blog (28.85), Academic (28.46), and Web (24.88). The differences between the two are perhaps made clearer if we compare the top three genres for five terms that one would think are clearly commonly used and five terms that one would think are clearly technical terms within philosophy.
Among the terms that are plausibly widely used, certain genres recur frequently, such as TV/Movies and Spoken, while among the technical terms, the categories of Academic, Blog, and Web dominate (Table 3). Turning back to “know” and “justify”, it seems plausible on the basis of this table to group “know” with the common terms while “justify” more plausibly groups with the technical terms. This kind of analysis is admittedly coarse-grained, but I take it that the distribution of “justify” across the COCA corpus does provide reason to think that the term is not widely distributed. Instead, the term appears to be restricted to particular contexts. If we list the sub-genres for which “justify” has the highest frequency, we see clearly that the term is most common in academic contexts and other contexts with a higher register (Table 4).
We can gain further evidence if we turn to the BNC. As stated earlier, it is useful to turn to this corpus because we are able to analyse the distribution of an expression across many different categories, including written vs spoken, genre, domain, age of author, sex of author, age of audience, perceived level of difficulty, and more. The distribution of “justify” reveals several facts that all suggest that “justify”, rather than being a term that is used commonly, in fact belongs to a formal register. First, as we have just seen with the COCA data, the ten genres in which “justify” has its highest frequency can all plausibly be thought of as formal contexts (Table 5).
Second, if we turn away from specific genres (for which in BNCWeb there are 46) and to the more general category of derived text type for which there are only 8 categories, the most frequent derived text types are as shown in Table 6.
It is notable here that academic prose, non-academic prose and biography, and newspapers form the top three, while fiction and verse and spoken conversation are the bottom two text types. This is further evidence that “justify” belongs to a formal register. BNCWeb also categorises the majority of the corpus documents in terms of perceived level of difficulty, and here again we find evidence that “justify” belongs to a formal register, as it becomes more frequent as the difficulty increases (Table 7).
These three considerations taken together, and particularly combined with the considerations from COCA, suggest that “justify” is a term largely confined to a formal register and possibly a technical term, rather than a term that is used widely in ordinary discourse.Footnote 15
Focusing as we have on the frequency and distribution of “justify” across corpora, we may wonder how exactly this spells trouble for the folk justification approach outlined earlier. I will return to this issue in greater detail later in the paper. But my opponent, anticipating the nature of the case, may want to object in two ways. First, if we have shown that “justify” belongs to a formal register, this is no clear barrier to an investigation of the term. After all, there is no barrier to taking some term widely used only within particular contexts and providing some account of its meaning. Second, while the above evidence does suggest that “justify” is a formal term, there is also countervailing evidence that could be pointed to. For instance, if we take the technical terms used earlier, it is important to note that while “justify” does have a similar distribution to those terms, it is markedly more frequent (Table 8).
So while it might be that “justify” belongs to a formal register, it is certainly not as rare as other technical terms used in philosophical discourse. This also accords with intuition. It seems that we can use “justify” well enough in ordinary discourse with a non-philosophical audience without having to explain what it means as we would these other terms.
At this stage it is important to note, however, that in considering whether the kind of justification that is of interest in epistemology is in wider circulation, we ultimately need to look beyond just the frequency and distribution of “justify”. This is because justification in the sense that epistemologists are interested in attaches in the first place to beliefs and propositions (or other proposition-like objects), rather than, say, acts or emotions. Yet if we just look at the frequency and distribution of “justify”, the numbers involved will naturally be inflated by uses where “justify” takes an action or an emotion as its object (e.g. “His anger was justified”). So we should consider whether there is evidence in the corpora of “justify” being used to talk about proposition-like items. This will be the topic of the next section.
5. Justifying What?
In order to investigate whether there is evidence of discourse surrounding justification of beliefs, we will need to go beyond considering mere frequency and distribution, and instead consider the collocates of “justify” i.e. the terms that frequently appear alongside the term. In doing so, we can make some grammatical distinctions. For instance, for the verb “justify”, there are terms that will often appear as its subject, other terms that appear as its object, and adverbs that modify it etc. As mentioned earlier, the corpus analysis software SketchEngine is particularly useful for this task. As that is the case, I will largely perform analyses on the very large corpus EnTenTen20. Doing so is particularly useful when we are considering expressions and phrases that are used less frequently, as we have seen with “justify”.
Our investigation thus far has largely ignored a fact that is arguably widely accepted within philosophy: that there are many different kinds of justification. One way to distinguish between kinds of justification is to focus on the object of justification. We can distinguish between the justification of acts, beliefs, emotions, etc. A second way to distinguish between kinds of justification is to focus on the domain of justification. We can distinguish between moral justification, epistemic justification, prudential justification, legal justification etc. I am going to focus primarily on the object of justification, on what is being justified, as this is something that is amenable to a corpus-based analysis.
It is fair to say that epistemologists, in thinking about epistemic justification, have taken the justification of belief to be the central kind of case. This makes sense of course, given that the epistemic realm is concerned with belief-formation and knowledge. It may be that actions and possibly other objects can be epistemically justified, but it still seems fair to say that beliefs at least serve as the paradigmatic object of epistemic justification. So one way of considering whether epistemic justification is spoken of in ordinary discourse is to consider the extent to which justification of belief is spoken of and how this compares to justification of action or other objects. Alston has suggested that “the term ‘justified’ has been imported into epistemology from talk about voluntary action” (Reference Alston1993: 532). In doing so, he raises the possibility that it is only within epistemology that justification of belief is spoken of, and this is something we can investigate via a collocation analysis.
Before we do, there are two important preliminary points. First in performing a collocation analysis, it may initially seem sensible to focus on the terms that most frequently appear alongside our term of interest. So in investigating the object collocates of “justify_v”, it may seem sensible to focus on the terms that most frequently appear as its object. One setback with this approach is that it does not take into account how common the collocate is across the corpus. A version of this issue arises with “justify_v”. Taking into account all pronouns and nouns that serve as the object of “justify_v”, the most common object is the pronoun “it” (e.g. “He couldn't justify it”). But this is likely not a result of the fact that the two terms stand in some particular relationship. Instead, this is just because “it” is an extremely common and flexible term.Footnote 16 For that reason, it is preferable to use an association score that better reflects the association between two terms. Collocates can then be found by finding those terms with the highest association score. SketchEngine provides the LogDice association score for collocates, for which more information can be found at Rychlý (Reference Rychlý2008). This is a score between 0 and 14. 14 would indicate that all occurrences of the first term occur with the second term. 0 would indicate that there is less than 1 occurrence of the two terms together per 16,000 occurrences of either term. LogDice can be used to rank collocates and also compare them. A difference in score of 1 indicates that the higher-scored term collocates twice as often. For our purposes, we find that LogDice solves our initial issue, with “it” receiving a comparatively low LogDice score (3.3) compared with terms we might expect to collocate specifically with “justify_v”, such as “action” (7.4), “existence” (7.9) and “belief” (6.7).
The second preliminary point is that in considering the distinction between justification of a belief versus justification of an action, I will not consider whether certain terms are indicative of propositional or doxastic justification. It is an interesting further question whether the distinction can be teased out within a corpus, but it is not one I will explore here. I will just be interested in whether there is evidence of “justify” being applied to a belief, proposition, or other proposition-like object (even a claim or an assertion), and I will largely refer to this kind of justification as belief justification.
Focusing first on the top collocates that appear as the object of “justify_v” gives the results shown in Table 9.Footnote 17
There are a few points of interest from Table 9. First, keeping the distinction between justification of belief and justification of action in mind, it is worth noting that both “belief” and “action” make it into the top 20 collocates of “justify_v”, whether we calculate in terms of frequency or LogDice. Second, the list is dominated by action terms, with 15 of the 20 terms concerned with action (“mean”, “action”, “expense”, “invasion”, “war”, “intervention”, “killing”, “investment”, “violence”, “cost”, means”, “decision”, “expenditure”, “murder”, “purchase”, “refusal”). This would lend support to the idea that “justify_v” is primarily used to talk about action. Third, as well as “belief”, there are other terms indicative of the justification of belief, namely “claim” and “conclusion”.
Regarding “claim”, it is important here to note that the term is ambiguous between one sense akin to a proposition or assertion and another sense (more common in legal discourse) akin to a right to ownership (e.g. “seeking to justify his claim to the Liberian presidency”). We could attempt to filter out such uses in our query by appealing to syntactic properties. For instance, the ownership sense commonly uses prepositions “to” and “for” (e.g. “with the more detailed records being used to justify the claim for refund”). However, this kind of syntactic filtering will be inaccurate, particularly when claims are discussed without being explicit on what it is a claim for.Footnote 18 To get an idea of how many of the total instances of “justify_v” + “claim” are instances of “claim” in this ownership sense, a random sample of 300 instances was extracted from EnTenTen20 via SketchEngine, and were hand-coded according to whether they were clear instances of the ownership sense. Of that, 57 were clear instances of the ownership sense, amounting to 19% in total. This is a significant amount, but if we reduced the frequency of the collocate claim by that percentage, we would have an adjusted frequency of 4,542, which would mean it is still the 12th most frequent object collocate. All this is to say that while the ownership sense does make a significant contribution to the overall frequency of “justify_v” + “claim”, it is not the reason why “claim” is one of the top object collocates of “justify”.Footnote 19 So it is reasonable to think that “justify_v” + “claim” represents instances where belief justification is spoken of.
Turning to “justify_v” + “belief”, we may wonder whether the fact that these are relatively closely associated is due to philosophical uses. One way of exploring this is by hand-coding a sample the way we just have with “justify_v” + “claim”. Again, a random sample of 300 instances were extracted via SketchEngine, and each instance was hand-coded according to whether it was clearly an example of philosophical discourse. This was done primarily by looking out for other philosophical technical terms and phrases e.g. “justified true belief”, “Gettier”, “epistemology”, “foundationalism” etc., but also on whether they have originated from websites known to be dedicated to philosophy e.g. philpapers, philarchive, askphilosophers.org etc. Of the 300, 123 were clear instances of philosophical discussion, which amounts to 41% of the sample. It is also worth noting that religious discourse was also particularly prevalent within the sample. Of the remaining 177 instances, 61 were coded as clear cases of religious discourse (where clearly religious terms, such as “theism”, “God”, bible”, etc. were present) (Table 10). This is likely to be partly due the fact that “belief” itself has some association with religious discourse – the most frequent modifier of “belief” being “religious” (LogDice 10.1).
Another way of getting an idea of how “justify_v” + “belief” is used is by looking at the collocates of the phrase. The most common modifiers of “justify_v” and the most common modifiers of “belief” among instances of “justify_v” + “belief” are given in Table 11.
A brief look at the terms that are being used to modify the phrase would strongly suggest that it is predominantly used in philosophical discourse.Footnote 20 It is important to keep in mind that we are dealing with low frequencies here, even in a corpus as large as EnTenTen20. But it is still reasonable to take this as evidence that “justify_v” + “belief” is largely used only in philosophical discourse.
It is also notable that “true” is by far-and-away the most frequent modifier of “belief” on the list. Of course, this indicates discussion of whether or not a true belief is justified and an inspection of these confirms that they are instances of philosophical discourse where the tripartite theory of knowledge is being discussed.
6. Justify_j
Turning to the adjectival use of “justified”, the nouns modified by the term with the highest LogDice scores are shown in Table 12.
There are a few important points to note. First, half the list in Table 12 is taken up with emotional terms rather than terms pertaining to action or belief (“anger”, “outrage”, “indignation”, “suspicion”, “paranoia”, “grievance”, “resentment”, “distrust”, “vengeance”, “fear”). Second, the list is heavily populated with terms with negative connotations, with 15 of the 20 terms having a clearly negative connotation (“Sinner”, “anger”, “sinner”, “outrage”, “homicide”, “indignation”, “suspicion”, “criticism”, “paranoia”, “slavery”, “grievance”, “resentment”, “distrust”, “vengeance”, “fear”). Third, “belief” has the 8th highest LogDice score but is also the most frequent. However, there is good reason to think that the fact that “belief” is the most frequent is largely due to philosophical usages. From a random sample of 300, 273 were from clearly philosophical discourses, which amounts to 91%. An adjusted frequency that adjusted for philosophical instances would amount to 159, which would still keep “belief” as one of the top 20 collocates in terms of frequency. That “justified_j” + “belief” is largely used in philosophical discourse is also confirmed by looking at the terms that modify each of “justified_j” and “belief” across instances of “justified_j” + “belief” (Table 13).
Here again, we find that the modifiers are nearly all indicative that the phrase is being used in a philosophical context. We again find that “true” is the most frequent modifier of “belief”, and again this appears to be entirely due to discussion of the tripartite theory of knowledge and the Gettier problem.
7. Discussion
It is time to consolidate these various data points and consider their epistemological repercussions. I will focus particularly on three key claims:
I. “Justified_j” is nearly never used to talk about beliefs outside of philosophical circles.
The evidence in support of this claim can be seen in Tables 12 and 13. In Table 12, we saw that the only belief or proposition term among the top 20 nouns modified by “justified_j” was “belief”, and further inspection of “justified_j” + “belief” revealed that the vast majority of instances (91%) were – from concordance analysis alone – clear instances of philosophical discourse.Footnote 21 Table 13 also supports this, as all of the modifiers for both “justified_j” and “belief” were indicative of philosophical discourse.
This is an interesting insight in its own right into the way “justified_j” is used, but it should also prove relevant to some forms of the folk justification approach. We saw earlier that some adopt a particularly linguistic form of the folk justification approach whereby linguistic properties of “justify” and related word forms are used to inform a theory of justification. Hawthorne and Logins (Reference Hawthorne and Logins2020) are perhaps the clearest example of this where they have focused specifically on the gradability of “justified_j”. However, given claim I, even if “justified_j” is a gradable term, it is highly questionable that we should take this into account when considering the epistemic justification of beliefs, as ordinary speakers do not seem to use “justified_j” to talk about beliefs or other proposition-like objects.
II. Justification talk primarily concerns something other than belief justification.
We saw with both “justify_v” and “justify_j” that the object terms they most frequently combine with do not refer to beliefs or other proposition-like objects. In the case of “justify_v”, action terms dominate (Table 9), while in the case of “justify_j”, emotional terms dominate (Table 12).Footnote 22 This perhaps lends some support to Alston's suggestion that justification-talk was transferred over from talk of voluntary action. It is perhaps tempting to take Alston's suggestion one step further and claim that it is only within philosophical discourse that the transfer has been made. This would be continuous with the possibility mentioned earlier in the discussion of Plantinga and Weatherson that it is only after Gettier that philosophers explicitly discussed the notion of justification. But the findings here do not suggest that talk of belief-justification is restricted to philosophical circles, even if it is far more frequent there. Notably, as Table 9 indicates, “justified_v” does collocate with “conclusion”, “belief”, and “claim”, and even once we adjust for the possession usage of “claim”, and for the fact that many instances of “justified_v” + “belief” are philosophical usages, a considerable number of instances still remain. So the evidence here does not suggest that justification of beliefs or other proposition-like objects only occurs within philosophical discourse. But it does suggest that justification-talk is primarily not about belief justification.
III. “Justify” is a high register term.
Finally, we saw a range of evidence from both COCA and BNC to suggest that “justify” belongs to a higher register, and so is largely used in more formal contexts. This includes, but is not limited to, academic contexts, as Tables 3–7 indicate.
III is perhaps the most important finding of this investigation, as it forms the basis of a challenge for the folk justification approach. Echoing a response considered earlier, it might be thought that the fact that “justify” is a high register term poses no particular challenge to the folk justification approach, provided that we focus on the high register folk i.e. the practices of the people that take part in the high register contexts in which the term is used. At this stage however, it is important to revisit the basic motivation behind the folk justification approach. The thought is that there is some standard for belief evaluation that is widely circulated within our epistemic community, and that this plays a central role in our epistemic lives – we make efforts to have beliefs that are justified, and when others have formed unjustified beliefs, we take them to have behaved improperly in some sense. As Booth states:
We feel that we are obliged to believe in accordance with the available evidence … and only when we do so are we epistemically justified in having a particular belief; such that we feel, for that reason, that the beliefs of Holocaust deniers, creationists, and members of the flat earth society (for example) are epistemically unsalutary. In short, we feel we ought to have justified beliefs and we want to know how. (Booth Reference Booth2011: 40)
Given the importance that epistemic justification is thought to have in our epistemic lives, the idea that this property is only picked out within certain high register contexts seems implausible. We now have a tension between the following three claims:
(1) Epistemic justification is a central part of our epistemic lives, such that it is important across a wide range of contexts.
(2) Use of the word “justify” is largely restricted to high register contexts.
(3) “Justify” is the primary term used to pick out epistemic justification.
It has been a working hypothesis of this investigation that 3 is true, but at this stage, the most natural way for the folk justification theorist to respond is to reject 3. That is, they could concede that using “justify” to talk about justification is something that largely occurs in certain higher register contexts and particularly philosophical contexts, but they should claim that there are other terms or phrases that are used to talk about epistemic justification.
This move will have a couple of repercussions. First, any attempt to give an account of epistemic justification via consideration of the meaning properties or syntactic properties of the word “justify” requires a great deal further motivation before it can be considered plausible, contrary to the projects we saw at the start of the paper from the likes of Goldman, Kvanvig and Menzel, and Hawthorne and Logins. Second, there is now a particular challenge faced by the folk justification theorist: to identify how exactly the ordinary notion of epistemic justification is spoken of, if not via the use of “justify”. Note that I describe this as a challenge rather than as an objection, as I do think that there are routes to overcome it. Nevertheless, it is a challenge that needs to be answered. In what follows, I will outline the clearest available responses.
One option will be to claim that ordinary speakers in fact primarily use knowledge discourse to talk about the justification of their beliefs. After all, as mentioned at the start of this investigation, knowledge discourse is much more prevalent, with “know” consistently figuring as one of the most common verbs across English language corpora. The idea would be that rather than saying something as cumbersome as “her belief that p was justified”, an ordinary speaker may instead say “she knew that p”. Of course, if we are to allow false justified beliefs, then this approach would require some way of talking about them via knowledge discourse despite the fact that knowledge attributions are typically thought to be factive. But even assuming that there is some way round this issue, the more important point here is that we seem to have reverted back to the TOK approach to justification. That is, if we claim that justification is really spoken of in terms of knowledge, it seems we have no grounding in our ordinary epistemic practices with which to understand justification that is independent of our ordinary knowledge discourse. Instead, the role of justification would just be to understand some aspect of our knowledge. Again, as stated earlier, I am not claiming here that the TOK approach is implausible, the point is that this does not look like a way of saving the folk justification approach.
An alternative approach is to claim that ordinary epistemic agents typically use some other term or phrase to talk about the justification of their beliefs. Drawing inspiration from the philosophical literature, we might consider candidates such as “warrant”, “rational”, “reasonable” etc. Frankly, I am sceptical that either “warrant” or “rational” could serve the role of being the ordinary term used to refer to epistemic justification, although I will not seek to show that here other than to note that both terms are less frequent than “justify”.Footnote 23 It may be that justification is spoken of in terms of evidence or reason, and I think this is certainly a promising avenue to explore. One issue that may arise in considering this possibility is whether in ordinary discourse we rely upon a notion that bears a threshold, such that we can distinguish between the set of justified beliefs and the set of unjustified beliefs. That is, while we no doubt will find ordinary talk of people's reasons or evidence for believing, having some reason/evidence seems consistent with having reason/evidence insufficient for justified belief in the philosopher's sense. There are ways in which we might indicate in conversation that some threshold has been crossed such that a belief is indicated as justified, perhaps with phrases like “reasonable”, “good reason/evidence”, or “sufficient reason/evidence”. But one possibility to consider here is that as far as our ordinary practices are concerned, and quite apart from the role that justification plays with regard to knowledge, it may be that justification is really a purely scalar term. This would then have ramifications on the kinds of epistemic evaluation we can hold each other to – without a minimum threshold of justification, it becomes less straightforward to reach a verdict that someone has conducted themselves acceptably or unacceptably.
So there are options available, and it may be tempting to think that as a result, this challenge is not particularly pressing for the folk justification theorist provided that one or more of these terms is used in ordinary discourse to talk about justification. In particular, we might think that there is not a single term or phrase that is used to talk about justification, that instead speakers select from a long menu of possible phrases in order to engage in this form of epistemic evaluation. That may be right, and if so this is something that future corpus analysis can help reveal. However, the folk justification approach would face particular pressure from a pluralism about epistemic evaluation, akin to Alston (Reference Alston1993). That is, if our ordinary practices suggest that we appeal to a number of different notions in evaluating our beliefs, why not take that at face value and take there to be (in Alston's terms) a range of epistemic desiderata rather than a single notion of epistemic justification? And embracing this pluralism would itself prove somewhat revisionary in the way that folk justification is usually approached. For instance, it would mean that we shouldn't ask whether the new evil demon case, the Truetemp case, or the clairvoyant case are instances of justified belief, for that would conflate a number of different issues. Instead, we should ask whether the subject knows, whether they have good reason, good evidence, reliability etc.
These are further avenues that need to be explored in order for the folk justification approach to meet this challenge. Hopefully what has become clear in outlining them is that corpus analysis can play a role in this: by considering the frequencies, collocates, and distributions of the phrases discussed above across various corpora and text types, we can start to build a picture of our ordinary practices surrounding epistemic evaluation. And, as stated earlier, the particular benefit of corpus analysis is that it provides a way of inspecting our folk practices that is far less reliant on intuitive judgments and is also based on genuine language use rather than elicited language use in experimental settings.
8. Conclusion
In this paper, I have attempted to use corpus analytical methods on large English corpora to investigate the lemmas “justify_v” and “justified_j”, and in doing so I have raised a challenge to the folk justification approach: to identify the way in which epistemic justification is spoken of ordinarily. This is necessary to make good of the idea that this form of epistemic evaluation is something that plays a central role in our epistemic lives. I hope to have shown that corpus analysis can be used to raise interesting philosophical questions, thus continuing a theme one finds in recent work in experimental philosophy. In the final section, I also hope to have shown that corpus analysis will not just be a tool to raise such challenges, but should also be a tool to resolve them.Footnote 24