One of the main aims of Neo-Fregeanism is to provide a possible route to knowledge of arithmetic, by taking Hume’s principle to be in some sense constitutive of what is meant by identity and difference between numbers.Footnote 1 However even if successful in this, an important question remains: can this account somehow contribute to explaining how the vast majority of arithmetic knowledge—accrued by ordinary people in complete ignorance of the Neo-Fregean project—is actually obtained? The importance of this question is emphasised by Wright [Reference Wright, Ebert and Rossberg23, p. 162].Footnote 2 Heck [Reference Heck8, chap. 11] has discussed this issue and proposed a modification of Hume’s principle that better fits an ordinary understanding. However as I argue here, neither Hume’s principle nor Heck’s modified principle is satisfactory as an account of what is ordinarily meant by numerical identity, undermining their ability to account for the bulk of actual arithmetic knowledge.
This paper puts forward a new logic which allows for very natural and plausible characterizations of various key arithmetic concepts, including finiteness, equinumerosity (for finite pluralities), and addition and multiplication of cardinality, characterizations which—I argue—match a pretheoretic understanding reasonably closely. In particular, the characterization of equinumerosity is much more faithful to an everyday understanding of the concept than the famous definition in terms of bijections used in Hume’s principle. This logic is obtained by simply buttressing plural logic with plural versions of the ancestral operator. Various authors have defended plural logic and ancestral logic separately as “real logics,” with neither dependent on the other, and I argue that if this is correct then the combination of them put forward here should also be regarded as a real logic. If so then finiteness and equinumerosity are revealed to be purely logical concepts, and much of ordinary arithmetic knowledge is straightforwardly obtained as logical knowledge (or perhaps inductively inferred knowledge of logical truths). The rest of arithmetic is obtained by supplementing this logic with a predicative numerical abstraction principle, together with a very simple axiom of infinity—which could be obtained as an empirically known statement that there are some things which are not collectively finite, or can take a modal form. Alternatively one can use an impredicative abstraction principle and obtain a version of Neo-Fregeanism which relies on a more innocent logic than the standard version, and is much more plausible as an account of the ordinary meaning and epistemology of statements of arithmetic.
The work here has a precursor in that of Martin [Reference Martin12, Reference Martin13]. He set up a nominalist system by supplementing a mereological base theory with a generalized ancestral operator. The definitions of finiteness, equinumerosity and multiplication given here are essentially those given by him, except that we work with pluralities of arbitrary things instead of the fusions of atoms (mereological simples) that he uses. He did not put his definitions to the same philosophical uses however—he appears to have been aiming just to give a fairly strong nominalistic system for doing maths in, giving little philosophical discussion, and not claiming his work had implications for the semantics or epistemology of arithmetic. Here there is also a technical advance on his work, in that in our logic we are able to prove a fact which he takes as a nonlogical postulate—as seen in Section 5. Nonetheless, all due credit to Martin. If nothing else, the discussion here can be seen as an argument that versions of his ideas are very relevant today.
Section 1 discusses Neo-Fregeanism and various problems with it as an account of an ordinary person’s grasp of arithmetic. Our base plural logic is essentially that of Oliver & Smiley [Reference Oliver and Smiley14], and Section 2 summarizes the relevant parts of their account. In Section 3 we supplement their logic with plural versions of the ancestral operator, and defend the result as a “real logic.” Section 4 then discusses how the result allows a natural definition of finiteness. Section 5 discusses how it can be used to define the notion “just as many as” and its relatives, and derives some of their basic properties, and the validity of some practical applications. Section 6 then uses the logic to define the notions of addition and multiplication for pluralities. Finally Section 7 discusses the introduction of numbers via a predicative abstraction principle, with an impredicative alternative also mentioned. There is also an appendix which briefly lays out the natural deduction rules and semantics for plural logic as understood here.
1 Neo-Fregeanism
The aim here is to develop a potential semantics and epistemology for the understanding that an ordinary person (with a basic education) has of arithmetic and finite cardinalities—this being the source of the bulk of actual arithmetic knowledge. The kinds of arithmetic statements we seek to understand here include paradigmatic examples such as “5+7=12,” as well as statements of equinumerosity between pluralities, and of cardinalities of pluralities, such as “there are twice as many cages as animals,” “there are as many tennis players here as rugby and football players put together,”Footnote 3 and “there are three toasters in this kitchen.” The account follows the cardinal conception of arithmetic, where numbers are taken to be answers to “how many” questions. The most prominent existing version of the cardinal conception is Neo-Fregeanism, whose suitability as an account of an ordinary person’s understanding is criticised by Linnebo [Reference Linnebo11, chap. 10], in the course of motivating his own account based on the ordinal conception of arithmetic—where numbers are taken to be the answers to questions about the position of objects within suitable discrete orderings. We start by discussing problems with Neo-Fregeanism as an analysis of an everyday understanding of number, arguing that these problems are significant but also specific to Neo-Fregeanism rather than applying to the cardinality conception more generally. The way is thus open to a new account of arithmetic based on the cardinality conception, which the remainder of the paper develops.
The main idea behind Neo-Fregeanism [Reference Hale and Wright6, Reference Heck8, Reference Wright22] as an account of arithmetic is to take what is known as Hume’s principle as constitutive of what is meant by identity between numbers. Hume’s principle states that if F and G are concepts, then the number of things falling under F is the same as the number of things falling under G iff there is a one-to-one correspondence between the things falling under F and the things falling under G. This can be formalized by supplementing full second-order logic with the abstraction principle
where we write $F\sim _R G$ for the condition that R is a bijection between the x such that $F(x)$ and the x such that $G(x)$ . N here is an operator delivering a (first-order) term “ $N(F)$ ” given a concept F. Using full second-order logic supplemented with this abstraction principle, one can derive the existence of an infinite progression of natural numbers satisfying the axioms of second-order arithmetic.
For the Neo-Fregean approach to provide an account of the semantics and epistemology of an ordinary person’s understanding of arithmetic, Hume’s principle needs to be constitutive of what an ordinary person means by numerical identity. Though Wright initially believed this condition to be satisfied [Reference Wright22, pp. 106–117], later evolutions of Neo-Fregeanism aimed primarily to argue only that Hume’s principle provides a possible route to a priori knowledge of arithmetic [Reference Hale and Wright6, chap. 5], that it can perhaps be stipulated as something like an implicit definition of the concept of number. Heck [Reference Heck8, chap. 11] defends a modified version of Hume’s principle, in which Hume’s principle is relativized to finite concepts, as providing a plausible account of an ordinary person’s understanding of arithmetic. As we will see though, neither the original version of Neo-Fregeanism nor Heck’s proposed modification can be defended as properly capturing a layman’s grasp of arithmetic.
First, there is Linnebo’s point that the number zero took a long time to be accepted as a genuine number—whereas if a grasp of number was given by Hume’s principle, it should have been obvious that zero was a number, since it is obvious that there are concepts with no instances [Reference Linnebo11, p. 179].Footnote 4 This I think is a reasonable objection to both versions of Neo-Fregeanism (the original, and Heck’s modified version) as properly fitting an everyday grasp of number, but it is sidestepped if we regard attributions of number as applying not to a concept, but to some things—to a plurality of things.Footnote 5 In colloquial English, some things must consist of at least two things, in which case one cannot properly attribute a number to no things, so that it is no surprise that the number zero was a late arrival. An obvious reply is that in this case, the number one should also have aroused suspicion, as a single thing is not some things; and the answer is that in fact it did, with the Pythagoreans for instance holding that “The One is prior to the numbers proper” [Reference Kahn9, p. 28]. Indeed if we are going to take the history of number concepts as evidence for what a grasp of them consists in, then the suspicion of the number one tells against the ordinal conception that Linnebo defends, since there the existence of the number one—denoted by the first element of any numeral progression—is immediate.
Obviously if we want to give an account of arithmetic as currently understood it will have to include the numbers zero and one, so numbers cannot only be attributed to pluralities of at least two objects. However we can reasonably hold that the “number of” operator is rightly applied to plural terms, that plural terms have historically been taken to necessarily denote at least two objects (when they denote at all)—hence the status of numbers zero and one—but that in fact, we can have a perfectly coherent conception of plural terms as terms that can denote no objects, or one object, or some objects. This expanded conception of plural terms can properly underlie an ordinary person’s grasp of number even if they do not consciously recognise it themselves.Footnote 6
The next objection to Neo-Fregeanism is the simple point that for many examples, the definition of equinumerosity in terms of the existence of a bijection just does not intuitively get the meaning right. If one says for instance that there will come a day as many days in the future as there are birds in the sky, intuitively—to me at least—this statement does not assert some way of pairing up birds with days. I conjecture that if one surveyed members of the public about the meaning of such a statement, few would volunteer a suggestion in terms of pairing; perhaps an answer in terms of counting the birds, and then counting that many days into the future, would be more likely. Similarly, if one says that there are at least as many planets in the universe as grains of sand on earth, this does not appear intuitively to be an assertion about the existence of an injective function from grains of sand to planets (and I think if surveyed, members of the public would be even less likely to volunteer this as an answer).
Of course this is not conclusive—we do not have direct introspective access to the conceptual workings of our minds. But it is telling that neither Heck nor Wright claims that the bijection definition of equinumerosity is obviously correct. Indeed Heck in fact argues that a basic understanding of equinumerosity does not require grasping the bijection characterization [Reference Heck8, pp. 168–172], arguing instead for three basic equinumerosity principles. In the presence of full second-order logic one can argue that for finite concepts, these principles are equivalent to the bijection definition (we will return to these principles in Section 5, and see that they are closely related to our definition of equinumerosity). Heck later gives a similar argument that by considering the act of counting one can see that if F has finitely many instances, then the number of Fs is the number of Gs iff there is a bijection between the Fs and the Gs [Reference Heck8, p. 248]. Wright, too, does not claim the immediate correctness of the bijection definition of equinumerosity, instead arguing again that in the presence of second-order logic, one can come to be convinced that the bijection definition is equivalent to one’s existing concept [Reference Wright22, pp. 106–107]. But all any of these arguments establish is that the bijection definition is extensionally equivalent to what is ordinarily meant by equinumerosity, not that it is what a grasp of that concept typically consists of.
Further evidence against the latter comes from considering infinite cardinalities. Indeed if an understanding of equinumerosity is given by the definition in terms of bijections, the extension of Hume’s principle to the infinite case would and should be immediate—whereas it took a singular genius in the form of Cantor to define and explore the notion of cardinality of infinite sets, and when he did so “he encountered widespread incomprehension and opposition” [Reference Linnebo11, p. 180]. This incomprehension persists to some extent today. The idea that there are different sizes of infinity—though very basic, mathematically speaking—is often discussed as an example of a counterintuitive feature of modern mathematics. Students are also reported to find this idea challenging, with Heck [Reference Heck8, p. 244] describing one who finds it “very worrying,” a source of genuine unease, this kind of case emphasising (in Heck’s view) the size of the conceptual leap required to move from finite to infinite cardinalities. If Hume’s principle is a proper formalization of an ordinary grasp of number, the magnitude of this conceptual leap is inexplicable—cardinalities of infinite concepts work in exactly the same way as the familiar finite cardinalities, so what is the problem? Heck tries to address this issue by claiming that Hume’s principle should be replaced by “Finite Hume’s principle,” Hume’s principle relativized to finite concepts:
Heck argues that by reflecting on the ordinary process of counting, one can come to see that this equivalence (as a material conditional) is true—that if the right-hand side holds, then the left-hand side holds, and vice versa [Reference Heck8, p. 248]. Since this argument involves the notion of counting, it does not extend to infinite concepts, and so—Heck thinks—one has a good reason for the version of Hume’s principle relativized specifically to finite concepts, and an explanation of why infinite cardinalities require a conceptual leap. The problem with this is simple. If Finite Hume’s principle was actually a proper formalization of an ordinary person’s grasp of number, then there would still (as with Hume’s principle) be no conceptual leap to the infinite case—our layman already thinks about equinumerosity in terms of bijections, and they just have to be told that the same conceptual machinery that they use for finite concepts also applies to infinite concepts. But this is patently not the case—the reaction to Cantor’s work, and the troubled reaction of present-day students to different sizes of infinity, indicates that something genuinely novel is taking place in the move from finite to infinite cardinalities, more than just the dropping of a restriction to a special case (finite concepts). And this point is entirely consistent with Heck’s argument: Heck shows how one who thinks of cardinality in terms of counting can be convinced that this is equivalent to a definition in terms of bijections, but that leaves open the possibility that thinking in terms of bijections may be basically new—as it is, for many people. If a formal account of an ordinary person’s grasp of arithmetic employs concepts understanding of which would not naturally lead to a grasp of infinite cardinality, that is an advantage.
A final problem with Heck’s version of Hume’s principle is that it can only formalize the layman’s understanding of arithmetic if the layman has an understanding of successor functions as formalized in full second-order logic—rather than merely predicative second-order logic. Indeed for Heck, finiteness of a concept is given by the existence of an appropriate successor function, repeated application of which takes one from a distinguished initial element, through every object falling under the concept (with each hit exactly once), to a distinguished final element. But in predicative second-order logic the existence of such a function depends on it being definable, and there is no obvious way to define such a function for the finitely many atoms in the earth, say—we cannot just enumerate them by name, as our language does not contain such names. At any rate, our confidence in the ability to define such a successor function for these objects, or many other examples of finitely many objects, is much lower than our confidence in their finiteness.
The distinction between predicative and full second-order logic matters here, since a grasp of the former can plausibly be attributed to a person with an ordinary grasp of English, but a grasp of the latter much less so. If an ordinary educated person can be said to have an innate understanding of full second-order logic today, presumably the same has been true for a number of centuries at least. But the modern notion of function provided by second-order logic, the notion of function as an arbitrary correspondence—rather than a geometric curve, or given by an analytic formula—is a recent invention, whose evolution in mathematics was gradual, hesitant and contested [Reference Kleiner10]. As late as the early twentieth century, eminent mathematicians were questioning whether the notion of function as arbitrary correspondence is legitimate, or whether justifying the existence of a function requires specifying it in a lawlike way—this was part of the controversy over the axiom of choice [Reference Kleiner10, pp. 296–297]. Today, too, students have difficulty with the notion of function as arbitrary correspondence—for many, “functions given by more than one rule are not functions… and functions must consist of algebraic symbols” [Reference Hatisaru and Erbas7, p. 706]. If people typically already had a pre-existing grasp of full second-order logic, one would expect both historical mathematicians and modern-day students to find the notion of function as arbitrary correspondence more natural, and not to latch on so much more easily to the rival predicative notion of function (seen in the preference for functions given by algebraic expressions). If an account of everyday arithmetic competence can avoid attributing a grasp of full second-order logic to the layman, that is a point in its favour.
Though there are thus many objections to Neo-Fregeanism as an account of a layman’s understanding of arithmetic, these should not be taken as general objections to the cardinality conception of arithmetic. There clearly is something that an ordinary person means when they talk of certain things being finite, or being as many as certain other things; and it is worth investigating if we can give a plausible analysis of this, and whether such an analysis has to be dependent on a prior account of numerals, or of numbers as objects—as it would be on the view developed by Linnebo [Reference Linnebo11, chap. 10]. There is also some initial plausibility to developing an account of arithmetic based on such notions, stemming from the common sense idea that “2+2=4” means that if you have two things, and add two more things, then you have four things. I find that something like this is often volunteered by laymen if the question of the meaning of arithmetic arises, and I would speculate that empirical studies of the philosophical views of ordinary people would find the same. This kind of view is also the basis for old puzzles about arithmetic, such as the claim that 1+1 is not always 2, since one raindrop meeting a second may combine with it, resulting in 1 rather than 2. Of course the fact that ordinary people think they mean something by certain statements is not conclusive evidence that they actually do, but it is reasonable grounds for taking the possibility seriously.
This is the motivation for the account put forward in the following, a version of the cardinal conception of arithmetic. It has something of an ordinal flavour to it though, on the view of ordinal numbers as measures of iteration—a view which comes to the fore in Tait’s account of finitism [Reference Tait18, p. 27], and which gives the sense in which the transfinite ordinals are ordinal numbers. Indeed the key concepts such as finiteness and equinumerosity are defined in terms of ancestral operators, and the ancestral can be viewed as a natural way of capturing iteration of a relation. My account avoids Linnebo’s objection regarding the number zero by being based on plural logic, as discussed above; it avoids the intuitive implausibility of an account in terms of pairings; it makes clear formally why the extension to infinite cardinalities is a conceptual leap; and it avoids attributing the layman a grasp of full second-order logic. Linnebo makes two further objections to the cardinality conception of arithmetic which I do not think are convincing [Reference Linnebo11, pp. 180–182], but which I do not have space to address before continuing with the main argument of the paper.
2 Plural logic
As has been advertised, the view of arithmetic to be put forward here is one where a number is stated to be the cardinality of some things—a plurality of things. The logic of reference to, and reasoning about, multiple things at once—rather than just one individual at a time—is known as plural logic. The modern study of plural logic stems from Boolos [Reference Boolos1, Reference Boolos2], who argues that certain statements of ordinary English, such as the Geach–Kaplan sentence “some critics admire only one another,” cannot be given an analysis in first-order logic: they require instead an analysis which takes seriously the plural quantifier “there are some …”. The analysis of plural idioms such as this has since been carried forward by various authors. We employ here the account of Oliver & Smiley [Reference Oliver and Smiley14], of whose most relevant parts we give a brief summary (for much more detailed discussion and argument, refer to the original).
The central thesis of Oliver & Smiley [Reference Oliver and Smiley14] is that there is such a thing as plural denotation—a semantic relation holding between terms and things, plural in the sense that a particular term may denote more than one thing at once [Reference Oliver and Smiley14, p. 2]. A term thus capable of denoting more than one thing is called plural, contrasted with the familiar singular terms capable of denoting at most one thing. There are plural proper names (“The Rocky Mountains”), plural definite descriptions (“the cars in the garage”) and plural terms obtained by applying a function sign to its arguments (“Queen Victoria’s children”) [Reference Oliver and Smiley14, pp. 2 and 78–80]. Plural terms feature significantly in arguments, such as “John’s kids are Sally and Laurence; John’s kids are spoilt; so Sally and Laurence are spoilt.”
Predicates that can take singular terms as arguments can typically take plural terms as arguments as well: “the chocolates are in the cupboard/are liable to melt/are delicious.” Examples such as these are termed distributive predicates—a predicate being distributive if it is analytic that it holds of some things iff it holds of each individually. A predicate is collective if it is not distributive—for instance “the men carried the coffin” (they didn’t each carry the coffin individually). The same expression can sometimes be read as either distributive or collective, such as “they cost $5” (each individually or all taken together). For n-place predicates,Footnote 7 the distributive/collective distinction applies to each place separately. For instance “[certain people] wrote [certain books]” is collective at its first place, distributive at its second [Reference Oliver and Smiley14, pp. 2–3].
As well as arguing for admitting terms which can denote more than one object, Oliver & Smiley [Reference Oliver and Smiley14, sec. 5.6] argue for the importance of allowing for empty terms, where a term—either singular or plural—is empty if it does not, in fact, denote anything.Footnote 8 Consonant with this, they argue too that definite descriptions should be regarded as genuine singular terms, despite Russellian qualms [Reference Oliver and Smiley14, chaps. 5 and 8]. I find their arguments very plausible, and will follow their lead here. They identify four kinds of definite descriptions, of which we will use two: the familiar singular definition descriptions—terms of the form “the x such that $\phi $ ,” formally written “ $\iota x\,\,\phi $ ,” which denotes the unique object satisfying “ $\phi $ ” if there is a unique such object; and exhaustive descriptions, of the form “those x which individually $\phi $ ,” written “ $x{:}\,\phi $ ,” which denotes those objects y such that y (by itself) satisfies “ $\phi $ .” Thus for instance “ $x{:}\,x=x$ ” denotes every object. The term “zilch” is introduced as “the x which is not self-identical,” as a paradigmatic empty term, which we write as $\mathfrak {o}$ .
It has typically been held that if we apply a function sign f to terms $t_1$ , …, $t_n$ , then the resulting term $f(t_1,\ldots , t_n)$ is empty if any $t_i$ is empty. Oliver & Smiley [Reference Oliver and Smiley14, p. 87] define a function sign as strong if it is analytic that it behaves in this way, weak otherwise. Though various authors have objected to the idea of weak function signs, Oliver and Smiley argue these objections are feeble [Reference Oliver and Smiley14, pp. 87–88]. They give natural examples of weak function signs such as “ $\{x\mid x=y\}$ ,” which applied to an empty term such as “Pegasus” gives “ $\{x\mid x=\text {Pegasus}\}$ ,” which denotes the empty set—rather than denoting nothing. Similarly, if we take the mereological fusion of Lloyd George with Pegasus, we obtain Lloyd George (rather than nothing). A weak function sign will typically express a function which is co-partial—a function which maps nothing to something, the dual of the notion of partial function, which maps something to nothing. Co-partial functions are important for our purposes here, since we want for instance “the number of black holes in our solar system” to be the object zero, rather than nothing—so that “the number of [some things]” expresses a co-partial function.Footnote 9
As well as weak function signs, Oliver & Smiley [Reference Oliver and Smiley14, p. 90] defend the notion of weak predicates, where a one-place predicate F is strong if it is analytic that if t is empty then $Ft$ is false, and is otherwise weak. For n-place predicates the distinction between strong and weak applies to each place. The logical predicate “is identical to,” or $=$ , is taken to be a strong predicate, with $a=b$ holding only if a and b both exist (we sometimes call this strong identity to emphasise this). An example of a weak predicate is “do/does not exist.” Our plural ancestral operators, discussed in Section 3, form weak predicates, and the predicates for finiteness and equinumerosity that we define via these operators in Sections 4 and 5 are weak.
A logical relation that makes an appearance when we move from singular to plural logic is that of inclusion, which we will symbolise $\preccurlyeq $ [Reference Oliver and Smiley14, sec. 7.2]. In $a\preccurlyeq b$ with b plural, $\preccurlyeq $ can be read as “is/are among,” and with b singular, $\preccurlyeq $ can be read “is/are.” Thus $\preccurlyeq $ can be read “is/are, or is/are among, as the case may be.” This is a strong predicate, with the truth of $a\preccurlyeq b$ implying that “a” and “b” are not empty terms.
Oliver & Smiley [Reference Oliver and Smiley14, chap. 13] present a formal account of the logic just described, defining its syntax, axiom system and semantics. However their formal account is not a perfect fit for our purposes here. First, we want what Oliver and Smiley call the “inclusive” interpretation of plural quantifiers, where the range of a plural quantifier $\forall xx$ or $\exists xx$ includes the case where $xx$ is assigned no values, and is an empty term. On our intended inclusive interpretation, $\exists xx\, \phi $ can be read as “zilch or some thing $\phi $ s or some things $\phi $ ,” and $\forall xx\,\phi $ can be read as “whenever you have no things, or some thing, or some things, that thing $\phi $ s or those things $\phi $ .”Footnote 10 Second, since Oliver and Smiley present the deductive system as a Hilbert style axiom system rather than a natural deduction system, it takes more effort than it could to discern from their account how reasoning using the deductive system will actually proceed in practice. Third, they present the semantics of the system using an unspecified form of higher-order logic as the metalanguage—a higher-order logic whose intended properties are unclear. Typical higher-order logics only allow quantification over a single type at a time, where all individuals form a type, and relations of different arities, or over different types, are themselves of different types (similarly for functions). But the key notion in Oliver and Smiley’s semantics is that of a valuation, a function assigning each linguistic expression of the object language—its variables, constant symbols, predicates and function symbols—the object, relation or function that linguistic expressions take as a value. Thus a valuation is a function from objects of the base type (linguistic expressions) to objects of various different types—such as functions and relations of different arities. Presumably Oliver and Smiley have in mind some conception of higher-order logic that allows a function to take values in separate types, but it isn’t clear what it is.
Thus in the Appendix to this paper, we give a new formal presentation of plural logic as understood here. Though closely following Oliver and Smiley’s in spirit, it addresses the three issues just mentioned: it uses the inclusive interpretation of the plural quantifiers; it presents the deductive system in a form of natural deduction, making it more transparent how deductions according to the logic proceed; and it uses set theory, phrased with plural logic as the background logic, as the metalanguage. This base plural logic is buttressed with plural ancestral and plural generalized ancestral operators in Section 3 to give the “logic of arithmetic” that is the subject of this paper. In the background plural set theory, we use $\{aa\}$ to denote the set whose members are exactly the objects $aa$ , if such a set exists.
We use $x,y,\ldots {}$ for singular variables and $xx,yy,\ldots {}$ for plural variables. Quantifiers can bind either, and no further syntactic distinction is made between singular and plural terms—for the reasons discussed by Oliver & Smiley [Reference Oliver and Smiley14, p. 218]. Any function symbol can be applied to any terms as arguments, as can any predicate; if for instance a function symbol expresses a function that can only take a singular argument, then it applying it to a term denoting more than one object will produce an empty term as a result.
To allow us to express notions like that of a term taking some value (rather than being empty), or taking at most one value, we introduce various abbreviations, along similar lines to Oliver & Smiley [Reference Oliver and Smiley14, chap. 13]. If t is a term we write $E!(t)$ , meaning “t exists,” for $\exists x\,x\preccurlyeq t$ ; we write $S(t)$ , meaning “t is singular” for $\forall x\,(x\preccurlyeq t\rightarrow x=t)$ ; and we write $S!(t)$ , meaning “t exists and is singular” for $\exists x\,(x=t)$ —where x is the first singular variable not free in t according to some ordering of the singular variables. Then for instance if we want f to be a function symbol expressing a function that can only take singular arguments, rather than plural arguments, we can state that $\forall xx\,\neg\, S(xx)\rightarrow \neg\, E!(f(xx))$ . Additionally, if s and t are terms we write $s\equiv t$ for $s=t\lor (\neg\, E!(s)\land \neg\, E!(t))$ , the predicate termed “weak equality” by Oliver and Smiley—equality which holds with empty terms as arguments.
One feature of the base plural logic that is worth mentioning is that, like that of Oliver and Smiley, it allows the domain of quantification to be empty. This is argued by Oliver & Smiley [Reference Oliver and Smiley14, sec. 11.1] to be an important aspect of topic neutrality, and allows us to properly handle the case where there are no individuals to be counted (and so the number of individuals is zero). It is thus a universally free logic. We follow Oliver & Smiley [Reference Oliver and Smiley14, sec. 11.1] in avoiding the usual problems with universally free logic by allowing variable assignments to be partial, and to assign no value to a variable (so that variables may be empty terms). Importantly, though our logic is universally free, this is no real hindrance: under the assumption that there is an object, one can in effect reason as though the singular variables are all nonempty, as discussed near the end of the Appendix (this is a point that Oliver and Smiley do not make, and which may be much less obvious from the point of view of their Hilbert style axiomatization).
3 Plural ancestrals
To obtain the “logic of arithmetic” that is the subject of this paper, we supplement the base plural logic from the Appendix with plural versions of the ancestral operator.
The ancestral operator is an operator which takes a formula $\phi $ and distinct (singular) variables $x_1,x_2$ , and produces a binary predicate symbol $\phi ^\ast _{x_1,x_2}$ in which occurrences of $x_1$ and $x_2$ in $\phi $ are bound, where the relation expressed by $\phi ^\ast _{x_1,x_2}$ is the reflexive transitive closure of the relation expressed by $\phi (x_1,x_2)$ [Reference Shapiro16, p. 227]. In other words, $\phi ^\ast _{x_1,x_2}(r,s)$ states that the object denoted by s can be reached from the object denoted by r by finite iteration of the relation expressed by $\phi (x_1,x_2)$ . The operator takes its name from the fact that “s is r or is an ancestor of r” is the instance $(\text {Parent}(x_1,x_2))^\ast _{x_1,x_2}(s,r)$ of the ancestral.Footnote 11 We will term this operator the singular ancestral to differentiate it from the versions that follow.
Formally, we can give a semantics for the singular ancestral by stating that $\mathcal {D},v\vDash \phi ^\ast _{x_1,x_2}(r,s)$ iff there are individuals $a_0\ldots a_n\in D$ with $n\geqslant 0$ such that $v(r)=a_0$ , $v(s)=a_n$ and for each $i\in \{0\ldots (n-1)\}$ , we have that $\mathcal {D},v[a_i|x_1,a_{i+1}|x_2]\vDash \phi $ . Importantly if our metalanguage contains the singular ancestral, we can instead give a semantics using the operator in the metalanguage, and eschewing sequences: we state that $\mathcal {D},v\vDash \phi ^\ast _{x_1,x_2}(r,s)$ iff
where “ $b_1$ ” and “ $b_2$ ” are distinct singular metavariables. This can be proved equivalent to the first semantics, if for instance our metalanguage is set theory with the singular ancestral in its logic (and the standard deductive rules for the singular ancestral), with the separation scheme expanded to include formulae containing the singular ancestral. There are standard deductive rules associated with the singular ancestral, which we omit for reasons of brevity. In the context of our base plural logic, they can be obtained by restricting the rules for the plural ancestral below to singular nonempty terms (terms t satisfying $S!(t)$ ), with the principle that if $\phi ^\ast _{x_1,x_2}(r,s)$ then $S!(r)$ and $S!(s)$ .
The plural ancestral is just the singular ancestral but with the restriction to singular arguments lifted. It takes a formula $\phi $ and distinct plural variables $xx_1$ , $xx_2$ and produces a binary predicate symbol $\phi ^\ast _{xx_1,xx_2}$ in which occurrences of $xx_1$ and $xx_2$ in $\phi $ are bound. If our background metalogic allows us to form the reflexive transitive closure of a binary relation which can take plural arguments (for instance if it has a comprehension scheme which allows us to form arbitrary intersections of such relations) then we can characterize $\phi ^\ast _{xx_1,xx_2}$ again as expressing the reflexive transitive closure of the relation expressed by $\phi (xx_1,xx_2)$ .
We can also again view $\phi ^\ast _{xx_1,xx_2}(r,s)$ as stating that the things denoted by t can be reached from the things denoted by s by finite iteration of the relation expressed by $\phi (xx_1,xx_2)$ . We can give an informal characterization of the operator along these lines, saying that $(\phi )^\ast _{xx_1,xx_2}(qq,zz)$ holds iff
-
• $qq\equiv zz$ ;
-
• Or $\phi [qq|xx_1,zz|xx_2]$ ;
-
• Or there are $uu$ such that $\phi [qq|xx_1,uu|xx_2]$ and $\phi [uu|xx_1,zz|xx_2]$ ;
-
• Or there are $uu$ and there are $uu'$ such that $\phi [qq|xx_1,uu|xx_2]$ and $\phi [uu|xx_1,uu'|xx_2]$ and $\phi [uu'|xx_1,zz|xx_2]$ ;
-
• Or there are $uu$ and there are $uu'$ and there are $uu"$ such that we have $\phi [qq|xx_1,uu|xx_2]$ and $\phi [uu|xx_1,uu'|xx_2]$ and $\phi [uu'|xx_1,uu"|xx_2]$ and $\phi [uu"|xx_1,zz|xx_2]$ ;
and so on. If one replaced the plural variables here with singular variables, one would have an informal characterization of the singular ancestral.
We can give formal semantics along similar lines to those for the singular ancestral, stating that $\mathcal {D},v\vDash \phi ^\ast _{xx_1,xx_2}(r,s)$ iff there is a sequence $(p_0,\ldots , p_n)$ of subsets of D with $n\geqslant 0$ such that $\{v(r)\}=p_0$ , $\{v(s)\}=p_n$ and for each $i\in \{0\ldots (n-1)\}$ , if $aa_i$ are the elements of $p_i$ and $aa_{i+1}$ are the elements of $p_{i+1}$ we have $\mathcal {D},v[aa_i|xx_1,aa_{i+1}|xx_2]\vDash \phi $ . We need to use a sequence $(p_0\ldots p_n)$ of subsets since we have no direct way to talk about the values taken by variably many plural terms, but we are not committed to interpreting plural terms via sets, since again we can give an alternative semantics using the operator in the metalanguage: stating that $\mathcal {D},v\vDash \phi ^\ast _{xx_1,xx_2}(r,s)$ iff
where “ $bb_1$ ” and “ $bb_2$ ” are distinct plural metavariables.
As an example of how this operator forms predicates of a general kind that is familiar and unsuspicious (though giving the exact details of the example is somewhat fiddly), we can consider the case of a secretive cabal of ne’er-do-wells—formed from some initial group of members, with periodic ceremonies where the new membership of the cabal is decided on. Then if the $uu$ are the initial membership, and we let $C(xx,yy)$ be the relation that holds between the $xx$ and the $yy$ if the $xx$ hold an appropriate ceremony sanctioning the $yy$ as the new membership, then
is a rough attempt to characterize the $zz$ being the membership of the cabal at some stage.Footnote 12
The plural ancestral is governed by rules which parallel those normally given for the singular ancestral, except applying to any terms, rather than just singular nonempty terms. We write $\text {P}^\ast $ to identify these rules as those for the plural ancestral operator.
In rule $\text {P}^\ast $ - $\text {I}_2$ we require that $s_1$ and $s_2$ are free for $xx_1$ and $xx_2$ respectively in $\phi $ , and in rule $\text {P}^\ast $ -E we require that $xx_2$ is not free in $\chi $ .
Importantly unlike the singular ancestral, the plural ancestral forms weak predicates—predicates which can hold with empty terms as arguments. For instance suppose we write $\text {Children}(xx_1,xx_2)$ to abbreviate
i.e., the statement that the $xx_2$ are exactly the children of all the $xx_1$ . Then applying the ancestral to this predicate gives a weak predicate, where $(\text {Children}(xx_1,xx_2))^\ast _{\vec {xx}}(s,\mathfrak {o})$ holds iff some generation of descendants of the things s is empty.
The plural ancestral includes the singular ancestral as a special case, with $\phi ^\ast _{x_1,x_2}(r,s)$ being definable as
where $xx_1$ and $xx_2$ are distinct plural variables not free in $\phi $ . Indeed if both operators are included in the logic then these two expressions are provably equivalent, and if only the plural ancestral is included then the latter expression can be proved to have the intended semantics of the former (on either semantics), and to satisfy its deductive rules.
The third and final form of the ancestral we will consider is that of the plural generalized ancestral. This takes a formula $\phi $ , together with four distinct plural variables $xx_1,xx_2,yy_1$ , and $yy_2$ , and produces a binary predicate symbol $\phi ^\ast _{xx_1,xx_2,yy_1,yy_2}$ in which occurrences of $xx_1,xx_2,yy_1$ , and $yy_2$ in $\phi $ are bound. We can give an informal characterization of this operator along similar lines to that of the plural ancestral, but more verbose, saying that $\phi ^\ast _{xx_1,xx_2,yy_1,yy_2}(pp,qq,ww,zz)$ holds iff
-
• $pp\equiv ww$ and $qq\equiv zz$ ;
-
• Or $\phi [pp|xx_1,qq|yy_1,ww|xx_2,zz|yy_2]$ ;
-
• Or there are $uu$ and there are $vv$ such that:
$$ \begin{align*} &\phi[pp|xx_1,qq|yy_1,uu|xx_2,vv|yy_2];\\ &\phi[uu|xx_1,vv|yy_1,ww|xx_2,zz|yy_2]; \end{align*} $$ -
• Or there are $uu$ and there are $uu'$ , and there are $vv$ and there are $vv'$ , such that:
$$ \begin{align*} &\phi[pp|xx_1,qq|yy_1,uu|xx_2,vv|yy_2];\\ &\phi[uu|xx_1,vv|yy_1,uu'|xx_2,vv'|yy_2];\\ &\phi[uu'|xx_1,vv'|yy_1,ww|xx_2,zz|yy_2]; \end{align*} $$ -
• Or there are $uu$ and there are $uu'$ and there are $uu"$ , and there are $vv$ and there are $vv'$ and there are $vv"$ , such that:
$$ \begin{align*} &\phi[pp|xx_1,qq|yy_1,uu|xx_2,vv|yy_2];\\ &\phi[uu|xx_1,vv|yy_1,uu'|xx_2,vv'|yy_2];\\ &\phi[uu'|xx_1,vv'|yy_1,uu"|xx_2,vv"|yy_2];\\ &\phi[uu"|xx_1,vv"|yy_1,ww|xx_2,zz|yy_2]; \end{align*} $$
and so on.
We can give a formal semantics along the same lines as before, stating that $\mathcal {D},v\vDash \phi ^\ast _{xx_1,xx_2,yy_1,yy_2}(q,r,s,t)$ iff there is a sequence $((l_0,p_0),\ldots , (l_n,p_n))$ of pairs of subsets of D with $n\geqslant 0$ such that $\{v(q)\}=l_0$ , $\{v(r)\}=p_0$ , $\{v(s)\}=l_n$ , $\{v(t)\}=p_n$ and for each $i\in \{0\ldots (n-1)\}$ , if $aa_i$ are the elements of $l_i$ , $bb_i$ are the elements of $p_i$ , $aa_{i+1}$ are the elements of $l_{i+1}$ and $bb_{i+1}$ are the elements of $p_{i+1}$ then $\mathcal {D},v[aa_i|xx_1,aa_{i+1}|xx_2,bb_i|yy_1,bb_{i+1}|yy_2]\vDash \phi $ . Again we are not committed to interpreting plural terms via sets, since we can give an alternative semantics using the operator in the metalanguage: stating that $\mathcal {D},v\vDash \phi ^\ast _{xx_1,xx_2,yy_1,yy_2}(q,r,s,t)$ iff
where “ $aa_1$ ,” “ $aa_2$ ,” “ $bb_1$ ” and “ $bb_2$ ” are distinct plural metavariables.
We can illustrate the meaning of this operator, and its ability to form unsuspicious predicates, by another version of the cabal example from before. This time we have the A-cabal and its sister cabal, the B-cabal. The new membership of each cabal is determined in a joint ceremony, in which the existing members of both cabals are required to be present; the existing members of the A-cabal play a certain ceremonial role; the existing members of the B-cabal play a different ceremonial role; and the proposed new membership of each cabal plays a further distinct role. Again we can give an intuitive but rough characterization of this concept using the plural generalized ancestral, letting $C(xx_1,xx_2,yy_1,yy_2)$ as the relation holding if a membership ceremony is correctly held by the $xx_1$ and $xx_2$ which sanctions the $yy_1$ as the new membership of the A-cabal, the $yy_2$ as the new membership of the B-cabal, and with the $xx_1$ playing the role of the A-cabal members, and the $xx_2$ playing the role of the B-cabal members. Then if we let $uu$ be the initial members of the A-cabal, and the $vv$ the initial members of the B-cabal, then we have that
is a rough characterization of the $ww$ and the $zz$ being the membership of the A-cabal and B-cabal respectively at some contemporaneous stage.Footnote 13 This predicate does not appear to be definable in terms of the plural ancestral alone.Footnote 14
The plural generalized ancestral is governed by rules paralleling those of the plural ancestral, but with more arguments. We write $\text {PG}^\ast $ to identify these rules as those for the plural generalized ancestral operator.
In rule $\text {PG}^\ast $ - $\text {I}_2$ we require that $s_1$ , $s_2$ , $t_1$ and $t_2$ are free for $xx_1$ , $xx_2$ , $yy_1$ , and $yy_2$ respectively. In rule $\text {PGG}^\ast $ -E we require that $xx_2$ and $yy_2$ are not free in $\chi $ .
The plural ancestral can be defined in terms of the plural generalized ancestral. Indeed $\phi _{xx_1,xx_2}^\ast (r,s)$ can be defined as $\phi ^\ast _{xx_1,xx_2,yy_1,yy_2}(r,c,s,c)$ where the $yy_i$ are not free in $\phi $ and c is any closed term. These expressions are provably equivalent if both the plural ancestral and plural generalized ancestral are present in the logic, and if only the plural generalized ancestral is present then the latter expression can be proved to have the intended semantics of the former, and to satisfy its deductive rules.
We will take the logic of this paper to be the base plural logic of the Appendix supplemented by the plural ancestral and plural generalized ancestral operators (though the former is redundant, as just noted). This logic has a plausible claim to being a “real logic” (though this notion is not without obscurity). Of course, this is dependent on us regarding the base plural logic itself as a real logic. Oliver & Smiley [Reference Oliver and Smiley14] draw a convincing parallel between plural terms and plural quantification and singular terms and singular quantification, so that in as much as we have reason to regard standard first-order logic as a real logic, the same should go for their plural logic (and our base plural logic). Plural logic, on their approach, is ontologically innocent: we are not requiring the existence of any new entities, such as mysterious “pluralities,” but are just appealing to a plural notion of denotation—the ability of a term to denote multiple objects.
Supplementing the base plural logic with our ancestral operators arguably should not impact its status as a real logic, either. The logic remains ontologically innocent, as we are still not committed to interpreting plural terms via sets, and can instead state the semantics by just using the same operators in the metalanguage. Smith [Reference Smith17] argues that the singular ancestral should be regarded as a genuine logical operator, a conceptual primitive intermediate between first- and second-order logic, and Parsons [Reference Parsons15, chap. 8] argues that one can introduce the predicate “natural number” by laying down the relevant introduction rules as canonical ways of attributing the predicate to objects—an argument which generalizes immediately to other instances of the singular ancestral. If these defences of the singular ancestral—which in no way appeal to an interpretation of it in plural logic—are correct, then exactly the same should go for the plural ancestral operators we have introduced here, about which very similar points can be made. The ancestral operators are governed by simple introduction and elimination rules, in the same manner as the propositional connectives and quantifiers are. The rules characterize these operators uniquely, in the sense that in each case, if there are two predicates $\phi ^\ast $ and $\phi ^+$ satisfying the deductive rules for the predicate symbol formed from $\phi $ (with a given choice of bound variables) then one can prove these predicates coextensional. The rules are conservative in the sense that following a sequence of instances of the introduction rules by an instance of the elimination rule always gives a result that was already provably without using the ancestral; arguing this requires the ability to argue by induction or recursion in the metalanguage—which is no more circular than obtaining the parallel conservativeness results for propositional connectives using the corresponding logical rules in the metalanguage. They also satisfy Tarki’s permutation-invariance requirement, when the ancestral is regarded as a function from relations to relations [Reference Tarski19]. Of course how we should determine which concepts should count as those of “real logic” is a difficult question, and merits much further discussion, but there is no obvious objection to the plural ancestral and plural generalized ancestral that would not apply equally to the singular ancestral, and moreover to the propositional connectives and quantifiers. One residual worry may concern the status of induction—whether somehow this is less “obvious” as a primitive form of reasoning than other varieties of deduction. I note this worry, but in my view it is not compelling.
As with other logical concepts, we can view a grasp of these ancestral operators as given—at least in paradigmatic cases, being aware of the points made by Williamson [Reference Williamson21]—by proficiency with their introduction and elimination rules. Then the cabal examples earlier suggest that ordinary people typically do grasp at least some instances of the plural ancestral and plural generalized ancestral, with the predicates “member of [the/the A/the B] cabal” being of an ordinary kind, and most naturally seen as defined via the plural ancestral and plural generalized ancestral in the manner above. Almost everyone would be proficient with uses of the introduction and elimination rules for these examples; this is obvious for the introduction rules, and an instance of the elimination rule for “member of the cabal in year I” is the inference that if all initial members of the cabal are murderers, and each year all new members admitted are murderers, then all members of the cabal in every year are always murderers. One can give a similar example for the A-cabal and B-cabal, say with murderers and arsonists. These kinds of inductive inferences would, I think, be immediately accepted by almost anyone.
As mentioned in the introduction, the work here has a precursor in that of Martin [Reference Martin12, Reference Martin13]. He was (as far as I’m aware) the first to define the generalized ancestral, giving a version of the operator for singular variables. Martin’s purpose was to give a nominalistic system in which a reasonable amount of mathematics could be carried out, rather than to argue for a philosophical analysis of arithmetic in terms of ancestrals. But ancestrals do have plausibility as the basis of numerous arithmetic concepts: as well as the concepts discussed here that can be defined in terms of our plural ancestrals, I have argued previously [Reference Tatton-Brown20] that the double ancestral (slightly less general than the singular generalized ancestral) can be seen as the basis of a usual grasp of primitive recursion.
4 Finiteness
The first arithmetic concept we will characterize in our logic is that of finiteness. First we define what it is for a plurality $yy$ to be the same as the $xx$ but with one additional object, which we write as $\text {Succ}(xx,yy)$ , “Succ” for “successor.” This holds if every x amongst the $xx$ is also among the $yy$ , and if there is a unique y which is amongst the $yy$ but not the $xx$ . Using the plural ancestral we can form the relation $(\text {Succ}(xx_1,xx_2))^\ast _{xx_1,xx_2}$ , where $(\text {Succ}(xx_1,xx_2))^\ast _{xx_1,xx_2}(ww,zz)$ holds if the $zz$ are obtained as a finite extension of the $ww$ . Then we can define what it is for a plurality to be finite: they are just the finite extensions of zilch. Symbolically, we have
It is easy to see that $\mathcal {D},v\vDash \text {Finite}(xx)$ iff $v(xx)$ are finitely many elements of D, in our metalanguage of plural set theory.
If one likes one can read us as meaning “are finitely many” when we say “are finite,” as one perhaps ordinarily only talks about some things being finite if they are plural, whereas “finitely many” does naturally apply to no things, or one thing: if the room is free of chairs, or has one chair in it, then it has finitely many in it (it certainly doesn’t have infinitely many).
The introduction and elimination rules for the plural ancestral, specialized to this predicate, state:
This definition of finiteness has great plausibility as a characterization of what is ordinarily meant by the term—and is more plausible than the Neo-Fregean alternative. We do not attribute to ordinary people an innate grasp of second-order logic, and instead use the plural ancestral operator, which as seen in Section 3 is a concept forming operator that ordinary people grasp at least some instances of.
Recalling the informal characterization of the plural ancestral seen in Section 3, that characterization when specialized to this predicate states that the $zz$ are finite iff they are zilch; or they are obtained from zilch by the addition of a single element; or they are obtained from the $uu$ by addition of a single element with the $uu$ obtained from zilch by the addition of a single element; or are obtained from the $uu'$ by the addition of a single element, where the $uu'$ are obtained from the $uu$ by the addition of a single element, and the $uu$ are obtained from zilch by the addition of a single element; and so on. The finite things are the things which can be reached by iterating the “add a single object” operation (where it is implicit that this is finite iteration, an ordinary person not having a concept of transfinite iteration)—a very natural way to state what we ordinarily mean by finiteness.
The introduction rules for the plural ancestral, specialized to this predicate, are also very plausible as a partial characterization of what is ordinarily meant by finiteness: they state that zilch is finite, and that if the $xx$ are finite, and the $yy$ are obtained by adding a single additional object to the $xx$ , then the $yy$ are finite.
The elimination rule’s status in this regard is more questionable. It gives that for each definable property if that property holds of zilch, and that property is preserved whenever one adds one new object to some things, then it holds of any finite things. As an example, if we wanted to argue that every finite nonempty group of people has a tallest person, we could argue as follows: any group consisting of exactly one person has a tallest person, and if the $xx$ have a tallest person u and people $yy$ are just the $xx$ with an extra person v then either u or v will be a tallest person among the $yy$ . The status of this kind of argument is less clear than that of the introduction rules for finiteness. An ordinary person might have no interest in this example argument, as they would assume the conclusion is obvious. More significantly, even if persuaded to seriously consider the question at hand, this is not the kind of argument we would expect any ordinary person to spontaneously make. It may be more plausible that they might give an informal variation of the argument, saying that if there’s one person then they’re the tallest; if you add a second person, then if they’re taller than the previous person then they’re the tallest, otherwise the previous person was the tallest; if you add a third person, then if they’re taller than the previous two then they’re the tallest, otherwise the tallest of the previous two is the tallest; and so on. Though phrased differently this has essentially the same content as the first version.
Though it is thus not obvious that an ordinary person will typically have an innate grasp of this elimination rule, this is not actually a strong objection to our characterization of finiteness. We can draw a parallel with other logical operations. The introduction rule for the universal quantifier $\forall x$ involves reasoning about an arbitrary, unspecified x, using no premises involving x, and then deducing that whatever conclusion reached actually holds of all x. It is not clear whether there are any cases when an ordinary person would spontaneously or explicitly use this kind of reasoning, or even whether they would necessarily see why it was valid. It is a method of mathematical proof that requires time and effort for school age students to learn, if they are able to learn it at all. Nonetheless we do regard this rule as partly constituting what is normally meant by “for all”—with ordinary people perhaps not proficient in the full use of the concept, but meaning the same as logicians when they say “for all” in virtue of speaking the same language. Exactly the same can be said of our characterization of finiteness.
5 Equinumerosity
We now treat the predicateFootnote 15 “equinumerous with,” and its relatives “at least as many as” and “more than” (for count nouns), discussing their definition and basic properties. Here “the $xx$ are equinumerous with the $yy$ ” is taken to mean “there are just as many $xx$ as $yy$ ,” not necessarily with any connotations of the $xx$ and the $yy$ being assigned numbers as cardinalities. Of course we do naturally think of there being a close relationship between equinumerosity (thus understood) and equality of number, and we will introduce numbers as objects that can play this standard role in Section 7. It is worthwhile first investigating how much of a normal understanding of equinumerosity can be recovered without treating numbers as objects.
The definition of equinumerosity for finite pluralities is very similar to that of finiteness, but using the plural generalized ancestral in place of the plural ancestral. With the plural generalized ancestral we can form the predicate symbol $(\text {Succ}(xx_1,xx_2)\land \text {Succ}(yy_1,yy_2))^\ast _{\vec {xx},\vec {yy}}$ . We have that
holding means that the $ww$ are obtained from the $pp$ by adding in as many additional elements as it takes to obtain the $zz$ from the $qq$ . Abbreviating $(\text {Succ}(xx_1,xx_2)\land \text {Succ}(yy_1,yy_2))^\ast _{\vec {xx},\vec {yy}}$ by $\text {Eq-Add}$ , we define the binary predicate of equinumerosity, symbolized by “ $\approx $ ,” to be that satisfying:
It is easy to see that $\mathcal {D},v\vDash xx\approx yy$ iff $v(xx)$ are finite and there are as many $v(xx)$ as $v(yy)$ (in our metalanguage of plural set theory).
The introduction rules for the plural generalized ancestral, specialized to this predicate, state:
In other words, the introduction rules state that zilch is equinumerous with zilch, and that if the $xx$ are equinumerous with the $yy$ , and the $xx'$ and the $yy'$ are obtained from the $xx$ and the $yy$ respectively by the addition of a single object, then the $xx'$ are equinumerous with the $yy'$ . The elimination rule gives that for each open formula $\chi (xx_1,yy_1)$ , if:
-
• $\chi (\mathfrak {o},\mathfrak {o})$ ,
-
• Whenever $\chi (xx_1,yy_1)$ , and the $xx_2$ and the $yy_2$ are obtained from the $xx_1$ and the $yy_1$ respectively by the addition of a single object, then $\chi (xx_2,yy_2)$ ,
then $xx\approx yy$ implies $\chi (xx,yy)$ . We will refer to this elimination rule as $\approx $ -induction.
We now use these rules to prove some standard properties of equinumerosity.
Proposition 1.
-
(i) If $xx\approx yy$ then the $xx$ are finite and the $yy$ are finite.
-
(ii) If the $xx$ are finite then $xx\approx xx$ .
-
(iii) If $xx\approx yy$ then $yy\approx xx$ .
Proof. (i) is immediate by $\approx $ -induction, and (ii) is immediate by induction on $xx$ . (iii) is immediate by $\approx $ -induction, with $\chi (xx_1,yy_1)$ being the formula $yy_1\approx xx_1$ .
For the following, recall the definition of weak equality $\equiv $ from near the end of Section 2 (equality where empty terms are counted as equal).
Proposition 2. If $xx\not \equiv \mathfrak {o}$ and $xx\approx yy$ then there are $xx'$ and $yy'$ such that $xx'\approx yy'$ , $\text {Succ}(xx',xx)$ and $\text {Succ}(yy',yy)$ .
Proof. Immediate by $\approx $ -induction.
Corollary 3. $xx\approx \mathfrak {o}$ iff $xx\equiv \mathfrak {o}$
Proof. The “if” direction is just the first $\approx $ introduction rule. The “only iff” direction is immediate from the preceding proposition.
Arguing that $\approx $ is transitive appears to require more work. First we prove a preliminary result. Say that $yy$ and $yy'$ differ in one element if there is a unique y amongst the $yy$ but not the $yy'$ , and a unique $y'$ amongst the $yy'$ but not amongst the $yy$ .
Proposition 4. If the $yy$ are finite, and $yy$ and $yy'$ differ in one element then $yy\approx yy'$ .
Proof. The proof is by induction on $yy$ . The conclusion is trivial for $yy\equiv \mathfrak {o}$ . Suppose it holds for $yy$ , and that we have $\text {Succ}(yy,zz)$ , with a unique x amongst the $zz$ but not the $yy$ ; and that we have that $zz'$ differs from $zz$ in one element, with a unique z amongst the $zz$ but not the $zz'$ and a unique $z'$ amongst the $zz'$ but not the $zz$ . If $x\neq z$ then we let $yy'$ consist of the same elements as $yy$ but with $z'$ added and z removed; thus $yy'$ differs from $yy$ in one element, so by the induction hypothesis we have $yy\approx yy'$ , but we have $\text {Succ}(yy,zz)$ and $\text {Succ}(yy',zz')$ by construction, and so $zz\approx zz'$ as required. If on the other hand $x=z$ then $\text {Succ}(yy,zz')$ and so $zz\approx zz'$ . Thus either way we are done.
Corollary 5. If the $yy$ are finite and $\text {Succ}(yy,zz)$ and $\text {Succ}(yy',zz)$ then $yy\approx yy'$ .
Proof. If the hypotheses hold then either $yy=yy'$ , or the $yy$ differ from the $yy'$ in one element, and so either way $yy\approx yy'$ .
Proposition 6. If $xx\approx yy$ and $yy\approx zz$ then $xx\approx zz$ .
Proof. The proof is by induction on finite $xx$ that for all $yy$ and all $zz$ , if $xx\approx yy$ and $yy\approx zz$ then $xx\approx zz$ . This is true for $xx\equiv \mathfrak {o}$ . Suppose true for $xx$ , and that $\text {Succ}(xx,xx^\ast )$ , and let $xx^\ast \approx yy$ and $yy\approx zz$ . By Proposition 2, there are $xx'$ , $yy'$ , $yy"$ , and $zz'$ such that $xx'\approx yy'$ , $\text {Succ}(xx',xx^\ast )$ , $\text {Succ}(yy',yy)$ , $yy"\approx zz'$ , $\text {Succ}(yy",yy)$ and $\text {Succ}(zz',zz)$ . Then we also have $xx\approx xx'$ and $yy'\approx yy"$ by Corollary 5. Thus $xx\approx xx'\approx yy'\approx yy"\approx zz'$ , and so by the induction hypothesis $xx\approx zz'$ . Thus $xx^\ast \approx zz$ as required.
Thus $\approx $ is an equivalence relation, when each argument consists of finitely many things.
Proposition 7. If $\text {Succ}(xx,xx^\ast )$ and $\text {Succ}(yy,yy^\ast )$ then $xx\approx yy$ iff $xx^\ast \approx yy^\ast $ .
Proof. Suppose that $\text {Succ}(xx,xx^\ast )$ and $\text {Succ}(yy,yy^\ast )$ . The “only if” direction of the conclusion is just the second introduction rule for $\approx $ . For the converse, we have by Proposition 2 that if $xx^\ast \approx yy^\ast $ then there are $xx'$ and $yy'$ with $xx'\approx yy'$ , $\text {Succ}(xx',xx^\ast )$ and $\text {Succ}(yy',yy^\ast )$ . Then by Corollary 5 we have $xx\approx xx'$ and $yy\approx yy'$ , so since $\approx $ is an equivalence relation we have $xx\approx yy$ as required.
In these arguments for transitivity and related results we make a technical advance on the work of Martin [Reference Martin12], since he posits a non-logical axiom [Reference Martin12, IV, R6(2), p. 11] which, translated into our logic, states:
He used this axiom to derive transitivity [Reference Martin12, V, *T2.293 and *T2.294, p. 16]. However for us this fact follows from Proposition 7 and the fact that $\approx $ is an equivalence relation, and positing it as an axiom is unnecessary.
For the following discussion we introduce a predicate $\preceq $ for weak inclusion, where $s\preceq t$ is defined as an abbreviation for $(\neg\, E!(s))\lor s\preccurlyeq t$ —the version of inclusion where zilch is counted as included amongst everything.
We now prove a significant proposition.
Proposition 8. If $xx'\preceq xx$ and $xx'\approx xx$ then $xx'\equiv xx$ .
Proof. This is by induction on $xx$ . The base case is trivial. For the induction step suppose the conclusion holds for $xx$ and we have $\text {Succ}(xx,yy)$ , and that we have $yy'\preceq yy$ and $yy'\approx yy$ . Let y be the unique object amongst $yy$ which is not amongst $xx$ . If $y\preccurlyeq yy'$ then let $z=y$ , otherwise let z be an arbitrary element of $yy'$ . Then let $xx'$ be the objects of $yy'$ not equal to z, so that $\text {Succ}(xx',yy')$ and $xx'\preccurlyeq xx$ . Then since $yy'\approx yy$ , $\text {Succ}(xx,yy)$ and $\text {Succ}(xx',yy')$ , by Proposition 7 we obtain $xx'\approx xx$ . Thus the induction hypothesis gives $xx'=xx$ , so we must have been in the case where $z=y$ , and thus $yy'$ consists of $xx$ together with y, i.e., $yy'=yy$ as required.
This is equivalent to saying that if $xx\preceq yy$ and $xx\not \equiv yy$ then $xx\not \approx yy$ . In other words, the whole is greater than the part. This proposition is not an idle observation, and plays an essential role in some of the following proofs. We have here an important difference from the Cantorian conception of cardinality: for us the principle follows from the most natural characterization of equinumerosity in our logic, whereas one can only obtain this principle on the Cantorian conception if one arbitrarily restricts oneself to finite concepts (which is a complication rather than simplification of the theory). Commitment to this proposition was one of the main roadblocks to acceptance of the Cantorian conception, and on our approach we can appreciate why it would earn this commitment—with its value illustrated by its role in the following proofs.
Proposition 9. If $xx'\preceq xx$ and $xx\approx zz$ then there are $zz'\preceq zz$ such that $xx'\approx zz'$ .
Proof. We prove this by induction on $xx$ . The base case is trivial. Suppose the conclusion holds for $xx$ and we have $\text {Succ}(xx,yy)$ , $yy'\preceq yy$ and $yy\approx zz$ . Then $zz\neq \mathfrak {o}$ and we can let $z\preccurlyeq zz$ , and let $ww$ be the objects of $zz$ not equal to z. Then $\text{Succ}(ww, zz)$ , so $xx\approx ww$ by Proposition 7. Let y be the unique object amongst $yy$ but not $xx$ . If $y\not \preccurlyeq yy'$ then we are done by the induction hypothesis. Otherwise let $xx'$ be $yy'$ but with y removed; then the induction hypothesis gives $ww'\preccurlyeq ww$ with $xx'\approx ww'$ , and thus $yy'\approx zz'$ where $zz'$ are $ww'$ but with z added.
We now introduce a predicate $\lesssim $ , where $xx\lesssim yy$ is to be read “there are at least as many $yy$ as $xx$ ,” and is defined to hold if $\exists zz (xx\approx zz\land zz\preceq yy)$ .
Proposition 10.
-
(i) If $xx\lesssim yy$ then the $xx$ are finite.
-
(ii) If the $xx$ are finite then $xx\lesssim xx$ .
-
(iii) If $xx\lesssim yy$ and $yy\lesssim zz$ then $xx\lesssim zz$ .
-
(iv) If $xx\lesssim yy$ and $yy\lesssim xx$ then $xx\approx yy$ .
Proof. (i) and (ii) are immediate by Proposition 1.
For (iii), if $xx\lesssim yy$ and $yy\lesssim zz$ then there are $yy'$ and $zz'$ such that $xx\approx yy'\preceq yy$ , and $yy\approx zz'\preceq zz$ ; then by Proposition 9 there are $zz"\prec zz'$ such that $yy'\approx zz"$ , and thus $xx\approx zz"\preceq zz'\preceq zz$ as required.
For (iv), if $xx\lesssim yy$ and $yy\lesssim xx$ then there are $yy'\preceq yy$ such that $xx\approx yy'$ , and $xx'\preceq xx$ such that $yy\approx xx'$ . Then by Proposition 9 there are $xx"\preceq x'$ such that $yy'\approx xx"$ , and thus $xx\approx yy'\approx x"$ and so $xx=xx"$ by Proposition 8. Thus $xx\preceq xx'\preceq xx$ , and so $xx=xx'$ , and so $yy\approx xx$ as required.
Property (i) may seem odd, but it is not unnatural in a context where we can only make sense of equinumerosity for finite arguments. One could read $xx\lesssim yy$ as “ $xx$ are finite and there are at least as many t as $yy$ ” if one preferred.
Now we introduce a predicate $\lnsim $ , where $xx\lnsim yy$ is to be read “there are strictly more $yy$ than $xx$ ,” and is defined to hold if $\exists zz(xx\approx zz\land zz\preceq yy \land zz\not \equiv yy)$ . This is equivalent to $\exists ww \,\exists zz(\text {Succ}(xx,ww)\land (ww\approx zz)\land (zz\preceq yy))$ .
Proposition 11.
-
(i) If $xx\lnsim yy$ then the $xx$ are finite.
-
(ii) $\neg\, xx\lnsim xx$ .
-
(iii) If $xx\lnsim yy$ and $yy\lesssim zz$ then $xx\lnsim zz$ .
-
(iv) If $xx\lesssim yy$ and $yy\lnsim zz$ then $xx\lnsim zz$ .
-
(v) If $xx\lnsim yy$ then $xx\not \approx yy$ .
-
(vi) If $xx\lnsim yy$ then $yy\not \lesssim xx$ .
Proof. (i) is again immediate, and (ii) is immediate by Proposition 8. (iii) and (iv) are similar to (iii) of Proposition 10. For (v), suppose that $xx\lnsim yy$ . Then there are $yy'$ such that $xx\approx yy'$ , $yy'\preceq yy$ and $yy'\not \equiv yy$ . Thus by Proposition 8, $yy'\not \approx yy$ , so since $xx\approx yy'$ we can’t have $xx\approx yy$ . (vi) follows from (v) since if $xx\lnsim yy$ and $yy\lesssim xx$ then $xx\approx yy$ by Proposition 10 (iv).
We can also prove a theorem scheme showing, in effect, that finite pluralities with a definable bijection between them are equinumerous.
Proposition 12. Let $x,y$ be distinct singular variables, let $xx$ be a plural variable, and let $\alpha ,\beta ,\phi $ be formulae such that y and $xx$ are not free in $\alpha $ , x is not free in $\beta $ , and $xx$ is not free in $\phi $ . Let $\Gamma $ be a set of formulae in which $xx$ is not free. Suppose that $\Gamma $ proves $\forall x\,(\alpha (x)\rightarrow \exists ! y(\beta (y)\land \phi (x,y)))$ , and proves $\forall y\,(\beta (y)\rightarrow \exists ! x(\alpha (x)\land \phi (x,y)))$ . Then $\Gamma $ proves:
Proof. The proof is immediate by induction on $xx$ .
Thus if we expand our logic to include binary relation variables and quantifiers, then we can prove that if the $xx$ are finite and R is a bijection between the $xx$ and the $yy$ then $xx\approx yy$ . The converse, that if $xx\approx yy$ then there is a bijection R between the $xx$ and the $yy$ , is also straightforward. Thus it is a theorem for us that the bijection characterization of equinumerosity, specialized to finite pluralities, is materially correct (in the presence of polyadic second-order logic).
Given these facts about equinumerosity and its relatives, we can argue that certain standard practical properties of equinumerosity do hold. First, we can argue for the validity of counting as a way of determining equinumerosity. Like Linnebo [Reference Linnebo11, chap. 10] we will just take a system of numerals to be a discrete linear order (this is all the structure we need). If j is a numeral in a given system, we will write $[\leqslant j]$ for the numerals at most j. Then a fundamental result about numerals is the following.
Proposition 13. If j and $j'$ are numerals in a given system of numerals, and $[\leqslant j]\approx [\leqslant j']$ , then $j=j'$ .
Proof. Suppose that $j'<j$ , with $[\leqslant j]$ and $[\leqslant j']$ finite. Then $[\leqslant j']\preccurlyeq [\leqslant j]$ and $[\leqslant j']\not \equiv [\leqslant j]$ , so $[\leqslant j']\not \approx [\leqslant j]$ by Proposition 8.
To discuss counting we will quantify over acts of counting, by which we mean an attempted assignment of numerals to objects.Footnote 16 If a is an act of counting and i a numeral, we will write $\text {Assign}(a,i,x)$ to signify that i is assigned to object x by a. We will say that an act of counting a enumerates the $xx$ via I if I is an initial segment of numerals such that if $\text {Assign}(a,i,x)$ then $i\in I$ , and such that the map $I\ni i\mapsto \text {Assign}(a,i,x)$ gives a bijection between I and the $xx$ . We will write $\text {Count}(a,j,xx)$ if a enumerates the $xx$ via $[\leqslant j]$ . Then the basic principle behind counting as a way of determining equinumerosity is the following.
Proposition 14. Suppose the $xx$ and the $yy$ are finite, and that we have $\text {Count}(a,j,xx)$ and $\text {Count}(b,k,yy)$ . Then $j=k$ iff $xx\approx yy$ .
Proof. The function $\text {Assign}(a,{\text {-}},{\text {-}})$ gives a bijection between $[\leqslant j]$ and $xx$ , and the function $\text {Assign}(b,{\text {-}},{\text {-}})$ gives a bijection between $[\leqslant k]$ and $yy$ . Thus by Proposition 12 we have $[\leqslant j]\approx xx$ and $[\leqslant k]\approx yy$ , and in particular $[\leqslant j]$ and $[\leqslant k]$ are finite. Thus $xx\approx yy$ iff $[\leqslant j]\approx [\leqslant k]$ iff $j=k$ , by Proposition 13.
As a special case of this we have by taking $yy=xx$ that if the $xx$ are finite, and a and b are two acts of counting that enumerate the $xx$ , then the largest numerals reached during a and b are the same: counting the same things twice gives the same answer.
As a final practical example we consider Heck’s case of a child assigning cookies to children [Reference Heck8, p. 171]: in their view, to have a basic grasp of the notion of “at least as many,” one needs to understand that if there at least a many cookies as children, then if you give each child one cookie then you’ll have enough cookies for everyone to receive one (and similarly for “as many as” and “more than”). We can see that on our definition of “at least as many” this fact does indeed follow. Let us consider here an act b of assigning cookies to children, where we require that during b at most one cookie goes to each child, and cookies only stop being given to children if either every child has a cookie, or the cookies have run out. Let us suppose we have finitely many children $xx$ and at least as many cookies $cc$ as children. We will write $g(b,c,x)$ if under b, cookie c is assigned to child x. By assumption we have that if $g(b,c,x)$ and $g(b,c',x)$ then $c=c'$ , and have that either $\forall c \,\exists x \,g(b,c,x)$ or $\forall x\, \exists c \,g(b,c,x)$ . Suppose for contradiction that we end up with a child with no cookie. Then $\exists x\,\forall c\,\neg\, g(b,c,x)$ , and so by our second assumption $\forall c \,\exists x \,g(b,c,x)$ . Thus g defines a bijection from cookies $cc$ to some $xx'$ of the children, and by assumption $xx'\not \equiv xx$ . Then Proposition 12 gives $cc\approx xx'$ , and thus $cc\lnsim xx$ , and thus $xx\not \lesssim cc$ by Proposition 11 (vi), giving us our required contradiction.
As with finiteness, this account of equinumerosity has great plausibility as a characterization of what is ordinarily meant by the term—and is more plausible than the Neo-Fregean alternative. We do not attribute to ordinary people an innate grasp of second-order logic, and instead use the plural generalized ancestral, which as seen in Section 3 is a concept forming operator that ordinary people grasp at least some instances of.
Recalling the informal characterization of the plural generalized ancestral from Section 3, that characterization when specialized to this predicate states that $yy\approx zz$ iff both are zilch; or both are obtained from zilch by the addition of a single element; or there are $uu$ and $vv$ such that $uu$ and $vv$ are obtained from zilch by the addition of a single element, and $yy$ and $zz$ are obtained from $uu$ and $vv$ respectively by the addition of a single element; or there are $uu$ and $vv$ and $uu'$ and $vv'$ such that $uu$ and $vv$ are obtained from zilch by the addition of a single element, and the $uu'$ and $vv'$ are obtained from the $uu$ and $vv$ respectively by the addition of a single element, and the $yy$ and the $zz$ are obtained from the $uu'$ and $vv'$ respectively by the addition of a single element; and so on. The predicate “equinumerous” applies to arguments that can be reached from zilch by iterating the “add a single object to each side” operation (again, it being implicit that this is finite iteration)—a very natural way to state what we ordinarily mean by equinumerosity (before being introduced to Cantor’s ideas, at least). Being thus informally characterizable in terms of iteration, this account of equinumerosity has an ordinal flavour, on the understanding of ordinals as measures of iteration.
The introduction rules for the plural generalized ancestral, specialized to this predicate, are also very plausible as a partial characterization of what is ordinarily meant by equinumerosity: they state that zilch is equinumerous with zilch (or that no things are equinumerous with no things), and that if the $xx$ are equinumerous with the $yy$ , and the $xx'$ and the $yy'$ are obtained from the $xx$ and the $yy$ respectively by addition of a single object, then the $xx'$ are equinumerous with the $yy'$ .
These are closely related to the first two of the three principles that Heck thinks are constitutive of a basic grasp of equinumerosity [Reference Heck8, p. 170]. Heck states these principles using quantification over concepts (which can be thought of as formalized using unary second-order variables) instead of our use of plural logic, and uses the relation “ $\text {JAM}_x(Fx,Gx)$ ” to formalize that there are just as many $Fs$ as $Gs$ , which is the equivalent of our $x:Fx\approx x:Gx$ . Heck’s three principles are:
Heck’s first principle $ {\textbf {ZCE}}^\ast $ can be split into two claims:
the former of which is the equivalent in their logic of our first introduction rule $\approx $ - $\text {I}_1$ , and the latter the equivalent of an consequence of our definition, as seen in Corollary 3. Heck’s second principle $ {\textbf {APC}}^\ast $ is the equivalent in their logic of our second introduction rule $\approx $ - $\text {I}_2$ . Their third principle $ {\textbf {RPC}}^\ast $ is the equivalent in their logic of our result Proposition 7.
As was the case with finiteness, the status of the elimination rule is more debatable. It gives that for any definable relation R, if that relation holds between zilch and zilch, and the relation holding between $xx$ and $yy$ implies that it holds between $xx'$ and $yy'$ whenever $xx'$ and $yy'$ are obtained from $xx$ and $yy$ respectively by the addition of a single object, then R holds between $xx$ and $yy$ whenever $xx\approx yy$ . As an example of this, suppose we sought to argue that whenever there are as many people $xx$ as people $yy$ then the $xx$ and the $yy$ can be lined up facing each other, with each person in each line directly opposite one person in the other line. This is true (in a trivial sense) whenever there are no $xx$ and no $yy$ . If it’s true for $xx$ and $yy$ , and the $xx'$ are the $xx$ with extra person u, and the $yy'$ are the $yy$ with extra person v, then the $xx$ and the $yy$ can be lined up facing each other, with u and v facing each other at one end. As with the case of finiteness, this is probably a kind of argument few ordinary people would spontaneously make (even if persuaded to consider the question seriously). They might be more likely to give a more informal version with essentially the same content, arguing that the conclusion held when the $xx$ were one and the $yy$ were one, and the conclusion still holds if you add one person to each, and add another, and so on. As with the case of finiteness, it seems about as reasonable to take a grasp of the rules governing $\approx $ to be implicit in an ordinary person’s understanding of the concept as it does to take the natural deduction rules for “ $\forall $ ” to be implicit in an ordinary person’s understanding of the concept.
Indeed as we have seen, these definitions of equinumerosity and the related predicates “at least as many as” and “strictly more than” allow the derivation of all their standard properties, including properties like reflexivity and transitivity (Propositions 1, 6, and 10), the fact that the part is not equinumerous with the whole (Proposition 8), the ability to determine equinumerosity by counting (Proposition 14), and the facts required for Heck’s example of the cookies and the children.
The definitions also do not overreach, in an important sense. They all come with a built in restriction to finite concepts. The only obvious way to extend the notion of equinumerosity as we have defined it here to the non-finite case is to declare all infinite pluralities to be equinumerous—which is more or less what the layman’s notion of cardinality appears to amount to (if they allow cardinality comparisons of infinite pluralities at all). We can derive a theorem scheme showing that pluralities related by a definable bijection are equinumerous (Proposition 12), but this is just one derived result about equinumerosity amongst many, and no more fundamental than many others. Indeed it is not obviously more fundamental than the result that the whole is greater than the part (Proposition 8). Thus on our approach we can very well see why Cantor’s definition of equinumerosity via bijections for infinite concepts was such a conceptual leap: it required taking one derived property of equinumerosity as basic, and as having priority over other properties that were (apparently) just as basic. This represents a major advantage of our approach over Neo-Fregeanism.
6 Arithmetic operations
We now give definitions of the $xx$ being one greater in size than the $yy$ , or the same size as the sizes of the $yy$ and the $zz$ added or multiplied, and prove some basic properties of these concepts.
First we define $S(xx,yy)$ to hold if there are $yy'$ such that $xx\approx yy'$ and $\text {Succ}(yy',yy)$ . This predicate states that the $yy$ are one greater in size than the $xx$ .
Proposition 15.
-
(i) If $S(xx,yy)$ then the $xx$ and the $yy$ are finite.
-
(ii) If $xx\approx xx_2$ and $yy\approx yy_2$ then $S(xx,yy)$ iff $S(xx_2,yy_2)$ .
-
(iii) If $S(xx,yy)$ and $S(xx,yy_2)$ then $yy\approx yy_2$ .
-
(iv) If $S(xx,yy)$ and $S(xx_2,yy)$ then $xx\approx xx_2$ .
Proof. (i) is immediate.
For (ii), it is clear that if $xx\approx xx_2$ then $S(xx,yy)$ iff $S(xx_2,yy)$ , so we need to only show that if $yy\approx yy_2$ then $S(xx,yy)$ iff $S(xx,yy_2)$ . Thus it suffices to assume that $yy\approx yy_2$ and that $S(xx,yy)$ and argue that $S(xx,yy_2)$ . But on these hypotheses there are $yy'$ with $xx\approx yy'$ and $\text {Succ}(yy',yy)$ , and thus we can find $yy_2'$ with $\text {Succ}(yy_2',yy_2)$ , and then $yy_2'\approx yy'$ by Proposition 7, and so $yy_2'\approx xx$ and we are done.
For (iii), we have by assumption that there are $yy'$ and $yy_2'$ such that $xx\approx yy'$ , $xx\approx yy_2'$ , $\text {Succ}(xx,yy')$ and $\text {Succ}(xx,yy_2')$ . Thus $yy'\approx yy_2'$ so we are done by Proposition 7.
For (iv), we have by assumption that there are $yy'$ and $yy"$ with $xx\approx yy'$ , $xx_2\approx yy"$ , $\text {Succ}(yy',yy)$ and $\text {Succ}(yy",yy)$ . Thus $yy'\approx yy"$ by Proposition 7 so $xx\approx xx_2$ .
(ii) here gives that S is a well-defined relation up to $\approx $ -equivalence. With (iii), we have that S is a well-defined partial function up to $\approx $ -equivalence, and with (iv), that it is a well-defined injective partial function up to $\approx $ -equivalence.
For addition, we use the predicate $\text {Eq-Add}$ , which was defined at the start of Section 5 as an abbreviation for $(\text {Succ}(xx_1,xx_2)\land \text {Succ}(yy_1,yy_2))^\ast _{\vec {xx},\vec {yy}}$ . We define $+(xx,yy,zz)$ to hold if there are $xx'$ with $xx'\approx xx$ and such that $\text {Eq-Add}(\mathfrak {o},xx',yy,zz)$ . In other words, $zz$ can be obtained from $xx'$ by adding as many elements as it takes to obtain $yy$ from zilch, i.e., as many elements as there are in $yy$ .
An alternative definition of addition states that $+'(xx,yy,zz)$ means that there are $xx'$ and $yy'$ with $xx\approx xx'$ , $yy\approx yy'$ , and the $zz$ the disjoint union of the $xx'$ and the $yy'$ . It is not difficult to see that is equivalent to our definition (though we will not prove that).
Proposition 16.
-
(i) If $+(xx,yy,zz)$ then the $xx$ , the $yy$ and the $zz$ are finite.
-
(ii) $+(xx,\mathfrak {o},zz)$ iff $xx\approx zz$ .
-
(iii) If $yy\not \equiv \mathfrak {o}$ and $+(xx,yy,zz)$ then there are $yy'$ and $zz'$ with $\text {Succ}(yy',yy)$ and $\text {Succ}(zz',zz)$ and $+(xx,yy',zz')$ .
-
(iv) If $xx\approx xx_2$ then $+(xx,yy,zz)$ iff $+(xx_2,yy,zz)$ .
-
(v) If $yy\approx yy_2$ then $+(xx,yy,zz)$ iff $+(xx,yy_2,zz)$ .
-
(vi) If $zz\approx zz_2$ then $+(xx,yy,zz)$ iff $+(xx,yy,zz_2)$ .
-
(vii) If $+(xx,yy,zz)$ and $+(xx,yy,zz_2)$ then $zz\approx zz_2$ .
-
(viii) If $S(yy,yy^\ast )$ and $+(xx,yy,zz)$ then $S(zz,zz^\ast )$ iff $+(xx,yy^\ast ,zz^\ast )$ .
Proof. (i) is trivial for $xx$ , and an immediate $\text {Eq-Add}$ -induction for $yy$ and $zz$ .
For (ii), we obtain by Eq-Add-induction that $\text{Eq-Add}(xx_1,xx_2,yy_1,yy_2)$ implies that if $xx_1\approx xx_2$ then $yy_1\approx yy_2$ . Thus $\text{Eq-Add}(\mathfrak{o},xx',\mathfrak{o},zz)$ implies $xx'\approx zz$ , so $+(xx,\mathfrak{o},zz)$ implies $xx\approx zz$ . The converse is easy.
For (iii) we obtain by Eq-Add-Induction that if $\text{Eq-Add}(uu,xx,yy,zz)$ with $uu\equiv\mathfrak{o}$ and $yy\not\equiv\mathfrak{o}$ then:
(iv) is immediate.
For (v) we prove by induction on $yy$ that if $yy\approx yy_2$ and $+(xx,yy_2,zz)$ then $+(xx,yy,zz)$ . The base case where $yy\equiv \mathfrak {o}$ is trivial. For the induction step, we suppose the conclusion holds for $yy$ , and that we have $\text {Succ}(yy,yy^\ast )$ and $yy^\ast \approx yy_2^\ast $ with $+(xx,yy_2^\ast ,zz^\ast )$ . Then by (iii) there are $yy_2$ and $zz$ with $\text {Succ}(yy_2,yy_2^\ast )$ and $\text {Succ}(zz,zz^\ast )$ and $+(xx,yy_2,zz)$ , so that $yy\approx yy_2$ and thus by the induction hypothesis $+(xx,yy,zz)$ and so $+(xx,yy^\ast ,zz^\ast )$ as required.
For (vi) we prove by induction on $yy$ that if $zz\approx zz_2$ and $+(xx,yy,zz)$ then $+(xx,yy,zz_2)$ . The base case where $yy\equiv \mathfrak {o}$ follows from (ii). For the induction step, we suppose we have $yy$ for which the induction hypothesis holds, and that $\text {Succ}(yy,yy^\ast )$ , with $zz$ , $zz_2$ such that $zz\approx zz_2$ and $+(xx,yy^\ast ,zz)$ . Then by (iii) there are $yy'$ and $zz'$ such that $\text {Succ}(yy',yy^\ast )$ and $\text {Succ}(zz',zz)$ and $+(xx,yy',zz')$ . Thus $yy'\approx yy$ , so by (v) we have $+(xx,yy,zz')$ . Then if we let $zz_2'$ be $zz_2$ but with one element removed then $\text {Succ}(zz_2',zz_2)$ , and so $zz_2'\approx zz'$ , and thus by the induction hypothesis $+(xx,yy,zz_2')$ , and thus $+(xx,yy^\ast ,zz_2)$ as required.
(vii) is straightforward by induction on $yy$ , using (iii). The “if” direction of (viii) follows from (iii), (v) and (vii), and the “only if” direction is easy.
Thus we obtain that $+$ is a well-defined partial binary operation in its first two arguments, up to $\approx $ -equivalence. Properties (ii) and (vii) parallel the axioms governing addition in Peano arithmetic.
Finally, we treat multiplication. We define $\times (xx,yy,zz)$ to hold if
In other words, this means that $zz$ is obtained from zilch by repeatedly adding as many elements as there are in $xx$ to zilch, as many times as there are elements in $yy$ .
Proposition 17.
-
(i) If $\times (xx,yy,zz)$ then the $xx$ , the $yy$ and the $zz$ are finite.
-
(ii) If the $xx$ are finite then $\times (xx,\mathfrak {o},zz)$ iff $zz=\mathfrak {o}$ .
-
(iii) If $y\not \equiv \mathfrak {o}$ and $\times (xx,yy,zz)$ then there are $yy'$ and $zz'$ with $\text {Succ}(yy',yy)$ and $+(zz',xx,zz)$ and $\times (xx,yy',zz')$ .
-
(iv) If $xx\approx xx_2$ then $\times (xx,yy,zz)$ iff $\times (xx_2,yy,zz)$ .
-
(v) If $yy\approx yy_2$ then $\times (xx,yy,zz)$ iff $\times (xx,yy_2,zz)$ .
-
(vi) If $zz\approx zz_2$ then $\times (xx,yy,zz)$ iff $\times (xx,yy,zz_2)$ .
-
(vii) If $\times (xx,yy,zz)$ and $\times (xx,yy,zz_2)$ then $zz\approx zz_2$ .
-
(viii) If $S(yy,yy^\ast )$ , and $\times (xx,yy,zz)$ then $+(zz,xx,zz^\ast )$ iff $\times (xx,yy^\ast ,zz^\ast )$ .
Proof. The proofs of (i)–(iii) are similar to those of (i)–(iii) from Proposition 16. (iv) is clear from Proposition 16 (iv). The proof of (v) is similar to that of Proposition 16 (v). The proof of (vi) is similar to that of Proposition 16 (vi) (induction on $yy$ , with a simpler induction step than before). (vii) follows by an easy induction on $yy$ . The “if” direction of (viii) follows from (iii), (v) and (vii), and the “only if” direction is easy.
Thus again $\times $ is a well-defined partial binary operation in its first two arguments up to $\approx $ -equivalence. (i) and (vii) parallel the two Peano axioms governing multiplication.
One can define exponentiation in a similar way, and continue to further hyperoperations if one desires.
7 Numbers
We have seen how in our setting we can obtain many arithmetic concepts, and many of their basic properties, without relying on numbers as objects. It would be interesting to see how much of arithmetic can be obtained on this basis (for instance on the adjectival interpretation), but our priority is instead to introduce numbers as objects via an abstraction principle.
Our preferred option is to use a predicative abstraction principle. Linnebo [Reference Linnebo11] ably defends predicative abstraction principles as a way of obtaining reference to a new domain of entities, by laying down a criterion of identity for those entities, and thus establishing how such an entity can be reidentified, encountered again under different circumstances—an ability fundamental to a notion of reference to such entities [Reference Linnebo11, chap. 2]. Using a predicative rather than impredicative abstraction principle means that the criterion of identity is stated in terms of language that is antecedently understood, and there is no question of it being circular—with the holding of an identity between the new entities depending on other facts about them (including, perhaps, other instances of identities between them). It also means that truth conditions for statements about the new domain of objects can be given in our existing language, which is not the case for impredicative abstraction principles in general [Reference Linnebo11, chap. 6]. Additionally it avoids the bad company problem [Reference Linnebo11, chap. 3], the problem that many natural impredicative abstraction principles are inconsistent or have unwelcome consequences.
To state our abstraction principle we move to a two-typed logic, with one type $\mathbb {U}$ consisting of the non-arithmetical objects, and the other $\mathbb {N}$ the type of numbers introduced by the principle (this is what makes the principle predicative). As noted at the end of the Appendix it is straightforward to accommodate multiple types in our base plural logic—and the addition of ancestral operators does not change this. We will write $x,y,xx,yy,\ldots $ for the variables of type $\mathbb {U}$ , and write $n,m,p,\ldots $ for singular variables of type $\mathbb {N}$ . For our abstraction principle we introduce a unary function symbol N, with arity $\mathbb {U}\to \mathbb {N}$ . Then our abstraction principle states:
In other words, the number of $xx$ is the same as the number of $yy$ iff the $xx$ and the $yy$ are equinumerous—clearly basic to what we mean by number and equinumerosity (so basic that we had to distinguish sameness of number from our notion of equinumerosity at the start of Section 5). Since numbers are the things obtained by this principle—this principle being constitutive of what identity between numbers means—we also state that $\forall n\,\exists xx\,(N(xx)=n)$ .
There are actually difficult questions about whether an abstraction principle like this, with singular denotation of the new objects obtained via plural denotation in the existing domain, should license plural denotation to the new entities,Footnote 17 but we will not need plural quantifiers over numbers, just using our background logic for its provision for co-partial functions and definite descriptions.
It follows from (*) that N is a co-partial function, since $\mathfrak {o}\approx \mathfrak {o}$ and so $N(\mathfrak {o})=N(\mathfrak {o})$ , and in particular $E!(N(\mathfrak {o}))$ . We introduce the symbol 0 for $N(\mathfrak {o})$ . In general N is also a partial function, since if $\neg\, \text {Finite}(xx)$ then $\neg\, xx\approx xx$ , in which case (*) gives $\neg\, N(xx)=N(xx)$ , from which $\neg\, E!(N(xx))$ follows in our logic—in other words, if the $xx$ are infinite, then $N(xx)$ is an empty term. This tallies with the ordinary conception of cardinality.
If $k,l$ are terms of type $\mathbb {N}$ , we write $S(k)$ for the term
and similarly $+(k,l)$ for the term
and $\times (k,l)$ similarly.
Proposition 18.
-
(i) $E!(S(N(xx)))$ iff there are $yy$ with $S(xx,yy)$ .
-
(ii) $E!(+(N(xx),N(yy)))$ iff there are $zz$ with $+(xx,yy,zz)$ .
-
(iii) $+(k,0)\equiv k$ .
-
(iv) $+(k,S(l))\equiv S(+(k,l))$ .
-
(v) $E!(\times (N(xx),N(yy)))$ iff there are $zz$ with $\times (xx,yy,zz)$ .
-
(vi) If k exists then $\times (k,0)\equiv 0$ .
-
(vii) $\times (k,S(l))\equiv +(\times (k,l),k)$ .
Proof. (i) follows from Proposition 15, (ii)–(iv) follow from Proposition 16, and (v)–(vii) follow from Proposition 17.
Thus $+$ and $\times $ behave like the normal arithmetic operations, with the proviso that they may not be total.
We can also prove the validity of an induction scheme for numbers.
Proposition 19. Suppose that $\Gamma \vdash \phi [0|n]$ and $\Gamma \vdash \forall n\,(\phi \rightarrow \phi [S(n)|n])$ . Then $\Gamma \vdash \forall n\,\phi $ .
Proof. On these hypotheses we obtain by the induction rule for finiteness, with induction hypothesis $\phi [N(xx)|n]$ , that
giving the result.
This is broader than the PA induction scheme, holding when $\Gamma $ and $\phi $ include any vocabulary available in our language (with its two types).
Thus we obtain an interpretation of all of PA, except without the assurance that S, $+$ , and $\times $ are total functions. What is needed is that S is total, since then totality of addition follows from (iii) and (iv) of Proposition 18 by induction, and totality of multiplication follows in turn by (vi) and (vii) of Proposition 18 by induction.
We outline two possible routes to knowledge that S is total. First, there is the simple possibility that there are infinitely many things, and that this is known—perhaps empirically—to be the case. In our context we can state a very simple axiom of infinity: that there are $xx$ such that $\neg\, \text {Finite}(xx)$ . It is immediate by induction on $xx$ that if the $xx$ are finite and $yy\preceq xx$ then the $yy$ are finite, so that this axiom of infinity is equivalent to the statement that all the things that there are collectively infinite, or in symbols, $\neg\, \text {Finite}(x{:}\,x=x)$ . Thus given the axiom of infinity, if the $xx$ are finite then they are not all the objects that there are, so there is x with $x\not \preccurlyeq xx$ , and thus if we obtain $xx^\ast $ by adding x to the $xx$ then $\text {Succ}(xx,xx^\ast )$ . It follows that the function S is total.
As an example, one could obtain this result for instance from empirical facts about space-time (if the world co-operated). If there was a world line w such that for any finitely many points on the world line, there was a strictly later point in terms of w’s proper time, then it would follow that
giving us our axiom of infinity.
Our second route is modal. It is very plausible to think that whenever the $xx$ are finite, there could have been an x not amongst them.Footnote 18 Symbolically,
We will give an informal sketch of an argument from this principle that S is total.Footnote 19 We can argue intuitively that, given (M), it should be the case that from $m\!=\!N(xx)$ it follows that $\Diamond (m\!=\!N(xx)\land \exists yy\,\text {Succ}(xx,yy))$ , and thus that from $E!(m)$ it follows that $\Diamond (\exists n\,n\!=\!S(m))$ . Then since arithmetic facts are necessary if they hold, this in turn implies that $\Diamond \Box (\exists n\,n\!=\!S(m))$ , and thus by the principles of S5 that $\exists n\,n\!=\!S(m)$ . Thus, $\forall m\,\exists n\,n\!=\!S(m)$ .
On either approach, we have a plausible account of arithmetic knowledge. Much of arithmetic consists of what we could call “conceptual” truths: either purely logical, of the form seen in Sections 5 and 6, or following from logic together with the abstraction principle (*)—which can be seen as an innocent semantic fact, of a similar status to the fact that all bachelors are unmarried, and justified metasemantically by Linnebo’s account of reference via criteria of identity. Many of the facts from Sections 5 and 6 were simple to derive, so that even though ordinary people would not consciously know how to derive them one can plausibly see knowledge of them as given by an implicit grasp of how the concepts work (similar to a truth such as $\exists x\forall y\phi \rightarrow \forall y\exists x \phi $ ). Other facts, such as Proposition 6, required more work, and these we can view as logical truths the layman establishes by inductive reasoning, seeing that they hold in every case they consider. Finally, some of arithmetic relies on the fact that S is a total function: this can be derived from the fact there are infinitely many things, which might be known empirically, or can otherwise by derived from highly plausible modal principles (modal principles which a layman might intuitively appeal to, even if they cannot carry out a formal derivation themselves).
Finally we briefly note that one can otherwise introduce numbers via an impredicative abstraction principle, if one prefers to stick more closely to the Neo-Fregean route—though predicative abstraction principles have many advantages, as discussed above. For the impredicative principle we stick to our original logic with only a single type, again stating as an axiom:
We define the numbers to be the things of the form $N(xx)$ for some things $xx$ . As with Hume’s principle, it follows with no additional assumptions that there are infinitely many numbers. The key fact in this argument is that
which isn’t hard to show by induction for $yy$ finite. Then it follows that
(using Proposition 8), and thus that every number has a successor.
Appendix A plural logic
Here we give a formal presentation of the system of plural logic used in this paper, sketching its syntax, deductive system and semantics. This is a bivalent, positive universally free logic (though its primitive notion of identity is that of strong identity, which holds only between nonempty terms).
As logical vocabulary, we have a countably infinite stock $x,y,z,\ldots $ of singular variables, and a countably infinite stock $xx,yy,zz,\ldots $ of plural variables (disjoint from the singular variables); distinguished binary predicate symbols $=$ (for strong identity) and $\preccurlyeq $ (for the inclusion relation); logical connectives $\land $ , $\lor $ , $\rightarrow $ , $\neg\, $ ; parentheses; quantifiers $\forall $ and $\exists $ ; and symbols $\iota $ and $:$ for forming definite descriptions. In addition to these, a language $\mathcal {L}$ for plural logic consists of a set of constant symbols; a set of function symbols, each with a specified arity $n\in \mathbb {N}^{\geqslant 1}$ ; and a set of predicate symbols, each with a specified arity $n\in \mathbb {N}^{\geqslant 0}$ .
We follow Oliver and Smiley in not making a syntactic distinction between singular and plural terms, apart from for variables—this is due to the inability of a syntactic distinction to properly reflect the semantic difference between singular and plural terms, where a plural term is one capable of denoting more than one object [Reference Oliver and Smiley14, p. 218]. Due to the presence of definite description operators, we introduce terms and formulae together using simultaneous inductive clauses. All variables, singular and plural, are terms; all constant symbols are terms; a function symbol applied to a list of terms of the appropriate arity is a term; if $\phi $ is a formula and x a singular variable then $(\iota x\, \phi )$ is a term (the singular definite description formed from $\phi $ ); and if $\phi $ is a formula and x a singular variable then $(x{:}\,\phi )$ is a term (the exhaustive description formed from $\phi $ ). Then we obtain atomic formulae by applying a predicate symbol to a list of terms of the appropriate arity; if $\phi $ and $\psi $ are formulae then so are $\neg\, \phi $ , $\phi \land \psi $ etc., with the usual conventions for parentheses; and if $\phi $ is a formula, x a singular variable, and $xx$ a plural variable, then $\forall x\,\phi $ , $\exists x\,\phi $ , $\forall xx\,\phi $ and $\exists xx\,\phi $ are formulae. Free and bound variables are as usual, with $\iota x$ in $\iota x\,\phi $ and $x{:}$ in $x{:}\,\phi $ binding any occurrences of x in $\phi $ .
As discussed in Section 2, we introduce various abbreviations along similar lines to Oliver & Smiley [Reference Oliver and Smiley14, chap. 13]. If t is a term we write $E!(t)$ , meaning “t exists,” for $\exists x\,x\preccurlyeq t$ ; we write $S(t)$ , meaning “t is singular” for $\forall x\,(x\preccurlyeq t\rightarrow x=t)$ ; and we write $S!(t)$ , meaning “t exists and is singular” for $\exists x\,(x=t)$ —where x is the first singular variable not free in t according to some ordering of the singular variables. When we state the deductive rules for our logic, it will be clear that any of these abbreviations is equivalent to the formula obtained by taking x to be any other variable not free in t. If s and t are terms we write $s\equiv t$ for $s=t\lor (\neg\, E!(s)\land \neg\, E!(t))$ , the predicate termed “weak equality” by Oliver and Smiley—the form of equality which holds with empty terms as arguments.
We give the deductive system for our logic as a form of natural deduction, specifically as a kind of sequent calculus, or what Gentzen called an L system. For the logic discussed here, this presentation is essentially a notational variant of the paradigmatic natural deduction trees labelled with discharging annotations, but is simpler and more compact. Compared to a Hilbert style axiomatization like that of Oliver & Smiley [Reference Oliver and Smiley14, chap. 13], this kind of presentation arguably makes clearer how to actually reason using the logic in practice.
A sequent is a pair $(\Gamma ,\phi )$ , where $\Gamma $ is a (possibly empty) set of formulae and $\phi $ a formula. We typically use the notation $\Gamma \vdash \phi $ for sequents. The idea behind sequent-based approaches to deduction is that at each stage of a deductive argument, we establish the validity of a sequent $\Gamma \vdash \phi $ , where $\Gamma \vdash \phi $ being valid means that $\phi $ is a consequence of $\Gamma $ . We will present various deductive rules, each stating either that certain sequents are valid, or stating a way to obtain new valid sequents from previously established valid sequents. Formally, we define a sequent to be valid if it is a member of all sets of sequents in which all of the following deductive rules hold.
When stating these rules we intend $\Gamma ,\Delta ,\Lambda $ to range over arbitrary sets of formulae, $\phi ,\psi $ to range over arbitrary formulae, x and $xx$ to range over arbitrary singular and plural variables respectively, and s and t to range over arbitrary terms (typically with a free variable restriction, as stated). The part of the system corresponding to propositional logic is standard. We have $\phi \vdash \phi $ for every $\phi $ , we have $\Gamma \vdash \phi $ and $\Gamma \subseteq \Delta $ implying $\Delta \vdash \phi $ , and we have that if $\Gamma \vdash \delta $ for all $\delta \in \Delta $ and $\Gamma \cup \Delta \vdash \phi $ then $\Gamma \vdash \phi $ . We then have rules for introducing and eliminating conditionals:
disjunctions:
conjunctions:
and a single rule for negation:
as well as the law of excluded middle, $\varnothing \vdash \phi \lor \neg\, \phi $ , and the law of explosion, $\phi \land \neg\, \phi \vdash \psi $ .
It is in the rules for quantifiers, identity and definite descriptions that the distinctive character of the logic becomes apparent. Since terms can denote either nothing, one thing, or more than one thing (and even variables can be empty), the introduction and elimination rules for singular variable quantifiers have to be adjusted using the predicate $S!$ , defined above as an abbreviated formula which holds of terms that denote a single object:
Here we require in $\forall x$ -I that x is not free in any formula in $\Gamma $ , and in $\exists x$ -E that x is not free in $\psi $ or any member of $\Gamma $ . We require in the rules where $\phi [t|x]$ appears that t is free for x, i.e., that there is no free occurrence of x in $\phi $ in the scope of a quantifier that binds a variable of t. In each case if one removes the conjuncts and antecedents of the form $S!(x)$ and $S!(t)$ , one obtains the usual rules for quantifier introduction and elimination.
On our intended interpretation, the rules for plural variable quantifiers take a very simple form, essentially the normal introduction and elimination rules:
Again, we require in $\forall xx$ -I that x is not free in any formula in $\Gamma $ , and in $\exists xx$ -E that x is not free in $\psi $ or any member of $\Gamma $ , and require in rules where $\phi [t|xx]$ appears that t is free for $xx$ in $\phi $ . These rules take this simple form because on our intended interpretation, quantification over plural variables includes the case where those variables denote only one thing, or no things: for instance, with $\mathfrak {o}$ our paradigmatic empty term zilch, defined above, we intend that $\forall xx\,\phi $ implies $\phi [\mathfrak {o}|xx]$ , and that $\phi [\mathfrak {o}|xx]$ implies $\exists xx\,\phi $ .
We state what are essentially the usual rules governing identity, but using weak identity (defined above as an abbreviation, the version of identity in which empty terms are taken as equal):
where we require in $\equiv $ -E that s and t are free for $xx$ in $\phi $ .
We state various further rules involving the primitive notions $=$ and $\preccurlyeq $ , and our abbreviations $E!$ , S and $S!$ :
FromFootnote 20 the first we obtain that $S!(t)\rightarrow E!(t)$ , and from the second that $S!(t)\rightarrow S(t)$ . The converse to these, that $E!(t)\land S(t)\rightarrow S!(t)$ , is immediate. From the third and the first we obtain that $s=t\rightarrow (E!(s)\land E!(t))$ . It then follows that $(s=s)\rightarrow E!(s)$ , and thus (using $\equiv $ -I) it follows that $(s=s)\leftrightarrow E!(s)$ . From the third and the first we obtain that
It then follows that $s\equiv t \leftrightarrow \forall x(x\preccurlyeq s\leftrightarrow x\preccurlyeq t)$ . Also if t is any term then the above reflexivity axiom for weak identity gives $t\equiv t$ , so that $(t=t)\lor \neg\, E!(t)$ ; thus we obtain $E!(t)\rightarrow (t=t)$ . Thus, since in general $s=t\rightarrow (E!(s)\land E!(t))$ , we have $E!(t)\leftrightarrow (t=t)$ .
This allows us to derive a reflexivity axiom for singular variables with respect to strong identity. Indeed for any term t we have $S!(t)\rightarrow (t=t)$ , and in particular for x a singular variable we have $S!(x)\rightarrow (x=x)$ , thus obtaining $\forall x\,x=x$ by $\forall x$ -I.
The final elements of the deductive system are rules governing our definite description operators $\iota x\,\phi $ and $x{:}\,\phi $ . We have that
that
for y distinct from x and not free in $\phi $ , and that
It follows that for instance $E!(\iota x\,\phi )\rightarrow \phi (\iota x\,\phi )$ , and that $E!(x{:}\,\phi )\leftrightarrow \exists x \phi $ . This characterization of exhaustive description also removes the need for a comprehension scheme for pluralities, as for any $\phi $ we can immediately derive $\exists xx \,\forall x\, (x\preccurlyeq xx\leftrightarrow \phi )$ .Footnote 21
With the syntax and deductive system for the logic in hand, we will now sketch our semantics. Our intended metalanguage is a version of set theory using our plural logic as its logic, so using deductions that follow the above deductive rules (though we have not given an explicit definition of what it is for a finite list of sequents to be a proof according to the above deductive rules, it is clear how such a definition would proceed). We can take the axioms of set theory to be those of ZFC, except that separation is phrased using a plural quantifier
The usual separation scheme then follows using instances of exhaustive description. The use of plural logic in the metalanguage is not essential, and one could replace plural talk in the semantics with talk of sets in the obvious way—we use plural logic just to make clear that the semantics does not require assigning plural variables sets as extensions.
Our semantics differs from that of Boolos [Reference Boolos2], which is specific to the case of second-order set theory, and differs from that of Yi [Reference Yi24] in allowing plural constants, empty terms (both singular and plural), functions (including multivalued, partial and co-partial functions), definite descriptions, and both strong and weak predicates. It differs from that of Oliver and Smiley in using plural set theory, rather than an unspecified form of higher-order logic, as its metalanguage (see the discussion at the end of Section 2).
Oliver & Smiley [Reference Oliver and Smiley14, sec. 11.1] argue that to be properly topic neutral, a semantics cannot be phrased using set theory, as a semantics should also apply to domains of quantification in which there are too many individuals to form a set. We do not have to regard our semantics as intending to literally state the truth conditions for all object languages, however—a goal that is likely impossible, on pain of paradox: instead we can think of ourselves as modelling the relationship between our words and reality, a worthwhile project that sets are excellent tools for, even if they do not literally have any semantic properties.
With these preliminaries addressed, on to the semantics. We use $a,b,\ldots {}$ as singular metavariables, and $aa,bb,\ldots {}$ as plural metavariables. If $aa$ are some objects we write $\{aa\}$ for the set whose elements are exactly the objects $aa$ , if such a set exists.Footnote 22 If $\mathcal {L}$ is a language for our logic, a structure $\mathcal {D}$ for $\mathcal {L}$ consists of the following data: a base set D (perhaps empty); for each constant symbol c (of $\mathcal {L}$ ) some elements $\overline {c}^{\mathcal {D}}$ of D (zero or more);Footnote 23 for each function symbol f of arity n, a function $\overline {f}^{\mathcal {D}}:\mathbb {P}(D)^n\to \mathbb {P}(D)$ ; and for each relation symbol P of arity n, a subset $\overline {P}^{\mathcal {D}}$ of $\mathbb {P}(D)^n$ . It is required that $(p,q)\in \overline {=}^{\mathcal {D}}$ iff $\varnothing \neq p=q$ —i.e., iff the elements of p are the same (in the sense of strong identity) as the elements of q—and that $(p,q)\in \overline {\preccurlyeq }^{\mathcal {D}}$ iff $\varnothing \neq p\subseteq q$ —i.e., iff the elements of p are among the elements of q (in the sense of inclusion as a strong predicate).
Here if f is a function symbol of arity n, the relation $\overline {f}^{\mathcal {D}}(p_1,\ldots , p_n)=q$ represents the function expressed by f taking the elements of q as values, when applied to the elements of $p_1$ , and the elements of $p_2$ , … and the elements of $p_n$ (which can include the case where q or any of the $p_i$ are empty). If P is a predicate of arity n, then $(p_1,\ldots , p_n)\in \overline {P}^{\mathcal {D}}$ represents the relation expressed by P holding of the elements of $p_1$ , and the elements of $p_2$ , … and the elements of $p_n$ .
If $\mathcal {D}$ is a structure, a variable assignment v over $\mathcal {D}$ assigns zero or more elements $v(xx)$ of D to each plural variable, and zero or one elements $v(x)$ of D to each singular variable.Footnote 24 If v is a variable assignment, $xx$ a plural variable and $aa$ are zero or more of D, we write $v[aa|xx]$ for the variable assignment which agrees with v everywhere except assigning $aa$ to $xx$ . Similarly if x is a singular variable and $aa$ are zero or one elements of D, we write $v[aa|x]$ for the variable assignment which agrees with v everywhere except assigning $aa$ to x.
Given a structure $\mathcal {D}$ and a variable assignment v over $\mathcal {D}$ , we will now define the interpretation $v(t)$ of each term with respect to v and $\mathcal {D}$ , which consists of zero or more elements of D, and also define whether each formula $\phi $ of the language is satisfied with respect to $\mathcal {D}$ and v, written $\mathcal {D},v\vDash \phi $ . We define these by a simultaneous induction, since the concepts are related through our definite description operators. The interpretation of a variable consists just of the values v assigns to that variable. If c is a constant symbol then $v(c)$ is defined to be the values $\overline {c}^{\mathcal {D}}$ . If we have terms $t_1,\ldots , t_n$ and f is a function symbol of arity n, then $v(f(t_1,\ldots , t_n))$ is defined to be the elements of the set $f(\{v(t_1)\},\ldots , \{v(t_n)\})$ . Given a formula $\phi $ , if there is a unique element a of D such that $\mathcal {D}, v[a|x]\vDash \phi $ then $v(\iota x\,\phi )$ is defined to be that a, and otherwise $v(\iota x\,\phi )$ is defined to consist of zero elements of D. Also given $\phi $ , $v(x{:}\,\phi )$ is defined to consist of all those elements a of D such that $\mathcal {D},v[a|x]\vDash \phi $ .
Then if we have defined interpretation for terms $t_1,\ldots , t_n$ and we have a predicate symbol P of arity n, then we define $\mathcal {D},v\vDash P$ to hold iff $(\{t_1\},\ldots , \{t_n\})\in \overline {P}^{\mathcal {D}}$ . We define satisfaction for propositional connectives in the usual way. Finally, we have quantifiers. Given $\phi $ , if x is a singular variable, then we define $\mathcal {D},v\vDash \,\forall x\,\phi $ to hold iff for every element a of D, $\mathcal {D},v[a|x]\vDash \phi $ holds (dually for $\exists x$ ). Then given $\phi $ and a plural variable $xx$ , we define $\mathcal {D},v\vDash \,\forall xx\,\phi $ to hold iff whenever we have zero or more elements $aa$ of D, $\mathcal {D},v[aa|xx]\vDash \phi $ (dually for $\exists xx$ ).
Then given a sequent $\Gamma \vdash \phi $ , we define $\mathcal {D}$ and v to satisfy $\Gamma \vdash \phi $ if either $\mathcal {D}$ and v satisfy $\phi $ , or they fail to satisfy some element of $\Gamma $ . We define a sequent $\Gamma \vdash \phi $ to be universally satisfied if it is satisfied by every $\mathcal {D},v$ . It is a routine check that the set of universally satisfied sequents is closed under all the deductive rules of the deductive system of our logic, and thus that all valid sequents are universally satisfied.
Our deductive system is necessarily incomplete, as in this logic one can straightforwardly characterize the notions of $\omega $ -sequence and complete ordered fields, and the set of truths about these structures in the relevant languages are not recursively enumerable. For instance in a language with the usual function symbols S, $+$ and $\times $ we can state versions of the Peano axioms for arithmetic, using as our induction principle
Then if $T_{\text {PA}}$ is the resulting theory, for any statement $\phi $ of the ordinary language of first-order Peano arithmetic, we have that $T_{\text {PA}}\vdash \phi $ is universally satisfied (as defined above) iff $\phi $ is true under its normal interpretation. Thus the semantic consequence relation of our logic is not recursive.
We conclude by noting two facts about this logic. First, we observe that the deductive system is not hobbled by being universally free (with an empty domain allowed in the semantics). Indeed we can argue straightforwardly that on the assumption that there is an object—which can be symbolised as $\exists x E!(x)$ , for any variable x Footnote 25 —one can reason essentially as normal, without worrying about whether variables are empty. We introduce a new deductive system, which we call the existential assumption deductive system, denoted EA, which is the same as above except that we have $S!(x)$ as an additional postulate for every singular variable x (instead of just $S(x)$ ). In the presence of this assumption, the rules of $\forall x$ -I and $\exists x$ -E are equivalent to ones of a simpler form, though we still have a singular existence requirement on $\forall x$ -E and $\exists x$ -I:
The deductive system EA would be sound if we restricted the semantics to require the base set D of each structure to be nonempty, and required the value $v(x)$ assigned to each singular variable to be an element of D, rather than zero or one elements.
Then in a precise sense, we can argue that reasoning in EA can be carried out in the universally free deductive system, on the assumption that an object exists. Indeed if $\Gamma \vdash \phi $ is a sequent, we let $\Theta (\Gamma \vdash \phi )$ be the sequent
where $x_1,\ldots , x_n$ are the singular variables free in $\phi $ or any element of $\Gamma $ . Then we can argue that for each deductive rule of EA, if one applies $\Theta $ to each sequent, one obtains a rule that can be derived in the universally free deductive system. For instance if we take the rule $\land $ -E:
applying $\Theta $ to each sequent gives:
which is easily seen to be a valid derived rule in the universally free deductive system—the assumption $\exists x E!(x)$ allows one to discharge the hypotheses $E!(z_i)$ for $z_i$ not free in $\phi $ or any element of $\Gamma $ . It is a routine check that the same holds for all rules of EA (whether one takes the singular variable quantifier rules to be the original ones, or the simpler ones valid in the presence of the axiom $S!(x)$ , mentioned above). Thus if $\Gamma \vdash \phi $ is a sequent derivable in EA, then $\Theta (\Gamma \vdash \phi )$ is derivable in the universally free deductive system—and if we were to define a notion of derivation involving these deductive rules, a derivation in EA of $\Gamma \vdash \phi $ could be transformed into a similar (and not too much longer) derivation of $\Theta (\Gamma \vdash \phi )$ . Thus in a precise sense, if we assume that there is an object, then we are able to reason in a natural way as though all our variables are nonempty. The universally free nature of our deductive system is no hindrance.Footnote 26 This point is not made by Oliver & Smiley [Reference Oliver and Smiley14], and indeed is much easier to make in the context of a natural deduction style system like ours here, than when using a Hilbert style axiomatization like theirs.
Finally, we note that modifying the logic to allow multiple different types of individuals is straightforward. This is relevant to the presentation of a predicative version of numerical abstraction in Section 7. Syntactically, one would include a set of types in the specification of a language, have disjoint stocks of singular and plural variables for each type, specify the type of each constant symbol, specify the arity of a function symbol as a list of the types of the arguments and a value type, and specify the arity of a predicate symbol as a list of the types of the arguments. One would introduce versions of the equality and inclusion symbols for each type. Terms would only be formed by applying function symbols to terms of the appropriate type, and atomic formulae would only be formed by applying predicate symbols to terms of the appropriate type. Definite descriptions would have the type of their bound variable. Semantically, one would specify a base set (there is no need to make them disjoint) for each type, and adjust the definition of interpretations of constant symbols, function symbols and predicate symbols so that everything is appropriately typed. The semantics of variable assignments and quantifiers would be relativized to the base set of the type of the relevant variable. Apart from that everything would proceed as above.