1 Introduction
Large cardinals have played a pivotal role in set theory, and many of them can be defined in terms of elementary embeddings. Associating large cardinals with elementary embeddings appeared first in Scott’s pioneering paper [Reference Scott59], and was further systematically developed throughout the 1960s and 1970s. Many of these results were collected by Reinhardt and Solovay around the early 1970s, which was published later with Kanamori in the expository paper [Reference Solovay, Reinhardt and Kanamori60].
The attempt to find ever stronger notions of large cardinal axioms culminated in the principle now known as a Reinhardt cardinal, which was first mentioned in Reinhardt’s doctoral thesis [Reference Reinhardt57]. A Reinhardt cardinal is a critical point of a non-trivial elementary embedding
$j\colon V\to V$
. Unfortunately, the fate of a Reinhardt cardinal in
$\mathsf {ZFC}$
is that of inconsistency, as if Icarus falls into the sea as he flew too close to the sun. A famous result by Kunen [Reference Kunen34] proves that Reinhardt cardinals are incompatible with the Axiom of Choice and it is still unknown if
$\mathsf {ZF}$
with a Reinhardt cardinal is consistent. However, there has been little study of the consistency of Reinhardt cardinals in the choiceless context and few results about the implications of such axioms have appeared in the literature before 2010. The only exception the authors know of is a result of Apter and Sargsyan [Reference Apter and Sargsyan7], and other relevant results about Reinhardt embeddings focused on its inconsistency. Notable examples include Suzuki’s non-definability of embeddings
$j\colon V\to V$
over
$\mathsf {ZF}$
[Reference Suzuki61] or Zapletal’s PCF-theoretic proof of Kunen’s inconsistency theorem [Reference Zapletal66].
Choiceless large cardinals are large cardinal notions that extend a Reinhardt cardinal and are therefore incompatible with the Axiom of Choice. A super Reinhardt cardinal was employed by Hugh Woodin in 1983 to prove the consistency of
$\mathsf {ZFC+I_0}$
, which had a focal role in establishing the consistency of
$\mathsf {ZF+AD}^{L(\mathbb {R})}$
. A Berkeley cardinal appeared around 1992 by Woodin at his set theory seminar as an attempt to provide a large cardinal notion that was refutable from
$\mathsf {ZF}$
alone. While no such inconsistency has been found so far, it has since become an interesting principle in itself. Current research on choiceless large cardinals emerged in the mid-2010s as part of a project to explore Woodin’s
$\mathsf {HOD}$
dichotomy (see [Reference Woodin65, Section 7.1] or [Reference Bagaria, Koellner and Woodin10] for details), under the thesis that such cardinals would indicate the V was ‘far’ from
$\mathsf {HOD}$
in some sense. Bagaria, Koellner, and Woodin collected and analyzed notions of choiceless large cardinals in [Reference Bagaria, Koellner and Woodin10], and the theory of choiceless large cardinals was further developed in various papers by authors including Cutolo, Goldberg, and Schlutzenberg.
One of the most striking results along this line is a result by Goldberg [Reference Goldberg26], which establishes the consistency of
$\mathsf {ZF} + j\colon V_{\lambda +2}\to V_{\lambda +2}$
modulo large cardinals over
$\mathsf {ZFC}$
:
Theorem 1.1 (Goldberg [Reference Goldberg26, Theorem 6.20]).
These two theories are equiconsistent over
$\mathsf {ZF+DC}$
:
-
1. For some ordinal
$\lambda $ , there is an elementary embedding
$j\colon V_{\lambda +2}\to V_{\lambda +2}$ .Footnote 1
-
2.
$\mathsf {AC+I_0}$ .
Furthermore, Goldberg proved that the existence of an elementary embedding
$j\colon V_{\lambda +3}\to V_{\lambda +3}$
exceeds almost all of the traditional large cardinal hierarchy over
$\mathsf {ZFC}$
.
Theorem 1.2 (Goldberg [Reference Goldberg26, Theorem 6.16]).
Working over
$\mathsf {ZF+DC}$
, the existence of a
$\Sigma _1$
-elementary embedding
$j\colon V_{\lambda +3}\to V_{\lambda +3}$
implies the consistency of
$\mathsf {ZFC}+\mathsf {I_0}$
.
In the other direction, we can try to ‘salvage’ an elementary embedding
$j\colon V\to V$
by weakening the background set theory. One direction of research in this manner was conducted by Corazza. Corazza [Reference Corazza12] introduced the Wholeness axiom,
$\mathsf {WA}$
, by dropping Replacement for j-formulas.
$\mathsf {WA}$
is known to be weaker than
$\mathsf {I}_3$
. Corazza further weakened
$\mathsf {WA}$
to the Basic Theory of Elementary Embeddings,
$\mathsf {BTEE}$
, in [Reference Corazza13] which is the weakest setting one needs to express the existence of an elementary embedding by dropping all axioms for j-formulas in the extended language. The resulting axiom is known to be weaker than the existence of
$0^\sharp $
.
Another direction to weaken the assumptions is by dropping Powerset. However, it should be noted that , the theory obtained by ejecting Powerset from
$\mathsf {ZFC}$
and only assuming Replacement, is ill-behaved with elementary embeddings. For example, in [Reference Gitman, Hamkins and Johnstone23] it is shown that Łoś’s theorem can fail over
, and a cofinal
$\Sigma _1$
-elementary embedding need not be fully elementary. On the other hand, they also show that these issues can be avoided by strengthening
to
$\mathsf {ZFC}^-$
, which is obtained by additionally assuming Collection.
Further research along this line is characterizing large cardinal notions in terms of models of
$\mathsf {ZFC}^-$
with an ultrafilter predicate (for example, [Reference Holy and Lücke30] or [Reference Gitman and Schlicht25]). Large cardinals defined in this way refine the large cardinal hierarchy between a Ramsey cardinal and a measurable cardinal, and provide bounds for the consistency strength of
$\mathsf {ZFC}^-$
with an elementary embedding. As one such example of this characterization, we have the following theorem.
Theorem 1.3 [Reference Matthews38, Theorem 10.5.7].
$\mathsf {ZFC}$
with a locally measurable cardinal proves the consistency of
$\mathsf {ZFC}^- + \mathsf {DC}_{\mathrm {<Ord}}$
plus the existence of a non-trivial elementary embedding
$j\colon V\to M$
.
Finally, it can be shown that an elementary embedding
$j \colon V \rightarrow V$
is possible in
$\mathsf {ZFC}^-$
under the large cardinal assumption of
$\mathsf {ZFC + I_1}$
:
Theorem 1.4 [Reference Matthews38, Theorem 9.3.2] or [Reference Matthews39, Theorem 2.2].
$\mathsf {ZFC}$
proves the following
$:$
there is an elementary embedding
$k\colon V_{\lambda +1}\to V_{\lambda +1}$
if and only if there is an elementary embedding
$j\colon H_{\lambda ^+}\to H_{\lambda ^+}$
.
As a corollary,
$\mathsf {ZFC}^-_j$
Footnote
2
is compatible (modulo large cardinals over
$\mathsf {ZFC}$
) with a non-trivial elementary embedding
$j\colon V\to V$
which additionally satisfies that
$V_{\operatorname {crit} j}$
exists.
However, [Reference Matthews39] also showed there is a limitation on the properties of such an elementary embedding
$j\colon V\to V$
over
$\mathsf {ZFC}^-$
. One of these restrictions is that
$j\colon V\to V$
cannot be cofinal:Footnote
3
Theorem 1.5 [Reference Matthews38, Theorem 10.2.3] or [Reference Matthews39, Theorem 5.4].
Working over
$\mathsf {ZFC}^-_j$
, if
$j\colon V\to V$
is a non-trivial
$\Sigma _0$
-elementary embedding such that
$V_{\operatorname {crit} j}$
exists, then it cannot be cofinal.
We may also weaken the background set theory by dropping the law of excluded middle, that is, moving into a constructive setting. Since some statements are no longer equivalent over a constructive background, we need to carefully formulate constructive counterparts of classical notions, including the axiom systems of constructive set theories.
The first form of a constructive set theory was defined by H. Friedman [Reference Friedman15], and is now known as Intuitionistic
$\mathsf {ZF}$
,
$\mathsf {IZF}$
. In [Reference Friedman15], Friedman showed that
$\mathsf {IZF}$
and
$\mathsf {ZF}$
are mutually interpretable by using the combination of double-negation translation and non-extensional set theory. Another flavor of constructive set theory appeared as an attempt to establish a formalization of Bishop’s Constructive analysis. Myhill [Reference Myhill42] gave his own formulation of a constructive set theory
$\mathsf {CST}$
. However, its language is different from the standard one—
$\mathsf {CST}$
includes natural numbers as primitive objects, while
$\mathsf {ZF}$
does not. The other formulation of a constructive set theory, which is closer to standard
$\mathsf {ZF}$
, was given by Aczel [Reference Aczel1] via a type-theoretic interpretation. Aczel’s constructive set theory is called Constructive
$\mathsf {ZF}$
,
$\mathsf {CZF}$
. Aczel further developed a theory on
$\mathsf {CZF}$
and its relationship with Martin-Löf type theory in the consequent works [Reference Aczel2, Reference Aczel3]. In particular, Aczel’s last paper in the sequel [Reference Aczel3] defined regular sets, which begins the program to define large cardinals over
$\mathsf {CZF}$
.
The first research on constructive analogues of large cardinal axioms was done by Friedman and Ščedrov [Reference Friedman and Ščedrov16]. Unlike classical set theories, ordinals over constructive set theories are not well-behaved. This motivated defining large cardinal notions over constructive set theories in a structural manner, resulting in large set axioms. They defined and analyzed inaccessible sets, Mahlo sets, and various elementary embeddings over
$\mathsf {IZF}$
, and proved that their consistency strength is no different from their classical counterparts.
Large set axioms over constructive set theories appeared first in various papers of Rathjen (for example, [Reference Rathjen45–Reference Rathjen47, Reference Rathjen, Griffor and Palmgren55]). The first appearance of large set axioms over
$\mathsf {CZF}$
was in a proof-theoretic context, and their relationship with better known theories, like extensions of Martin-Löf type theories or
$\mathsf {KP}$
, were emphasized. An initial analysis of large set axioms over
$\mathsf {CZF}$
can be found in [Reference Aczel and Rathjen5] or [Reference Aczel and Rathjen6], and this has been further extended by Gibbons [Reference Gibbons21] and Ziegler [Reference Ziegler67] in each of their doctoral theses. Gibbons [Reference Gibbons21] extended Rathjen’s analysis on the proof-theoretic strength of
$\mathsf {CZF}$
in [Reference Rathjen45] to
$\mathsf {CZF}$
with Mahlo sets. Furthermore, Gibbons’ thesis is the first publication that shows the definition of a critical set.Footnote
4
Unlike other results around large set axioms over
$\mathsf {CZF}$
, Ziegler’s thesis focused on what we can derive about large set axioms from
$\mathsf {CZF}$
alone. For example, he observed that the number of inaccessible sets may not affect the proof-theoretic strength:
Theorem 1.6 [Reference Ziegler67, Chapter 6].
The following theories are equiconsistent
$:$
-
1.
$\mathsf {CZF}$ with an inaccessible set,
-
2.
$\mathsf {CZF}$ with two inaccessible sets, and
-
3.
$\mathsf {CZF}$ with
$\omega $ inaccessible sets.
Ziegler also examined elementary embeddings over
$\mathsf {CZF}$
in detail, and one of his striking results is that every Reinhardt embedding
$j\colon V\to V$
must be cofinal (see Proposition 4.8 for the formal statement of the theorem).
Ziegler’s thesis ends with the following result that elementary embeddings are incompatible with the principle of subcountability, which asserts that for every set there is a partial surjection from
$\omega $
onto it. This can be seen as a constructive analogue to Scott’s early result in [Reference Scott59] that measurable cardinals are incompatible with the Axiom of Constructibility.
Theorem 1.7 (Ziegler [Reference Ziegler67, Theorem 9.93]).
Over
$\mathsf {{C}ZF}$
, the combination of the Axiom of Subcountability and the existence of a critical set results in a contradiction.
Rathjen analyzed the proof-theoretic strength of small large sets, large set notions whose classical counterpart is weaker than the existence of
$0^\sharp $
, over
$\mathsf {CZF}$
(see Section 3, especially Theorem 3.10), and the proof-theoretic strength of all currently known small large set axioms over
$\mathsf {CZF}$
is weaker than that of Second-Order Arithmetic. However, the proof-theoretic strength of large large sets, large set notions defined in terms of elementary embeddings, is yet to receive a formal rigorous treatment.
We end this introduction by noting some history of the development of this work. A first version of this paper can be found in [Reference Jeon31] where the first author studied the consistency strength of a Reinhardt set over
$\mathsf {CZF}$
with Full Separation. The second author’s PhD thesis [Reference Matthews38] included an initial investigation into elementary embeddings of
$\mathsf {KP}$
and
$\mathsf {IKP}$
. This version can be seen as a combination of the previous work by the individual authors and extensively extends the results found in either source.
1.1 Main results
It turns out that the proof-theoretic strength of large large set vastly exceeds that of
$\mathsf {ZFC}$
. In fact, we just need a small fragment of the properties of an elementary embedding, that we shall denote by
$\Delta _0\text {-}\mathsf {BTEE}_M$
, which is the minimal theory needed to claim that
$j\colon V\to M$
is a
$\Delta _0$
-elementary embedding, to exceed the proof-theoretic strength of
$\mathsf {ZFC}$
.
Theorem.
-
1. (Theorem 5.1) Working over
$\mathsf {IKP}$ , let K be a transitive set such that
$K\models \Delta _0\text {-}\mathsf {Sep}$ ,
$\omega \in K$ and let
$j \colon V\to M$ be a
$\Delta _0$ -elementary embedding whose critical point is K. Then
$K\models \mathsf {IZF}$ .
-
2. (Corollary 5.14
$)$ Furthermore, if we additionally allow Set Induction and Collection for
$\Sigma ^{j,M}$ -formulas and add Separation for
$\Delta _0^{j,M}$ -formulas, then we can define
$j^\omega (K):=\bigcup _{n\in \omega } j^n(K)$ and prove that
$j^\omega (K)$ satisfies
$\mathsf {IZF+BTEE}$ plus Set Induction for j-formulas.
-
3. (Theorem 5.21 and Theorem 6.37) As a consequence, the following two theories prove the consistency of
$\mathsf {ZFC+BTEE}$ plus Set Induction for j-formulas:
$\mathsf {CZF}_{j, M}$ with a critical set, and
$\mathsf {IKP}_{j,M}$ plus a
$\Sigma $ -
$\mathrm {Ord}$ -inary elementary embedding with a critical point
$\kappa \in \mathrm {Ord}$ .
The next natural question would be how strong a Reinhardt set is. It turns out that Reinhardt sets are very strong over
$\mathsf {CZF}$
:
Theorem. (Theorem 6.41)
$\mathsf {CZF}$
with a Reinhardt set proves the consistency of
$\mathsf {ZF+WA}$
.
These two results motivate an idea that the proof-theoretic strength of stronger large set notions may go beyond that of
$\mathsf {ZF}$
with choiceless large cardinals. This idea turns out to hold, and a constructive formulation of super Reinhardt cardinals witnesses this. We can push this idea further, so that we can expect that we may reach an ‘equilibrium’ by strengthening large set axioms once more, in the sense that adding some large set notions to
$\mathsf {CZF}$
has the same proof-theoretic strength as that of
$\mathsf {IZF}$
plus the same axiom. We can see that the assertion that V resembles
$V_\kappa $
for a total Reinhardt cardinal
$\kappa $
, which we will call
$\mathsf {TR}$
, witnesses this claim.
However, both our analogues of super Reinhardts and total Reinhardts require a second-order formulation. We will resolve this issue by defining
$\mathsf {CGB}$
, which is a constructive version of
$\mathsf {GB}$
. Moreover, formulating an elementary embedding in a constructive context requires infinite connectives because there is no obvious way to cast the elementarity into a single formula. (It is possible in a classical context because every
$\Sigma _1$
-elementary embedding
$j\colon V\to M$
is fully elementary.) This motivates
$\mathsf {CGB}_\infty $
,
$\mathsf {CGB}$
with infinite connectives. Extending
$\mathsf {CGB}$
with infinite connectives will turn out to be ‘harmless’ in the sense that
$\mathsf {CGB}_\infty $
is conservative over
$\mathsf {CGB}$
. Under this setting, we have the following results:
1.2 The structure of the paper
This paper is largely divided into two parts: the ‘internal’ analysis of large set axioms over constructive set theories, and deriving a lower bound for large large set axioms in terms of extensions of classical set theories. Defining some of these axioms will require largely unexplored notions including second-order constructive set theories, so we define the necessary preliminary notions in Section 2.
Next, we review the basic properties of large set axioms over constructive set theories. This is also divided into two sections: in Section 3, we review small large sets over constructive set theories and their connection with classical set theories. Then in Section 4, we define large large sets over constructive set theories, including analogues of choiceless large cardinals. Having laid down the necessary framework, in Section 5 we provide an internal analysis of large set axioms. The main consequence of Section 5 is that we provide lower bounds for the consistency strength of large large set axioms in terms of extensions of
$\mathsf {IZF}$
, from which it will be easier to interpret classical theories by applying a double-negation translation.
In any case, we want to derive the consistency strengths in terms of extensions of classical set theories. Hence we need to develop one of the methods to transform intuitionistic theories into classical ones. This is what Section 6 mainly focuses on. In Section 6, we review Gambino’s Heyting-valued interpretation defined in [Reference Gambino20] and investigate this interpretation under the double-negation topology. We will see that the interpretation translates
$\mathsf {IZF}$
into the classical theory
$\mathsf {ZF}$
. Furthermore, we will also reduce extensions of
$\mathsf {IZF}$
to those of
$\mathsf {ZF}$
by using the double-negation topology. Section 7 is devoted to the double-negation translation of second-order set theories and the concept of the universe being totally Reinhardt. We summarize the lower bound of the consistency strength of large large set axioms over constructive set theories in terms of extensions of
$\mathsf {ZFC}$
in Section 8 before ending by posing some questions for future investigation in Section 9.
2 Preliminaries
In this section, we will briefly review
$\mathsf {ZFC}$
without Power Set,
$\mathsf {ZFC}^-$
, and constructive set theory. There are various formulations of constructive set theories, but we will focus on
$\mathsf {CZF}$
. In addition, we will define the second-order variant
$\mathsf {CGB}$
and
$\mathsf {IGB}$
of
$\mathsf {CZF}$
and
$\mathsf {IZF}$
respectively.
2.1
$\mathsf {ZFC}$
without Power Set
We will frequently mention
$\mathsf {ZFC}$
without Power Set, denoted
$\mathsf {ZFC}^-$
. However,
$\mathsf {ZFC}^-$
is not obtained by just dropping Power Set from
$\mathsf {ZFC}$
:
Definition 2.1.
$\mathsf {ZF}^-$
is the theory obtained from
$\mathsf {ZF}$
by dropping Power Set and using Collection instead of Replacement.
$\mathsf {ZFC}^-$
is obtained by adding the Well-Ordering Principle to
$\mathsf {ZF}^-$
.
Note that using Collection instead of Replacement is necessary to avoid pathologies. See [Reference Gitman, Hamkins and Johnstone23] for the details. It is also known by [Reference Friedman, Gitman and Kanovei17] that
$\mathsf {ZFC}^-$
does not prove the reflection principle.
2.2 Intuitionistic set theory
$\mathsf {IZF}$
and constructive set theory
$\mathsf {CZF}$
There are two possible constructive formulations of
$\mathsf {ZF}$
, namely
$\mathsf {IZF}$
and
$\mathsf {CZF}$
, although we will focus on the latter.
$\mathsf {IZF}$
appeared first in H. Friedman’s paper [Reference Friedman15] on the double-negation of set theory. Friedman introduced
$\mathsf {IZF}$
as an intuitionistic counterpart of
$\mathsf {ZF}$
and showed that there is a double-negation translation from
$\mathsf {ZF}$
to
$\mathsf {IZF}$
, analogous to that from
$\mathsf {PA}$
to
$\mathsf {HA}$
.
Definition 2.2.
$\mathsf {IZF}$
is the theory that comprises the following axioms: Extensionality, Pairing, Union, Infinity, Set Induction, Separation, Collection, and Power Set.
Remark 2.3. We take the axiom of Infinity to be the statement
$\exists a (\exists x (x \in a) \land \forall x \in a \, \exists y \in a \, (x \in y))$
. See Remark 2.13 for alternative, equivalent, ways to define Infinity.
Constructive Zermelo–Fraenkel set theory,
$\mathsf {CZF}$
, is introduced by Aczel [Reference Aczel1] with his type-theoretic interpretation of
$\mathsf {CZF}$
. We will introduce subtheories called Basic Constructive Set Theory,
$\mathsf {BCST}$
, and
$\mathsf {CZF}^-$
before defining the full
$\mathsf {CZF}$
.
Definition 2.4.
$\mathsf {BCST}$
is the theory that consists of Extensionality, Pairing, Union, Emptyset, Replacement, and
$\Delta _0$
-Separation.
$\mathsf {CZF}^-$
is obtained by adding the following axioms to
$\mathsf {BCST}$
: Infinity, Set Induction, and Strong Collection that states the following: if
$\phi (x,y)$
is a formula such that for given a, if
$\forall x\in a\exists y \phi (x,y)$
, then we can find b such that

We also provide notation for frequently mentioned axioms:
Definition 2.5. We will use
$\mathsf {Sep}$
,
$\Delta _0\text {-}\mathsf {Sep}$
,
$\Delta _0\text {-}\mathsf {LEM}$
for denoting Full Separation (i.e., Separation for all formulas),
$\Delta _0$
-Separation, and the law of excluded middle for
$\Delta _0$
-formulas respectively.
The combination of Full Separation and Collection proves Strong Collection, but the implication does not hold if we weaken Full Separation to
$\Delta _0$
-Separation. It is also known that
$\Delta _0$
-Separation is equivalent to the existence of the intersection of two sets. See Section 9.5 of [Reference Aczel and Rathjen6] for its proof.
Proposition 2.6. Working over
$\mathsf {BCST}$
without
$\Delta _0$
-Separation,
$\Delta _0$
-Separation is equivalent to the Axiom of Binary Intersection, which asserts that
$a\cap b$
exists if a and b are sets.
It is convenient to introduce the notion of multi-valued function to describe the Strong Collection and Subset Collection axioms that we will discuss shortly. Let A and B be classes. A relation
$R\subseteq A\times B$
is a multi-valued function from A to B if
$\operatorname {dom} R=A$
. In this case, we write
$R\colon A\rightrightarrows B$
. We use the notation
if both
$R \colon A\rightrightarrows B$
and
$R \colon B\rightrightarrows A$
hold. The reader is kept in mind that the previous definition must be rephrased in an appropriate first-order form if one of A, B, or R is a (definable) proper class in the same way as how we translate classes over
$\mathsf {ZF}$
.
Then we can rephrase Strong Collection as follows: for every set a and a class multi-valued function
$R \colon \text {a}\rightrightarrows V$
, there is an ‘image’ b of a under R, that is, a set b such that
.
Now we can state the Axiom of Subset Collection:
Definition 2.7. The Axiom of Subset Collection states the following: Assume that
$\phi (x,y,u)$
is a formula that defines a collection of multi-valued functions from a to b parametrized by
$u\in V$
: that is,
$\phi (x,y,u)$
satisfies
$\forall u \forall x\in a\exists y\in b \phi (x,y,u)$
. Then we can find a set c such that

$\mathsf {CZF}$
is the theory obtained by adding Subset Collection to
$\mathsf {CZF}^-$
.
We may state Subset Collection informally as follows: for every first-order definable collection of multi-valued class functions
$\langle R_u \colon a\rightrightarrows b \mid u\in V\rangle $
from a to b, we can find a set c of all ‘images’ of a under some
$R_u$
. That is, for every
$u\in V$
there is
$d\in c$
such that
.
There is a simpler axiom equivalent to Subset Collection, known as Fullness, which is a bit easier to understand.
Definition 2.8. The Axiom of Fullness states the following: Let
$\operatorname {mv}(a,b)$
be the class of all multi-valued functions from a to b. Then there is a subset
$c\subseteq \operatorname {mv}(a,b)$
such that if
$r\in \operatorname {mv}(a,b)$
, then there is
$s\in c$
such that
$s\subseteq r$
. Such a c is said to be full in
$\operatorname {mv}(a,b)$
.
Then the following hold:
Proposition 2.9 [Reference Aczel and Rathjen5, Reference Aczel and Rathjen6].
-
1. (
$\mathsf {CZF}^-$ ) Subset Collection is equivalent to Fullness.
-
2. (
$\mathsf {CZF}^-$ ) Power Set implies Subset Collection.
-
3. (
$\mathsf {CZF}^-$ ) Subset Collection proves the function set
${^a}b$ exists for all a and b.
-
4. (
$\mathsf {CZF}^-$ ) If
$\Delta _0\text {-} \mathsf {LEM}$ holds, then Subset Collection implies Power Set.
We will not provide a proof for the above proposition, but the reader may consult with [Reference Aczel and Rathjen5] or [Reference Aczel and Rathjen6] for its proof. We also note here that [Reference Rathjen45] showed that Subset Collection does not increase the proof-theoretic strength of
$\mathsf {CZF}^-$
, while [Reference Rathjen51] showed that the Axiom of Power Set does.
The following lemma is useful to establish (1) of Proposition 2.9, and is also useful to treat multi-valued functions in general:
Lemma 2.10. Let
$R \colon A\rightrightarrows B$
be a multi-valued function. Define
$\mathcal {A}(R) \colon A\rightrightarrows A\times B$
by

For
$S\subseteq A\times B$
, let
$\mathcal {A}^S(R) = \{ \langle a, \langle a, b \rangle \rangle | \langle a, b \rangle \in R \cap S\}$
.Footnote
5
Then
$:$
-
1.
${\mathcal {A}^S(R)} \colon A\rightrightarrows S \iff R\cap S \colon A\rightrightarrows B$ ,
-
2.
${\mathcal {A}^S(R)} \colon A\leftleftarrows S\iff S\subseteq R$ .
Proof For the first statement, observe that
$\mathcal {A}^S(R) \colon A\rightrightarrows S$
is equivalent to

By the definition of
$\mathcal {A}^S$
, this is equivalent to

We can see that the above statement is equivalent to
$\forall a\in A\exists b\in B [\langle a,b\rangle \in R\cap S]$
, which is the definition of
$R\cap S \colon A\rightrightarrows B$
. For the second claim, observe that
$\mathcal {A}^S(R) \colon A\leftleftarrows S$
is equivalent to

By rewriting
$\mathcal {A}^S$
to its definition, we have

We can see that it is equivalent to
$S\subseteq R$
.
The following lemma is useful when we work with multi-valued functions because it allows us to replace class multi-valued functions over A with set multi-valued functions in A:
Lemma 2.11. Assume that A satisfies second-order Strong Collection, that is, for every
$a\in A$
and
$R \colon a\rightrightarrows A$
, we have
$b\in A$
such that
.Footnote
6
If
$a\in A$
and
$R \colon a\rightrightarrows A$
, then there is a set
$c\in A$
such that
$c\subseteq R$
and
$c \colon a\rightrightarrows A$
.
Proof Consider
$\mathcal {A}(R) \colon a\rightrightarrows a\times A$
. By second-order Strong Collection over A, there is
$c\in A$
such that
. Hence by Lemma 2.10, we have
$c\subseteq R$
and
$c \colon a\rightrightarrows A$
.
It is known that every theorem of
$\mathsf {CZF}$
is also provable in
$\mathsf {IZF}$
. Moreover,
$\mathsf {IZF}$
is quite strong in the sense that its proof-theoretic strength is the same as that of
$\mathsf {ZF}$
. On the other hand, it is known that the proof-theoretic strength of
$\mathsf {CZF}$
is equal to that of Kripke–Platek set theory
$\mathsf {KP}$
.
$\mathsf {IZF}$
is deemed to be impredicative due to the presence of Full Separation and Power Set.Footnote
7
On the other hand,
$\mathsf {CZF}$
is viewed as predicative since it allows for a type-theoretic interpretation such as the one given by Aczel [Reference Aczel1]. However, adding the full law of excluded middle to
$\mathsf {IZF}$
or
$\mathsf {CZF}$
results in the same theory, namely
$\mathsf {ZF}$
.
2.3 Kripke–Platek set theory
Kripke–Platek set theory is the natural intermediate theory between arithmetic and stronger set theories like
$\mathsf {ZF}$
.
$\mathsf {KP}$
has a natural intuitionistic counterpart called Intuitionistic Kripke–Platek, which we denote by
$\mathsf {IKP}$
.
Definition 2.12.
$\mathsf {IKP}$
is the theory consisting of Extensionality, Pairing, Union, Infinity, Set Induction,
$\Delta _0$
-Collection, and
$\Delta _0$
-Separation.
Remark 2.13. The reader is reminded that there are different formulations of
$\mathsf {KP}$
and
$\mathsf {IKP}$
.
-
1. Some authors such as in [Reference Avigad9] restricts Set Induction in
$\mathsf {KP}$ and
$\mathsf {IKP}$ to
$\Pi _1$ -formulas. We include Full Set Induction in
$\mathsf {KP}$ and
$\mathsf {IKP}$ .
-
2. The formulation of Infinity over
$\mathsf {IKP}$ is more subtle. Some authors such as [Reference Avigad9] exclude Infinity from
$\mathsf {KP}$ and
$\mathsf {IKP}$ , and denote
$\mathsf {KP}$ with Infinity
$\mathsf {KP\omega }$ . We also have an apparently stronger formulation, namely Strong Infinity, which is defined as follows: let
$\mathsf {Ind}(a)$ be the formula
$\varnothing \in a\land \forall x\in a (x\cup \{x\}\in a)$ . Then Strong Infinity is the statement
$$ \begin{align*} \exists a [\mathsf{Ind} (a) \land \forall b[\mathsf{Ind}(b)\to a\subseteq b]]. \end{align*} $$
$$ \begin{align*} \exists a [\mathsf{Ind}(a)\land \forall x\in a[x=0\lor \exists y\in a (x=y\cup\{y\})]]. \end{align*} $$
$\mathsf {IKP}$ . The equivalence of Strong Infinity and Lubarsky’s Infinity is easy to prove. The harder part is proving Strong Infinity from Infinity. This is done over
$\mathsf {CZF}$ in Proposition 4.7 of [Reference Aczel and Rathjen5], and one can verify that the proof also works over
$\mathsf {IKP}$ .
Finally, let us observe that
$\mathsf {IKP}$
proves Collection for a broader class of formulas named
$\Sigma $
-formulas:
Definition 2.14. The collection of
$\Sigma $
-formulas is the least collection which contains the
$\Delta _0$
-formulas and is closed under conjunction, disjunction, bounded quantifications, and unbounded
$\exists $
.
Theorem 2.15. For every
$\Sigma $
-formula
$\varphi (x, y, u)$
the following is a theorem of
$\mathsf {IKP}$
: For every set u, if
$\forall x \in a \exists y \varphi (x, y, u)$
then there is a set b such that

We refer the reader to Section 19 of [Reference Aczel and Rathjen6] or Section 11 of [Reference Aczel and Rathjen5] for some of the basic axiomatic consequences of
$\mathsf {IKP}$
and their proofs.
2.4 Inductive definition
Various recursive constructions on
$\mathsf {CZF}$
are given by inductive definitions. The reader might refer to [Reference Aczel and Rathjen5] or [Reference Aczel and Rathjen6] to see general information about the inductive definition, but we will review some of the details for the reader who are not familiar with it.
Definition 2.16. An inductive definition
$\Phi $
is a class of pairs
$\langle X,a\rangle $
. To any inductive definition
$\Phi $
, associate the operator
$\Gamma _\Phi (C)=\{a\mid \exists X\subseteq C \langle X,a\rangle \in \Phi \}$
. A class C is
$\Phi $
-closed if
$\Gamma _\Phi (C)\subseteq C$
.
We may think of
$\Phi $
as a generalization of a deductive system, and
$\Gamma _\Phi (C)$
as a class of theorems derivable from the class of axioms C. Some authors use the notation
$X\vdash _\Phi a$
or
$X/a\in \Phi $
instead of
$\langle X,a\rangle \in \Phi $
. The following theorem says each inductive definition induces a least fixed point:
Theorem 2.17 (Class Inductive Definition Theorem,
$\mathsf {CZF}^-$
).
Let
$\Phi $
be an inductive definition. Then there is the smallest
$\Phi $
-closed class
$I(\Phi )$
.
The following lemma is the essential tool for the proof of the Class Inductive Definition Theorem. See Lemma 12.1.2 of [Reference Aczel and Rathjen6] for its proof:
Lemma 2.18 (
$\mathsf {CZF}^-$
).
Every inductive definition
$\Phi $
has a corresponding iteration class J, which satisfies
$J^a=\Gamma \left (\bigcup _{x\in a}J^x\right )$
for all a, where
$J^a=\{x\mid \langle a,x\rangle \in J\}$
.
2.5 Constructive L
In this subsection, we will define the constructible universe L and discuss its properties over
$\mathsf {IKP}$
.
Constructing L under a constructive manner was first studied by Lubarsky [Reference Lubarsky35]. Lubarsky developed the properties of L over
$\mathsf {IZF}$
, and showed that
$\mathsf {IZF}$
proves L satisfies
$\mathsf {IZF}$
plus
$V=L$
. Crosilla [Reference Crosilla14] showed that the construction of L carries over to
$\mathsf {IKP}$
Footnote
8
.
There are at least two ways of defining L: the first is using definability over a set model, which Lubarsky [Reference Lubarsky35] and Crosilla [Reference Crosilla14] had taken. Another one is using fundamental operations, also called Gödel operations in the classical context, which was taken by the second author in [Reference Matthews38]. We will follow the second method.
Definition 2.19 (Fundamental operations).
Define
-
•
$1^{st}(x)=a$ iff
$\exists u\in x\exists b\in u(x=\langle a,b\rangle )$ ,
-
•
$2^{nd}(x)=b$ iff
$\exists u\in x\exists a\in u(x=\langle a,b\rangle )$ ,
-
•
$y^"\{z\} := \{u\mid \langle z,u\rangle \in y\}$ .
Then we define the fundamental operations as follows:
-
•
$\mathcal {F}_p(x,y):=\{x,y\}$ ,
-
•
$\mathcal {F}_\cap (x,y):=x\cap \bigcap y$ ,
-
•
$\mathcal {F}_\cup (x):=\bigcup x$ ,
-
•
$\mathcal {F}_\setminus (x,y) : =x\setminus y$ ,
-
•
$\mathcal {F}_\times (x,y) :=x\times y$ ,
-
•
$\mathcal {F}_\to (x, y){\kern-1.5pt} ={\kern-1.5pt} x {\kern-1.5pt}\cap{\kern-1.5pt} \{ z {\kern-1.5pt}\mid{\kern-1.5pt} y\ \text {is an ordered pair and } (z {\kern-1.5pt}\in{\kern-1.5pt} 1^{st}(y) \rightarrow z {\kern-1.5pt}\in{\kern-1.5pt} 2^{nd}(y) ) \}$ ,
-
•
$\mathcal {F}_\forall (x,y):=\{x^"\{z\}\mid z\in y\}$ ,
-
•
$\mathcal {F}_d(x,y):=\operatorname {dom} x$ ,
-
•
$\mathcal {F}_r(x,y):=\operatorname {ran} x$ ,
-
•
$\mathcal {F}_{123}(x,y):=\{\langle u,v,w\rangle \mid \langle u,v\rangle \in x\land w\in y\}$ ,
-
•
$\mathcal {F}_{132}(x,y):=\{\langle u,w,v\rangle \mid \langle u,v\rangle \in x\land w\in y\}$ ,
-
•
$\mathcal {F}_=(x,y):=\{\langle v,u\rangle \in y\times x\mid u=v\}$ ,
-
•
$\mathcal {F}_\in (x,y):=\{\langle v,u\rangle \in y\times x\mid u\in v\}$ .
For simplicity, we shall let
$\mathcal {I}$
be the set of all indices i of
$\mathcal {F}_i$
presented in the above definition.
The following lemma says that we can represent every
$\Delta _0$
-formula in terms of fundamental operations:
Lemma 2.20 [Reference Matthews38, Lemma 5.2.4],
$\mathsf {IKP}$
.
Let
$\phi (x_1,\ldots , x_n)$
be a bounded formula whose free variables are all expressed. Then there is a term
$\mathcal {F}_\phi $
built up from fundamental operations such that

Definition 2.21. For a set a, define
-
•
$\mathcal {E}(a) := a\cup \{\mathcal {F}_i(\vec {x})\mid \vec {x}\in a\land i\in \mathcal {I}\}$ ,
-
•
$\mathcal {D}(a):=\mathcal {E}(a\cup \{a\})$ , and
-
•
$\operatorname {Def}(a) := \bigcup _{n\in \omega } \mathcal {D}^n(a)$ .
Definition 2.22. For an ordinal
$\alpha $
, define
$L_\alpha :=\bigcup _{\beta \in \alpha }\operatorname {Def}(L_\beta )$
and
$L:=\bigcup _{\alpha \in \mathrm {Ord}} L_\alpha $
.
Then we have the following properties of the constructible hierarchy:
Proposition 2.23 [Reference Matthews38, Proposition 5.3.12],
$\mathsf {IKP}$
.
For all ordinals
$\alpha ,\beta :$
-
1. If
$\beta \in \alpha $ , then
$L_\beta \subseteq L_\alpha $ ,
-
2.
$L_\alpha \in L_{\alpha +1}$ ,
-
3.
$L_\alpha $ is transitive, and
-
4.
$L_\alpha $ is a model of Bounded Separation.
Moreover we can see that
$\mathsf {IKP}$
proves L is a model of
$\mathsf {IKP}$
:
Theorem 2.24 [Reference Matthews38, Theorems 5.3.6 and 5.3.7],
$\mathsf {IKP}$
.
$\mathsf {IKP}^L$
holds. That is, if
$\sigma $
is a theorem of
$\mathsf {IKP}$
, then
$\mathsf {IKP}$
proves
$\sigma ^L$
. Furthermore, L thinks
$V=L$
holds.
We will not examine its proof in detail, but it is still worthwhile to mention relevant notions that are necessary for the proof. One of these is hereditary addition, which was first formulated by Lubarsky [Reference Lubarsky35]. This is necessary because
$\alpha \in \beta $
does not entail
$\alpha +1\in \beta +1$
constructively.
Definition 2.25. For ordinals
$\alpha $
and
$\gamma $
, define
$\alpha +_H\gamma $
by recursion on
$\alpha $
:

Another relevant notion is augmented ordinals. Lubarsky [Reference Lubarsky35] introduced this notion to develop properties of the constructible hierarchy over
$\mathsf {IZF}$
. Augmented ordinals are not needed to verify that L is a model of
$\mathsf {IKP}$
, but are used to prove the Axiom of Constructibility, and we will use them when working with elementary embeddings.
Definition 2.26. Let
$\alpha $
be an ordinal. Then
$\alpha $
augmented,
$\alpha ^\#$
, is defined recursively on
$\alpha $
as

Remark 2.27. In [Reference Matthews38], the second author distinguished Strong Infinity and Infinity, and take a more cautious way to define L: first, one defines a subsidiary hierarchy
$\mathbb {L}_\alpha $
by
$\mathbb {L}_\alpha :=\bigcup _{\beta \in \alpha } \mathcal {D}(\mathbb {L}_\beta )$
. Then define
$\mathbb {L}:=\bigcup _{\alpha \in \mathrm {Ord}}\mathbb {L}_\alpha $
. Finally, it is shown that if Strong Infinity holds, then
$\mathbb {L}$
satisfies
$\mathsf {IKP}$
and
$\mathbb {L}=L$
.
The reason for taking this method is that the equivalence of Infinity and Strong Infinity over
$\mathsf {IKP}$
without Infinity is quite non-trivial. This way of defining L also has the benefit that it works even in the absence of any form of Infinity.
Remark 2.28. Unlike either
$\mathsf {IKP}$
or
$\mathsf {IZF}$
,
$\mathsf {CZF}$
does not prove that L satisfies
$\mathsf {CZF}$
, in particular
$\mathsf {CZF}$
cannot prove that Exponentiation holds in L. Crosilla [Reference Crosilla14] showed that
$\mathsf {CZF}$
proves L validates
$\mathsf {IKP}$
with full Collection. Full details of the above construction using fundamental operations, alongside a full investigation of which axioms of
$\mathsf {CZF}$
hold in L were undertaken by the second author and Michael Rathjen in [Reference Matthews and Rathjen40].
2.6 Second-order set theories
We need a second-order formulation of constructive set theories in order to define large set axioms corresponding to large cardinals beyond choice. We will formulate constructive analogues of Gödel–Bernays set theory
$\mathsf {GB}$
whose first-order counterparts are
$\mathsf {CZF}$
and
$\mathsf {IZF}$
, respectively. We will use
$\forall ^0$
and
$\exists ^0$
for quantifications over sets, and
$\forall ^1$
and
$\exists ^1$
for quantifications over classes. We omit the superscript if the context is clear. Following standard conventions, we will use uppercase letters for classes and lowercase letters for sets, unless specified otherwise. The reader could consult with [Reference Williams64] if they are interested in classical second-order set theory.
Definition 2.29.
Constructive Gödel–Bernays set theory (
$\mathsf {CGB}$
) is defined as follows: the language of
$\mathsf {CGB}$
is two-sorted, that is,
$\mathsf {CGB}$
has sets and classes as its objects.
$\mathsf {CGB}$
comprises the following axioms:
-
• Axioms of
$\mathsf {CZF}$ for sets.
-
• Every set is a class, and every element of a class is a set.
-
• Class Extensionality: two classes are equal if they have the same set of members. Formally,
$$ \begin{align*} \forall^1 X,Y [X=Y\leftrightarrow \forall^0x (x\in X \leftrightarrow x\in Y)]. \end{align*} $$
-
• Elementary Comprehension: if
$\phi (x,p,C)$ is a first-order formula with a class parameter C, then there is a class A such that
$A=\{x\mid \phi (x,p,C)\}$ . Formally,
$$ \begin{align*} \forall^0p\forall^1C\exists^1 A \forall^0x [x\in A\leftrightarrow \phi(x,p,C)]. \end{align*} $$
-
• Class Set Induction: if A is a class, and if we know a set x is a member of A if every element of x belongs to A, then A is the class of all sets. Formally,
$$ \begin{align*} \forall^1A \big[ [\forall^0 x(\forall^0y\in x (y\in A)\to x\in A)]\to \forall^0x (x\in A) \big]. \end{align*} $$
-
• Class Strong Collection: if R is a class multi-valued function from a set a to the class of all sets, then there is a set b which is an ‘image’ of a under R. Formally,
Remark 2.30. We do not need to add the Class Subset Collection axiom

asFootnote 9 an axiom of
$\mathsf {CGB}$
because it is derivable from the current formulation of
$\mathsf {CGB}$
: we can prove it from the first-order Fullness and the Class Strong Collection by mimicking the proof of Set Collection from Fullness and Strong Collection.
The reader should also note that, unlike classical
$\mathsf {GB}$
, there is no additional Separation axiom for sets: that is, we do not assume
$a{\kern-1pt}\cap{\kern-1pt} A{\kern-1pt}={\kern-1pt}\{x{\kern-1pt}\in{\kern-1pt} a {\kern-1pt}\mid{\kern-1pt} x{\kern-1pt}\in{\kern-1pt} A\}$
is a set for a given class A. In fact, the assumption that
$a\cap A$
is always a set implies Full Separation due to Elementary Comprehension.
Thus we introduce a new terminology for classes such that
$A\cap a$
is always a set:
Definition 2.31. A class A is amenable if
$A\cap a$
is a set for any set a.
The following lemma shows every class function is amenable:
Lemma 2.32 (
$\mathsf {CGB}$
).
Let F be a class function, then F is amenable.
Proof It suffices to show that if F is a class function then
$F\operatorname {\mathrm {\upharpoonright }} a$
is a set for each
$a\in V$
. Consider the class function
$\mathcal {A}(F\operatorname {\mathrm {\upharpoonright }} a):a\to a\times V$
, where
$\mathcal {A}$
is the operation we defined in Lemma 2.10. By Class Strong Collection, we can find a set b such that
.
We claim that
$b=F\operatorname {\mathrm {\upharpoonright }} a$
. For
$F\operatorname {\mathrm {\upharpoonright }} a\subseteq b$
, if
$x\in a$
then we can find
$y\in b$
such that
$\langle x,y\rangle \in \mathcal {A}(F\operatorname {\mathrm {\upharpoonright }} a)$
. Hence
$y=\langle x,z\rangle \in F\operatorname {\mathrm {\upharpoonright }} a$
for some z. By functionality of F, we have
$y=\langle x,F(x)\rangle $
. For the remaining inclusion, if
$y\in b$
, then there is
$x\in a$
such that
$\langle x,y\rangle \in \mathcal {A}(F\operatorname {\mathrm {\upharpoonright }} a)$
. Now we can prove that
$y=\langle x,F(x)\rangle $
since F is a class function.
Next, we define the second-order variant of
$\mathsf {IZF}$
which we denote by
$\mathsf {IGB}$
.
$\mathsf {IGB}$
allows Separation for arbitrary classes as classical
$\mathsf {GB}$
does.
Definition 2.33.
Intuitionistic Gödel–Bernays set theory (
$\mathsf {IGB}$
) is obtained by adding the following axioms to
$\mathsf {CGB}$
:
-
1. Axioms of
$\mathsf {IZF}$ for sets.
-
2. Class Separation: every class is amenable.
We know that classical
$\mathsf {GB}$
is a conservative extension of
$\mathsf {ZF}$
. We expect the same results to hold for constructive set theories. The following theorem shows it actually holds; however, its proof will require a small amount of proof theory. We borrow ideas of the proof from [63].
Proposition 2.34.
$\mathsf {CGB}$
is conservative over
$\mathsf {CZF}$
.
$\mathsf {IGB}$
is conservative over
$\mathsf {IZF}$
.
Proof We only provide the proof for the conservativity of
$\mathsf {CGB}$
over
$\mathsf {CZF}$
since the same argument applies to the conservativity of
$\mathsf {IGB}$
over
$\mathsf {IZF}$
. We will rely on the cut-elimination theorem of intuitionistic predicate logic. The reader who might be unfamiliar with this can consult with [Reference Negri and Von Plato44] or [Reference Arai8].
Assume that
$\sigma $
is a first-order sentence that is deducible from
$\mathsf {CGB}$
. Then we have a cut-free derivation of
$\sigma $
from a finite set
$\Gamma $
of axioms of
$\mathsf {CGB}$
. It is known that every cut-free derivation satisfies the subformula property, that is, every formula appearing in the deduction is a subformula of
$\sigma $
or
$\Gamma $
.
Thus we have a finite set
$\{X_0,\ldots , X_n\}$
of class variables that appear in the deduction. Now we divide into the following cases:
-
1. If
$X_i$ appears in an instance of Class Comprehension in the deduction, then the deduction contains
$x\in X_i\leftrightarrow \phi (x,p,C)$ for some class-quantifier free
$\phi $ . Now replace every
$x\in X_i$ with
$\phi (x,p,C)$ , and
$X_i=Y$ with
$\forall x (\phi (x,p,C)\leftrightarrow x\in Y)$ in the deduction.
-
2. Otherwise, replace
$x\in X_i$ with
$x=x$ , and
$X_i=Y$ with
$\forall x (x=x\leftrightarrow x\in Y)$ .
Then we can see that the resulting deduction is a first-order proof of
$\sigma $
. Hence
$\mathsf {CGB}\vdash \sigma $
.
2.7 Second-order set theories with infinite connectives
We will use infinite conjunctions to circumvent technical issues about defining elementary embeddings. Hence we define an appropriate infinitary logic and the corresponding second-order set theories. The following definition appears in [Reference Negri43]:
Definition 2.35 (
$\mathsf {G3i_\omega }$
[Reference Negri43]).
The intuitionistic cut-free first-order sequent calculus
$\mathsf {G3i_\omega }$
is defined by the following rules: initial sequents are of the form
$P,\Gamma \implies \Delta ,P$
and its deduction rules are:

The difference between finitary logic and
$\mathsf {G3i}_\omega $
is that the latter allows for infinite conjunctions and disjunctions over countable sets of formulas. This will only be needed in order to formalise the concepts of Super Reinhardt sets and the total Reinhardtness of V in Section 4.2. Otherwise, throughout this paper, we will stick to finitary logic.
It is well-known that the usual classical and intuitionistic predicate calculus enjoys the cut-elimination rule. This is the same for
$\mathsf {G3i_\omega }$
. We state the following proposition without proof. The reader might refer to [Reference Negri43] for its proof.
Proposition 2.36 (Cut elimination for
$\mathsf {G3i_\omega }$
).
$\mathsf {G3i_\omega }$
with the cut rule proves the same sequents as
$\mathsf {G3i_\omega }$
itself
$($
that is, without the cut rule
$)$
.
Finally, we are ready to define set theories over
$\mathsf {G3i_\omega }$
:
Definition 2.37.
$\mathsf {CGB}_\infty $
and
$\mathsf {IGB}_\infty $
are obtained by replacing the underlying logic of
$\mathsf {CGB}$
and
$\mathsf {IGB}$
with
$\mathsf {G3i_\omega }$
, respectively.
Note that the infinitary logic
$\mathsf {G3i_\omega }$
has no role in defining new sets from the given sets: for example, we do not have Separation schemes for infinite conjunctions and disjunctions. The only reason we are using
$\mathsf {G3i}$
is to enhance the expressive power of set theories as we introduced classes into first-order set theories. The following theorem shows that infinite connectives do not provide new theorems into
$\mathsf {CGB}$
or
$\mathsf {IGB}$
:
Proposition 2.38.
$\mathsf {CGB}_\infty $
is conservative over
$\mathsf {CGB}$
for formulas with no infinite connectives. The analogous result also holds between
$\mathsf {IGB}_\infty $
and
$\mathsf {IGB}$
.
Proof We only prove the first claim since the proof for the second claim is identical. We can see from Proposition 2.36 that the subformula property for
$\mathsf {G3i_\omega }$
holds: that is, every formula in a derivation of
$\Gamma \Rightarrow \Delta $
is a subformula of
$\Gamma $
or
$\Delta $
.
Now assume that we have a derivation
$\Gamma \implies \sigma $
from a finite set
$\Gamma $
of axioms of
$\mathsf {CGB}$
and an infinite-connective free formula
$\sigma $
. By the subformula property, the derivation has no infinite connectives. This means that the entire derivation can be done without rules for infinite connectives. Hence
$\mathsf {CGB}$
proves
$\sigma $
.
3 Small large set axioms
In this and the next section, we will discuss large set axioms, which is an analogue of large cardinal axioms over constructive set theories. Since ordinals over
$\mathsf {CZF}$
could be badly behaved (for example, they need not be well-ordered), we focus on the structural properties of given sets to obtain higher infinities over
$\mathsf {CZF}$
. This approach is not unusual even under classical context since many large cardinals over
$\mathsf {ZF}$
are characterized and defined by structural properties of
$V_\kappa $
or
$H_\kappa $
. (One example would be the definition of indescribable cardinals.) We also compare the relation between large cardinal axioms over well-known theories like
$\mathsf {ZF}$
and its large set axiom counterparts.
The first large set notions over
$\mathsf {CZF}$
would be regular sets. Regular sets appear first in Aczel’s paper [Reference Aczel3] about inductive definitions over
$\mathsf {CZF}$
. As we will see later, regular sets can ‘internalize’ most inductive constructions, which turns out to be useful in many practical cases. We will follow definitions given by [Reference Ziegler67], and will briefly discuss differences in the terminology between different references.
Definition 3.1. A transitive set A is:
-
1. Regular if it satisfies second-order Strong Collection:
-
2. Weakly regular if it satisfies second-order Collection,
-
3. Functionally regular if it satisfies second-order Replacement,
-
4.
$\bigcup $ -regular if it is regular and
$\bigcup a\in A$ for all
$a\in A$ ,
-
5. Strongly regular if it is
$\bigcup $ -regular and
${^{a}}{}b\in A$ for all
$a, b \in A$ , and
-
6. Inaccessible if
$(A,\in )$ is a model of second-order
$\mathsf {CZF}$ .
The Regular Extension Axiom
$\mathsf {REA}$
asserts that every set is contained in some regular set. The Inaccessible Extension Axiom
$\mathsf {IEA}$
asserts that every set is contained in an inaccessible set.
Note that there is an alternative characterization of inaccessible sets:
Lemma 3.2 [Reference Aczel and Rathjen5, Corollary 10.27],
$\mathsf {CZF}$
.
A regular set A is inaccessible if and only if A satisfies the following conditions
$:$
-
1.
$\omega \in A$ ,
-
2.
$\bigcup a\in A$ if
$a\in A$ ,
-
3.
$\bigcap a\in A$ if
$a\in A$ and a is inhabited, and
-
4. for
$a,b\in A$ we can find
$c\in A$ such that c is full in
$\operatorname {mv}(a,b)$ .
Corollary 3.3. Inaccessible sets are strongly regular. Especially, if
$a,b\in A$
and A is inaccessible, then
${}^{a}b\in A$
.
Proof Let A be an inaccessible set and
$a,b\in A$
. If
$r \colon a\to b$
is a function and
$c\in A$
is full in
$\operatorname {mv}(a,b)$
, then we can find a multi-valued function
$s\in c$
of the domain a such that
$s\subseteq r$
. Hence
$r=s\in c\in A$
, which proves
$r\in A$
, and thus we have
${{}^a}b\subseteq A$
. To see
${{}^a}b\in A$
, since A satisfies fullness, we can find
$c\in A$
which is full in
$\operatorname {mv}(a,b)\cap A$
. Then
${{}^a}b\subseteq \operatorname {mv}(a,b)\cap A$
, and so
${{}^a}b=\{f\in c\mid f\colon a\to b\}\in A$
by
$\Delta _0$
-Separation over A.
There is no need for the notion ‘pair-closed regular sets’ since every regular set is closed under pairings if it contains 2:
Lemma 3.4 [Reference Aczel and Rathjen6, Lemma 11.1.5],
$\mathsf {CZF}^-$
.
If A is regular and
$2\in A$
, then
$\langle a,b\rangle \in A$
for all
$a,b\in A$
.
$\mathsf {REA}$
has various consequences. For example,
$\mathsf {\mathsf {CZF}^-+REA}$
proves Subset Collection. Moreover, it also proves that every bounded inductive definition
$\Phi $
has a set-sized fixed point
$I(\Phi )$
. (See [Reference Aczel and Rathjen5] or [Reference Aczel and Rathjen6] for details.)
The notion of regular set is quite restrictive, as it does not have Separation axioms, not even for
$\Delta _0$
-formulas. Thus we have no way to do any internal construction over a regular set. The following notion is a strengthening of regular set, which resolves the issue of internal construction:
Definition 3.5. A regular set A is BCST-regular if
$A\models \mathsf {BCST}$
. Equivalently, A is a regular set satisfying Union, Pairing, Empty set, and Binary Intersection.
We do not know if
$\mathsf {CZF}$
proves every regular set is BCST-regular, although Lubarsky and Rathjen [Reference Rathjen and Lubarsky56] proved that the set of all hereditarily countable sets in the Feferman–Levy model is functionally regular but not
$\bigcup $
-regular. It is not even known that the existence of a regular set implies that of a BCST-regular set. However, every inaccessible set is BCST-regular, and every BCST-regular set we work with in this paper will also be inaccessible.
What are regular sets and inaccessible sets in the classical context? The following result illustrates what these sets look like under some well-known classical set theories:
Proposition 3.6.
-
1. (
$\mathsf {ZF}^-$ ) Every
$\bigcup $ -regular set containing
$2$ is a transitive model of second-order
$\mathsf {ZF}^-$ ,
$\mathsf {ZF}^-_2$ .
-
2. (
$\mathsf {ZFC}^-$ ) Every
$\bigcup $ -regular set containing
$2$ is of the form
$H_\kappa $ for some regular cardinal
$\kappa $ .
-
3. (
$\mathsf {ZF}^-$ ) Every inaccessible set is of the form
$V_\kappa $ for some inaccessible cardinal
$\kappa $ .Footnote 10
Proof
-
1. Let A be a
$\bigcup $ -regular set containing
$2$ . We know that A satisfies Extensionality, Set Induction, Union, and the second-order Collection. Hence it remains to show that second-order Separation holds.
Let
$X\subseteq A$ ,
$a\in A$ , and suppose that
$X\cap a$ is inhabited. Fix c in this intersection. Now consider the function
$f \colon a\to A$ defined by
$$ \begin{align*} f(x)=\begin{cases} x,&\text{if }x\in X,\\ c,&\text{otherwise}. \end{cases} \end{align*} $$
$b\in A$ such that
. It is easy to see that
$b=X\cap a$ holds.
-
2. Let A be a regular set. Let
$\kappa $ be the least ordinal that is not a member of A. Then
$\kappa $ must be a regular cardinal: if not, there is
$\alpha <\kappa $ and a cofinal map
$f \colon \alpha \to \kappa $ . By transitivity of A and the definition of
$\kappa $ , we have
$\alpha \in A$ , so
$\kappa \in A$ by the second-order Replacement and Union, a contradiction.
We can see that
$\mathsf {ZFC}^-$ proves
$H_\kappa =\{x \mid |\operatorname {TC}(x)|<\kappa \}$ is a class model of
$\mathsf {ZFC}^-$ , and A satisfies the Well-Ordering Principle. We can also show that
$A\subseteq H_\kappa $ holds: if not, there is a set
$x\in A\setminus H_\kappa $ . Since A is closed under transitive closures, A contains a set whose cardinality is at least
$\kappa $ . Now derive a contradiction from second-order Replacement.
We know that
$A\cap \mathrm {Ord}= H_\kappa \cap \mathrm {Ord}=\kappa $ . By second-order Separation over A,
$\mathcal {P}(\mathrm {Ord})\cap A=\mathcal {P}(\mathrm {Ord})\cap H_\kappa $ . Hence we have
$A=H_\kappa $ : for each
$x\in H_\kappa $ , we can find
$\theta <\kappa $ ,
$R\subseteq \theta \times \theta $ , and
$X\subseteq \theta $ such that
$(\operatorname {TC}(x),\in ,x)\cong (\theta ,R,X)$ . (Here we treat x as a unary relation.) Then
$(\theta ,R,X)\in A$ , so
$x\in A$ by Mostowski Collapsing Lemma.
-
3. If A is inaccessible, then A is closed under the true power set of its elements since second-order Subset Collection implies A is closed under exponentiation
$a,b\mapsto {}^ab$ . Hence A must be of the form
$V_\kappa $ for some
$\kappa $ . Moreover,
$\kappa $ is inaccessible because
$V_\kappa =A\models \mathsf {ZF_2}$ .
It is easy to see from results in [Reference Rathjen and Lubarsky56] that
$\mathsf {ZF}$
proves every rank of a regular set is a regular cardinal. However, the complete characterization of regular sets in a classical context is open:
Question 3.7. Is there a characterization of regular sets over
$\mathsf {ZFC}$
? How about
$\bigcup $
-regular sets over
$\mathsf {ZF}^-$
?
While we can see that every inaccessible set over
$\mathsf {ZF}$
is closed under power sets, there is no reason to believe that inaccessible sets over constructive set theories are closed under power sets, even if the Axiom of Power Set holds. Hence we introduce the following notion:
Definition 3.8. An inaccessible set K is power inaccessible if, for every
$a\in K$
, every subset of a belongs to K.
$\mathsf {pIEA}$
is the assertion that every set is an element of some power inaccessible set.
Notice that the existence of a power inaccessible set implies the Axiom of Power Set over
$\mathsf {CZF}$
. This is because if K is power inaccessible, then
$\mathcal {P}(1)\subseteq K$
, and from this one can see that every set has a power set since there is a bijection between
$\mathcal {P}(a)$
and
${^a}\mathcal {P}(1)$
through characteristic functions, see Proposition 10.1.1 of [Reference Aczel and Rathjen6] for further details. We will mention power inaccessible sets only in the context of
$\mathsf {IZF}$
. Also, the reader should note that [Reference Friedman and Ščedrov16, Reference Matthews38] use the word ‘inaccessible sets’ to denote power inaccessible sets.
The reader is reminded that the meaning of an inaccessible set varies over the references. On the one hand, [Reference Aczel and Rathjen6, Reference Ziegler67] follow our definition of inaccessibility. On the other hand, other references like [Reference Aczel and Rathjen5, Reference Rathjen46, Reference Rathjen54, Reference Rathjen, Griffor and Palmgren55] use the following definition, which we will call REA-inaccessibility:
Definition 3.9. A set I is REA-inaccessible if I is inaccessible and
$I\models \mathsf {REA}$
.
The disagreement in the terminology may come from the difference between their set-theoretic and proof-theoretic properties. On the one hand, Proposition 3.6 shows that inaccessible sets over
$\mathsf {ZF}^-$
are exactly of the form
$V_\kappa $
for some inaccessible cardinal
$\kappa $
. However, an REA-inaccessible set over
$\mathsf {ZF}^-$
is not only of the form
$V_\kappa $
for some inaccessible
$\kappa $
, but also satisfies
$\mathsf {REA}$
. Gitik [Reference Gitik22] proved that
$\mathsf {ZF}$
with no regular cardinal other than
$\omega $
is consistent if there is a proper class of strongly compact cardinals, and Gitik’s construction can carry over
$V_\kappa $
for an inaccessible cardinal
$\kappa $
which is a limit of strongly compact cardinals while preserving the inaccessibility of
$\kappa $
. Thus
$\mathsf {ZF}$
with the existence of an inaccessible set does not imply there is an REA-inaccessible set.
On the other hand, [Reference Rathjen48] showed that the theory
$\mathsf {CZF+IEA}$
is equiconsistent with
$\mathsf {CZF+REA}$
.Footnote
11
However,
$\mathsf {CZF} + \forall x\exists y(x\in y\land y\ \text {is REA-inaccessible})$
has a stronger consistency strength than that of
$\mathsf {CZF+REA}$
.
Before finishing this section, let us remark that the consistency strength of small large sets over
$\mathsf {CZF}$
is quite weak compared to their counterparts over
$\mathsf {ZF}$
. The reader should refer to [Reference Rathjen45], [Reference Rathjen46], or [Reference Rathjen47] for additional accounts for the following results:
Theorem 3.10 (Rathjen).
Each pair of theories have an equal consistency strength and proves the same
$\Pi ^0_2$
-sentences.
-
1.
$\mathsf {CZF+REA}$ and
$\mathsf {KPi}$ , the theory
$\mathsf {KP}$ with a proper class of admissible ordinals.
-
2.
$\mathsf {CZF}+\forall x\exists y (x\in y\land y\ \text {is REA-inaccessible})$ and
$\mathsf {KPI}$ , the theory
$\mathsf {KP}$ with a proper class of recursively inaccessible ordinals.
4 Large large set axioms
There is no reason to refrain from defining larger large sets. Hence we define stronger large set axioms. We have two ways of defining large cardinals above measurable cardinals: elementary embeddings and ultrafilters. On a fundamental level, ultrafilters make use of the law of excluded middle because either a set or its complement must be in the ultrafilter. This means that it is possible to prove that a set is in the ultrafilter by showing that a different set is not in the ultrafilter, an inherently nonconstructive principle. On a more structural note, the ultrafilter characterization does not immediately entail the existence of smaller large set notions. For example, it is well-known that it is possible to have a model of
$\mathsf {ZF}$
in which there is a non-principal, countably complete ultrafilter over
$\omega _1$
. To obtain the consistency of inaccessible cardinals from this, one is then obliged to work in an inner model, which adds additional complexity to the arguments.
Unlike ultrafilters, the embedding characterization will give direct, positive results on large sets even when working in a weak constructive setting. For example, we will see in Corollary 5.4 that, over
$\mathsf {IKP}$
, if K is a transitive set (which satisfies some minor additional assumptions) which is a ‘critical point’ of an elementary embedding
$j \colon V\to M$
, then K is a model of
$\mathsf {IZF+pIEA}$
and more. Therefore, we will use the elementary embedding characterizations to access the notions of measurable cardinals and stronger principles.
4.1 Critical sets and Reinhardt sets
In this section, we will give the basic tools that we shall need to work with elementary embeddings. When working with some embedding
$j \colon V \rightarrow M$
we will not in general be assumed that either j or M is definable in a first-order manner. This means that we cannot state Collection or Separation for formulas involving j or M in a purely first-order way. Therefore, to deal with them, we will need to expand our base theory to accommodate class parameters. This leads to the following convention, whose main purpose is to simplify later notation.
Convention 4.1. Let T be a ‘reasonable’ first-order set theory over the language
$\{\in \}$
such as
$\mathsf {CZF}$
,
$\mathsf {IKP}$
, or
$\mathsf {IZF}$
and let A be a class predicate. The theory
$T_{A}$
is the theory with the same axiom schemes as T in the language expanded to include the predicate A. For example,
$\mathsf {CZF}_A$
has the axiom schemes of A plus
$\Delta _0$
-Separation, Set Induction, Strong Collection, and Subset Collection in the expanded language
$\{ \in , A \}$
.
Definition 4.2. We work over the language
$\in $
extended by a unary functional symbol j and a unary predicate symbol M. Let T be a ‘reasonable’ set theory such as
$\mathsf {CZF}$
,
$\mathsf {IKP}$
, or
$\mathsf {IZF}$
and suppose that
$V \models T$
. We say that
$j \colon V \rightarrow M$
is a
$($
fully
$)$
elementary embedding if it satisfies the following:
-
1. j is a map from V into M;
$\forall x M(j(x))$ ,
-
2. M is transitive;
$\forall x (M(x) \rightarrow \forall y \in x M(y))$ ,
-
3.
$M \models T$ ,
-
4. (Elementarity)
$\forall \vec {x} [\phi (\vec {x})\leftrightarrow \phi ^M(j(\vec {x}))]$ for every
$\in $ -formula
$\phi $ , where
$\phi ^M$ is the relativization of
$\phi $ over M.
For a class of formulas Q (usually
$Q=\Delta _0$
or
$\Sigma $
), we call
$j\ Q$
-elementary if j satisfies the above definition with (4) restricted to
$\phi $
a Q-formula. Finally, we say that K is a critical point of j if it is a transitive set that satisfies
$K \in j(K)$
and
$j(x)=x$
for all
$x\in K$
.
We begin with the weak theory of the Basic Theory of Elementary Embeddings, originally introduced by Corazza in [Reference Corazza13]. This is the minimal theory required to state that
$j \colon V \rightarrow V$
is an elementary embedding with a critical point but without assuming anything about the relevant Separation, Collection, or Set Induction schemes over the extended language. For our purposes, it will in fact be beneficial to weaken this further to
$\mathsf {BTEE}_M$
which is the minimal theory stating that
$j \colon V \rightarrow M$
is an elementary embedding with a critical point for some
$M \subseteq V$
.
We shall see in Section 5 that
$\mathsf {IKP}_j$
with a critical point implies the consistency of
$\mathsf {IZF + BTEE}$
and in Section 6 that the latter is equiconsistent with
$\mathsf {ZF} + \mathsf {BTEE}$
, which already has a quite high consistency strength.
Definition 4.3. We work over the language
$\in $
extended by a unary functional symbol j and a unary predicate symbol M. Let
$\mathsf {BTEE}_M$
be the assumption that there exists some transitive class
$M \subseteq V$
and elementary embedding
$j\colon V\rightarrow M$
which has a critical point. We drop M and simply use
$\mathsf {BTEE}$
when
$V=M$
holds, that is,
$\forall x M(x)$
. In addition, continuing our previous notation, given a class of formulas Q we let Q-
$\mathsf {BTEE}_M$
denote
$\mathsf {BTEE}_M$
with elementarity replaced by Q-elementarity.
Remark 4.4. If
$V \models T + \mathsf {BTEE}_M$
(where T is a reasonable set theory) we will always assume that
$V \models T_M$
, that is we will assume that M is allowed to appear in the axiom schemes of T.
Classically,
$\mathsf {BTEE}$
is a relatively weak principle which is implied by the large cardinal axiom
$0^\#$
, which is the statement that there is a non-trivial elementary embedding
$j \colon L \rightarrow L$
. The reason this is weak is because we do not have Separation in the extended language which means that we cannot produce sets of the form
$j \operatorname {\mathrm {\upharpoonright }} a$
for arbitrary sets a.
Convention 4.5. For T a reasonable theory, we will slightly abuse the notation
$T_{j, M}$
to extend it to be the theory T plus a fully elementary embedding
$j \colon V \rightarrow M$
which has a critical point and which satisfies the axiom schemes of T in the language expanded to include predicates for j and M. In particular, we will allow j and M to appear in the Separation Schemes of T (for example,
$\Delta _0$
-Separation if
$T = \mathsf {CZF}$
), Set Induction and appropriate Collection axioms of T (for example,
$\Sigma $
-Collection if
$T=\mathsf {IKP}$
or Strong Collection and Subset Collection if
$T=\mathsf {CZF}$
).
Over
$\mathsf {IKP}_j$
, being a critical point of an elementary embedding is sufficient to prove some of the basic properties one would want from the theory of elementary embeddings. For example, one can show that if there is a critical point then there is (at least one) critical point which is an ordinal (see Chapter 9 of [Reference Ziegler67] or Chapter 7 of [Reference Matthews38] for details) or that there is a model of (first-order)
$\mathsf {IZF}$
.
However, such an assumption is not necessarily the most natural counterpart to the classical notion of non-trivial elementary embeddings. For example, consider
$\mathsf {ZFC}$
plus an elementary embedding
$j \colon V \rightarrow M$
with critical point
$\kappa $
. Then
$L_\kappa $
can easily be seen to be a critical point as defined above; however, it is the ‘wrong’ set to work within this context because it is not a model of second-order
$\mathsf {ZFC}$
. In particular, one can show that
$\mathcal {P}(\omega )^L$
is merely a countable set. Instead, what one wants to work with is a critical point which is an inaccessible set, and this is what we shall define next. Note that in weak theories there is no reason that having a critical point will imply the existence of such a set. For example, it is proven in Chapter 10 of [Reference Matthews38] that it is possible to produce a model of
$\mathsf {ZFC}^-$
with a non-trivial elementary embedding, while also having that
$\mathcal {P}(\omega )$
is a proper class, and in such a model there will be no set sized models of second-order
$\mathsf {ZFC}^-$
.
Now let us define critical sets and Reinhardt sets in terms of elementary embeddings:
Definition 4.6 (
$T_{j, M}$
).
Let K be a critical point of an elementary embedding
$j \colon V\to M$
. We call K a critical set if K is also inaccessible. If
$M=V$
and K is transitive and inaccessible, then we call K a Reinhardt set.
We introduce one final definition which is intermediate between
$\mathsf {BTEE}$
and Reinhardt sets; the Wholeness Axiom. This was first formulated by Corazza [Reference Corazza12] and later stratified by Hamkins [Reference Hamkins28] in order to investigate what theory was necessary to produce the Kunen Inconsistency under
$\mathsf {ZFC}$
.
Definition 4.7. The Wholeness axiom
$\mathsf {WA}$
is the combination of
$\mathsf {BTEE}$
and Separation
j
.
$\mathsf {WA}_0$
is the combination of
$\mathsf {BTEE}$
and
$\Delta _0$
-Separation
j
.
We will use the term critical point and critical set simultaneously, so the reader should distinguish the difference between these two terms. (For example, a critical point need not be a critical set unless it is inaccessible.) Note that the definition of critical sets is different from that suggested by Hayut and Karagila [Reference Hayut and Karagila29]: they defined a critical cardinal as a critical point of an elementary embedding
$j \colon V_{\kappa +1}\to M$
for some transitive set M. This is done to ensure that the embedding j is a set and therefore first-order definable. We will instead take the more natural, but not obviously first-order definable, definition where the domain is taken to be the universe, which is what Schlutzenberg calls V-critical in [Reference Schlutzenberg58]. Finding and analyzing the
$\mathsf {CZF}$
-definition of a critical set that is classically equivalent to a critical set in the style of Hayut and Karagila would be good future work. Also, [Reference Ziegler67] uses the term ‘measurable sets’ to denote critical sets, but we will avoid this term for the following reasons: it does not reflect that the definition is given by an elementary embedding, and it could be confusing with measurable sets in measure theory.
We do not know if every elementary embedding
$j \colon V\to M$
over
$\mathsf {CZF}$
enjoys being cofinal. Surprisingly, the following lemma shows that j becomes a cofinal map if
$M=V$
. Note that the following lemma heavily uses Subset Collection. See Theorem 9.37 of [Reference Ziegler67] for its proof.
Proposition 4.8 (Ziegler [Reference Ziegler67],
$\mathsf {CZF}_j$
).
Let
$j \colon V\to V$
be a non-trivial elementary embedding. Then j is cofinal, that is, we can find y such that
$x\in j(y)$
for each x.
Note that [Reference Ziegler67] uses the term set cofinality to denote our notion of cofinality. However, we will use the term cofinality to harmonize the terminology with that of [Reference Matthews39]. Ziegler’s theorem on the cofinality of Reinhardt embedding was an early indication of the high consistency strength of
$\mathsf {CZF}$
with a Reinhardt set. This is because [Reference Matthews39] shows that a related weak set theory, namely
$\mathsf {ZFC}^-$
with the
$\mathsf {DC}_\mu $
-scheme for all cardinals
$\mu $
, already proves there is no cofinal non-trivial elementary embedding
$j \colon V\to V$
.
The following lemma, due to Ziegler [Reference Ziegler67], has an essential role in the proof of Proposition 4.8 and developing properties of large large sets. We note here that the proof will not require V to satisfy any of our axioms in the language expanded with j or for the embedding to be fully elementary and therefore the Lemma is also provable over weaker theories such as
$\mathsf {IKP + BTEE}$
. Note that, when working without Power Set,
$\mathcal {P}(\mathcal {P}(a))$
should be seen as an abbreviation for the class consisting of those sets whose elements are all subsets of a.
Lemma 4.9 (
$\mathsf {IKP} + \Delta _0\text {-}\mathsf {BTEE}_M$
).
Assume that
$j \colon V\to M$
is a
$\Delta _0$
-elementary embedding and K is a transitive set such that
$j(x)=x$
for all
$x\in K$
. Then we have the following results for each
$a\in K$
:
-
1. If
$b\subseteq a$ , then
$j(b)=b$ .
-
2. If
$b\subseteq \mathcal {P}(a)$ , then
$j(b)=b$ .
-
3. If
$b\subseteq \mathcal {P}(\mathcal {P}(a))$ , then
$j(b)=b$ .
Furthermore, if we can apply the induction to j-formulas, then we can show that
$j(b)=b$
for all
$b\subseteq \mathcal {P}^n(a)$
for each
$n\in \omega $
, where
$\mathcal {P}^n(a)$
is the
$n^{th}$
application of the power set operator to a.
Proof
-
1. If
$x\in b$ , then
$x\in K$ by transitivity of K, so
$x=j(x)\in j(b)$ . Hence
$b\subseteq j(b)$ .
Conversely,
$x\in j(b)\subseteq j(a)=a$ implies
$x\in K$ , so
$j(x)=x$ . Hence
$j(x)\in j(b)$ , and we have
$x\in b$ . This shows
$j(b)\subseteq b$ .
-
2. If
$x\in b$ , then
$x\in \mathcal {P}(a)$ . Hence
$j(x)=x$ by the previous claim, and we have
$x=j(x)\in j(b)$ . In sum,
$b\subseteq j(b)$ .
Conversely, suppose
$x\in j(b)$ . Observe that
$\forall t \in b ( t \subseteq a)$ and this is
$\Delta _0$ . Thus, by elementarity,
$x \subseteq j(a) = a$ .
Hence by the previous claim,
$x=j(x)$ . This shows
$j(x)\in j(b)$ , so
$x\in b$ . Hence
$j(b)\subseteq b$ .
The proof of the remaining case is identical, so we omit it.
4.2 Large set axioms beyond choice
As before, there is no reason why we should stop at Reinhardt sets. While Reinhardt cardinals are incompatible with
$\mathsf {ZFC}$
, it is known that all proofs of this use the Axiom of Choice in an essential way. This has led to the study of stronger large cardinals in the
$\mathsf {ZF}$
context, the most notable example being [Reference Bagaria, Koellner and Woodin10]. We will focus on the concept of super Reinhardt sets and totally Reinhardt sets. However, before we do so, we note a technical difficultly regarding truth predicates and first-order definability, which is why we will need to work in an infinitary second-order theory.
Using [Reference Gaifman18, Theorem 1] or [Reference Kanamori33, Proposition 5.1], in
$\mathsf {ZF}$
, an elementary map
$j \colon V\to M$
is fully elementary if it is
$\Sigma _1$
-elementary. Unfortunately, in our context, we do not know whether
$\Sigma _n$
-elementary embeddings are fully elementary even if n is sufficiently large.
During this subsection, we will fix an enumeration
$\langle \phi _{n}(X,\vec {x})\mid n\in \mathbb {N}\rangle $
of all first-order formulas over the language of set theory with one class variable X.
Definition 4.10 (
$\mathsf {CGB}_\infty $
).
Let A be a class. A class j is an
$A-(fully)$
elementary embedding from V to M if j is a class function and
$\bigwedge _{n\in \mathbb {N}}\forall \vec {x}[\phi _n(A,\vec {x})\leftrightarrow \phi ^M_{n}(A\cap M,j(\vec {x}))]$
. We simply call j elementary if
$A=V$
.
Definition 4.11 (
$\mathsf {CGB}_\infty $
).
An inaccessible set K is super Reinhardt if for every set a there is an elementary embedding
$j \colon V\to V$
such that K is a critical point of j and
$a\in j(K)$
.
More generally, an inaccessible set K is A-super Reinhardt for an amenable A if for each set a we can find an A-elementary embedding
$j \colon V\to V$
whose critical point is K such that
$a\in j(K)$
.
The reader should notice that we have formulated critical sets and Reinhardt sets over
$T_{j,M}$
where T was some general, unspecified (first-order) theory, whereas the definition of super Reinhardt sets in formulated in the specified second-order theory of
$\mathsf {CGB}_\infty $
. Some readers could think that formulating criticalness and Reinhardtness over
$\mathsf {CGB}_\infty $
would be more natural, as Bagaria–Koellner–Woodin formulated Reinhardt cardinals over
$\mathsf {GB}$
instead of
$\mathsf {ZF}_j$
. It will turn out that sticking to the more first-order formulation of criticalness and Reinhardtness is better for the following technical reason. We will take a double-negation interpretation based on Gambino’s Heyting-valued interpretation [Reference Gambino20] over first-order constructive set theories in Section 6. It is harder to extend Gambino’s interpretation to a second-order set theory than extending it to
$\mathsf {CZF}_{j,M}$
because the former requires extending Gambino’s forcing to all classes and defining the interpretation of second-order quantifiers, while the latter does not. However, we cannot avoid the full second-order formulation of super Reinhardtness, unlike we did for Reinhardtness, because its definition asks for class many elementary embeddings.
Our definition of A-super Reinhardtness is different from that of Bagaria–Koellner–Woodin [Reference Bagaria, Koellner and Woodin10] because they required j to satisfy
$j^+[A] :=\bigcup _{x\in V} j(A\cap x)$
is equal to A instead of A-elementarity. It turns out by Lemma 4.14 that our definition is stronger than that of Bagaria–Koellner–Woodin in our constructive context, but they are equivalent in the classical context.
Lemma 4.12 (
$\mathsf {CGB}$
).
Let
$\phi (X, x_0,\ldots , x_n)$
be a
$\Delta ^X_0$
-formula whose free variables are all expressed. If A is an amenable class and if we take
$a = \operatorname {TC}(\{x_0,\ldots ,x_n\})$
, then

Proof The proof proceeds by induction on
$\phi $
.
-
• Assume that
$\phi $ is an atomic formula. The only non-trivial case is
$x\in A$ , and it is easy to see that
$x\in A\leftrightarrow x\in \operatorname {TC}(\{x\})\cap A$ .
-
• The cases for logical connectives are easy to check.
-
• Assume that
$\phi (A,x_0,\ldots , x_n)$ is ∀y ∈ x 0 ψ(A, y, x 0, …, x n ), and assume that
$$ \begin{align*} \psi(A,y,x_0,\ldots, x_n)\leftrightarrow \psi(A\cap f(y),x_0,\ldots, x_n) \end{align*} $$
$f(y)=\operatorname {TC}(\{y,x_0,\ldots , x_n\})$ . If
$y\in x_0$ , then
$f(y)=a=\operatorname {TC}(\{x_0,\ldots , x_n\})$ . Hence
$$ \begin{align*} \forall y\in x_0 \psi(A,y,x_0,\ldots, x_n) \leftrightarrow \forall y\in x_0 \psi(A\cap a,y,x_0,\ldots, x_n). \end{align*} $$
$\exists y\in x_0\psi (y,x_0,\ldots , x_n)$ is also similar.
Remark 4.13. It is worthwhile to note that if K is a super Reinhardt set then it is also a Reinhardt set, in the sense that it is a critical point for some elementary embedding
$j \colon V \rightarrow V$
for which
$\langle V, j \rangle $
satisfies every axiom of
$\mathsf {CZF}_j$
. This is not obvious because, while
$\mathsf {CGB}_\infty $
satisfies Class Set Induction and Class Strong Collection, it does not include a Class Separation axiom. Therefore, for an arbitrary class A, there is no reason why
$\Delta _0^A$
-Separation should hold. To circumvent this issue let
$j \colon V \rightarrow V$
be an elementary embedding such that
$K \in j(K)$
(and everything in K is fixed by j). We observe that, by Lemma 2.32, j is an amenable class and therefore we can apply the above lemma from which one can conclude that
$\Delta _0^j$
-Separation holds.
Lemma 4.14 (
$\mathsf {CGB}_{\infty }$
).
Let A be an amenable class and
$j \colon V \rightarrow V$
be a class function.
-
1. If j is fully elementary, then j is
$\Delta ^A_0$ -elementary if and only if
$j^+[A]=A$ .
-
2. If
$\mathsf {LEM}$ holds, then every
$\Sigma _1\cup \Delta ^A_0$ -elementary embedding j is elementary for all A-formulas.
Proof
-
1. Assume that j is elementary for
$\Delta ^A_0$ -formulas. We first show that
$j(A\cap x)=A\cap j(x)$ : we know that
$\forall y [y\in A\cap x \leftrightarrow y\in A \land y\in x]$ . We can view this sentence as a conjunction of two
$\Delta ^A_0$ -formulas, so we have
$$ \begin{align*} \forall y [y\in j(A\cap x) \leftrightarrow y\in A \land y\in j(x)]. \end{align*} $$
$j^+[A]=\bigcup _{x\in V} j(A\cap x) = \bigcup _{x\in V} A \cap j(x) = A$ . (The last equality holds by the cofinality of j, Proposition 4.8.)
Conversely, assume that
$j^+[A]=A$ holds and let
$\phi (X,x_0,\ldots ,x_n)$ be a
$\Delta ^X_0$ -formula with a unique class variable X, all of whose free variables are displayed. By Lemma 4.12,
$a=\operatorname {TC}(\{x_0,\ldots , x_n\})$ satisfies
$$ \begin{align*} \phi(A,x_0,\ldots, x_n)\leftrightarrow \phi(A\cap a,x_0,\ldots, x_n). \end{align*} $$
$A \cap a$ is a set by amenability, we have
$$ \begin{align*} \phi(A\cap a,x_0,\ldots,x_n)\leftrightarrow \phi(j(A\cap a),j(x_0),\ldots, j(x_n)). \end{align*} $$
$\forall y\in j(a)[y\in j^+[A] \leftrightarrow y\in j(A\cap a)]$ holds. Furthermore, every bounded variable y appearing in
$\phi $ is bounded by a. Hence
$$ \begin{align*} \phi(j(A\cap a),j(x_0),\ldots, j(x_n)) \leftrightarrow \phi(j^+[A],j(x_0),\ldots, j(x_n)). \end{align*} $$
$j^+[A]=A$ , we finally have
$\phi (A,x_0,\ldots , x_n)\leftrightarrow \phi (A, j(x_0),\ldots , j(x_n))$ .
-
2.
$\mathsf {LEM}$ implies our background theory is
$\mathsf {GB}$ . We follow Kanamori’s proof [Reference Kanamori33, Proposition 5.1(c)] that every
$\Sigma _1$ -elementary cofinal embeddings over
$\mathsf {ZF}$ is fully elementary.
Assume that we have the
$\Sigma _n^A$ -elementarity of j. Consider the A-formula
$\exists x\phi (A,x,a)$ , where
$\phi $ is a
$\Pi ^A_n$ -formula. Then
$\phi (A,x,a)$ implies
$\phi (A,j(x),j(a))$ , and we have
$\exists y \phi (A,y,j(a))$ .
Conversely, assume that we have
$\exists x \phi (A,x,j(a))$ . Since j is cofinal by Proposition 4.8, we can find
$\alpha $ such that
$x\in V_{j(\alpha )}$ . Since j is elementary,
$V_{j(\alpha )} = j(V_\alpha )$ and hence
$\exists x\in j(V_\alpha )\phi (A,x,j(a))$ holds. Note that, by using second-order Collection for formulas using the class A, if
$\theta $ is
$\Sigma _n^A$ then so is
$\forall x \in a \theta $ and, since we have a classical background theory, by taking negations, if
$\theta $ is
$\Pi _n^A$ then so is
$\exists x \in a \theta $ and thus the formula
$\exists x\in j(V_\alpha )\phi (A,x,j(a))$ is
$\Pi ^A_n$ . So, by
$\Sigma ^A_n$ -elementarity, we have
$\exists x\in V_\alpha \phi (A,x,a)$ .
The upshot of the second claim in the above lemma is that
$\mathsf {GB}$
proves that given a fully elementary embedding, it is A-elementary if it is
$\Delta ^A_0$
-elementary. Furthermore, the condition
$j^+[A]=A$
is equivalent to
$\Delta ^A_0$
-elementarity. Hence Lemma 4.14 justifies that our definition of super Reinhardtness is a reasonable constructive formulation of that of Bagaria–Koellner–Woodin.
We also note here that we can weaken the assumption in the second clause of Lemma 4.14: namely,
$\mathsf {GB}$
proves every cofinal
$\Delta ^A_0$
-elementary embedding
$j\colon V\to V$
is fully elementary for A-formulas.
We end this section with one final large set axiom which we will see suffices to bring the constructive theory into proof-theoretic equilibrium with its classical counterpart.
Definition 4.15.
V is totally Reinhardt (
$\mathsf {TR}$
) is the following statement: for every class A, there is an A-super Reinhardt set.
5 How strong are large large sets over
$\mathsf {CZF}$
?
In this section, we perform the preparatory work needed to derive a lower bound for the consistency strength of various large large set axioms over
$\mathsf {CZF}$
. We will show that over
$\mathsf {IKP}$
, a critical point with a moderate property is a model of
$\mathsf {IZF+pIEA}$
. We will see that we can find a model of a stronger extension of
$\mathsf {IZF}$
under a large large set axioms over
$\mathsf {CZF}$
.
One might ask why we focus on deriving the models of
$\mathsf {IZF}$
with large set properties. The reason is that deriving consistency strength from
$\mathsf {CZF}$
is difficult: double-negation translation does not behave well over
$\mathsf {CZF}$
with large set axioms. On the contrary,
$\mathsf {IZF}$
or its extensions go well with double-negation translations.
5.1 The strength of elementary embeddings over
$\mathsf {BTEE}_M$
The aim of this subsection is to derive a lower bound for the consistency strength of an elementary embedding
$j\colon V\to M$
with a critical point. One amazing consequence of Lemma 4.9 is that the existence of a critical set is already quite strong, compared to small large set axioms over
$\mathsf {CZF}$
. As in the lemma, we remark here that we do not require V to satisfy any axiom in the expanded language and therefore the proof also works over
$\mathsf {IKP} + \Delta _0\text {-} \mathsf {BTEE}_M$
.
Theorem 5.1 (
$\mathsf {IKP} + \Delta _0\text {-} \mathsf {BTEE}_M$
).
Let K be a transitive set such that
${K\models \Delta _0\text {-}\mathsf { Sep}}$
,
$\omega \in K$
and let
$j \colon V\to M$
be a
$\Delta _0$
-elementary embedding whose critical point is K. Then
$K\models \mathsf {IZF}$
.
Proof
K satisfies Extensionality and Set Induction since K is transitive. We only prove that Power Set, Full Separation, and Collection are valid over K since the validity of other axioms is not hard to prove. The focal fact of the proof is that
$j(K)$
is also a transitive model of
$\Delta _0$
-Separation and
$K\in j(K)$
.
-
1. Power Set: let
$a\in K$ , and define b as
$b=\{c\in K \mid c\subseteq a\}$ . We can see that
$b\in j(K)$ and
$b=j(b)$ by Lemma 4.9. Hence
$j(b)\in j(K)$ , and thus we have
$b\in K$ .
It remains to show that K thinks b is a power set of a. We claim that
$K\models \forall x (x\in b\leftrightarrow x\subseteq a)$ , which is equivalent to
$\forall x\in K (x\in b\leftrightarrow x\subseteq a)$ , but this is obvious from the definition of b.
-
2. Full Separation: let
$a\in K$ and
$\phi (x,p)$ be a first-order formula with a parameter
$p\in K$ . Observe that the relativization
$\phi ^K(x,p)$ is
$\Delta _0$ , so
$b=\{x\in a\mid \phi ^K(x,p)\}\in j(K)$ . Since
$b\subseteq a$ , we have
$j(b)=b$ . Hence
$b\in K$ .
It suffices to show that K thinks b witnesses this instance of Separation for
$\phi $ and a. Formally, it means
$\forall x\in K [x\in b\leftrightarrow x\in a\land \phi ^K(x,p)]$ holds, which trivially holds by the definition of b.
-
3. Collection: Since K models Full Separation, by
$\Delta _0$ -elementarity of j, we also have
$j(K)$ satisfies Full Separation. Now, let
$\phi (x,y,p)$ be a formula and
$a,p\in K$ , and suppose that
$\forall x\in a \exists y\in K \phi ^K(x,y,p)$ holds. Now, for each
$x\in a$ and
$y\in K$ , since everything is fixed by j, we have
$$ \begin{align*} V\models \phi^K(x,y,p) ~ \implies ~ M\models \phi^{j(K)} (x,y,p). \end{align*} $$
$M\models \forall x\in a\exists y\in K \phi ^{j(K)}(x,y,p)$ . Define
$b=\{y\in K \mid \exists x\in a\ \phi ^{j(K)}(x,y,p)\}\in j(K)$ , then b witnesses
$M\models \exists b\in j(K) \forall x\in a \exists y\in b\ \phi ^{j(K)}(x,y,p)$ . Thus, by elementarity, we have
$\exists b\in K\forall x\in a\exists y\in b \ \phi ^K(x,y,p)$ . Namely, K satisfies this instance of Collection.
Hence the existence of an elementary embedding with a critical point is already tremendously strong compared to
$\mathsf {CZF}$
. We will see in Section 6 that
$\mathsf {IZF}$
interprets classical
$\mathsf {ZF}$
, so the existence of a critical set over
$\mathsf {CZF}$
is stronger than
$\mathsf {ZF}$
. However, it turns out that the hypotheses given in Theorem 5.1 proves the critical point K satisfies not only
$\mathsf {IZF}$
, but also large set axioms. For example, we next show that K is a model of
$\mathsf {IZF+pIEA}$
. The proof begins with the following lemmas:
Lemma 5.2 (
$\mathsf {IKP} + \Delta _0\text {-} \mathsf {BTEE}_M$
).
Assume that K satisfies the conditions given in the hypotheses of Theorem 5.1. If
$a\in K$
, then
$\mathcal {P}(a)\cap K = \mathcal {P}(a)\cap j(K)$
.
Proof Let
$a\in K$
,
$b\in j(K)$
and
$b\subseteq a$
. By Lemma 4.9, we have
$b\in K$
.
Lemma 5.3 (
$\mathsf {IKP} + \Delta _0\text {-} \mathsf {BTEE}_M$
).
Under the hypotheses for K given in Theorem 5.1,
$j(K)$
thinks K is power inaccessible.
Proof We proved that K is a model of
$\mathsf {IZF}$
and
$\mathcal {P}(a) \cap K = \mathcal {P}(a) \cap j(K)$
for any
$a \in K$
. Thus it suffices to show that
$j(K)$
believes K is regular. That is,

The proof employs basically the same argument as the proof that K satisfies Collection given in Theorem 5.1: let
$a\in K$
,
$R\in j(K)$
, and suppose that
$R\colon a\rightrightarrows K$
. Then

Now let
$b=\{y\in K \mid \exists x\in a \ \langle x,y\rangle \in j(R)\}\in M$
. Since b is a subset of
$K\in j(K)$
, we have
$b\in j(K)$
. Furthermore, b witnesses
. Hence, by elementarity, there exists
$b \in K$
such that
.
Corollary 5.4 (
$\mathsf {IKP} + \Delta _0\text {-}\mathsf {BTEE}_M$
).
Under the hypotheses for K given in Theorem 5.1, K satisfies
$\mathsf {pIEA}$
.
Proof By Lemma 5.2,
$j(K)$
believes K is power inaccessible. Fix
$a\in K$
, then
$b=K$
witnesses the following statement:

Hence by elementarity, there is
$b\in K$
such that
$a\in b$
and K believes b is power inaccessible. Since a is arbitrary, we have
$K \models \mathsf {IZF + pIEA}$
.
In fact, we can derive more: we can show that not only does K satisfy
$\mathsf {pIEA}$
, but K also satisfies
$\forall x\exists y[x\in y\land \Phi (y)]$
, where
$\Phi $
is any large set properties such that
$j(K) \models \Phi (K)$
. Especially,
$\Phi $
can be a combination of power inaccessibility with Mahloness or 2-strongness. (See [Reference Rathjen46] or [Reference Ziegler67] for the details of Mahlo sets or 2-strong sets.)
5.2 Iterating j to a critical point
In the classical context, one usually works with
$\lambda =j^\omega (\kappa )=\sup _{n<\omega }j^n(\kappa )$
for
$\kappa =\operatorname {crit} j$
when they study large cardinals at the level of rank-into-rank embeddings and beyond. We can see in the classical context that if
$j\colon V\to M$
is elementary embedding with
$\kappa =\operatorname {crit} j$
, then
$V_\lambda $
is still a model of
$\mathsf {ZFC}$
with some large cardinal properties.
We may expect the same in a constructive manner, but defining the analogue of
$\lambda =j^\omega (\kappa )$
cannot be done over
$\mathsf {IKP}+\Sigma \text {-} \mathsf {BTEE}_M$
. The problem is that the definition of
$\lambda $
requires defining
$\langle j^n(\kappa ) \mid n\in \omega \rangle $
, which requires recursion for j-formulas. It turns out that we can define this sequence if we have Set Induction for
$\Sigma ^{j, M}$
-formulas. Moreover, even if we have this sequence, we require an instance of
$\Sigma ^{j, M}$
-Replacement to turn the sequence into a set, which is when we will have to start working in
$\mathsf {IKP}_{j,M}$
.
Definition 5.5 (
$\mathsf {IKP}+\Sigma \text {-} \mathsf {BTEE}_M$
).
Let
$j\colon V\to M$
be an elementary embedding with a critical point K. Following Corazza [Reference Corazza13, (2.7)–(2.9)], let
$\Theta $
and
$\Phi $
be the following formulas:

Informally speaking,
$\Theta (f,n,x,y)$
states that f is a function with domain
$n + 1$
computing
$j^n(x)=y$
, and
$\Phi (n,x,y)$
asserts that
$j^n(x)=y$
. We can see that
$\Theta $
is
$\Delta ^{j,M}_0$
and
$\Phi $
is
$\Sigma ^{j,M}$
. A careful analysis will show that
$\Phi $
allows a
$\Pi ^{j,M}$
-formulation, so
$\Phi $
is actually
$\Delta ^{j,M}$
. However, this fact is irrelevant in our context.
Lemma 5.6 (Corazza,
$\mathsf {IKP}+\Sigma \text {-} \mathsf {BTEE}_M+\Sigma ^{j,M}\text {-}\text {Set Induction}$
).
-
1. For all
$n\in \omega $ and x, y, there is at most one f such that
$\Theta (f,n,x,y)$ holds. That is,
$$ \begin{align*} \forall n\in\omega \forall x,y,f,g\ [\Theta(f,n,x,y)\land \Theta(g,n,x,y)\to f=g]. \end{align*} $$
-
2.
$\Phi $ defines a function. That is,
$\forall n\in \omega \forall x \exists ! y \Phi (n,x,y)$ .
Proof
-
1. Fix
$x,y,f,g$ and assume that both
$\Theta (f,n,x,y)$ and
$\Theta (g,n,x,y)$ hold. To be precise, we apply Set Induction to the following formula:
$$ \begin{align*} \theta(i) \equiv [i\le n\to f(i)=g(i)]. \end{align*} $$
$\forall j\in i \ \theta (j)$ holds. Then we can see that
$f(i) = j(f(i-1))=j(g(i-1))=g(i)$ holds. In other words, we have
$\theta (i)$ . By Set Induction for
$\Delta _0$ -formulas, we have
$\forall i \ \theta (i)$ , so
$f=g$ .
-
2. For uniqueness, assume that we have
$\Phi (n,x,y_0)$ and
$\Phi (n,x,y_1)$ . By the previous claim, we can see that for the same f,
$\Theta (f,n,x,y_0)$ and
$\Theta (f,n,x,y_1)$ hold. Hence
$y_0=f(n)=y_1$ .
For existence, we claim that
$\forall n\in \omega \forall x \exists y,f\ \Theta (f,n,x,y)$ . This uses Set Induction for
$\Sigma ^{j,M}$ -formulas. Fix x and let
$$ \begin{align*} \phi(n, x) \equiv n\in\omega \to \exists y\exists f\ \Theta(f,n,x,y). \end{align*} $$
$\forall i\in n \ \phi (i, x)$ holds. Suppose that y and f witnesses
$\Theta (f,n-1,x,y)$ , then we can extend f to
$f'$ by setting
$f':=f\cup \{\langle n, j(y) \rangle \}$ . It is immediate that
$\Theta (f',n,x,y)$ . Hence, by Set Induction for
$\Sigma ^{j,M}$ -formulas, we have
$\forall n \phi (n,x)$ .
Remark 5.7. When working with elementary embeddings derived from ultrafilters in
$\mathsf {ZFC}$
one can show that the ultrapower construction can be iterated. That is, one starts with
$j \colon V \rightarrow M_0$
and shows that for every n there is a transitive class
$M_{n+1}$
and an embedding
$i_n \colon M_n \rightarrow M_{n + 1}$
. Then one can show that
$j^2 = i_0 \circ j \colon V \rightarrow M_1$
is an elementary embedding with a critical set. However, this argument does not go through in our weaker context. In particular, it is unclear what the codomain of
$j^2$
should be and therefore why
$j^2$
should be an elementary embedding. Therefore, for our purposes, in general,
$j^n(x)$
should be a formal object which is the result of applying j to
$x n$
times. On the other hand, one should note that if j is restricted to some set A then it is possible to iterate the (set) embedding
$j \operatorname {\mathrm {\upharpoonright }} A \colon A \rightarrow j(A)$
.
Hence we can define the constructive analogue of
$\lambda =j^\omega (\kappa )$
, by using an instance of
$\Sigma ^{j,M}$
-Replacement:
Definition 5.8 (
$\mathsf {IKP}_{j,M}$
).
Let
$j\colon V\to M$
be a
$\Delta _0$
-elementary embedding with a critical point K. By Lemma 5.6 and Replacement for
$\Sigma ^{j,M}$
-formulas,
$\langle j^n(K)\mid K\in \omega \rangle $
is a set. Define
$\Lambda := \bigcup _{n\in \omega } j^n(K)$
.
Remark 5.9. In order to define
$\Lambda $
in the above definition it is important that V satisfies at least
$\Sigma ^{j,M}$
-Replacement and
$\Sigma ^{j,M}$
-Induction for
$\omega $
. One can see that the class function sending n to
$j^n(K)$
is
$\Sigma ^{j,M}$
-definable and that
$\Sigma ^{j,M}$
-Induction then implies that this function is total. Finally, an instance of
$\Sigma ^{j,M}$
-Replacement gives us that
$\{ j^n(K) \mid n \in \omega \}$
is a set, and therefore so is its union. We refer to the end of Section 2 of [Reference Corazza13] or Section 6.3 of [Reference Matthews38] for details of this and what issues can potentially arise without assuming any induction in the extended language.
Remark 5.10. In the above definition, we keep talking about
$\Delta _0$
-elementary embedding while we work over
$\mathsf {IKP}_{j,M}$
, which technically includes the full elementarity of j as an axiom. Hence we actually work with a weaker subtheory of
$\mathsf {IKP}_{j,M}$
, which is obtained by weakening the elementarity scheme to
$\Delta _0$
-formulas. This fact becomes important in Section 5.3 where we work in
$\mathsf {IKP}$
with a
$\Sigma $
-elementary embedding
$j\colon V\to M$
.
However, we will continue to refer to the background theory
$\mathsf {IKP}_{j,M}$
when this does not cause confusion. To emphasize, we work with the Set Induction scheme over the extended language as well as the
$\Delta _0^{j,M}$
-Separation and the
$\Sigma ^{j,M}$
-Collection schemes.
Now we can see that K is an elementary submodel of
$\Lambda $
:
Lemma 5.11 (
$\mathsf {IKP}_{j,M}$
).
Let
$j \colon V\to M$
be an
$\Delta _0$
-elementary embedding with a critical point K, and assume that
$K\models \Delta _0\text {-} \mathsf {Sep}$
. Then K is an elementary submodel of
$\Lambda $
. That is, the following holds
$:$
for every formula
$\phi (\vec {x})$
(without j or M) all of whose free variable are expressed,

Proof We proceed with the proof by induction on formulas. Note that we can formulate our proof over
$\mathsf {IKP}$
since the truth predicate for transitive models is
$\Sigma $
-definable. Atomic cases and the cases for
$\land $
,
$\lor $
, and
$\to $
are trivial, and so we concentrate on cases for quantifiers.
-
• Case
$\forall $ : assume that
$\vec {a}\in K$ and
$\phi (x,\vec {a})$ is absolute between K and
$\Lambda $ . That is, assume that (1) holds for
$\phi $ . Obviously
$$ \begin{align*} V\models (\forall x\phi(x,\vec{a}))^\Lambda\to (\forall x\phi(x,\vec{a}))^K. \end{align*} $$
Conversely, assume that
$$ \begin{align*} V\models \forall x\in K \phi^K(x,\vec{a}). \end{align*} $$
$\Delta _0$ -expressible in V, by elementarity we have
$$ \begin{align*} M\models \forall x\in j(K) \phi^{j(K)}(x,\vec{a}). \end{align*} $$
$\forall x\in j(K) \phi ^{j(K)}(x,\vec {a})$ is bounded and M is a transitive subclass of V,
$\forall x\in j(K) \phi ^{j(K)}(x,\vec {a})$ is absolute between V and M, so
$$ \begin{align*} V\models \forall x\in j(K) \phi^{j(K)}(x,\vec{a}). \end{align*} $$
We can iterate j in a similar fashion, so we have
$$ \begin{align*} V\models \forall x\in j^n(K) \phi^{j^n(K)}(x,\vec{a}) \end{align*} $$
$n\in \omega $ . Here
$\vec {a}$ is unchanged because it is in K. Similarly, by elementarity and absoluteness of bounded formulas, we have from (1) that
$$ \begin{align*} V\models \forall n\in\omega \forall x,\vec{a}\in j^n(K) [\phi^{j^n(K)}(x,\vec{a})\leftrightarrow \phi^\Lambda(x,\vec{a})]. \end{align*} $$
Hence V thinks
$\phi ^\Lambda (x,\vec {a})$ for all
$n\in \omega $ and
$\vec {a}\in K$ . Since
$\Lambda =\bigcup _{n\in \omega } j^n(K)$ , we have
$\forall x\in \Lambda \phi ^\Lambda (x,\vec {a})$ .
-
• Case
$\exists $ : The proof is similar to the case for
$\forall $ . Assume the same conditions to
$\vec {a}$ and
$\phi (x,\vec {a})$ as we did before. Showing
$V\models (\exists x\phi (x,\vec {a}))^K\to (\exists x\phi (x,\vec {a}))^\Lambda $ is trivial.
For the converse, assume that
$V\models \exists x\in \Lambda \phi ^\Lambda (x,\vec {a})$ . Then we can find
$n\in \omega $ such that
$V\models \exists x\in j^n(K)\phi ^\Lambda (x,\vec {a})$ . We can prove
$\phi ^\Lambda (x,\vec {a})$ is equivalent to
$\phi ^{j^n(K)}(x,\vec {a})$ for
$x,\vec {a}\in j^n(K)$ , so
$V\models \exists x\in j^n(K) \phi ^{j^n(K)}(x,\vec {a})$ .
Then we have the desired result if
$n=0$ . If
$n>0$ , then by absoluteness of bounded formulas,
$M\models \exists x\in j^n(K) \phi ^{j^n(K)}(x,\vec {a})$ . By applying elementarity, we have
$V\models \exists x\in j^{n-1}(K) \phi ^{j^{n-1}(K)}(x,\vec {a})$ . Now we can see by repeating this argument that
$V\models \exists x\in K \phi (x,\vec {a})$ .
An upshot of Lemma 5.11 is that
$j^n(K)$
is an elementary submodel of
$\Lambda $
.Footnote
12
Especially, by applying this fact to Lemma 5.3, we have:
Lemma 5.12 (
$\mathsf {IKP}_{j,M}$
).
Assume that K satisfies the conditions given in the hypotheses of Theorem 5.1. Then
$\Lambda $
thinks K is a power inaccessible set.
Especially, we have an analogue of Lemma 5.2 between K and
$\Lambda $
:
Lemma 5.13 (
$\mathsf {IKP}_{j,M}$
).
Assume that K satisfies the conditions given in the hypotheses of Theorem 5.1. For
$n\in \omega $
and
$a\in j^n(K)$
, we have
$\mathcal {P}(a)\cap \Lambda =\mathcal {P}(a)\cap j^n(K)$
.
As a corollary of the previous results, we have:
Corollary 5.14 (
$\mathsf {IKP}_{j,M}$
).
Assume that K satisfies the conditions given in the hypotheses of Theorem 5.1. Then
$\Lambda $
satisfies
$\langle \Lambda ,j\operatorname {\mathrm {\upharpoonright }}\Lambda \rangle \models \mathsf {IZF+BTEE}+\text {Set Induction}_j$
.
Proof
$\Lambda $
is a model of
$\mathsf {IZF}$
by Lemma 5.11 and Theorem 5.1. Moreover,
$j\operatorname {\mathrm {\upharpoonright }} \Lambda :\Lambda \to \Lambda $
has a critical point
$K\in \Lambda $
, so
$ \langle \Lambda , j\operatorname {\mathrm {\upharpoonright }} \Lambda \rangle $
is a model of
$\mathsf {BTEE}$
. Since
$\Lambda $
is transitive and Set Induction for
$\Delta _0^{j,M}$
formulas holds, we have
$\langle \Lambda , j\operatorname {\mathrm {\upharpoonright }}\Lambda \rangle $
believes Set Induction for j-formulas holds.
Remark 5.15. The reader might notice that we always try to reveal where the given formula holds over, like
$V\models \phi $
or
$M\models \phi $
, and explicitly state how to transfer between statements holding over V and over M, for example, by relying on absoluteness of bounded formulas. The main reason for it is that while
$V\models \phi (K)$
implies
$M\models \phi (j(K))$
, we cannot say anything about
$V\models \phi (j(K))$
and
$M\models \phi (K)$
.
For example, work over
$\mathsf {ZFC}$
with a measurable cardinal
$\kappa $
, and consider an ultrapower map
$j\colon V\to \operatorname {Ult}(V,U)\cong M$
. Then
$V\models (V_\kappa \models \mathsf {ZFC}_2)$
and
$M\models (V_{j(\kappa )}\models \mathsf {ZFC}_2)$
. However,
$j(\kappa )$
is not even a cardinal over V, so
$V\not \models (V_{j(\kappa )}\models \mathsf {ZFC}_2)$
. Similarly, while V thinks
$\kappa $
is measurable, M does not in general think
$\kappa $
is measurable.
5.3 Playing with a
$\Sigma $
-
$\mathrm {Ord}$
-inary elementary embedding over
$\mathsf {IKP}$
Let us examine what we can derive from a
$\Sigma $
-elementary embedding with an ordinal critical point. An illuminating result along this line is that of Ziegler [Reference Ziegler67], which is proved by considering the rank of any fixed critical point K:
Proposition 5.16 (Ziegler [Reference Ziegler67, Section 9.1] or [Reference Matthews38, Lemma 7.2.4],
$\mathsf {IKP}+ \mathrm {\Sigma }\text {-} \mathsf {BTEE}_M$
).
Let
$j\colon V\to M$
be a
$\Sigma $
-elementary map. Then the following statements are all equivalent
$:$
-
1. There is K such that
$K\in j(K)$ and
$\forall x\in \operatorname {TC}(K) \ (j(x)=x)$ .
-
2. There is a transitive K such that
$K\in j(K)$ and
$\forall x\in K\ (j(x)=x)$ .
-
3. There is an ordinal
$\kappa $ such that
$\kappa \in j(\kappa )$ and
$\forall \alpha \in \kappa \ (j(\alpha )=\alpha )$ .
The main idea of the above proposition is extracting the rank of a given K, then observing that j respects the rank of a set since the rank function is
$\Sigma $
-definable.
However, it is not likely that we can extend this result further. Ziegler provided a way to give a model of
$\mathsf {CZF^-_{Rep}+Exp}$
,
$\mathsf {CZF}^-$
with Exponentiation and Replacement in place of Strong Collection, from a
$\Delta _0$
-elementary embedding
$j\colon V\to M$
with an ordinal critical point. However, Ziegler’s result requires an additional assumption on j called
$j\mathsf {IEA}$
.
Despite that,
$\Sigma $
-elementary embedding over
$\mathsf {IKP}$
still shows quite a strong consistency strength provided if it has an ordinal critical point. The following result is from [Reference Matthews38]:
Theorem 5.17 [Reference Matthews38, Theorem 7.3.2],
$\mathsf {IKP}+ \mathrm {\Sigma }\text {-} \mathsf {BTEE}_M$
.
Let
$j\colon V\to M$
be a
$\Sigma $
-
$\mathrm {Ord}$
-inary elementary embedding with witnessing ordinal
$\kappa $
, that is, a
$\Sigma $
-elementary embedding with a critical point
$\kappa $
that is an ordinal. Furthermore, let
$\kappa ^\#$
be defined as in Definition 2.26. Then
$L_{\kappa ^\#}\models \mathsf {IZF}$
.
Using the results we previously developed we can extend this to a model of
$\mathsf {IZF +pIEA}.$
Proposition 5.18 (
$\mathsf {IKP}+\Sigma \text {-} \mathsf {BTEE}_M$
).
Let
$j \colon V \rightarrow M$
be a
$\Sigma $
-
$\mathrm {Ord}$
-inary elementary embedding with witnessing ordinal
$\kappa $
. Then
$j\operatorname {\mathrm {\upharpoonright }} L^V : L^V\to L^M$
is a
$\Sigma $
-elementary embedding over L and
$L_{\kappa ^\#}\models \Delta _0\text {-} \mathsf {Sep}$
.
Proof
$\Delta _0$
-elementarity of
$j\operatorname {\mathrm {\upharpoonright }} L^V : L^V\to L^M$
follows from the fact that the formula
$x\in L$
is
$\Sigma $
-definable and j is
$\Sigma $
-elementary. Secondly, by (4) of Proposition 2.23,
$L_{\kappa ^\#}\models \Delta _0\text {-} \mathsf {Sep}$
.
By applying Theorem 5.1 and Corollary 5.4, we have:
Corollary 5.19 (
$\mathsf {IKP}+\Sigma \text {-} \mathsf {BTEE}_M$
).
Let j be a
$\Sigma $
-
$\mathrm {Ord}$
-inary elementary embedding with witnessing ordinal
$\kappa $
. Then
$L_\kappa ^\#$
is a model of
$\mathsf {IZF+pIEA}$
.
Remark 5.20. Classically,
$L^V=L^M$
if M is a proper class transitive model of
$\mathsf {KP}$
. Furthermore, if M is a transitive (set or class) model of
$\mathsf {KP}$
, then
$L^M=L\cap M$
only depends on
$\mathrm {Ord}^M=\mathrm {Ord}\cap M$
.
This is not constructively valid even if M is a proper class since there is no reason to believe
$\mathrm {Ord}\cap V=\mathrm {Ord}\cap M$
. In fact, we can construct a Kripke model of
$\mathsf {IZF}$
that satisfies
$\mathrm {Ord}\cap V\neq \mathrm {Ord}\cap L$
. See [Reference Matthews38, Section 5.5] for the details.
Of course, we can derive more: [Reference Matthews38] proved that
$L_{\kappa ^\#}$
also thinks every set is included in a totally indescribable set. Also, under the presence of Set Induction and Collection for
$\Sigma ^{j, M}$
-formulas, we have:
Theorem 5.21 (
$\mathsf {IKP}_{j,M}$
).
Let
$\lambda =\bigcup _{n\in \omega } j^n(\kappa ^\#)$
. Then
$L_\lambda = \bigcup _{n\in \omega } L_{j^n(\kappa ^\#)}$
and
$ \langle L_\lambda , j\operatorname {\mathrm {\upharpoonright }} L_\lambda \rangle $
is a model of
$\mathsf {IZF+BTEE} + \text {Set Induction}_{j}$
.
Proof Set Induction and Collection for
$\Sigma ^{j,M}$
-formulas are necessary to ensure the existence of
$\lambda $
.
By definition of
$L_\alpha $
,

Now, the desired result follows from Corollary 5.14.
5.4 Reinhardt sets
Let
$j \colon V\to M$
be an elementary embedding and K a critical point of j. It is not generally true that
$j(K)$
is also a regular set, although M believes it is. However, for a Reinhardt embedding
$j \colon V\to V$
,
$j(K)$
is regular. This yields a better lower bound for the consistency strength of Reinhardt sets. Observe that the proof only requires j to be
$\Delta _0$
-elementary because all of the formulas can be bound by
$\Lambda $
.
Theorem 5.22 (
$\mathsf {IKP}_j$
).
If K is Reinhardt, then
$\Lambda $
satisfies
$\mathsf {IZF+WA}$
.
Since we already know that
$\Lambda \models \mathsf {IZF}$
, the above proposition follows immediately from the following lemma:
Lemma 5.23. For any j-formula
$\phi $
,
$t\in \omega $
and
$a,p\in j^t(K)$
,
$\{x\in a\mid \langle \Lambda , j\operatorname {\mathrm {\upharpoonright }}\Lambda \rangle \models \phi (x, p)\}\in j^t(K)$
.
Proof Let
$F_\theta (a,p):=\{x\in a\mid \langle \Lambda , j\operatorname {\mathrm {\upharpoonright }}\Lambda \rangle \models \theta (x,p)\}$
for a formula
$\theta $
. We will prove that, for every formula
$\theta $
,
$F_\theta (a,p)\in j^t(K)$
for all
$a,p\in j^t(K)$
by induction on the complexity of
$\theta $
.
Atomic cases in which j does not appear follow immediately from the inaccessibility of
$j^t(K)$
. We consider the case for equality where j appears, the cases for element-hood being similar. To do this, we need to show that for
$a \in j^t(K)$
,

First, since
$j^{t+1}(K)$
is inaccessible in V,
$j^{t+1}(K)$
is Exp-closed by Corollary 3.3, and therefore
$j \operatorname {\mathrm {\upharpoonright }} a \in \operatorname {mv}({^{a}}{}j(a))$
is in
$j^{t+1}(K)$
. Next, since
$j^{t+1}(K)$
is closed under intersections,
$j \operatorname {\mathrm {\upharpoonright }} a \cap a \in j^{t+1}(K)$
. However, this is a subset of
$a \in j^t(K)$
so, by Lemma 5.13,
$j \operatorname {\mathrm {\upharpoonright }} a \cap a = \{ \langle x, y \rangle \in a \mid x = j(y) \} \in j^t(K)$
.
Conjunctions, disjunctions, and implications follow from the fact that
$j^t(K)$
satisfies Union and
$\Delta _0$
-Separation: let us examine the proof for implications. Suppose that
$F_\phi (a,p),F_\psi (a,p)\in j^t(K)$
for any
$a,p\in j^t(K)$
. Then, by
$\Delta _0$
-Separation,

Next, suppose that
$\phi (x,p)$
is
$\forall y \psi (x,y,p)$
. For each
$n\in \omega $
, let

We claim that
$S_n\in j^t(K)$
for every
$n\in \omega $
. By the inductive hypothesis, for every
$y\in j^{t+n}(K)$
,

Then we can define a function
$R \colon j^{t+n}(K)\to j^{t+n+1}(K)$
given by
$R(y)=F_\psi (a,\langle y,p\rangle )$
. Furthermore,
$R(y)\in \mathcal {P}(a)\cap j^{t+n+1}(K)=\mathcal {P}(a)\cap j^{t+n}(K)$
. Hence the codomain of R is
$\mathcal {P}(a)\cap j^{t+n}(K)$
, which is an element of
$j^{t+n+1}(K)$
. Since K is regular, so inaccessible by Theorem 5.1, so is
$j^{t+n+1}(K)$
. Hence by Corollary 3.3,
$R\in j^{t+n+1}(K)$
. Hence
$S_n =\bigcap \operatorname {ran} R \in j^{t+n+1}(K)$
. By Lemma 5.13 and
$S_n\subseteq a \in j^t(K)$
, we have
$S_n \in j^t(K)$
.
Finally for this case, since
$S_n\in j^t(K)$
for each
$n\in \omega $
, we can define a function
$S \colon \omega \to j^t(K)$
by
$S(n):=S_n$
. By repeating the previous argument with the inaccessibility of
$j^t(K)$
and
$\omega \in j^t(K)$
, we have
$S\in j^t(K)$
. Hence

is also a member of
$j^t(K)$
.
The last case is when
$\phi (x,p)$
is
$\exists y \psi (x,y,p)$
. The proof is quite similar to the previous one. Similar to the previous case, for every
$n\in \omega $
, let

We again prove that for each
$n\in \omega $
,
$S^{\prime }_n\in j^t(K)$
by first obtaining that
$R\in j^{t+n+1}(K)$
, where
$R \colon y\mapsto F_\psi (a,\langle y,p\rangle )$
is the same function we defined in the proof for the previous case. Therefore,
$S^{\prime }_n = \bigcup _{y\in j^{n+1}(K)}R(y) \in j^{n+1}(K)$
. Furthermore,
$S^{\prime }_n\subseteq a\in j^t(K)$
shows
$S^{\prime }_n\in j^t(K)$
. If we define
$S'\colon \omega \to j^t(K)$
by
$S'(n):=S^{\prime }_n$
, then
$S'\in j^t(K)$
. Hence
$F_\phi (a,p)=\bigcup _{n\in \omega } S'(n)\in j^t(K)$
.
5.5 Super Reinhardt sets
The following theorem shows that super Reinhardt sets reflect first-order properties of V.
Theorem 5.24 (
$\mathsf {CGB}_\infty $
).
Let K be a super Reinhardt set. Then K is an elementary submodel of V. That is, for every formula (without j)
$\phi (\vec {x})$
all of whose variables are displayed,

Proof Atomic cases and the cases for logical connectives are trivial. Hence we focus on quantifications.
-
• Case
$\forall $ : assume that
$a\in K$ and
$\phi (x,a)$ is absolute between K and V. Then clearly we have
$\forall x \phi (x,a)\to \forall x\in K \phi ^K(x,a)$ . Conversely, assume that
$\forall x\in K \phi ^K(x,a)$ . Fix
$b\in V$ and j such that
$b\in j(K)$ . Then
$\forall x\in j(K) \phi ^{j(K)}(x,a)$ implies
$\phi ^{j(K)}(b,a)$ . Furthermore, we can see that
$\phi (x,a)$ is also absolute between
$j(K)$ by applying j to our inductive hypothesis;
$\forall x, a\in K \phi ^K(x,a)\leftrightarrow \phi (x,a)$ . Thus
$\phi (b,a)$ for all
$b\in V$ .
-
• Case
$\exists $ : assume the same conditions on a and
$\phi (x,a)$ as we did before. Then obviously we have
$\exists x\in K \phi ^K(x,a)\to \exists x \phi (x,a)$ . For the converse, assume that there is b such that
$\phi (b,a)$ . Find j such that
$b\in j(K)$ . Since
$\phi $ is also absolute between
$j(K)$ and V, we have
$\exists x\in j(K) \phi ^{j(K)}(x,a)$ . Thus
$\exists x\in K \phi ^K(x,a)$ .
Note that the above theorem requires the full elementarity of elementary embeddings. Next, we shall see that not only do strong Reinhardt sets reflect all first-order properties of V, but also they contain every true subset of a member of themselves:
Proposition 5.25 (
$\mathsf {CGB}_\infty $
).
Let K be a super Reinhardt set and
$a\in K$
. If
$b\subseteq a$
, then
$b\in K$
. Especially,
$\mathcal {P}^K(a)=\mathcal {P}(a)$
for
$a\in K$
, so K is power inaccessible.
Proof Find
$j \colon V\to V$
such that
$b\in j(K)$
. By Lemma 4.9,
$j(b)=b\in j(K)$
, so
$b\in K$
.
Corollary 5.26 (
$\mathsf {CGB}_\infty $
).
Suppose that there is a super Reinhardt set. Then V is a model of
$\mathsf {IZF + pIEA}$
.
However, it is not, in general, true that we will also satisfy the full second-order theory of
$\mathsf {IGB}$
. The issue here is that there is no reason why a (not first-order definable) class should be amenable. On the other hand, one should observe that by restricting our attention to amenable classes we will obtain a model of
$\mathsf {IGB}$
. Also, note that being a super Reinhardt is a second-order property, so its existence does not reflect down to K.
Bagaria–Koellner–Woodin [Reference Bagaria, Koellner and Woodin10] showed that super Reinhardt cardinals rank-reflect Reinhardt cardinals, that is, there is an inaccessible cardinal
$\gamma $
such that
$(V_\gamma ,V_{\gamma +1})$
models
$\mathsf {ZF}_2$
with a Reinhardt cardinal. We will show in Theorem 6.39 that
$\mathsf {CGB}_\infty $
with a super Reinhardt set interprets
$\mathsf {ZF}$
with a proper class of
$\gamma $
such that
$(V_\gamma ,V_{\gamma +1})\models \mathsf {ZF_2}+\text {`there is a Reinhardt cardinal'.}$
However, its proof ‘mixes up’ large set arguments over
$\mathsf {CZF}$
with a double-negation translation, so the following question is still open:
Question 5.27. Working over
$\mathsf {CGB}_\infty $
with a super Reinhardt set, can we prove there is an inaccessible set M such that
$(M,\mathcal {P}(M))$
satisfies
$\mathsf {CGB}_\infty $
with the existence of a Reinhardt set?
However, we can still derive various large set principles from super Reinhardtness. For example, we can see that super Reinhardtness implies the analogue of
$j \colon V_{\lambda +n}\prec V_{\lambda +n}$
over
$\mathsf {ZF}$
:
Proposition 5.28 (
$\mathsf {CGB}_\infty $
).
Assume that there is a super Reinhardt set. Define
$V_\alpha (x)$
recursively as
$V_\alpha ({ x})=\operatorname {TC}({ x})\cup \bigcup _{\beta \in \alpha }\mathcal {P}(V_\beta (x))$
. If
$j(\xi )=\xi $
, then for each set a we can find
$\Lambda \ni a$
, which is a countable union of power inaccessible sets, with an elementary embedding
$j \colon V_\xi (\Lambda )\to V_\xi (\Lambda )$
.
Especially, for each
$n\in \omega $
and
$a\in V$
, we can find
$\Lambda \ni a$
, which is a countable union of power inaccessible sets, such that there is an elementary embedding
$j \colon \mathcal {P}^n(\Lambda )\to \mathcal {P}^n(\Lambda )$
.
Proof Let K be a super Reinhardt set and j be an elementary embedding with a critical point K such that
$a\in j(K)$
. Now let
$\Lambda = j^\omega (K)$
. We can see that
$j\operatorname {\mathrm {\upharpoonright }} V_\xi (\Lambda ) \colon V_\xi (\Lambda )\to j(V_\xi (\Lambda )) = V_\xi (\Lambda )$
is the desired elementary embedding. The latter claim follows by letting
$\xi =n$
.
5.6 Totally Reinhardt sets
Then how about the case for totally Reinhardt sets? We examined that if K is super Reinhardt, then
$K\prec V$
. However, K does not reflect j-formulas. We can see that A-super Reinhardt sets reflect not only usual set-theoretic formulas, but also A-formulas. The proof of the following theorem is identical to that of Theorem 5.24, so we omit it. We note here that in the theorem we will not need to assume that
$A \cap K$
is a set, and if A is not amenable then in fact this will not be the case.
Theorem 5.29 (
$\mathsf {CGB}_\infty $
).
Let K be an A-super Reinhardt set. Then K reflects every A-formula. That is, for every formula
$\phi (X, \vec {x})$
with one class parameter X and all of whose variables are displayed,

Note that A-elementarity of j is necessary for the above theorem: The reader can see that the proof of Theorem 5.24 applies j to the inductive hypothesis, namely,
$\phi (\vec {x})$
is absolute between K and V, and this is where we need the A-elementarity. We can also see that the proof breaks down if we do not assume A-elementarity: if the proof were to work without A-elementarity, then Theorem 5.24 would hold even for j-formulas. This would imply K thinks there is a critical point of j, which is invalid.
One consequence of the reflection of A-formulas is that V satisfies Full Separation for A-formulas when A is amenable. This is because, for amenable classes, it is relatively straightforward to prove that
$K \models \mathsf {Sep}_A$
. The proof of the following lemma is similar to that of Theorem 5.1, so we omit it.
Lemma 5.30 (
$\mathsf {CGB}_\infty $
).
Let A be an amenable class and K be an A-super Reinhardt set. Then K satisfies Full Separation for A-formulas.
Corollary 5.31 (
$\mathsf {CGB}_\infty $
).
Assume that there is an A-super Reinhardt set for an amenable A. Then Full Separation for A-formulas hold. Especially, if V is totally Reinhardt, then Full Separation for A-formulas hold for all amenable classes A.
Note that every A-super Reinhardt set is super Reinhardt, so is power inaccessible by Proposition 5.25. In sum,
$\mathsf {CGB_\infty +TR}$
proves
$\mathsf {IGB_\infty +TR}$
without Class Separation. It is unknown if Class Separation follows from the remaining axioms of
$\mathsf {IGB_\infty +TR}$
. However, we can still see that
$\mathsf {IGB_\infty +TR}$
is interpretable within itself without Class Separation:
Theorem 5.32 (
$\mathsf {CGB}_\infty $
).
$\mathsf {CGB_\infty +TR}$
interprets
$\mathsf {IGB_\infty +TR}$
.
Proof We proved that
$\mathsf {CGB_\infty +TR}$
proves every axiom of
$\mathsf {IGB_\infty +TR}$
except for Class Separation. We claim that
$\mathsf {IGB_\infty +TR}$
is interpreted in itself without Class Separation.
Let
$\phi $
be a formula of
$\mathsf {IGB_\infty +TR}$
, and
$\phi ^{\mathfrak {a}}$
be a formula obtained by bounding every second-order quantifier to the collection of all amenable classes. That is, we get
$\phi ^{\mathfrak {a}}$
by replacing every
$\forall ^1 X$
and
$\exists ^1 X$
occurring in
$\phi $
to
$\forall ^1 X (X \text { is amenable})\to \cdots $
and
$\exists ^1 X (X \text { is amenable})\land \cdots $
.
By Corollary 5.31, Class Separation holds for amenable classes: that is, if A is amenable and
$\phi (x,p,C)$
is a second-order quantifier-free formula whose free variables are all expressed, then
$\{x\mid \phi (x,p,A)\}$
is also amenable. Thus the
$\mathfrak {a}$
-interpretation of Class Separation holds. Moreover, it is easy to see that the
$\mathfrak {a}$
-interpretation of the other axioms of
$\mathsf {IGB_\infty +TR}$
are valid.
6 Heyting-valued interpretation and the double-negation interpretation
In this section, we will develop tools to analyze the consistency strength of large set axioms over constructive set theories. The main tool we will use is the double-negation translation. Especially, we will heavily rely on Gambino’s Heyting-valued interpretation ([Reference Gambino19, Chapter 5] or [Reference Gambino20]) with the double-negation topology which has a similar presentation to the classical Boolean-valued interpretations. Grayson [Reference Grayson27] and Bell [Reference Bell11] have also introduced a way to interpret classical
$\mathsf {ZF}$
from
$\mathsf {IZF}$
using an easier method whose connection to Boolean-valued models is even clearer. However, we will use Gambino’s method because it works over the much weaker background theory of
$\mathsf {CZF}^-$
.
6.1 Heyting-valued interpretation of
$\mathsf {CZF}^-$
Forcing is a powerful tool to construct a model of set theory. Gambino’s definition of Heyting-valued model (or alternatively, forcing) opens up a way to produce models of
$\mathsf {CZF}^-$
. His Heyting-valued model starts from a formal topology, which formalizes a poset of open sets with a covering relation:
Definition 6.1. A formal topology is a structure
$\mathcal {S}=(S,\le ,\vartriangleleft )$
such that
$(S,\le )$
is a poset and
$\vartriangleleft \subseteq S\times \mathcal {P}(S)$
satisfies the following conditions:
-
1. if
$a\in p$ , then
$a\vartriangleleft p$ ,
-
2. if
$a\le b$ and
$b\vartriangleleft p$ , then
$a\vartriangleleft p$ ,
-
3. if
$a\vartriangleleft p$ and
$\forall x\in p (x\vartriangleleft q)$ , then
$a\vartriangleleft q$ , and
-
4. if
$a\vartriangleleft p, q$ , then
$a\vartriangleleft (\operatorname {\mathrm {\downarrow }} p)\cap (\operatorname {\mathrm {\downarrow }} q)$ , where
$\operatorname {\mathrm {\downarrow }} p = \{b\in S \mid \exists c \in p (b\le c )\}$ .
Intuitively, S describes a basis of a topology, and
$\vartriangleleft $
is a covering relation. Then, for each collection of ‘open sets’ p, we have the notion of a nucleus,
$\jmath p$
, which is the set of all open sets that are covered by p. We can view
$\jmath p$
as a ‘union’ of all open sets in p, defined by

Then the class
$\operatorname {Low}(\mathcal {S})_\jmath $
of all lower subsetsFootnote
13
that are stable under
$\jmath $
(i.e.,
$\jmath p = p$
) form a set-generated frame:
Definition 6.2. A structure
$\mathcal {A}=(A,\le ,\bigvee ,\land ,\top ,g)$
is a set-generated frame if
$(A,\le ,\bigvee ,\land ,\top )$
is a complete distributive lattice with the generating set
$g\subseteq A$
, such that the class
$g_a=\{x\in g\mid x\le a\}$
is a set, and
$a=\bigvee g_a$
for any
$a\in A$
.
The reader is reminded that we can endow a Heyting algebra structure over a set-generated frame. For example, we can define
$a\to b$
by
$a\to b = \bigvee \{x\in g\mid x\land a\le b\}$
,
$\bot $
by
$\varnothing $
, and
$\bigwedge p$
by
$\bigvee \{x\in g\mid \forall y\in p (x\le y)\}$
.
Proposition 6.3. For every formal topology
$\mathcal {S}$
, the class
$\operatorname {Low}(\mathcal {S})_\jmath $
has a set-generated frame structure under the following definition of relations and operations
$:$
-
•
$p\land q=p\cap q$ ,
-
•
$\bigvee p=\jmath (\bigcup p)$ ,
-
•
$\top = S$ ,
-
•
$\le $ as the inclusion relation, and
-
•
$g=\{\{x\}\mid x\in S\}$ .
Furthermore, we can make
$\operatorname {Low}(\mathcal {S})_\jmath $
a Heyting algebra with the following additional operations
$:$
-
•
$p\lor q=\jmath (p\cup q)$ ,
-
•
$p\to q=\{x\in S\mid x\in p\to x\in q\}$ ,
-
•
$\bigwedge p=\bigcap p$ .
We extend the nucleus
$\jmath $
to a lower subclass
$P\subseteq S$
, which is a subclass of
$\mathcal {S}$
satisfying
$P = \operatorname {\mathrm {\downarrow }} P := \{a\in \mathcal {S} \mid \exists b\in P (a\le b)\}$
, by taking

Then we define Heyting operations for classes as follows:
-
•
$P\land Q=P\cap Q$ ,
-
•
$P\lor Q=J(P\cup Q)$ ,
-
•
$P\to Q =\{x\in S\mid x\in P\to x\in Q\}$ .
For a set-indexed collection of classes
$\{P_x\mid x\in I\}$
, take
$\bigwedge _{x\in I} P_x=\bigcap _{x\in I} P_x$
and
$\bigvee _{x\in I} P_x=J\left (\bigcup _{x\in I} P_x\right )$
.
The following results can be proven by direct calculation and so we omit their proofs:
Remark 6.4.
-
• If p is a set then
$Jp = \jmath p$ .
-
• For any lower subclass
$P \subseteq S$ ,
$P \subseteq JP$ .
-
• If P, Q and R are subclasses of S which are stable under J then
$R \subseteq (P \rightarrow Q)$ if and only if
$R \land P \subseteq Q$ .
-
• If
$\{ P_x \mid x \in I \}$ is a family of subclasses of S such that for each
$x \in I$ ,
$JP_x = P_x$ and R is another subclass of S such that
$JR = R$ , then
$R \subseteq \bigwedge _{x \in I} P_x$ if and only if
$R \subseteq P_x$ for each
$x \in I$ .
Given
$x \in S$
one can consider the class of all
$p \subseteq S$
such that
$x \in \jmath p$
, namely the collection of all covers of x. In general, this need not be a set; however, in many cases it can be sufficiently approximated by a set. This is particularly vital to verify Subset Collection in our eventual model in Theorem 6.11.
Definition 6.5. A formal topology
$(\mathcal {S},\le ,\vartriangleleft )$
is said to be set-presentable if there is a set-presentation
$R:\mathcal {S}\to \mathcal {P(P(S))}$
, which is a set function satisfying

for all
$a\in \mathcal {S}$
and
$p\in \mathcal {P(S)}$
.
Since some readers may not be familiar with the definition of formal topology, we give here a brief informal description of it. We then refer the reader to Chapter 4 of [Reference Gambino19] or Chapter 15 of [Reference Aczel and Rathjen6] for a more detailed account. The notion of formal topology stems from an attempt to formulate point-free topology over a predicative system such as Martin-Löf type theory. Thus we may view
$(\mathcal {S}, \le )$
as a collection of open sets.
We usually describe open sets by using a subbasis, and sometimes the full topology is more complex than the subbasis. This issue can particularly arise in the constructive set-theoretic context and, even worse, it could be that while a subbasis is a set, the whole topology generated by the subbasis is a proper class. In that case, we want to have a simpler surrogate for the full topology. This explains why we do not define the formal topology as a
$\bigvee $
-semilattice. Since the covering relation plays a pivotal role in topology and sheaf theory, we also formulate the covering relation
$\vartriangleleft $
into the definition of formal topology. Producing
$\operatorname {Low}(\mathcal {S})_\jmath $
from the formal topology corresponds to recovering the full topology from a subbasis.
Although we hope that
$\mathcal {S}$
will be as simple as possible, the use of a ‘complex’ formal topology is sometimes unavoidable. For example, even the natural double-negation formal topology defined over
$\mathcal {P}(1)$
, which we will shortly define, is a class unless we have Power Set. Thus we want to define a ‘small formal topology’ separately, and the notion of set-presented formal topology is exactly for such a purpose. Roughly, the set-representation R decomposes
$a\in \mathcal {S}$
into some collection of ‘open sets’, and we can track the covering relation by using R.
The Heyting universe
$V^{\mathcal {S}}$
over
$\mathcal {S}$
is defined inductively as follows:
$a\in V^{\mathcal {S}}$
if and only if a is a function from a set-sized subset of
$V^{\mathcal {S}}$
to
$\operatorname {Low}(\mathcal {S})_\jmath $
. For each set x, we have the canonical representation
$\check {x}\in V^{\mathcal {S}}$
of x recursively defined by
$\operatorname {dom} \check {x}=\{\check {y}\mid y\in x\}$
and
$\check {x}(\check {y})=\top $
. We can now define the Heyting interpretation,
$[\mkern -2mu[\phi ]\mkern -2mu]\in \operatorname {Low}_\jmath $
, with parameters in
$V^{\mathcal {S}}$
as follows:
Definition 6.6. Let
$\phi $
be a formula of first-order set theory and
$\vec {a}\in V^{\mathcal {S}}$
. Then we define the Heyting-valued interpretation,
$[\mkern -2mu[\phi (\vec {a})]\mkern -2mu]$
, as follows:
-
•
$\kern-5pt \begin {array}[t]{lr} [\mkern -2mu[ a = b]\mkern -2mu] = & \left (\bigwedge _{x\in \operatorname {dom} a}a(x)\to \bigvee _{y\in \operatorname {dom} b} b(y)\land [\mkern -2mu[ x=y]\mkern -2mu]\right ) \land \\ & \left (\bigwedge _{y\in \operatorname {dom} b}b(y)\to \bigvee _{x\in \operatorname {dom} a}a(x)\land [\mkern -2mu[ x=y]\mkern -2mu]\right ),\end {array}$
-
•
$[\mkern -2mu[ a\in b]\mkern -2mu]=\bigvee _{y\in \operatorname {dom} b} b(y)\land [\mkern -2mu[ a=y]\mkern -2mu]$ ,
-
•
$[\mkern -2mu[\bot ]\mkern -2mu]=\bot $ ,
$[\mkern -2mu[ \phi \land \psi ]\mkern -2mu]=[\mkern -2mu[\phi ]\mkern -2mu]\land [\mkern -2mu[\psi ]\mkern -2mu]$ ,
$[\mkern -2mu[ \phi \lor \psi ]\mkern -2mu]=[\mkern -2mu[\phi ]\mkern -2mu]\lor [\mkern -2mu[\psi ]\mkern -2mu]$ ,
$[\mkern -2mu[ \phi \to \psi ]\mkern -2mu]=[\mkern -2mu[\phi ]\mkern -2mu]\to [\mkern -2mu[\psi ]\mkern -2mu]$ , and
$[\mkern -2mu[ \lnot \phi ]\mkern -2mu]=[\mkern -2mu[\phi \to \bot ]\mkern -2mu]$ ,
-
•
$[\mkern -2mu[\forall x\in a\phi (x)]\mkern -2mu]=\bigwedge _{x\in \operatorname {dom} a}a(x)\to [\mkern -2mu[\phi (x)]\mkern -2mu]$ ,
-
•
$[\mkern -2mu[\exists x\in a\phi (x)]\mkern -2mu]=\bigvee _{x\in \operatorname {dom} a}a(x)\land [\mkern -2mu[\phi (x)]\mkern -2mu]$ ,
-
•
$[\mkern -2mu[\forall x\phi (x)]\mkern -2mu]=\bigwedge _{x\in V^{\mathcal {S}}}[\mkern -2mu[\phi (x)]\mkern -2mu]$ and
$[\mkern -2mu[\exists x\phi (x)]\mkern -2mu]=\bigvee _{x\in V^{\mathcal {S}}}[\mkern -2mu[\phi (x)]\mkern -2mu]$ .
We write
$V^{\mathcal {S}} \models \phi $
when
$[\mkern -2mu[ \phi ]\mkern -2mu] = \top $
holds, and in such a case we say that
$\phi $
is valid in
$V^{\mathcal {S}}$
.
The next result is that the interpretation validates every axiom of
$\mathsf {CZF}^-$
. Since the proof of this was already done in [Reference Gambino20], we will omit most of the proof. However we will replicate the proof of the validity of Strong Collection because it is more involved and the method will be necessary to show the persistence of BCST-regularity in Theorem 6.24.
Let
$a\in V^{\mathcal {S}}$
and R be a class. We want to show the following statement holds:

Here
$\to $
is translated to a Heyting implication operation, and the focal property of the implication operation is the following:
$q\to r=\top $
if and only if for every
$p\le q$
, we have
$p\le r$
.
Hence we try the following strategy: take any
$p\in \operatorname {Low}(\mathcal {S})_\jmath $
such that
$p\subseteq [\mkern -2mu[ R \colon a \rightrightarrows V ]\mkern -2mu]$
(that is,
$p \leq [\mkern -2mu[ R \colon a \rightrightarrows V ]\mkern -2mu]$
in the inclusion ordering). Then we claim that we can find
$b\in V^{\mathcal {S}}$
such that
.
We will, and must use a form of Strong Collection to prove the validity of Strong Collection over
$V^{\mathcal {S}}$
. The role of Strong Collection is to confine the codomain of a class-sized multi-valued function to a set-sized range. Next note that, using the third and fourth items in Remark 6.4, the assumption
$p\subseteq [\mkern -2mu[ R\colon a\rightrightarrows V]\mkern -2mu]$
is equivalent to
$p\land a(x)\subseteq [\mkern -2mu[ \exists y\ R(x,y)]\mkern -2mu]$
for all
$x\in \operatorname {dom} a$
. So, using this assumption, we can codify the relation R by using

Intuitively, P encodes a family of classes

Since
$[\mkern -2mu[ R(x,y)]\mkern -2mu]$
could be a proper class in which we cannot form a collection of all
$p\land a(x)\land [\mkern -2mu[ R(x,y)]\mkern -2mu] $
, we introduce P to code this family into a single class.
We will construct b by searching for an appropriate subset of P and making use of it. We will cast appropriate lemmas when we need them.
Lemma 6.7. Fix
$p\in \operatorname {Low}(\mathcal {S})_\jmath $
and
$a\in V^{\mathcal {S}}$
such that
$p\subseteq [\mkern -2mu[ R\colon a\rightrightarrows V]\mkern -2mu]$
. Let P be the class we defined before. Then we can find a set
$r\subseteq P$
such that
$p\land a(x)\subseteq \jmath \{ z\mid \exists y \ \langle x,y,z\rangle \in r\}$
for all
$x \in \operatorname {dom} a$
.
Proof Before starting the proof, let us remark that the b we will eventually construct as our witness for Strong Collection will satisfy
$\jmath \{ z\mid \exists y \ \langle x,y,z\rangle \in r\}\subseteq [\mkern -2mu[ \exists y\in b\ R(x,y)]\mkern -2mu]$
.
Observe that
$p\subseteq [\mkern -2mu[ R\colon a\rightrightarrows V]\mkern -2mu]$
is equivalent to
$\forall x\in \operatorname {dom} a (p\land a(x)\subseteq \bigvee _{y\in V^{\mathcal {S}}} [\mkern -2mu[ R(x,y)]\mkern -2mu])$
. Hence we have

for every
$x\in \operatorname {dom} a$
. We want to have a family of classes
$Q_x = \bigcup _{y\in V^{\mathcal {S}}} p\land a(x)\land [\mkern -2mu[ R(x,y)]\mkern -2mu]$
. For this, we define the coding Q of the family
$Q_x$
by

and let
$Q_x = \{z\mid \langle x,z\rangle \in Q\}$
, which one can easily see satisfies our requirement. Then
$p\land a(x)\subseteq JQ_x$
. By the definition of J, we have the following sublemma, which is Lemma 2.8 of [Reference Gambino20]:
Lemma 6.8. Let P be a lower subclass of
$\mathcal {S}$
and
$u\subseteq JP$
. Then we can find
$v\subseteq P$
such that
$u\subseteq \jmath v$
.
In sum, we have that for each
$x\in \operatorname {dom} a$
there is a
$v\subseteq Q_x$
such that
$p\land a(x)\subseteq \jmath v$
. The following lemma shows we can find v in a uniform way:
Lemma 6.9. Let a be a set,
$S \colon a \rightrightarrows V$
a multi-valued class function, and
$Q\subseteq a\times V$
a class. For each
$x \in a$
, let
$Q_x := \{ z \mid \langle x, z \rangle \in Q\}$
. Moreover, assume that
$:$
-
1. for each
$x\in a$ there is
$u\subseteq Q_x$ such that
$S(x,u)$ holds and
-
2.
$($ Monotone Closure
$)$ if
$S(x,u)$ holds and
$u\subseteq v\subseteq Q_x$ then
$S(x,v)$ ,
then there is
$f\colon a\to V$
such that
$f(x)\subseteq Q_x$
and
$S(x,f(x))$
for all
$x\in a$
.
Proof Consider the multi-valued function
$S'$
of domain a defined by

By mimicking the proof of Lemma 2.11, we can find a set
$g\colon a\rightrightarrows V$
such that
$g\subseteq S'$
. Now take
$f(x)=\bigcup g_x = \bigcup \{u \mid \langle x,u\rangle \in g\}$
, which is a set by Union and Replacement. We can see that
$f(x) \subseteq Q_x$
for each
$x\in a$
. By the first clause of our assumptions and monotone closure of S, we have
$S(x,f(x))$
for all
$x\in a$
.
Let us return to the proof of Lemma 6.7. Consider the relation S defined by

It is easy to see that S satisfies the hypotheses of Lemma 6.9. Therefore, by Lemma 6.9 applied to S, we can find
$f\colon \operatorname {dom} a\to \operatorname {Low}(\mathcal {S})_\jmath $
such that
$f(x)\subseteq Q_x$
and
$p\land a(x)\subseteq \jmath f(x)$
.
Recall that
$f(x)\subseteq Q_x = \bigcup _{y\in V^{\mathcal {S}}} p\land a(x)\land [\mkern -2mu[ R(x,y)]\mkern -2mu]$
. Hence for each
$x\in \operatorname {dom} a$
and
$z\in f(x)$
we can find
$y\in V^{\mathcal {S}}$
such that
$z\in p\land a(x)\land [\mkern -2mu[ R(x,y)]\mkern -2mu]$
. However, the class of such y may not be a set, so we will apply Strong Collection to find a subset of the class uniformly.
Now let

Then for each
$\langle x,z\rangle \in q$
there is y such that
$\langle x,y,z\rangle \in P$
. Now consider the multi-valued function
$P'\colon q\rightrightarrows V^{\mathcal {S}}$
defined by

By Strong Collection applied to
$\mathcal {A}(P')\colon q \rightrightarrows q \times V^{\mathcal {S}}$
, we have d such that
. By Lemma 2.10,
$d\subseteq P'$
and
$d\colon q \rightrightarrows V^{\mathcal {S}}$
. Define
$g(x,z) = \{y \mid \langle \langle x,z\rangle , y\rangle \in d\}$
. Then we can see that
$y\in g(x,z)$
implies
$\langle x,y,z\rangle \in P$
.
Finally, let

It is clear that r is a set. We know that
$p\land a(x)\subseteq \jmath f(x)$
, and
$z\in f(x)$
implies
$\exists y[y\in g(x,z)]$
, so
$\exists y [\langle x,y,z\rangle \in r]$
. By combining these facts, we have
$p\land a(x)\subseteq \jmath \{z\mid \exists y\ \langle x,y,z\rangle \in r\}$
. Thus concluding the proof of Lemma 6.7.
Proposition 6.10. Working over
$\mathsf {CZF}^-$
,
$V^{\mathcal {S}}$
validates Strong Collection.
Proof Let
$a\in V^{\mathcal {S}}$
and let R be a class relation. Let
$p\in \operatorname {Low}(\mathcal {S})_\jmath $
and assume that
$p\subseteq [\mkern -2mu[ R\colon a\rightrightarrows V]\mkern -2mu]$
. By Lemma 6.7 (and using the previously defined notation), we can fix some
$r\subseteq P$
such that

holds for all
$x\in \operatorname {dom} a$
. Now define
$b\in V^{\mathcal {S}}$
as follows:

and

We claim that .
-
•
$p\subseteq [\mkern -2mu[ R:a\rightrightarrows b]\mkern -2mu]$ : if
$\langle x,y,z\rangle \in r$ , then
$y\in \operatorname {dom} b$ , and hence
$$ \begin{align*} p\land a(x)& \subseteq \jmath \{ z\mid \exists y \ \langle x,y,z\rangle\in r\} \subseteq \bigvee_{y\in\operatorname{dom} b} \jmath \{z \mid \langle x,y,z\rangle\in r\} \\ &\subseteq \bigvee_{y\in\operatorname{dom} b} \jmath\{z\mid \langle x,y,z\rangle \in r\}\land p\land a(x)\land [\mkern-2mu[ R(x,y)]\mkern-2mu] \\ &\subseteq \bigvee_{y\in\operatorname{dom} b} b(y) \land [\mkern-2mu[ R(x,y)]\mkern-2mu] = [\mkern-2mu[ \exists y\in b \ R(x,y)]\mkern-2mu]. \end{align*} $$
-
•
$p\subseteq [\mkern -2mu[ R:b\rightrightarrows a]\mkern -2mu]$ : note that
$\langle x,y,z\rangle \in r$ implies
$x\in \operatorname {dom} a$ . Hence
$$ \begin{align*} b(y)& = \jmath \{z \mid \exists x \in \operatorname{dom} a \langle x,y,z\rangle \in r\} \\ & \subseteq \bigvee_{x\in\operatorname{dom} a} a(x)\land [\mkern-2mu[ R(x,y) ]\mkern-2mu] = [\mkern-2mu[\exists x\in a\ R(x,y)]\mkern-2mu].\\[-42pt] \end{align*} $$
Hence we have:
Theorem 6.11. Working over
$\mathsf {CZF}^-$
, the Heyting-valued model
$V^{\mathcal {S}}$
also satisfies
$\mathsf {CZF}^-$
. If
$\mathcal {S}$
is set-presented and Subset Collection holds, then
$V^{\mathcal {S}}\models \mathsf {CZF}$
. Furthermore, if our background theory satisfies Full Separation or Power Set, then so does
$V^{\mathcal {S}}$
respectively.
Proof The first part of the theorem is shown by Gambino [Reference Gambino20], and we have already replicated the proof for the validity of Strong Collection. Hence we omit this part of the proof, and we concentrate on the preservation of Full Separation and Power Set.
For Full Separation, it suffices to see that the proof for Bounded Separation over
$V^{\mathcal {S}}$
also works for Full Separation. It actually works since Full Separation ensures
$[\mkern -2mu[\phi ]\mkern -2mu]$
is a set for every formula
$\phi $
because
$[\mkern -2mu[\forall x \phi (x)]\mkern -2mu] = \{s\in S \mid \forall x (x\in V^{\mathcal {S}} \to s\in [\mkern -2mu[\phi (x)]\mkern -2mu])\}$
and
$[\mkern -2mu[\exists x\phi (x)]\mkern -2mu] = \jmath (\{s\in S\mid \exists x (x\in V^{\mathcal {S}} \land s\in [\mkern -2mu[\phi (x)]\mkern -2mu])\})$
.
For Power Set, let
$a\in V^{\mathcal {S}}$
. We can show that
$\operatorname {Low}_\jmath (S)$
is a set due to Power Set. Thus we have the name
$b \in V^{\mathcal {S}}$
defined by
$\operatorname {dom} b = {{}^{\operatorname {dom} a}}(\operatorname {Low}_\jmath (\mathcal {S}))$
and
$b(c)=\top $
. We claim that b witnesses Power Set.
Let
$c\in V^{\mathcal {S}}$
. We will find
$d\in \operatorname {dom} b$
such that

Let d be the name such that
$\operatorname {dom} d=\operatorname {dom} a$
and
$d(y)=[\mkern -2mu[ y\in c]\mkern -2mu]$
. Then
$d\in \operatorname {dom} b$
. Furthermore, we have

and

Hence
$[\mkern -2mu[ c\subseteq a]\mkern -2mu] \le [\mkern -2mu[ c=d]\mkern -2mu]$
.
Let us finish this subsection with some constructors, which we will need later.
Definition 6.12. For
$\mathcal {S}$
-names a and b,
$\mathsf {up}(a,b)$
is defined by
$\operatorname {dom}(\mathsf {up}(a,b))=\{a,b\}$
and
$(\mathsf {up}(a,b))(x)=\top $
.
$\mathsf {op}(a,b)$
is the name defined by
$\mathsf {op}(a,b)=\mathsf {up}(\mathsf {up}(a,a),\mathsf {up}(a,b))$
.
$\mathsf {up}(a,b)$
is a canonical name for the unordered pair
$\{a,b\}$
over
$V^{\mathcal {S}}$
. That is, we can prove that

Hence the name
$\mathsf {op}(a,b)$
is the canonical name for the ordered pair given by a and b over
$V^{\mathcal {S}}$
.
6.2 Double-negation formal topology
Our main tool to determine the consistency strength of the theories in this paper is the Heyting-valued interpretation with the double-negation formal topology.
Definition 6.13. The double-negation formal topology,
$\Omega $
, is the formal topology
$(1,=,\vartriangleleft )$
, where
$x\vartriangleleft p$
if and only if
$\lnot \lnot (x\in p)$
.
Unlike set-sized realizability or set-represented formal topology, the double-negation topology and the resulting Heyting-valued interpretation need not be absolute between BCST-regular sets or transitive models of
$\mathsf {CZF}^-$
. For example, even for a transitive model M of
$\mathsf {CZF}^-$
, it need not be the case that
$\Omega =\Omega ^M$
holds. This is because
$\vartriangleleft \subseteq 1 \times \mathcal {P}(1)$
and
$\mathcal {P}(1) = \mathcal {P}^M(1)$
may not in general hold between transitive sets. For instance, if
$\varphi ^M$
is not logically equivalent to
$\varphi $
then it is unclear why
$\{ 0 \in 1 \mid \varphi \}$
should be in
$M \cap \mathcal {P}(1)$
. One example of the failure of
$\Omega =\Omega ^M$
happens when M is the set
$\mathsf {HF}$
of all hereditarily finite sets. Reference [Reference Jeon32] proved that
$\mathsf {HF}$
is a model of
$\mathsf {CZF}$
without Infinity, and it additionally satisfies
$\mathsf {LEM}$
for atomic formulas. This shows
$\Omega ^{\mathsf {HF}}=2$
. However, there is no reason to believe that
$\Omega =2$
in general unless we have
$\Delta _0\text {-}\mathsf {LEM}$
. Hence we need a careful analysis of the double-negation formal topology, which is the aim of this subsection.
We can see that the class of lower sets,
$\operatorname {Low}(\Omega )=\{p\subseteq 1 \mid p= \operatorname {\mathrm {\downarrow }} p\}$
, is just the powerclass of 1,
$\mathcal {P}(1)$
, and the nucleus of
$\Omega $
is given by the double complement
$p^{\lnot \lnot } = (p^\lnot )^\lnot $
, where

so
$p^{\lnot \lnot } = \{0\mid \lnot \lnot (0\in p)\}$
. Hence
$\operatorname {Low}(\Omega )_\jmath $
is the collection of all stable subsets of 1, that is, those sets
$p\subseteq 1$
such that
$p=p^{\lnot \lnot }$
.
The main feature of
$\Omega $
is that Heyting-valued interpretation over
$\Omega $
forces a law of excluded middle for some class of formulas:
Proposition 6.14. Let
$p\in \operatorname {Low}(\Omega ) = \mathcal {P}(1)$
. Then
$(p\cup p^\lnot )^{\lnot \lnot } =1$
. As a corollary, if
$[\mkern -2mu[ \phi ]\mkern -2mu]$
is a set, then
$[\mkern -2mu[\phi \lor \lnot \phi ]\mkern -2mu]=1$
. Especially,
-
1.
$[\mkern -2mu[\phi \lor \lnot \phi ]\mkern -2mu]=1$ holds for every bounded formula
$\phi $ , and
-
2. if Full Separation holds, then
$[\mkern -2mu[\phi \lor \lnot \phi ]\mkern -2mu]=1$ holds for every
$\phi $ .
Proof
$\lnot \lnot (p\cup \lnot p)=1$
follows from the fact that
$\lnot \lnot (\phi \lor \lnot \phi )$
is derivable in intuitionistic logic. Moreover,
$[\mkern -2mu[\phi \lor \lnot \phi ]\mkern -2mu] = ([\mkern -2mu[\phi ]\mkern -2mu] \cup \lnot [\mkern -2mu[\phi ]\mkern -2mu])^{\lnot \lnot }$
if
$[\mkern -2mu[\phi ]\mkern -2mu]$
is a set. Lastly, under
$\mathsf {CZF}^-$
, if
$\phi $
is bounded then
$[\mkern -2mu[ \phi ]\mkern -2mu]$
is a set and if Full Separation holds then
$[\mkern -2mu[ \phi ]\mkern -2mu]$
is a set for every
$\phi $
.
The following corollary is immediate from the previous proposition and Theorem 6.11:
Corollary 6.15. If V satisfies
$\mathsf {CZF}^-$
, then
$V^\Omega \models \mathsf {CZF}^-+\Delta _0\text {-}\mathsf { LEM}$
. Furthermore, if V satisfies Full Separation, then
$V^\Omega \models \mathsf {ZF}^-$
.
We will frequently mention the relativized Heyting-valued interpretation. For a transitive model A of
$\mathsf {CZF}^-$
, we can consider the construction of
$V^\Omega $
internal to A. We define the following notions to distinguish relativized interpretation from the usual one.
Definition 6.16. Let A be a transitive model of
$\mathsf {CZF}^-$
. Then
$A^\Omega :=(V^\Omega )^A$
is the
$\Omega $
-valued universe relativized to A. If A is a set, then
$\tilde {A}$
denotes the
$\Omega $
-name defined by
$\operatorname {dom} \tilde {A}:=A^\Omega $
and
$\tilde {A}(x)=\top $
for all
$x\in \operatorname {dom} \tilde {A}$
.
We shall see in Lemma 6.17 that whenever A is a set, so is
$A^\Omega $
which means that the definition of
$\tilde {A}$
is well defined. Also, we will often confuse
$\tilde {A}$
and
$A^\Omega $
if the context is clear. Finally, it is worth mentioning that if j is an elementary embedding, then
$j(\tilde {K})=\widetilde {j(K)}$
, so we may write
$j^n(\tilde {K})$
instead of
$\widetilde {j^n(K)}$
.
As before with
$\Omega $
, we do not know whether
$\operatorname {Low}(\Omega )_\jmath $
is equal to its relativization
$(\operatorname {Low}(\Omega )_\jmath )^M$
in M. As a result, we do not know whether its Heyting-valued universe,
$V^\Omega $
, and Heyting-valued interpretation,
$[\mkern -2mu[\cdot ]\mkern -2mu]$
, are absolute. Fortunately, the formula
$p\in \operatorname {Low}(\Omega )_\jmath $
, which is
$p\subseteq 1\land p=p^{\lnot \lnot }$
, is
$\Delta _0$
. Hence
$p\in \operatorname {Low}(\Omega )_\jmath $
is absolute between transitive models of
$\mathsf {CZF}^-$
. As a result, we have the following absoluteness result on the Heyting-valued universe:
Lemma 6.17. Let A be a transitive model of
$\mathsf {CZF}^-$
without Infinity. Then we have
$A^\Omega =V^\Omega \cap A$
. Moreover, if A is a set, then
$A^\Omega $
is also a set.
Proof We will follow the proof of [Reference Rathjen50, Lemma 6.1]. Let
$\Phi $
be the inductive definition given by

We can see that the
$\Phi $
defines the class
$V^\Omega $
. Furthermore,
$\Phi $
is
$\Delta _0$
, so it is absolute between transitive models of
$\mathsf {CZF}^-$
. By Lemma 2.18, we have a class J such that
$V^\Omega = \bigcup _{a\in V} J^a$
, and for each
$s\in V$
,
$J^s=\Gamma _\Phi (\bigcup _{t\in s} J^t)$
. Now consider the operation
$\Upsilon $
given by

By Lemma 2.18 again, there is a class Y such that
$Y^s=\Upsilon (\bigcup _{t\in s}Y^t)$
for all
$s\in V$
. Furthermore, we can see that
$Y^s\subseteq V^\Omega $
by induction on a.
Let
$Y=\bigcup _{s\in A}Y^s$
. We claim by induction on s that
$J^s\cap A\subseteq Y$
. Assume that
$J^t\cap A\subseteq Y$
holds for all
$t\in s$
. If
$a\in J^s\cap A$
, then the domain of a is a subset of
$A\cap \left (\bigcup _{t\in s} J^t\right )$
, which is a subclass of Y by the inductive assumption and the transitivity of A. Moreover, for each
$x\in \operatorname {dom} a$
there is
$u \in A$
such that
$x\in Y^u$
. By Strong Collection over A, there is
$v\in A$
such that for each
$x\in \operatorname {dom} a$
there is
$u\in v$
such that
$x\in Y^u$
. Hence
$\operatorname {dom} a\subseteq \bigcup _{u\in v} Y^u$
, which implies
$a\in Y^v\subseteq Y$
.
Hence
$V^\Omega \cap A\subseteq Y$
, which gives us that
$Y=V^\Omega \cap A$
. We can see that the construction of Y is the relativized construction of
$V^\Omega $
to A, so
$Y=A^\Omega $
. Hence
$A^\Omega =V^\Omega \cap A$
. If A is a set, then
$\Upsilon (X)$
is a set for each set X, so we can see by induction on a that
$Y^a$
is also a set for each
$a\in A$
. Hence
$A^\Omega =Y=\bigcup _{a\in A} Y^a$
is also a set.
We extended the nucleus
$\jmath $
to J for subclasses of
$\operatorname {Low}\mathcal {S}$
, and used it to define the validity of formulas of the forcing language. Now, we are working with the specific formal topology
$\mathcal {S}=\Omega $
, and in this case, for a class
$P\subseteq 1$
,
$JP = \bigcup \{q^{\lnot \lnot } \mid q\subseteq P\}$
. It is easy to see that
$P\subseteq JP\subseteq P^{\lnot \lnot }$
.
We also define the following relativized notion for any transitive class A such that
$1\in A$
:

If
$P\in A$
, then
$J^AP=P^{\lnot \lnot }$
, and in general, we have
$P\subseteq J^AP\subseteq JP\subseteq P^{\lnot \lnot }$
. Moreover, we can prove the following facts by straight forward computations.
Lemma 6.18. Let A and B be transitive classes such that
$1\in A,B$
and let
$P\subseteq 1$
be a class.
-
1.
$A\subseteq B$ implies
$J^AP\subseteq J^BP$ .
-
2. If
$\mathcal {P}(1)\cap A=\mathcal {P}(1)\cap B$ , then
$J^AP=J^BP$ .
However, the following proposition shows that
$\mathsf {CZF}^-$
does not prove
$J^AP=J^BP$
or
$JP = P^{\lnot \lnot }$
in general:
Proposition 6.19.
-
1. If
$A\cap \mathcal {P}(1)=2 $ (for example when
$A=2$ or
$A=V$ and
$\Delta _0\text {-}\mathsf {LEM}$ holds
$)$ , then
$J^AP=P$ .
-
2. If
$P^{\lnot \lnot }\subseteq JP$ for every class P, then if
$\Delta _0\text {-} \mathsf {LEM}$ holds so does the law of excluded middle for arbitrary formulas.
$J^A$
has a crucial role in defining Heyting-valued interpretation, but
$J^A$
and
$J^B$
might have different effects unless
$A=B$
. This causes absoluteness problems, which appears to be impossible to in general avoid.
The following lemma states provable facts about relativized Heyting interpretations:
Lemma 6.20. Let
$A\subseteq B$
be transitive models of
$\mathsf {CZF}^-$
. Assume that
$\phi $
is a formula with parameters in
$A^\Omega $
.
-
1. If
$\phi $ is bounded, then
$[\mkern -2mu[\phi ]\mkern -2mu]^A=[\mkern -2mu[\phi ]\mkern -2mu]^B$ .
-
2. If
$\phi $ only contains bounded quantifications, logical connectives between bounded formulas, unbounded
$\forall $ , and
$\land $ , then
$[\mkern -2mu[\phi ]\mkern -2mu]^A=[\mkern -2mu[\phi ^{\tilde {A}}]\mkern -2mu]^B$ .
-
3. If every conditional appearing as a subformula of
$\phi $ is of the form
$\psi \to \chi $ for a bounded formula
$\psi $ and a formula
$\chi $ , then
$[\mkern -2mu[\phi ]\mkern -2mu]^A\subseteq [\mkern -2mu[\phi ^{\tilde {A}}]\mkern -2mu]^B$ .
-
4. If
$\mathcal {P}(1)\cap A=\mathcal {P}(1)\cap B$ , then
$[\mkern -2mu[\phi ]\mkern -2mu]^A=[\mkern -2mu[\phi ^{\tilde {A}}]\mkern -2mu]^B$ .
Proof
-
1. If
$\phi $ is bounded, then
$[\mkern -2mu[\phi ]\mkern -2mu]$ is defined in terms of double complement, Heyting connectives between subsets of 1, and set-sized union and intersection. These notions are absolute between transitive sets, so we can prove
$[\mkern -2mu[\phi ]\mkern -2mu]$ is also absolute by induction on
$\phi $ . In the case of atomic formulas as an initial stage, we apply the induction on
$A^\Omega $ -names.
-
2. We apply induction on formulas. For the unbounded
$\forall $ , we have
$$ \begin{align*} [\mkern-2mu[\forall x\phi(x)]\mkern-2mu]^A = \bigwedge_{x\in A^\Omega}[\mkern-2mu[ \phi(x)]\mkern-2mu]^A = \bigwedge_{x\in A^\Omega} [\mkern-2mu[\phi^{\tilde{A}}(x)]\mkern-2mu]^B = [\mkern-2mu[\forall x\in\tilde{A} \phi^{\tilde{A}}(x)]\mkern-2mu]^B. \end{align*} $$
-
3. The proof again uses induction on formulas. By the previous argument, if
$[\mkern -2mu[\phi (x)]\mkern -2mu]^A\subseteq [\mkern -2mu[\phi ^{\tilde {A}}(x)]\mkern -2mu]^B$ for all
$x\in A^\Omega $ , then
$[\mkern -2mu[ \forall x \phi (x)]\mkern -2mu]^A\subseteq [\mkern -2mu[\forall x\in \tilde {A} \phi ^{\tilde {A}}(x)]\mkern -2mu]^B$ . For an unbounded
$\exists $ , we have
$$ \begin{align*} [\mkern-2mu[\exists x\phi(x)]\mkern-2mu]^A & = J^A \left(\bigcup \{[\mkern-2mu[ \phi(x)]\mkern-2mu]^A\mid x\in A^\Omega\}\right) \\ & \subseteq J^B \left(\bigcup \{[\mkern-2mu[ \phi^{\tilde{A}}(x)]\mkern-2mu]^B\mid x\in A^\Omega\}\right) = [\mkern-2mu[\exists x\in \tilde{A} \phi^{\tilde{A}}(x) ]\mkern-2mu]^B. \end{align*} $$
$\to $ , which requires some inspection to see how the inclusion works. By the assumption, our conditional is of the form
$\psi \to \chi $ for some bounded formula
$\psi $ and a (possibly unbounded) formula
$\chi $ . Since
$\psi $ is bounded, we have
$[\mkern -2mu[\psi ]\mkern -2mu]^A=[\mkern -2mu[\psi ^{\tilde {A}}]\mkern -2mu]^B$ . Furthermore, we can see that if
$a, b, c\in \operatorname {Low}(\Omega )_\jmath $ satisfies
$a\subseteq b$ , then
$c\to a\subseteq c\to b$ . Hence
$$ \begin{align*} [\mkern-2mu[ \psi\to\chi]\mkern-2mu]^A = \left( [\mkern-2mu[ \psi]\mkern-2mu]^A \to [\mkern-2mu[\chi]\mkern-2mu]^A \right)\subseteq \left( [\mkern-2mu[ \psi^{\tilde{A}}]\mkern-2mu]^B \to [\mkern-2mu[\chi^{\tilde{A}}]\mkern-2mu]^B \right) = [\mkern-2mu[(\psi\to\chi)^{\tilde{A}}]\mkern-2mu]^B. \end{align*} $$
-
4. We can see that
$[\mkern -2mu[\phi (x)]\mkern -2mu]^A = [\mkern -2mu[\phi ^{\tilde {A}}(x)]\mkern -2mu]^B$ holds by induction on
$\phi $ . The calculations we did before are helpful to see the inductive argument works for unbounded quantifications. In particular, we observe that if
$\mathcal {P}(1) \cap A = \mathcal {P}(1) \cap B$ then
$J^A = J^B$ .
Remark 6.21. In the third clause of Lemma 6.20, if
$A\in B$
, then
$\bigcup _{x\in A^\Omega } [\mkern -2mu[ \phi ^{\tilde {A}}(x)]\mkern -2mu]^B\in \mathcal {P}(1)\cap B$
. Thus, in this case, we have

Remark 6.22. We can see that Lemma 6.20 also holds for arbitrary formal topologies
$\mathcal {S}\in A$
. The proof is identical, and its verification is left to the reader.
6.3 Preservation of small large sets
Heyting-valued models do not necessarily preserve large sets unless we impose some additional restrictions. For example, on the one hand, Ziegler proved in [Reference Ziegler67] that large set properties are preserved under ‘small’ pcas and formal topologies.
Proposition 6.23 (Ziegler [Reference Ziegler67, Chapter 4]).
Let K be either a regular set, inaccessible set, critical set, or Reinhardt set.
-
1. Let
$\mathcal {A}$ be a pca and
$\mathcal {A}\in K$ . Then
$K[\mathcal {A}]:=V[\mathcal {A}]\cap K$ is a set and the realizability model
$V[\mathcal {A}]$ thinks
$K[\mathcal {A}]$ possesses the same large set property K does.
-
2. Let
$\mathcal {S}$ be a formal topology that is set-represented by R. Assume that
$\mathcal {S},R\in K$ . Then
$K^{\mathcal {S}} := V^{\mathcal {S}}\cap K$ is a set and
$V^{\mathcal {S}}$ thinks the canonical name
$\tilde {K}$ given by
$\operatorname {dom}\tilde {K}=K^{\mathcal {S}}$ ,
$\tilde {K}(a)=\top $ possesses the same large set property K does.
On the other hand, however, our double-negation formal topology will usually lose large set properties. For example,
$\mathsf {CZF}^-$
cannot prove that Heyting-valued interpretations under
$\Omega $
preserve regular sets regardless of how many regular sets exist in V: If it were possible, then
$\mathsf {CZF+REA}$
would interpret
$\mathsf {ZF}$
, but the latter theory proves the consistency of the former.
Hence our preservation results under the double-negation topology are also quite limited, which is a great obstacle for deriving the consistency strength of large sets over constructive set theories. Fortunately, almost all lower bounds we derived in Section 5 are models of
$\mathsf {IZF}$
. Furthermore, we mostly work with power inaccessible sets instead of mere inaccessible sets. This will make deriving the lower bounds easier, and we will examine the detailed account of the aforementioned statements in this subsection.
We mostly follow the proof of Lemma 6.7 and Proposition 6.10. However, for the sake of verification, we will provide most of the details of relevant lemmas and their proofs. Throughout this section, A and B are classes such that:
-
•
$A\in B$ (thus A is a set),
-
•
$\mathcal {P}(1)\cap A=\mathcal {P}(1)\cap B$ ,
-
• A is BCST-regular and B is a transitive (set or class) model of
$\mathsf {CZF}^-$ .
Finally,
$R\in B$
will denote a multi-valued function over the Heyting-valued universe
$B^\Omega $
unless specified.
The main goal of this subsection is proving the following preservation theorem:
Theorem 6.24. Let A be a BCST-regular set and
$B \supseteq A$
a transitive model of
$\mathsf {CZF}^-$
such that
$\mathcal {P}(1)\cap A = \mathcal {P}(1)\cap B$
. Then
$B^\Omega $
thinks
$\tilde {A}$
is BCST-regular.
Its proof requires a sequence of lemmas in a similar way to how we proved Proposition 6.10, the validity of Strong Collection over
$V^{\mathcal {S}}$
. The following lemma is an analogue of Lemma 6.9:
Lemma 6.25. Let
$a\in A$
,
$S \colon a\rightrightarrows A$
a multi-valued function, and
$Q\subseteq a\times A$
a class. Moreover, assume that
$:$
-
1. for each
$x\in a$ there is
$u\subseteq Q_x=\{z \mid \langle x,z\rangle \in Q\}$ such that
$\langle x,u\rangle \in S$ , and
-
2.
$($ Monotone Closure
$)$ if
$\langle x,u\rangle \in S$ and
$u \subseteq v \subseteq Q_x$ then
$\langle x,v\rangle \in S$ .
Then there is an
$f\in A\cap {^{a}}{}A$
such that
$f(x)\subseteq Q_x$
and
$\langle x,f(x)\rangle \in S$
for all
$x\in a$
.
Proof As before, consider the multi-valued function
$S'$
with domain a defined by

By Lemma 2.11, there is
$g\in A$
such that
$g \colon a\rightrightarrows A$
and
$g\subseteq S'$
. Let
$g_x=\{y\mid \langle x,y\rangle \in g\}$
, then
$\bigcup g_x\subseteq Q_x$
. Now take
$f(x) = \bigcup g_x$
, then
$f\in A$
since A satisfies Union and second-order Replacement. Moreover, since
$S'$
is monotone closed, we have
$\langle x,f(x)\rangle \in S'$
for all
$x\in a$
.
The following lemma is an analogue of Lemma 6.7. The reader is reminded that the proof of the following lemma necessarily uses the assumption
$\mathcal {P}(1)\cap A=\mathcal {P}(1)\cap B$
.
Lemma 6.26. Let
$a\in A^\Omega $
and
$R\in B$
. Fix
$p\in A$
such that
$p=p^{\lnot \lnot }$
and
$p\subseteq [\mkern -2mu[ R \colon a\rightrightarrows \tilde {A}]\mkern -2mu]^B$
. If we define P as

then there exists
$r\in A$
such that
$r\subseteq P$
and

Proof Again, observe that
$p\subseteq [\mkern -2mu[ R \colon a\rightrightarrows \tilde {A}]\mkern -2mu]^B$
is equivalent to

for all
$x\in \operatorname {dom} a$
. Now let us take

then take
$Q_x=\{z\mid \langle x,z\rangle \in Q\}\subseteq 1$
. We can easily see that
$Q_x=\bigcup _{y\in A^\Omega } [\mkern -2mu[\mathsf {op}(x,y)\in R]\mkern -2mu]^B$
holds. Thus we have
$p\land a(x)\subseteq J^BQ_x$
. Furthermore, it is also true that
$Q_x\in \mathcal {P}(1)\cap B=\mathcal {P}(1)\cap A$
. So we have
$Q_x\in \mathcal {P}(1)\cap B$
, which implies that
$J^B Q_x=Q_x^{\lnot \lnot }$
.
Now consider the relation
$S\subseteq \operatorname {dom} a\times (\mathcal {P}(1)\cap A)$
defined by

We want to apply Lemma 6.25 to S, so we will check the hypotheses of Lemma 6.25 hold.
The first condition holds because
$\langle x,Q_x\rangle \in S$
for each
$x\in \operatorname {dom} a$
. Furthermore, this shows that S is a multi-valued function with domain
$\operatorname {dom} a$
. For the second condition, the relation is monotone closed in A because if
$v \subseteq w$
are in
$\mathcal {P}(1) \cap A$
then
$v^{\lnot \lnot } \subseteq w^{\lnot \lnot }$
.
Therefore, by Lemma 6.25 applied to S, we have a function
$f\in {^{\operatorname {dom} a}}A\cap A$
such that
$p\land a(x)\subseteq f(x)^{\lnot \lnot }$
and
$f(x)\subseteq Q_x$
for all
$x\in \operatorname {dom} a$
. Now let

Then for each
$\langle x,z\rangle \in q$
there is
$y\in A^\Omega $
such that
$\langle x,y,z\rangle \in P$
holds. By Lemma 2.11 applied to
$P'\colon q\rightrightarrows A^\Omega $
, defined by

there is
$r\in A$
such that
$r\subseteq P$
and
$r \colon q\rightrightarrows A^\Omega $
. It is easy to see that r satisfies our desired property.
Remark 6.27. There is a technical note for the proof of Lemma 6.26, which is that there is no need for P, Q, and
$Q_x$
to be definable over A in general. The reason is that we do not know if either R or
$[\mkern -2mu[\cdot ]\mkern -2mu]^B$
are accessible from A. However, we do not need to worry about this since we are relying on the second-order Strong Collection over A.
Furthermore, the proof of Lemma 6.26 also uses an assumption that B is a transitive model of
$\mathsf {CZF}^-$
implicitly. The reason is that we made use of Heyting operations relative to B, which we formulate over
$\mathsf {CZF}^-$
. Thus relativized Heyting operations are viable when B satisfies
$\mathsf {CZF}^-$
.
We are now ready to prove the preservation theorem, Theorem 6.24. Its proof is parallel to that of Proposition 6.10.
Proof of Theorem 6.24
First, observe that since B is a transitive model of
$\mathsf {CZF}^-$
both
$B^\Omega $
and the Heyting operations over B are well defined. Furthermore, it is easy to see that
$B^\Omega $
thinks
$\tilde {A}$
is transitive, and closed under Pairing, Union and Binary Intersection. Hence it remains to show that
$B^\Omega $
thinks
$\tilde {A}$
satisfies second-order Strong Collection, that is,

Take
$a\in \operatorname {dom}\tilde {A}$
,
$R\in B^\Omega $
, and
$p\in B$
such that
$p\subseteq 1$
and
$p=p^{\lnot \lnot }$
.
We claim that if
$p\subseteq [\mkern -2mu[ R \colon a\rightrightarrows \tilde {A}]\mkern -2mu]^B$
, then there is
$b\in \operatorname {dom}\tilde {A}$
such that
. Taking P as in the statement of Lemma 6.26, by Lemma 6.26, we can find some
$r\in A$
such that
$r\subseteq P$
and

Define b such that
$\operatorname {dom} b=\{y\mid \exists x,z(\langle x,y, {z} \rangle \in r)\}$
and

for
$y\in \operatorname {dom} b$
. Note that
$b(y)\in A$
since r is. Then we can show that
by following the computation given in the proof of Proposition 6.10.
As a corollary, we have:
Corollary 6.28. Let K be a power inaccessible set. Then
$V^\Omega $
thinks
$\tilde {K}$
is power inaccessible.
Proof Since
$\mathcal {P}(1)\in K$
, we have
$\mathcal {P}(1)\cap K=\mathcal {P}(1)$
. Thus Theorem 6.24 shows
$V^\Omega $
thinks
$\tilde {K}$
is BCST-regular. It remains to show that
$V^\Omega $
thinks
$\tilde {K}$
is closed under the true power set.
We claim that the argument for the preservation of Power Set given in the proof of Theorem 6.11 relativizes to K. Work in V, and take
$a\in K^\Omega $
. Since K is power inaccessible, the name defined by
$\operatorname {dom} b =$
${^{\operatorname {dom} a}}{} \{p^{\lnot \lnot } \mid p\subseteq 1\}$
,
$b(c)=1$
is a member of K. By the same calculation as in the proof of Theorem 6.11, we have
$[\mkern -2mu[\forall c (c\subseteq a\to c\in b)]\mkern -2mu] = 1$
. Hence the desired result holds.
6.4 Interpreting an elementary embedding
In this subsection, we work over
$\mathsf {CZF + BTEE}_M$
unless otherwise specified.
We will show that elementary embeddings are persistent under Heyting-valued interpretation over
$\Omega $
. We will mostly follow Section 4.1.6 of Ziegler [Reference Ziegler67], but we need to check his proof works in our setting since his applicative topology does not include Heyting algebras generated by formal topologies that are not set-presentable. Furthermore, we are working with a weaker theory than Ziegler assumed. Especially, we do not assume Set Induction,
$\Delta _0$
-Separation, and Strong Collection for
$(j,M)$
-formulas, which calls for additional care.
In most of our results in the rest of the paper, we will consider the case
$M=V$
, so considering the target universe, M, of j will not be needed for our main results. Nevertheless, to begin with we will work in the more general setting and not assume that
$M = V$
.
We need to define Heyting-valued interpretations for M and j in the forcing language. Since j preserves names, we can interpret j as j itself. We will interpret M by
$M^\Omega $
, as defined in Definition 6.16, which was given by following the construction of
$V^\Omega $
inside M. Thus defining
$M^\Omega $
does not require Strong Collection or Set Induction for the language extended by j. The existence of
$M^\Omega $
follows from the assumption that M satisfied
$\mathsf {CZF}^-$
.
One possible obstacle to defining the interpretation for the extended language is a non-absoluteness of
$\Omega $
and the resulting double-negation formal topology between transitive models of
$\mathsf {CZF}^-$
. We discussed in Section 6.2 that the Heyting interpretation
$[\mkern -2mu[\phi ]\mkern -2mu]$
need not be absolute between transitive sets.
It would be convenient if we have
$[\mkern -2mu[\phi ]\mkern -2mu]=[\mkern -2mu[\phi ]\mkern -2mu]^M$
, which follows from
$\mathcal {P}(1)=\mathcal {P}(1)\cap M$
, which thankfully we have.
Lemma 6.29. For any formula
$\phi $
with parameters in
$M^\Omega $
, we have
$[\mkern -2mu[ \phi ]\mkern -2mu]^M = [\mkern -2mu[ \phi ^{M^\Omega }]\mkern -2mu]$
.
Proof By Lemma 4.9,
$\mathcal {P}(1)=\mathcal {P}(1)\cap M$
. Hence the conclusion follows from the last clause of Lemma 6.20.
Thus we do not need to worry about the absoluteness issue on the Heyting interpretation. Now we are ready to extend our forcing language to
$\{\in , j, M\}$
.
Definition 6.30. Define
$[\mkern -2mu[\phi ]\mkern -2mu]$
for the extended language as follows:
-
•
$[\mkern -2mu[ j^m(a)\in j^n(b)]\mkern -2mu]$ and
$[\mkern -2mu[ j^m(a)=j^n(b)]\mkern -2mu]$ are defined in the same way as
$[\mkern -2mu[ x\in y]\mkern -2mu]$ and
$[\mkern -2mu[ x=y]\mkern -2mu]$ were,
-
•
$[\mkern -2mu[ a\in M ]\mkern -2mu] := \bigvee _{x\in M^\Omega }[\mkern -2mu[ a=x]\mkern -2mu]$ ,
-
•
$[\mkern -2mu[\forall x\in M \phi (x)]\mkern -2mu] := \bigwedge _{x\in M^\Omega } [\mkern -2mu[ \phi (x)]\mkern -2mu]$ , and
-
•
$[\mkern -2mu[\exists x\in M \phi (x)]\mkern -2mu] := \bigvee _{x\in M^\Omega } [\mkern -2mu[ \phi (x)]\mkern -2mu]$ .
Remark 6.31. The reader should be careful that
$[\mkern -2mu[\phi (\vec {x})]\mkern -2mu]$
is not in general a set. Provably over
$\mathsf {CZF + BTEE}_M$
,
$[\mkern -2mu[ j^m(a)\in j^n(b)]\mkern -2mu]$
and
$[\mkern -2mu[ j^m(a)=j^n(b)]\mkern -2mu]$
are always sets. In general, we can see that if
$\phi (\vec {x})$
is a Heyting combination of atomic formulas, then
$[\mkern -2mu[ \phi (\vec {x})]\mkern -2mu]$
is a set regardless of what the background theory is.
However, taking a bounded quantification could make
$[\mkern -2mu[\phi ]\mkern -2mu]$
a class that is not provably a set. For example, consider taking bounded universal quantification over
$[\mkern -2mu[ \phi (x.y)]\mkern -2mu]$
for a Heyting combination of atomic
$(j,M)$
-formulas
$\phi $
. Then
$[\mkern -2mu[\forall x\in a \phi (x,y) ]\mkern -2mu] = \bigwedge _{x\in \operatorname {dom} a} a(x)\land [\mkern -2mu[ \phi (x,y)]\mkern -2mu]$
. Each of
$[\mkern -2mu[ \phi (x,y)]\mkern -2mu]$
can be a set, but the family
$\{[\mkern -2mu[ \phi (x,y)]\mkern -2mu] \mid x\in \operatorname {dom} a\}$
need not be a set unless we have Strong Collection and
$\Delta _0$
-Separation for
$(j,M)$
-formulas.
The situation is even worse for bounded existential quantifiers and disjunctions: we took a nucleus
$\jmath $
when defining the interpretation of these, so
$[\mkern -2mu[\exists x\in a\phi (x)]\mkern -2mu] = \bigvee _{x\in \operatorname {dom} a}[\mkern -2mu[ \phi (x)]\mkern -2mu] =\jmath \left ( \bigcup _{x\in \operatorname {dom} a}[\mkern -2mu[ \phi (x)]\mkern -2mu]\right )$
. This is ill-defined when the union
$\bigcup _{x\in \operatorname {dom} a}[\mkern -2mu[ \phi (x)]\mkern -2mu]$
is a class rather than a set. Hence we have to use the join operator for classes instead of sets. The trade-off for the new definition of
$[\mkern -2mu[\phi ]\mkern -2mu]$
, in this case, is that we do not know if
$[\mkern -2mu[\phi \lor \lnot \phi ]\mkern -2mu]=1$
for a j-formula
$\phi $
, unless we can ensure
$[\mkern -2mu[\phi ]\mkern -2mu]$
is a set.
From this definition, we have an analogue of Lemma 4.26 of [Reference Ziegler67], which is useful to check that j is still elementary over
$V^\Omega $
:
Lemma 6.32. For any bounded formula
$\phi (\vec {x})$
with all free variables displayed in the language
$\in $
(that is, without j and M), we have

for every
$\vec {a}\in V^\Omega $
.
Proof For the first equality, note that
$[\mkern -2mu[\phi (\vec {a})]\mkern -2mu]\subseteq 1$
. Hence we have

by Lemma 4.9.
The second equality will follow from the claim that for any bounded formula
$\phi $
and
$\vec {b} \in M^\Omega $
,
$[\mkern -2mu[ \phi (\vec {b})]\mkern -2mu] = [\mkern -2mu[\phi ^{M^\Omega }(\vec {b})]\mkern -2mu]$
. The proof proceeds by induction on
$\phi $
: the atomic case and cases for
$\land $
,
$\lor $
, and
$\to $
are trivial. For bounded
$\forall $
, observe that if
$c,\vec {b}\in M^\Omega $
then
$[\mkern -2mu[ c = c \cap M^\Omega ]\mkern -2mu] = 1$
, so
$[\mkern -2mu[ \forall x\in c \ \phi (x,\vec {b}) \leftrightarrow (\forall x\in c \ \phi (x,\vec {b}) )^{M^\Omega }]\mkern -2mu]=1$
. This proves
$[\mkern -2mu[ \forall x\in c \ \phi (x,\vec {b})]\mkern -2mu] = [\mkern -2mu[ \forall x\in c \ \phi ^{M^\Omega }(x,\vec {b})]\mkern -2mu]$
. The case for bounded existential quantifiers is similar.
Moreover, we can check the following equalities easily:
Proposition 6.33.
-
1.
$[\mkern -2mu[ \forall x,y (x=y\to j(x)=j(y))]\mkern -2mu]=1$ .
-
2.
$[\mkern -2mu[ \forall x (j(x)\in M)]\mkern -2mu]=1$ .
-
3.
$[\mkern -2mu[ \forall x (x\in M\to \forall y\in x(y\in M)) ]\mkern -2mu]=1$ .
Proof The first equality follows from
$[\mkern -2mu[ x=y]\mkern -2mu] = [\mkern -2mu[ j(x)=j(y)]\mkern -2mu]$
, which holds by the previous lemma, and the remaining two follow from direct calculations.
Lemma 6.34. For every
$\vec {a}\in V^\Omega $
and formula
$\phi $
that does not contain j or M, we have
$[\mkern -2mu[ \phi (\vec {a})\leftrightarrow \phi ^{M^\Omega }(j(\vec {a})) ]\mkern -2mu]=1$
.
Proof Lemma 6.32 proved that this lemma holds for bounded formulas
$\phi $
. We will use full induction on
$\phi $
to prove
$[\mkern -2mu[\phi (\vec {a})]\mkern -2mu]=[\mkern -2mu[\phi ^M(j(\vec {a}))]\mkern -2mu]$
for all
$\vec {a}\in V^\Omega $
. If
$\phi $
is
$\forall x\psi (x,\vec {a})$
, we have
$[\mkern -2mu[ \forall x\psi (x,\vec {a})]\mkern -2mu]=\bigwedge _{x\in V^\Omega }[\mkern -2mu[\psi (x,\vec {a})]\mkern -2mu]$
. Now

where the last equivalence follows from applying j to the above formula. Since the last formula is equivalent to
$0\in \bigwedge _{x\in M^\Omega } [\mkern -2mu[\psi (x, j(\vec {a}))]\mkern -2mu]$
, we have

If
$\phi $
is
$\exists x\psi (x,\vec {a})$
, we have
$[\mkern -2mu[ \exists x\psi (x,\vec {a})]\mkern -2mu]=\bigvee _{x\in V^\Omega }[\mkern -2mu[\psi (x,\vec {a})]\mkern -2mu]$
. Moreover,

Hence
$0\in \bigvee _{x\in V^\Omega }[\mkern -2mu[ \psi (x,\vec {a})]\mkern -2mu]$
if and only if
$0\in \bigvee _{x\in M^\Omega }[\mkern -2mu[ \psi (x,j(\vec {a}))]\mkern -2mu]$
We now work in the extended language
$\mathsf {CZF}_{j,M}$
. We need to check that
$\Delta _0$
-Separation and Strong Collection under the extended language are also persistent under the double-negation interpretation. We can see that the proof given by [Reference Gambino20] and Theorem 6.11 carries over, so we have the following claim:
Proposition 6.35. If V satisfies any of Set Induction, Strong Collection,
$\Delta _0$
-Separation, or Full Separation for the extended language, then the corresponding axiom for the extended language is valid in
$V^\Omega $
.
The essential property of a critical set is that it is inaccessible. However, inaccessibility is not preserved under Heyting-valued interpretations in general. Fortunately, being a critical point is preserved provided it is regular:
Lemma 6.36. Let K be a regular set such that
$K\in j(K)$
and
$j(x)=x$
for all
$x\in K$
. Then
$[\mkern -2mu[\tilde {K}\in j(\tilde {K})\land \forall x\in \tilde {K}(j(x)=x)]\mkern -2mu]=1$
.
Proof Since
$j(K) (\text {is inaccessible})^M$
, we have
$j(K)\models \mathsf {CZF}^-$
. Thus, by Lemma 6.17,
$j(K)^\Omega = j(K)\cap V^\Omega $
. By applying the same argument internal to M, we have

Since
$j(K)\cap V^\Omega \subseteq j(K)\subseteq M$
, we have
$j(K)\cap V^\Omega \subseteq M$
. This implies
$j(K)\cap V^\Omega = j(K)\cap V^\Omega \cap M = j(K)\cap M^\Omega $
. In sum, we have

and these are sets by Lemma 6.17.
Also,
$K\in j(K)$
implies
$\tilde {K}\in j(K)$
. Since the domain of
$j(\tilde {K})=\widetilde {j(K)}$
is
$j(K)\cap V^\Omega $
, we have
$\tilde {K}\in \operatorname {dom} j(\tilde {K})$
, which implies
$[\mkern -2mu[\tilde {K}\in j(\tilde {K})]\mkern -2mu]=1$
. For the assertion
$[\mkern -2mu[\forall x\in \tilde {K} (j(x)=x)]\mkern -2mu]=1$
, observe that if
$x\in \operatorname {dom}\tilde {K}$
then
$j(x)=x$
, so we have the desired conclusion.
6.5 Consistency strength: intermediate results
By using the results from the previous sections and subsections, we have the following:
Theorem 6.37.
-
1.
$\mathsf {IZF}+\mathsf {BTEE}$ interprets
$\mathsf {ZF}+\mathsf {BTEE}$ .
-
2.
$\mathsf {IZF}+\mathsf {BTEE}+\text {Set Induction}_j$ interprets
$\mathsf {ZF}+\mathsf {BTEE}+\text {Set Induction}_{j}$ .
-
3.
$\mathsf {IZF}+\mathsf {WA}$ interprets
$\mathsf {ZF}+\mathsf {WA}$ .
Hence:
Corollary 6.38.
-
1.
$(\mathsf {IKP}_{j,M})\ \mathsf {IKP}$ with a critical point implies
$\mathsf {Con}(\mathsf {ZF} + \mathsf {BTEE} + \text {Set Induction}_{j})$ .
-
2.
$\mathsf {CZF}$ with a Reinhardt set implies
$\mathsf {Con(ZF+WA)}$ .
Proof Using Theorem 5.21, if K is a critical point of an embedding
$j \colon V \rightarrow M$
then we can find some ordinal
$\lambda $
for which
$\langle L_\lambda , j \operatorname {\mathrm {\upharpoonright }} L_\lambda \rangle $
is a model of
$\mathsf {IZF + BTEE}+\text {Set Induction}_j$
. Therefore the first claim follows from the first claim of Theorem 6.37. The second claim follows directly from Theorem 5.22.
By mimicking Bagaria–Koellner–Woodin’s argument in [Reference Bagaria, Koellner and Woodin10] over
$V^\Omega $
, we have the following consistency result:
Theorem 6.39 (
$\mathsf {CGB}_\infty $
).
Let K be a super Reinhardt set. Then
$V^\Omega $
satisfies
$\mathsf {ZF}$
plus there is a proper class of inaccessible cardinals
$\gamma $
such that
$(V_\gamma ,V_{\gamma +1})\models \mathsf {ZF_2} + \text {there is a Reinhardt cardinal}$
.
Proof The proof will proceed as follows: First, we will show that the background theory interprets some moderate semi-intuitionistic theory. Then we will derive that this semi-intuitionistic theory proves that there is a proper class of inaccessible cardinals
$\gamma $
such that
$V_\gamma $
is a model of
$\mathsf {ZF}$
with a Reinhardt cardinal.
So, let K be a super Reinhardt set. For any
$a\in V^\Omega $
, we can find an amenable elementary embedding
$j\colon V\to V$
with critical set K such that
$a\in j(K)$
. As per usual, let
$\Lambda = j^{\omega }(K)$
. We shall restrict our background theory to its first-order part to facilitate the proof.
By Proposition 5.25 and Corollary 5.26, K is power inaccessible and V satisfies
$\mathsf {IZF+pIEA}$
. Thus by Corollary 6.15 and Corollary 6.28,
$V^\Omega $
interprets the following statements:
-
• Axioms of
$\mathsf {ZF}$ , and
-
• There is a proper class of inaccessible cardinals.
Especially,
$V^\Omega $
interprets the law of excluded middle for formulas without j. However, we do not know if
$V^\Omega $
satisfies
$\mathsf {ZF}_j$
because there is no reason why V should satisfy the Separation scheme for j-formulas. As a result, the double-negation translation does not force the law of excluded middle for j-formulas. Despite this,
$V^\Omega $
still believes the following statements are valid due to Lemma 6.34 and Proposition 6.35:
-
• j is amenable and elementary, and
-
• Collection and Set Induction for j-formulas.
Amenability of j needs some justification. Working over V, j is amenable by Lemma 2.32. As we observed in Remark 4.13, if j is amenable then Separation for
$\Delta _0^j$
-formulas holds. By Proposition 6.35,
$\Delta _0$
-Separation for j-formulas is valid in
$V^\Omega $
from which it follows that
$V^\Omega $
thinks j is amenable.
Now work in
$V^\Omega $
. Since
$V^\Omega $
validates
$\mathsf {ZF}$
with a proper class of inaccessible cardinals, we can find the least inaccessible cardinal
$\gamma $
such that
$\gamma> j^\omega (\kappa )$
, where
$\kappa =\operatorname {rank} \tilde {K}$
. Here
$j^\omega (\kappa )$
is well-defined since the sequence
$\langle j^n(\kappa )\mid n\in \omega \rangle $
required Set Induction for j-formulas for its definition and Collection for its supremum to exist. Furthermore, one can see that
$V^\Omega $
thinks
$\tilde {K}$
is a critical point of j and
$\tilde {K}=V_\kappa $
. (The latter equality follows from the fact that
$\tilde {K}$
is power inaccessible.) From which it follows that
$\kappa =\operatorname {crit} j$
.
Since
$\gamma $
is definable from the parameter
$j^{\omega }(\kappa )$
, which is fixed by j, by elementarity we have
$\gamma =j(\gamma )$
. Moreover,
$V^\Omega $
also believes
$j\operatorname {\mathrm {\upharpoonright }} V_\gamma \in V_{\gamma +1}$
and
$\operatorname {crit} (j\operatorname {\mathrm {\upharpoonright }} V_\gamma )=\kappa $
. (Amenability of j ensures
$j\operatorname {\mathrm {\upharpoonright }} V_\gamma $
exists.) Hence
$V^\Omega $
thinks
$(V_\gamma ,V_{\gamma +1})$
is a model of
$\mathsf {ZF_2}$
with a Reinhardt cardinal.
Finally, recall that
$a\in j(K)$
and a were arbitrary. Hence we have proved that for every
$a\in V^\Omega $
,
$V^\Omega $
thinks there is an inaccessible cardinal
$\gamma $
such that
$(V_\gamma ,V_{\gamma +1})$
believes that there is a Reinhardt cardinal
$\kappa $
for which
$a\in V_{j(\kappa )}$
. Hence
$V^\Omega $
thinks there is a proper class of such
$\gamma $
.
Remark 6.40. Most results in this section can be obtained by using Friedman’s double-negation translation as defined in [Reference Friedman15, Reference Friedman and Ščedrov16]. In some cases, applying Friedman’s argument would be simpler: for example, verifying the axioms of
$\mathsf {IZF}$
under Friedman’s double-negation translation does not involve any lengthy proof, unlike the verification of Strong Collection over Gambino’s Heyting universe
$V^{\mathcal {S}}$
. However, Friedman’s proof of the validity of Collection under this translation heavily relies on Full Separation, an axiom scheme that
$\mathsf {CZF}$
does not enjoy.
Thus we may ask what the advantages are to using Gambino’s Heyting-valued interpretation over Friedman’s double-negation interpretation. The main reason is that Gambino’s presentation is closer to forcing, which is a technique more familiar to set theorists. Secondly, as an intermediate step, Friedman first interprets a non-extensional set theory. This can be very notationally heavy and difficult to follow when one first tries to understand the arguments. Finally, most of this work has been done over the weak system of
$\mathsf {CZF}$
and it is unknown how to achieve Friedman’s interpretation in this system.
On the other hand, there may be no advantage to our main results concerning critical sets and Reinhardt sets. This is because in Theorems 5.21 and 5.22 we obtained a lower bound for their consistency strength in terms of
$\mathsf {IZF}$
plus some large set axioms. One could then take Freidman’s double-negation translation of this theory to obtain the consistency of
$\mathsf {ZF} + \mathsf {BTEE} + \text {Set Induction}_{j}$
and
$\mathsf {ZF + WA}$
for critical sets and Reinhardt sets respectively. However, using Gambino’s method we can strengthen this result to preserve our background theory while containing a set model of this classical theory. Namely, we will see that
$V^\Omega $
preserves the theory
$\mathsf {CZF}^-_j$
and contains, as a set,
$\tilde {\Lambda }$
which is a model of the above theory. Since Friedman’s interpretation does not work over weak theories, it does not seem to be possible to obtain a similar result with that translation.
Theorem 6.41. Working with the theory
$\mathsf {CZF}_{j,M}$
with a critical set,
$V^\Omega $
validates
$\mathsf {CZF}^-_j + \Delta _0\text {-} \mathsf {LEM}$
with a critical point
$\tilde {K}$
of j. Furthermore,
$\tilde {\Lambda } = j^\Omega (\tilde {K})$
satisfies
$\mathsf {ZF} + \mathsf {BTEE}$
plus Set Induction for j-formulas. If we strengthen a critical set to a Reinhardt set, then
$V^\Omega $
validates
$\tilde {\Lambda }\models \mathsf {ZF}+\mathsf {WA}$
.
If we add Full Separation into the background theory, then
$V^\Omega $
satisfies not only
$\mathsf {CZF}^-_j + \Delta _0\text {-} \mathsf {LEM}$
, but also
$\mathsf {ZF}^-$
.
Proof The results in the previous subsections show
$V^\Omega $
validates
$\mathsf {CZF}^-_j + \Delta _0\text {-} \mathsf {LEM}$
, and that j is an elementary embedding from
$V^\Omega $
to
$M^\Omega $
with a critical point
$\tilde {K}$
. Since K is BCST-regular,
$K\in j(K)$
, and
$\mathcal {P}(1)\cap K = \mathcal {P}(1)\cap j(K)$
, we can apply Theorem 6.24 to K and
$j(K)$
. Hence we have that

which implies that
$[\mkern -2mu[ \tilde {K}\models \mathsf {CZF}^-]\mkern -2mu]^{j(K)^\Omega }=1$
. Observe that the claim
$\tilde {K}\models \mathsf {CZF}^-$
is
$\Delta _0$
, so applying the second clause of Lemma 6.20 proves
$[\mkern -2mu[\tilde {K}\models \mathsf {CZF}^-]\mkern -2mu]=1$
.
Now working in
$V^\Omega $
, the excluded middle for bounded formulas gives us that every transitive set satisfies the full excluded middle. Especially, both
$\tilde {K}$
and
$\tilde {\Lambda }$
satisfy the full excluded middle. In addition,
$\tilde {\Lambda }$
satisfies
$\mathsf {IZF}+\mathsf {BTEE}$
plus Set Induction for j-formulas by Corollary 5.14. Thus we have the desired result.
The case for a Reinhardt set is analogous to the previous one, except that we apply Theorem 5.22 to show that
$\tilde {\Lambda }$
validates
$\mathsf {IZF}+\mathsf {WA}$
.
Finally, if we have Full Separation, then
$V^\Omega $
also satisfies Full Separation which completes the proof since the combination of Full Separation and
$\Delta _0\text {-} \mathsf {LEM}$
implies the full excluded middle.
We will end this section by observing that the translation preserves the cofinality of an elementary embedding over a moderate extension of
$\mathsf {CZF}$
. None of the previous analysis has required either Full Separation or Subset Collection in the background universe. On the other hand, the following proof, which we include for completeness, requires either Full Separation or
$\mathsf {REA}$
.
Lemma 6.42 (
$\mathsf {CZF}_{j}$
).
Assume either Full Separation or
$\mathsf {REA}$
. If
$j \colon V\prec V$
is a cofinal elementary embedding, then
$V^\Omega $
thinks j is cofinal.
Proof Let
$a\in V^\Omega $
. Then there is a set X such that
$a\in j(X)$
. If we assume Full Separation, then
$X\cap V^\Omega $
is a set, and
$j(X\cap V^\Omega )=j(X)\cap V^\Omega $
. Let b be a name such that
$\operatorname {dom} b= X\cap V^\Omega $
and
$b(y)=1$
for all
$y\in \operatorname {dom} b$
. Then
$[\mkern -2mu[ a\in j(b) ]\mkern -2mu]=\top $
.
An additional step is required if we assume
$\mathsf {REA}$
instead: Take a set X such that
$a\in j(X)$
. By
$\mathsf {REA}$
, we can find a regular set Y such that
$X\in Y$
. By Lemma 6.17,
$Y^\Omega =Y\cap V^\Omega $
is a set. The remaining argument is then identical to the previous one.
While Lemma 6.42 does not directly suggest anything about consistency strength, when combined with Theorem 6.41 it does tell us that
$\mathsf {CZF+Sep}$
with a Reinhardt set interprets
$\mathsf {ZF}^-$
with the existence of a non-trivial cofinal embedding
$j\colon V\to V$
.
As we pointed out after Proposition 4.8, over
$\mathsf {ZFC}^-$
with the
$\mathsf {DC}_\mu $
-schemes for all cardinals
$\mu $
there cannot be a non-trivial cofinal embedding
${j\colon V\to V}$
. The
$\mathsf {DC}_\mu $
-schemes are a variant of the Axiom of Choice and adding these schemes to
$\mathsf {ZF}^-$
does not bolster its consistency strength. In fact,
$\mathsf {ZF}^-$
proves L satisfies the
$\mathsf {DC}_\mu $
-scheme for every cardinal
$\mu $
in L. Thus the absence of a cofinal embedding over
$\mathsf {ZFC}^-$
with the
$\mathsf {DC}_\mu $
-schemes for any cardinal
$\mu $
can be seen as a variant of the Kunen inconsistency phenomenon.
However, it is unclear how we can use the above results to obtain a stronger bound for the consistency strength of
$\mathsf {CZF}$
with a Reinhardt set. It is known that if one extends
$\mathsf {CZF}$
by the Relation Reflection Scheme (
$\mathsf {RRS}$
), as defined by Aczel [Reference Aczel4], then this is also persistent under Gambino’s Heyting-values interpretation and
$\mathsf {ZFC}^-$
proves that
$\mathsf {RRS}$
is equivalent to the
$\mathsf {DC}$
-Scheme. It is further possible that one might be able to generalize such a scheme to
$\mathsf {DC}_\mu $
for larger cardinals; however, this does not appear to be sufficient to derive an inconsistency due to the heavy use of Well-Ordering in the proof of inconsistency in [Reference Matthews39].
Alternatively, it is proven in [Reference Matthews39] that we can remove the assumption of the Dependent Choice Schemes if we instead require that
$V_{\operatorname {crit} j} \in V$
(in fact, the main reason one assumes the
$\mathsf {DC}_\mu $
-scheme for every cardinal
$\mu $
is to prove this). The issue is that we do not know whether
$V^\Omega $
believes that
$V_{\operatorname {crit} j} \in V$
, even if we assume V satisfies Full Separation or
$\mathsf {REA}$
. If we were to assume that there was a Reinhardt set which was Power Inaccessible then this would be the case’ however, we can only obtain that
$\Lambda $
believes that K is Power Inaccessible which is insufficient to derive the required result.
7 Double-negation translation of second-order set theories
In this section, we will provide a double-negation translation of
$\mathsf {IGB}$
and
$\mathsf {TR}$
. One technical issue is that the statement of
$\mathsf {TR}$
requires an infinite conjunction. However, we know that the full elementarity of
$j \colon V\to V$
is definable in a classical context, namely
$\Sigma _1$
-elementarity, and that this can be codified into a single formula by using the partial truth predicate for
$\Sigma _1$
-formulas. Thus we will not try to interpret infinite connectives.
7.1 Interpreting
$\mathsf {GB}$
from
$\mathsf {IGB}$
Definition 7.1. A class A is an
$\Omega $
-class name, or simply a class name if:
-
• the elements of A are of the form
$\langle a, p\rangle $ , where
$a\in V^\Omega $ ,
$p\subseteq 1$ and
$p^{\lnot \lnot } = p$ ,
-
• it is functional in the sense that
$\langle a,p\rangle , \langle a,q\rangle \in A$ implies
$p=q$ .
If A is a class name, define
$\operatorname {dom} A := \{a \mid \exists p \ \langle a,p\rangle \in A\}$
and
$A(a)=p$
for the unique p such that
$\langle a,p\rangle \in A$
.
Then we can extend
$[\mkern -2mu[\phi ]\mkern -2mu]$
to atomic second-order formulas as follows:
Definition 7.2. Let
$a\in V^\Omega $
and A, B be class names. Define
$[\mkern -2mu[ a\in A]\mkern -2mu] := \bigvee _{x\in \operatorname {dom} A}A(x)\land [\mkern -2mu[ x=a]\mkern -2mu]$
and

Based on the above definition, we can extend our interpretation to any formula
$\phi $
with no second-order quantifiers.
Now we want to extend the double-negation interpretation to formulas which have second-order quantifiers, but this calls into question how this should be formulated in our second-order context. To illustrate the problem, observe that we take a class meet and join to interpret unbounded first-order quantifiers, and the resulting
$[\mkern -2mu[\phi ]\mkern -2mu]$
is a class rather than a set. Thus we may suspect that interpreting second-order quantifiers results in
$[\mkern -2mu[\phi ]\mkern -2mu]$
being a hyperclass, a collection of classes, which is clearly not an object of
$\mathsf {CGB}$
or
$\mathsf {IGB}$
.
This situation is analogous to the one encountered in Remark 6.31, where
$[\mkern -2mu[\phi ]\mkern -2mu]$
for a j-formula
$\phi $
may not be a set. This was because the definitions of
$\lor $
and
$\exists $
for j-formulas used J instead of
$\jmath $
and we only know that
$J([\mkern -2mu[\phi ]\mkern -2mu]\cup \lnot [\mkern -2mu[\phi ]\mkern -2mu])=1$
is true when
$[\mkern -2mu[\phi ]\mkern -2mu]$
is a set.
We may suspect the same issue happens for second-order formulas: we may need to extend
$\jmath $
to some function
$\mathcal {J}$
which is defined for hyperclasses, as J serves as an extension of
$\jmath $
, to define
$\lor $
and
$\exists $
for second-order formulas. However, even if we have a proper definition of
$\mathcal {J}$
, there is no reason to believe that
$\mathcal {J}([\mkern -2mu[\phi ]\mkern -2mu]\cup \lnot [\mkern -2mu[\phi ]\mkern -2mu])=1$
should hold for a second-order formula
$\phi $
.
The situation would be better if we can ensure
$[\mkern -2mu[\phi ]\mkern -2mu]$
is a set. We know that if we additionally assume Full Separation then, for
$\phi $
a first-order formula,
$[\mkern -2mu[\phi ]\mkern -2mu]$
is a set. Analogously, we may suspect that we can prove
$[\mkern -2mu[\phi ]\mkern -2mu]$
is a set when
$\phi $
is a second-order formula, if we have Full Separation and the Full Class Comprehension scheme, which is the assertion that
$\{x\mid \phi (x)\}$
is a class for any class formula
$\phi $
. However, adding this axiom to
$\mathsf {GBC}$
results in Kelley–Morse theory,
$\mathsf {KM}$
, which would considerably increase the strength of the underlying theory we wish to work with.
Because we do not have a good framework for working with hyperclasses we will circumvent this issue in another way, by combining Gambino’s interpretation with Friedman’s double-negation translation. We will take Gambino’s interpretations for formulas with no second-order quantifiers, and extend it using the same translation as in Friedman’s double-negation interpretation.
Definition 7.3. For a second-order formula
$\phi $
with set parameters from
$V^\Omega $
and
$\Omega $
-class name parameters, define
$\phi ^-$
inductively as follows:
-
• If
$\phi $ is an atomic formula, then
$\phi ^- \equiv ([\mkern -2mu[\phi ]\mkern -2mu] = 1)$ .
-
•
$(\phi \land \psi )^- \equiv \phi ^-\land \psi ^-$ .
-
•
$(\phi \lor \psi )^- \equiv \lnot \lnot (\phi ^-\lor \psi ^-)$ .
-
•
$(\phi \to \psi )^- \equiv \phi ^-\to \psi ^-$ .
-
•
$(\forall ^0 x \phi (x))^- \equiv \forall ^0 x\in V^\Omega \ \phi ^-(x)$ .
-
•
$(\exists ^0 x \phi (x))^- \equiv \lnot \lnot \exists ^0 x\in V^\Omega \ \phi ^-(x)$ .
-
•
$(\forall ^1 X \phi (X))^- \equiv \forall ^1 X ((X \text {is a class name})\to \phi ^-(X))$ .
-
•
$(\exists ^1 X \phi (X))^- \equiv \lnot \lnot \exists ^1 X ((X \text {is a class name})\land \phi ^-(X))$ .
Lemma 7.4 (
$\mathsf {IGB}$
).
If
$\phi $
is a formula with no second-order quantifiers, then
$\phi ^-\equiv ([\mkern -2mu[\phi ]\mkern -2mu]=1)$
.
Proof The proof proceeds by induction on
$\phi $
. Atomic cases follows by definition. For conjunction,

and we can see that
$[\mkern -2mu[\phi ]\mkern -2mu]\cap [\mkern -2mu[\psi ]\mkern -2mu] = [\mkern -2mu[\phi \cap \psi ]\mkern -2mu]$
by the definition of our translation. The cases for bounded and unbounded
$\forall $
and implications are analogous.
We need some care for the case for disjunction and
$\exists $
and in particular we will need to make use of both Full Separation and Powerset. We examine the case for unbounded
$\exists $
, to see why we need these two axioms. We know that
$[\mkern -2mu[\exists x \phi (x)]\mkern -2mu] = \bigvee _{x\in V^\Omega } [\mkern -2mu[ \phi (x)]\mkern -2mu]$
, and this is
$J\left (\bigcup _{x\in V^\Omega } [\mkern -2mu[ \phi (x)]\mkern -2mu]\right )$
. By Full Separation,
$\bigcup _{x\in V^\Omega } [\mkern -2mu[ \phi (x)]\mkern -2mu] = \{0\mid \exists x\in V^\Omega [\mkern -2mu[ \phi (x)]\mkern -2mu]=1\}$
is a set. Furthermore, Powerset proves
$Jp=p^{\lnot \lnot }$
for all
$p\subseteq 1$
. Hence
$[\mkern -2mu[\exists x\phi (x)]\mkern -2mu] = \left (\bigcup _{x\in V^\Omega } [\mkern -2mu[ \phi (x)]\mkern -2mu]\right )^{\lnot \lnot }$
, so

The cases for bounded existential quantifiers and disjunctions are analogous, so we omit them.
Lemma 7.5. Let A and B be class names and
$\phi (X)$
be a formula of second-order set theory. Then

Proof The proof proceeds by induction on
$\phi $
, and one can see that the only non-trivial part of this is the atomic case where
$\phi (X)$
is
$x\in X$
.
To do this, we show that
$[\mkern -2mu[ A=B]\mkern -2mu] \land [\mkern -2mu[ a\in A]\mkern -2mu] \le [\mkern -2mu[ a\in B]\mkern -2mu]$
:

Therefore, if both
$[\mkern -2mu[ A=B]\mkern -2mu]=1$
and
$[\mkern -2mu[ a\in A]\mkern -2mu]=1$
then
$[\mkern -2mu[ a \in B ]\mkern -2mu] = 1$
.
Theorem 7.6 (
$\mathsf {IGB}$
).
Working over
$\mathsf {IGB}$
, every axiom of
$\mathsf {GB}$
is valid in
$V^\Omega $
.
Proof By Theorem 6.11 and Lemma 7.4, we can see that the first-order part of
$\mathsf {IGB}$
is valid. Moreover,
$(\phi \lor \lnot \phi )^-$
is valid since it is equivalent to
$\lnot \lnot (\phi ^-\lor \lnot \phi ^-)$
, which is constructively valid.
It remains to show that the second-order part of
$\mathsf {IGB}$
is valid under the interpretation.
-
• Class Extensionality: follows from Lemma 7.5.
-
• Elementary Comprehension: Let
$a\in V^\Omega $ , A be a class name, and
$\phi $ be a formula without class quantifiers. Then
$[\mkern -2mu[\phi (x,a,A)]\mkern -2mu]$ is well-defined for
$x\in V^\Omega $ . Now consider B to be the class name
$$ \begin{align*} B = \{\langle x,[\mkern-2mu[\phi(x,a,A)]\mkern-2mu]\rangle \mid x\in V^\Omega\}. \end{align*} $$
Furthermore, we can easily see that
$[\mkern -2mu[ x\in B\leftrightarrow \phi (x,a,A)]\mkern -2mu]=1$ . Hence B witnesses Elementary Comprehension for
$\phi (x,a,A)$ .
-
• Class Set Induction: the usual argument for first-order Set Induction carries over.
-
• Class Strong Collection: we can see that the proof of Proposition 6.10 works if we replace
$R(x,y)$ with
$\langle x,y\rangle \in R$ .
We do not need to check that the double-negation translation interpretation also validates Class Separation since
$\mathsf {GB}$
without Class Separation proves Class Separation: it follows from Class Replacement, and the proof is similar to the derivation of Separation from Replacement over classical set theory.
7.2 Interpreting
$\mathsf {TR}$
This subsection is devoted to the following result:
Theorem 7.7. If
$\mathsf {GB+TR}\vdash \phi $
, then
$\mathsf {IGB_\infty +TR}\vdash \phi ^-$
.
As mentioned before, we will dismiss infinite connectives from the interpretation. The main reason is that in
$\mathsf {GB}$
every
$\Sigma ^A_1$
-elementary embedding is fully A-elementary embedding by essentially Lemma 4.14(2).
Proof of Theorem 7.7
It suffices to show that
$\mathsf {IGB_{\infty }+TR}$
proves
$(\mathsf {TR})^-$
. Let A be a class name and
$a\in V^\Omega $
. We claim that there is a class name
$\tilde {\jmath }$
such that

By
$\mathsf {TR}$
, we can find some elementary embedding
$j \colon V\to V$
and an inaccessible set K such that j is A-elementary,
$a\in j(K)$
and K is a critical point of j. Define
$\tilde {\jmath } = \{\langle \mathsf {op}(a,j(a)),1\rangle \mid a\in V^\Omega \}$
. It is easy to see that
$[\mkern -2mu[\tilde {\jmath }\text { is a function}]\mkern -2mu]=1$
, and, by Lemma 6.36,
$[\mkern -2mu[ \tilde {K}\text { is a critical point of }\tilde {\jmath }]\mkern -2mu]=1$
. Furthermore,
$[\mkern -2mu[ a\in \tilde {\jmath }(\tilde {K})]\mkern -2mu]=1$
is equivalent to
$[\mkern -2mu[ a\in j(\tilde {K})]\mkern -2mu]=1$
, the latter of which follows from
$a\in j(K)$
.
It remains to show that
$V^\Omega $
thinks
$\tilde {\jmath }$
is an A-elementary embedding. For this, it suffices to show that
$\tilde {\jmath }$
is
$\Sigma ^A$
-elementary by Lemma 4.14. Let
$\psi (x)$
be a
$\Sigma ^A$
-formula. Then

is equivalent to

Finally, since
$[\mkern -2mu[ \psi (\vec {x}) ]\mkern -2mu] = 1$
is expressible in V using a
$\Sigma ^A$
-formula (without any other class parameters), the above formula is immediate from the A-elementarity of j.
Combining the above analysis with Theorem 5.32, the following corollary is immediate.
Corollary 7.8.
$\mathsf {CGB_\infty +TR}$
interprets
$\mathsf {GB+TR}$
.
Remark 7.9. Some readers may wonder about whether we need full elementarity in the formulation of
$\mathsf {TR}$
, because the proof of Theorem 7.7 would work if j preserves formulas of some bounded complexity, probably
$\Sigma ^A$
-formulas. It is correct that what the proof of Theorem 7.7 actually shows is
$\mathsf {IGB_{\infty }}$
with a statement weaker than
$\mathsf {TR}$
interprets
$\mathsf {GB+TR}$
. However, the full power of
$\mathsf {TR}$
was necessary in Theorem 5.32 to derive
$\mathsf {IGB_{\infty }+TR}$
from
$\mathsf {CGB_{\infty }+TR}$
.
8 Consistency strength: final results
We have proven in Corollary 6.38 that
$\mathsf {IKP}$
with a critical point implies the consistency of
$\mathsf {ZF} + \mathsf {BTEE} + \text {Set Induction}_{j}$
while
$\mathsf {CZF}$
with a Reinhardt set implies the consistency of
$\mathsf {ZF + WA}$
. However, one may ask how strong these notions are with respect to the traditional large cardinal hierarchy over
$\mathsf {ZFC}$
, which is a question we address here.
Let us examine the theory
$\mathsf {ZF} + \mathsf {BTEE} + \text {Set Induction}_{j}$
first. We can see that
$\mathsf {ZF} + \mathsf {BTEE} + \text {Set Induction}_{j}$
proves L satisfies the same theory. Hence we have the consistency of
$\mathsf {ZFC} + \mathsf {BTEE} + \text {Set Induction}_{j}$
from that of
$\mathsf {ZF} + \mathsf {BTEE} + \text {Set Induction}_{j}$
.
Let us compare the consistency strength of
$\mathsf {ZFC} + \mathsf {BTEE} + \text {Set Induction}_{j}$
with that of other large cardinal axioms to illustrate how strong it is. We can see that virtually rank-into-rank cardinals, defined by Gitman and Schindler [Reference Gitman and Schindler24], provide an upper bound:
Definition 8.1 [Reference Gitman and Schindler24].
A cardinal
$\kappa $
is virtually rank-into-rank if in some set-forcing extension it is the critical point of an elementary embedding
$j\colon V_\lambda \to V_\lambda $
for some
$\lambda> \kappa $
.
Lemma 8.2. Let
$\kappa $
be a virtually rank-into-rank cardinal and
$\lambda $
witness this. If
$j\colon V_\lambda \to V_\lambda $
is an elementary embedding over a set-generic extension with
$\operatorname {crit} j=\kappa $
, then, in this extension,
$(V_{j^\omega (\kappa )},\in j)$
satisfies
$\mathsf {ZFC+BTEE} + \text {Set Induction}_{j}$
.
Proof First,
$\kappa $
is inaccessible in V. This follows from Theorem 4.20 of [Reference Gitman and Schindler24] and known facts about
$\omega $
-iterable cardinals, but we shall also give a direct proof for it. Suppose, for a contradiction, that V thinks
$\kappa $
is singular. Then there is a cofinal sequence
$\langle \alpha _\xi \mid \xi <\operatorname {cf}\kappa \rangle \in V_\lambda $
that converges to
$\kappa $
. Since
$j(\operatorname {cf}\kappa )=\operatorname {cf}\kappa $
and
$j(\alpha _\xi )=\alpha _\xi $
for all
$\xi <\operatorname {cf}\kappa $
,
$j(\kappa )=\kappa $
, which gives our contradiction. Similarly, if
$\kappa $
is not a strong limit cardinal in V, then there is
$\xi <\kappa $
and a surjection
$f:\mathcal {P}^{V}(\xi )\to \kappa $
in V. Then
$f\in V_\lambda $
since
$\operatorname {rank} f\le \kappa +3<j^3(\kappa )<\lambda $
. (This follows from the fact that the
$j^n(\kappa )$
for
$n<\omega $
form a strictly increasing sequence.) Now we can derive a contradiction in the usual way by considering
$j(f)$
.
Hence
$V_\kappa $
is a model of
$\mathsf {ZFC}$
. Also, we can see in the extension that
$V_\kappa \prec V_{j^n(\kappa )}$
for all
$n<\omega $
which shows that
$V_{j^\omega (\kappa )}=\bigcup _{n<\omega } V_{j^n(\kappa )}$
is a model of
$\mathsf {ZFC}$
. Finally,
$j\operatorname {\mathrm {\upharpoonright }} V_{j^\omega (\kappa )}:V_{j^\omega (\kappa )}\to V_{j^\omega (\kappa )}$
and, by the transitivity of
$V_{j^\omega (\kappa )}$
,
$V_{j^\omega (\kappa )}$
satisfies Set Induction
$_{j}$
.
As a lower bound, Corazza [Reference Corazza13] proved that
$\mathsf {ZFC+BTEE}$
proves there is an n-ineffable cardinal for each (meta-)natural number n.
The authors do not know the exact consistency strength of
$\mathsf {ZF+WA}$
(the assumption of
$\Sigma ^j$
-Induction is useful to ensure that the critical sequence is total) in the
$\mathsf {ZFC}$
-context but we can still find a lower bound for it. Suppose that
$\kappa $
was the critical point of such an embedding. Then we have that the critical sequence
$\langle j^n(\kappa ) | n \in \omega \rangle $
is definable, although it may not be a set (see Proposition 3.2 of [Reference Corazza12] for details). From this, we have:
Lemma 8.3 (
$\mathsf {ZF+WA_0}$
).
If the critical sequence is cofinal over the class of all ordinals, then
$\kappa $
is extendible.
Proof Let
$\eta $
be an ordinal. Take n such that
$\eta <j^n(\kappa )$
, then
$j^n \colon V_{\kappa +\eta }\prec V_{j^n(\kappa +\eta )}$
and
$\operatorname {crit} j^n=\kappa $
. Hence
$\kappa $
satisfies
$\eta $
-extendibility.
However, it should be noted that it is also possible that the critical sequence
$\langle j^n(\kappa )\mid n<\omega \rangle $
is bounded. In this case
$V_\lambda $
, for
$\lambda = \sup _{n<\omega } j^n(\kappa )$
, is a model of
$\mathsf {ZF+WA}$
in which the critical sequence is cofinal. Thus we can proceed with the argument by cutting off the universe at
$\lambda $
.
By an easy reflection argument, we can also see that
$\mathsf {ZF+WA_0}$
, with the critical sequence cofinal, proves not only that there is an extendible cardinal, but also the consistency of
$\mathsf {ZF}$
with a proper class of extendible cardinals, an extendible limit of extendible cardinals, and much more. Since extendible cardinals are preserved by Woodin’s forcing [Reference Woodin65, Theorem 226], we have a lower bound for the consistency strength of
$\mathsf {ZF+WA_0}$
, e.g.,
$\mathsf {ZFC}$
plus there is a proper class of extendible cardinals. Clearly,
$\mathsf {ZF + WA_0}$
should be much stronger than this but to find a better lower bound that involves even more sophisticated machinery than is currently available.
To summarize our consistency bound derived in this section, we have the following:
Corollary 8.4.
-
•
$\mathsf {IKP}_{j,M}$ with a
$\Sigma $ -
$\mathrm {Ord}$ -inary elementary embedding or
$\mathsf {CZF}$ with a critical set implies the consistency of
$\mathsf {ZFC+BTEE} + \text {Set Induction}_{j}$ . Furthermore,
$\mathsf {ZFC+BTEE} + \text {Set Induction}_{j}$ proves that the critical point,
$\kappa $ , of j is n-ineffable for every
$($ meta-
$)$ natural n.
-
•
$\mathsf {CZF}$ with a Reinhardt set implies the consistency of
$\mathsf {ZF+WA}$ . Furthermore, the consistency
$\mathsf {ZF}+\mathsf {WA}$ implies the consistency of
$\mathsf {ZFC}$ with a proper class of extendible cardinals.
9 Future works and questions
9.1 Upper bounds for large large set axioms
We may wonder how to find an upper bound for the consistency strength of
$\mathsf {CZF}$
with very large set axioms in terms of classical set theories. The authors believe that the currently known methods do not suffice to provide non-trivial upper bounds for the proof-theoretic strength of
$\mathsf {CZF}$
with very large set axioms. The known methods for analyzing the strength of
$\mathsf {CZF}$
and its extensions are the following:
-
1. Reducing
$\mathsf {CZF}$ or its extension to Martin-Löf type theory or its extension, and constructing a model of the type theory into a classical theory such as
$\mathsf {KP}$ or its extensions. This is how Rathjen (cf., [Reference Rathjen45, Reference Rathjen49, Reference Rathjen53, Reference Rathjen54]) provides a relative proof-theoretic strength for extensions of
$\mathsf {CZF}$ .
-
2. More generally, sets-as-trees interpretation or its variants: for example, Lubarsky [Reference Lubarsky37] proved that we can reduce
$\mathsf {CZF+Sep}$ into Second-Order Arithmetic by combining realizability with a sets-as-trees interpretation. We may associate Lubarsky’s construction with Rathjen’s interpretations because the sets-as-types interpretation is a special case of the sets-as-trees interpretation. (Types have tree-like structures.) Another construction on that line is functional realizability, which we define below for the sake of completeness:
Definition 9.1. Let
$\mathcal {A}$ be a pca and
$V(\mathcal {A})$ be the realizability universe. (See Definition 3.1 of [Reference Rathjen50] or Definition 2.3.1 of [Reference McCarty41].) A name
$a\in V(\mathcal {A})$ is functional if for every
$\langle e,b\rangle , \langle e, c\rangle \in a$ ,
$b=c$ .
$V^f(\mathcal {A})$ is the class of all functional names
$a\in V(\mathcal {A})$ .
The realizability relation
$\Vdash ^f$ over
$V^f(\mathcal {A})$ is identical with the usual realizability relation
$\Vdash $ over
$V(\mathcal {A})$ . (See Definition 4.1 of [Reference Rathjen50]), except that we restrict quantifiers to
$V^f(\mathcal {A})$ instead of
$V(\mathcal {A})$ . We say that
$V^f(\mathcal {A})\models \phi (\vec {a})$ if there is an
$e\in \mathcal {A}$ such that
$e\Vdash ^f \phi (\vec {a})$ .
-
3. Set realizability, which appears in [Reference Rathjen51, Reference Rathjen52]. This exploits the computational nature of sets to construct an interpretation. Unfortunately, set realizability does not result in models of
$\mathsf {CZF}$ : it produces an interpretation of
$\mathsf {CZF}^-$ ,
$\mathsf {CZF}^-+\mathsf {Exp}$ , or
$\mathsf {CZF+Pow}$ to
$\mathsf {IKP}$ ,
$\mathsf {IKP}(\mathcal {E})$ , or
$\mathsf {IKP}(\mathcal {P})$ respectively. Furthermore, set realizability can be used to prove the existence property of
$\mathsf {CZF}^-$ ,
$\mathsf {CZF}^-+\mathsf {Exp}$ , or
$\mathsf {CZF+Pow}$ . However, Swan proved in [Reference Swan62] that
$\mathsf {CZF}$ does not have the existence property, which suggests set realizability cannot model
$\mathsf {CZF}$ .
Thus the only currently known way to analyze the consistency strength of
$\mathsf {CZF}$
and its extensions is by combining realizability with sets-as-trees interpretations. However, this has significant issues when we try and generalize the method to large large cardinals. The first point is that almost all of the currently known methods (possibly except for [Reference Rathjen49, Reference Swan62]) rely on Kleene’s first pca to construct an interpretation of variants of Martin-Löf type theory into classical theories. The upshot is that the resulting interpretation of
$\mathsf {CZF}$
also validates the Axiom of Subcountability, which claims that every set is subcountable. However, Ziegler [Reference Ziegler67] observed that the Axiom of Subcountability is incompatible with critical sets. This feature may simply be due to the fact that the pca we are using in the construction of the interpretation is countable, so we may avoid this issue by using a larger pca.
However, it seems that there is no obvious way of constructing a realizer for
$\forall \vec {x}\phi ^M(j(\vec {x}))\to \phi (\vec {x})$
regardless of how large the pca
$\mathcal {A}$
is. Since we can view sets-as-types interpretations as a special case of sets-as-trees interpretations, we can expect there would be a similar difficulty when we construct a model of
$\mathsf {CZF}$
with large large set axioms by using the combination of realizability models and type-theoretic interpretations of
$\mathsf {CZF}$
.
Question 9.2. Can we provide any non-trivial upper bound for the consistency strength of
$\mathsf {CZF}$
with a critical set or a Reinhardt set?
9.2 Improving lower bounds
Our current lower bound could also be improved. For example, the proof of Corollary 5.14, Theorem 5.21, and Theorem 5.22 produces a set model of some theory. Hence the resulting lower bound for the consistency strength is strict. This brings into question whether we can provide a better lower bound for the given theories, possibly by constructing a class model of
$\mathsf {IZF}$
with a large set axiom.
Since obtaining a lower bound heavily relies on double-negation translations, it would be important to develop the relationship between double-negation translations and large set axioms. For example, Avigad [Reference Avigad9] provided a way to interpret
$\mathsf {KP}$
from
$\mathsf {IKP}$
by combining Friedman’s double-negation interpretation [Reference Friedman15] and a proof-theoretic forcing. It may be possible to extend Avigad’s interpretation for
$\mathsf {IKP}$
with a
$\Sigma $
-
$\mathrm {Ord}$
-inary elementary embedding with a critical point, which could result in a better lower bound (or possibly an equiconsistency result).
The ‘classical’ side of the lower bounds should also be improved. For example, we claimed that
$\mathsf {CZF}$
with a Reinhardt set implies the consistency of
$\mathsf {ZF+WA}$
, and the latter implies the consistency of
$\mathsf {ZFC}$
with a proper class of extendible cardinals. The authors do not know if it is possible to prove the consistency of
$\mathsf {ZFC+WA}$
from that of
$\mathsf {ZF+WA}$
.
We conclude this subsection with the following obvious question:
Question 9.3. Can we obtain a better lower bound for the consistency of any of the theories analyzed in this paper?
9.3 Developing technical tools
Some concepts in this paper are of independent interest. For example, we defined second-order constructive set theories
$\mathsf {CGB}$
and
$\mathsf {IGB}$
to handle super Reinhardt sets and
$\mathsf {TR}$
. However, constructive second-order set theories bring their own questions, associated with their classical counterparts. The following questions are untouched in this paper:
Question 9.4.
-
1. Williams [Reference Williams64] defined and analyzed second-order set-theoretic principles that bolster second-order set theory, including Class Collection schema, Elementary Transfinite Recursion schema,
$\mathsf {ETR}$ , and its restrictions
$\mathsf {ETR}_\Gamma $ . Can we define constructive analogues of these principles? If so, what are their consistency strength? For example, does
$\mathsf {CGB+ETR_\omega }$ prove the existence of the truth predicate of first-order set theory (and hence proves
$\mathsf {Con(CZF})$ )?
-
2. Do constructive second-order set theories admit the unrolling construction that was introduced by Williams [Reference Williams64]?
-
3. Can we develop a realizability model or Heyting-valued model for constructive second-order theories? It is known that not every class forcing preserves Collection and Powerset over the classical
$\mathsf {GB}$ . We may expect that we need some restriction on a class realizability or a class formal topology to ensure they preserve
$\mathsf {CGB}$ .
The relativization of Heyting-valued models is also an interesting topic. We only focused on the double-negation topology, and it seems that the formal topologies appearing in the current literature are either set-presentable or the double-negation topology. However, it is plausible that another formal topology might appear in the future, and its interaction with different inner models could be non-trivial. In that case, absoluteness and relativization issues become important.
9.4 Other large set axioms
In this paper, we analyzed critical sets, Reinhardt sets, and some constructive analogue of choiceless large cardinals. We did not provide the definition and analysis for analogues of other large cardinal axioms, such as supercompactness or hugeness. These concepts were first defined over
$\mathsf {IZF}$
by Friedman and Ščedrov in [Reference Friedman and Ščedrov16], in such a way as to be equivalent to their classical counterparts. However, for the sake of completeness, we include variations below that work better in our weaker context of
$\mathsf {CZF}$
.
Convention 9.5. Assume that K is a critical point of an elementary embedding
$j\colon V\to M$
. Over
$\mathsf {ZFC}$
, many large cardinals above measurable cardinals are defined as critical points of some elementary embedding with additional properties, for example, closure properties of M. We shall give constructive analogues of the main methods used to define closure where, here,
$\mathcal {MP}$
is Ziegler’s modified powerclass operator and
$\hat {V}_a$
is Ziegler’s modified hierarchy. (See Section 5.3 of [Reference Ziegler67] for the details.)
-
• Replace the closure under
$<\gamma $ -sequences,
${^{< \gamma }}{} M\subseteq M$ , to closure under multi-valued functions whose domain is an element of
$\hat {V}_a$ : if
$b\in \hat {V}_a$ and
$R:b\rightrightarrows M$ is a multi-valued function,Footnote 14 then there is
$c\in M$ such that
.
-
• Replace the closure under
$\gamma $ -sequences,
${^{\gamma }}{} M\subseteq M$ , to closure under multi-valued functions whose domain is an element of
$\hat {V}_{a\cup \{a\}}=\hat {V}_a\cup \mathcal {MP}(\hat {V}_a)$ .
-
• Replace
$V_\alpha \subseteq M$ with
$\hat {V}_\alpha \subseteq M$ .
The reader might wonder why we use multi-valued functions of the domain in
$\hat {V}_a$
and
$\hat {V}_{a\cup \{a\}}$
. For example, we may formulate a constructive analogue of the closure under
$\gamma $
-sequences
${^{\gamma }}{} M\subseteq M$
as the closure under multi-valued functions of domain a. There is a reason why we should allow multi-valued functions of the domain in
$a\cup \{a\}$
: for a transitive set a, the closure under multi-valued functions of domain in
$a\cup \{a\}$
proves
$a\in M$
. However, it is unclear if this can be achieved from just those multi-valued functions whose domain is in a.
It still remains a question why we use
$\hat {V}_{a\cup \{a\}}$
instead of
$a\cup \{a\}$
. The main reason is that
$\hat {V}_{a\cup \{a\}}$
includes
$a\cup \{a\}$
but has a much richer set-theoretic structure, which is the same reason we have worked with large sets rather than large cardinals. This will mean that our definitions will be equivalent to those presented in [Reference Friedman and Ščedrov16] over
$\mathsf {IZF}$
.
Observe that, over
$\mathsf {ZFC}$
, these modified definitions are equivalent to the standard ones. Thus, for example, we can define supercompact sets or strong setsFootnote
15
as follows:
Definition 9.6 (
$\mathsf {CGB}_\infty $
).
-
1. Let a be a transitive set. A set K is a-supercompact if there is an elementary embedding
$j\colon V\to M$ such that K is a critical point of j and M is closed under multi-valued functions whose domain is in
$\hat {V}_{a\cup \{a\}}$ A set K is supercompact if K is a-supercompact for all transitive sets a.
-
2. A set K is n-huge if there is an elementary embedding
$j\colon V\to M$ such that K is a critical point of j and M is closed under multi-valued functions whose domain is in
$\hat {V}_{j^n(K)\cup \{j^n(K)\}}$ .
-
3. A set K is
$\alpha $ -strong if there is an elementary embedding
$j\colon V\to M$ such that K is a critical point of j and
$\hat {V}_\alpha \subseteq M$ . A set K is strong if K is
$\alpha $ -strong for all
$\alpha $ .
Here it suffices to restrict our attention to ordinals rather than defining a-strongness for arbitrary sets since
$\hat {V}_a = \hat {V}_{\operatorname {rank}(a)}$ .
The reader is reminded that we may formulate a-supercompactness or
$\alpha $
-strongness over
$\mathsf {CZF}_{j,M}$
. However, formulating the full supercompactness and strongness would require us to quantify over j and M, so these should be stated over
$\mathsf {CGB}_\infty $
.
As a remark, let us mention that Friedman and Ščedrov [Reference Friedman and Ščedrov16] also defined hugeness and supercompactness over
$\mathsf {IZF}$
. A set K is huge in the sense of Friedman-Ščedrov if K is a critical point of an elementary embedding
$j\colon V\to M$
which is power inaccessible and satisfies the following statement: for any subset u of
$j(K)$
, if
$t\colon u\rightrightarrows M$
then we can find some
$v\in M$
such that
.Footnote
16
The following proposition shows our definition of hugeness and that of Friedman–Ščedrov coincides in some sense:
Proposition 9.7 (
$\mathsf {IGB}_\infty $
).
A transitive set K is huge in the sense of Friedman–Ščedrov if and only if K is power inaccessible and huge in the sense of Definition 9.6.
Proof Assume that K is huge in the sense of Friedman–Ščedrov. It is known that
$\hat {V}_K=K$
if K is inaccessible (see [Reference Ziegler67, Theorem 5.18]). Furthermore, power inaccessibility implies
$\mathcal {P}(1)\in K$
, so
$\mathcal {MP}(K)=\mathcal {P}(K)$
. Since transitivity implies
$K\subseteq \mathcal {P}(K)$
, we have
$\hat {V}_K\cup \mathcal {MP}(K)=\mathcal {P}(K)$
. Thus we have

Hence K is huge. The remaining direction is trivial.
The case for supercompactness is tricky because Friedman and Ščedrov employed a family of elementary embeddings instead of working over a second-order set theory to formulate it. We leave examining the difference between these two definitions of supercompact sets to possible future work.
These new notions of large set axioms bring the following question:
Question 9.8. Can we provide any consistency result for the large set axioms we can define by following the above schemes? Can we define an
$\mathsf {IKP}$
-analogue of such very large cardinals?
Appendix A Tables for notions appearing in this paper
Table A1 Notions appearing in this paper.

Table A2 Theories appearing in this paper.

Funding
The second author was supported by the UK Engineering and Physical Sciences Research Council during this research and is grateful for their support.