1. Introduction
Cancer is often caused by genetic mutations which disrupt regular cell division and apoptosis, in which case cancerous cells divide much more rapidly compared to healthy cells. This can happen, for example, as soon as several distinct mutations occur and dramatically disrupt cell function. Thus, it is sometimes reasonable to model cancer as occurring after k distinct mutations appear in sequence within a large body.
Mathematical models in which cancer occurs once some cell acquires k mutations date back to the famous 1954 paper [Reference Armitage and Doll1], which proposed a multi-stage model of carcinogenesis in which, once a cell has acquired $k-1$ mutations, it acquires a kth mutation at rate $\mu_k$ . In this model, the probability of acquiring the kth mutation during a small time interval $(t,t+\textrm{d}t)$ is
That is, the incidence rate of the kth mutation (at which point the individual becomes cancerous) is proportional to $\mu_1\mu_2\cdots\mu_k t^{k-1}$ . This means that cancer risk is proportional to both the mutation rates and the $(k-1)$ th power of age. More sophisticated models, taking into account the possibilities of cell division and cell death, were later analyzed in [Reference Durrett and Mayberry6, Reference Durrett and Moseley7, Reference Durrett, Schmidt and Schweinsberg9, Reference Iwasa, Michor, Komarova and Nowak12, Reference Iwasa, Michor and Nowak13, Reference Komarova, Sengupta and Nowak15, Reference Moolgavkar and Luebeck17, Reference Moolgavkar and Luebeck18, Reference Schweinsberg24].
To model some types of cancer, it is important to also include spatial structure in the model. In 1972, [Reference Williams and Bjerknes25] introduced a spatial model of skin cancer now known as the biased voter model. At each site on a lattice, there is an associated binary state indicating whether the site is cancerous or healthy. Each cell divides at a certain rate, and when cell division occurs, the daughter cell replaces one of the neighboring cells chosen at random. The model is biased in that a cancerous cell spreads $\kappa>1$ times as fast as a healthy cell. Computer simulations for this model were presented in [Reference Williams and Bjerknes25], and the model was later analyzed mathematically [Reference Bramson and Griffeath3, Reference Bramson and Griffeath4].
More recently, [Reference Durrett, Foo and Leder5], building on earlier work in [Reference Durrett and Moseley8, Reference Komarova14], studied a spatial Moran model which is a generalization of the biased voter model. Cells are modeled as points of the discrete torus $(\mathbb{Z}\text{ mod }L)^d$ , and each cell is of type $i\in\mathbb{N}\cup\{0\}$ . A cell of type $i-1$ mutates to type i at rate $\mu_i$ . Type i cells have fitness level $(1+s)^i$ , where $s>0$ measures the selective advantage of one cell over its predecessors. Each cell divides at a rate proportional to its fitness, and then, as in the biased voter model, the daughter cell replaces a randomly chosen neighboring cell. The authors considered the question of how long it takes for some type 2 cell to appear. To simplify the analysis, they introduced a continuous model where cells live inside the torus $[0,L]^d$ . This continuous stochastic model approximates the biased voter model because of the Bramson–Griffeath shape theorem [Reference Bramson and Griffeath3, Reference Bramson and Griffeath4], which implies that, conditioned on the survival of the mutations, the cluster of cells in $\mathbb{Z}^d$ with a particular mutation has an asymptotic shape that is a convex subset of $\mathbb{R}^d$ . In [Reference Durrett, Foo and Leder5, Section 4], the authors used the continuous model to compute the distribution of the time that the first type 2 cell appears, under certain assumptions on the mutation rates.
We describe here in more detail this continuous approximation to the biased voter model. The spread of cancer is modeled on the d-dimensional torus $\mathcal{T}\;:\!=\;[0,L]^d$ , where the points 0 and L are identified. Note that this is the continuous analog of the space $(\mathbb{Z}\text{ mod }L)^d$ considered in [Reference Durrett, Foo and Leder5]. We write $N\;:\!=\;L^d$ to denote the volume of $\mathcal{T}$ . Each point in $\mathcal{T}$ is assigned a type, indicating the number of mutations the cell has acquired. At the initial time $t=0$ , all points in $\mathcal{T}$ are type 0, meaning they have no mutations. A so-called type 1 mutation then occurs at rate $\mu_1$ per unit volume. Once each type 1 mutation appears, it spreads out in a ball at rate $\alpha$ per unit time. This means that t time units after a mutation appears, all points within a distance $\alpha t$ of the site where the mutation occurred will have acquired the mutation. Type 1 points then acquire a type 2 mutation at rate $\mu_2$ per unit volume, and this process continues indefinitely. In general, type k mutations overtake type $k-1$ mutations at rate $\mu_k$ per unit volume, and each type k mutation then spreads outward in a ball at rate $\alpha$ per unit time. A full mathematical construction of this process, starting from Poisson point processes which govern the mutations, is given at the beginning of Section 3.
Let $\sigma_k$ denote the first time that some cell becomes type k; [Reference Foo, Leder and Schweinsberg11] obtained the asymptotic distribution of $\sigma_2$ under a wide range of values for the parameters $\alpha$ , $\mu_1$ , and $\mu_2$ , extending the results in [Reference Durrett, Foo and Leder5], and also found the asymptotic distribution of $\sigma_k$ for $k \geq 3$ assuming equal mutation rates $\mu_i=\mu$ for all i. In this paper, we will further generalize the results in [Reference Foo, Leder and Schweinsberg11] for $k \geq 3$ by considering the case where the mutation rates are increasing. We will see that several qualitatively different types of behavior are possible, depending on how fast the mutation rates increase.
We mention two biological justifications for assuming increasing mutation rates. A general phenomenon in carcinogenesis was suggested in [Reference Loeb and Loeb16] where there is favorable selection for certain mutations in genes responsible for repairing DNA damage. The increasing genetic instability disrupting DNA repair, in the context of the present paper, would correspond to increasing mutation rates. Also, our model would be of interest in the situation described in [Reference Prindle, Fox and Loeb22], which hypothesized that cancer cells express a mutator phenotype, which causes cells to mutate at a much higher rate, and proposed targeting the mutator phenotype as part of cancer therapy, possibly with the goal of further increasing the mutation rate to the point where the mutations incapacitate or kill malignant cells.
As in [Reference Foo, Leder and Schweinsberg11], we assume that the rate of mutation spread $\alpha$ is constant across mutation types, so that successive mutations have equal selective advantage. One possible generalization of our model would be to allow each type i mutation to have a different rate of spread $\alpha_i$ . However, this more general model is non-trivial even to formulate unless $(\alpha_i)_{i=1}^{\infty}$ is decreasing, because if $\alpha_{i+1}>\alpha_{i}$ , then regions of type $i+1$ could completely swallow the surrounding type i region. Consequently, it would be necessary to model what happens not only when mutations of types $i+1$ and i compete, but also how mutations of types $i+1$ and $j\in \{1,\ldots,i-1\}$ compete. We do not pursue this generalization here.
After computing the limiting distribution of $\sigma_k$ , we also find the limiting distribution of the distances between the first mutation of type i and the first mutation of type j, where $i < j$ . The distribution of distances between mutations is relevant in studying a phenomenon known as the “cancer field effect”, which refers to the increased risk for certain regions to acquire primary tumors. These regions are called premalignant fields, and they have a high risk of becoming malignant despite appearing to be normal [Reference Foo, Leder and Ryser10]. The size of the premalignant field is clinically relevant when a patient is diagnosed with cancer, because it will determine the area of tissue to be surgically removed in order to avoid cancer recurrence. Surgical removal of premalignant fields, put in the context of this paper, is akin to removing the region with at least i mutations once the first type $j>i$ mutation appears. The case in which $i = 1$ and $j = 2$ was considered in [Reference Foo, Leder and Ryser10], which characterized the sizes of premalignant fields conditioned on $\{\sigma_2=t\}$ in $d\in \{1,2,3\}$ spatial dimensions. These ideas were applied to head and neck cancer in [Reference Ryser, Lee, Ready, Leder and Foo23].
We note that the model that we are studying in this paper independently appeared in the statistical physics literature, where it is known as the polynuclear growth model. It has been studied most extensively in $d = 1$ when all of the $\mu_k$ are the same [Reference Borodin, Ferrari and Sasamoto2, Reference Prähofer and Spohn19, Reference Prähofer and Spohn20], but the model was also formulated in higher dimensions in [Reference Prähofer and Spohn21]. Most of this work in the statistical physics literature focuses on the long-run growth properties of the surface, and detailed information about the fluctuations has been established when $d = 1$ . This is quite different from our goal of understanding the time to acquire a fixed number of mutations.
In Section 2 we introduce some basic notation and state our main results, as well as some heuristics explaining why these results are true. In Section 3 we prove the limit theorems regarding the time to wait for k mutations, and in Section 4 we prove the limit theorems for the distances between mutations.
2. Main results and heuristics
We first introduce some notation that we will need before stating the results. Given two sequences of non-negative real numbers $(a_N)_{N=1}^{\infty}$ and $(b_N)_{N=1}^{\infty}$ , we write:
We also define the following notation:
-
If $X_N$ converges to X in distribution, we write $X_N\Rightarrow X$ .
-
If $X_N$ converges to X in probability, we write $X_N\to_\textrm{p} X$ .
-
$\gamma_d$ denotes the volume of the unit ball in $\mathbb{R}^d$ .
-
For each $k\geq 1$ and $j \geq 1$ , we define
(1) \begin{align} \beta_k\;:\!=\;\Bigg(N\alpha^{(k-1)d}\prod_{i=1}^{k}\mu_i\Bigg)^{-1/((k-1)d+k)}, \qquad \kappa_j\;:\!=\;(\mu_j\alpha^d)^{-1/(d+1)}. \end{align}We explain how $\beta_k$ and $\kappa_j$ arise in Sections 2.3 and 2.5, respectively. -
$\sigma_k$ denotes the first time a mutation of type k appears, and $\sigma_k^{(2)}$ denotes the second time a mutation of type k appears. More rigorous definitions of $\sigma_k$ and $\sigma_k^{(2)}$ are given in Sections 3 and 4, respectively.
All limits in this paper will be taken as $N \rightarrow \infty$ . The mutation rates $(\mu_i)_{i=1}^{\infty}$ and the rate of mutation spread $\alpha$ will depend on N, even though this dependence is not recorded in the notation. Throughout the paper we will assume that the mutation rates $(\mu_i)_{i=1}^{\infty}$ are asymptotically increasing, i.e.
2.1. Theorem 1: Low mutation rates
Assume
The first time a mutation of type 1 appears is exponentially distributed with rate $N\mu_1$ . The maximal distance between any two points on the torus $\mathcal{T}=[0,L]^d$ is $\sqrt{d}L/2$ . Also note that $L=N^{1/d}$ , where N is the volume of $\mathcal{T}$ . Consequently, once the first type 1 mutation appears, it will spread to the entire torus in time $\sqrt{d}L/(2\alpha)=\sqrt{d}N^{1/d}/(2\alpha)$ . Hence, as noted in [Reference Foo, Leder and Schweinsberg11], the time required for a type 1 mutation to fixate once it has first appeared is much shorter than $\sigma_1$ precisely when $N^{1/d}/\alpha\ll 1/(N\mu_1)$ , which is equivalent to $\mu_1\ll \alpha/N^{(d+1)/d}$ .
Now, because of the second assumption $\mu_i/\mu_1\to c_i \in (0,\infty]$ , mutations of types $i\in \{2,\ldots,k\}$ appear at least as fast as the first mutation. If $c_i<\infty$ , then the waiting times $\sigma_1$ and $\sigma_i-\sigma_{i-1}$ are on the same order of magnitude. Because we have $\sigma_1\sim \text{Exponential}(N\mu_1c_1)$ , it follows that $\sigma_i-\sigma_{i-1}$ is also exponentially distributed and that $\sigma_i-\sigma_{i-1}\sim\text{Exponential}(N\mu_1c_i)$ . Otherwise, if $c_i=\infty$ , then the first type i mutation appears so quickly that its waiting time $\sigma_i-\sigma_{i-1}$ is negligible as $N\to\infty$ . Putting everything together gives us the following theorem. This result is a very slight generalization of [Reference Foo, Leder and Schweinsberg11, Theorem 1], and is proved by the same method.
Theorem 1. Suppose (2) holds, and $\mu_1 \ll \alpha/N^{(d+1)/d}$ . Suppose that, for all $i\in \{1,\ldots,k\}$ , we have ${\mu_i}/{\mu_1}\to c_i\in (0,\infty]$ . Let $W_1,\ldots,W_k$ be independent random variables with $W_i\sim \textrm{Exponential}(c_i)$ if $c_i<\infty$ and $W_i=0$ if $c_i=\infty$ . Then $N\mu_1\sigma_k\Rightarrow W_1+\cdots+W_k$ .
Figure 1 illustrates that once a type i mutation appears, it quickly fills up the whole torus, and then a type $i+1$ mutation occurs.
2.2. Theorem 2: Type $\boldsymbol{j}\geq 2$ mutations occur rapidly after $\sigma_1$
Assume
In contrast to Theorem 1, the assumption $\mu_1\gg \alpha/N^{(d+1)/d}$ means that the time it takes for type 1 mutations to spread to the entire torus is much longer than $\sigma_1$ . As a result, there will be many growing balls of type 1 mutations before any of these balls can fill the entire torus. However, if mutations of types $2, 3, \dots, k$ appear quickly after the first type 1 mutation appears, then the time to wait for the first type k mutation will be close to the time to wait for the first type 1 mutation. We consider here the conditions under which this will be the case.
First, consider the ball of type 1 cells resulting from the initial type 1 mutation at time $\sigma_1$ . Assuming t is small enough that, by time $\sigma_1 + t$ , the ball has not started overlapping itself by wrapping around the torus, the ball will have volume $\gamma_d (\alpha t)^d$ at time t. Then the probability that the first type 2 mutation appears in that ball before time t is
It follows that the first time a type 2 mutation occurs in this ball is on the order of $(\mu_2\alpha^d)^{-1/(d+1)}$ . Hence, whenever $(\mu_2\alpha^d)^{-1/(d+1)}\ll 1/(N\mu_1)$ , which is equivalent to the second assumption in (3), it follows that $\sigma_2-\sigma_1$ is much quicker than $\sigma_1$ . From this heuristic, we see that $N\mu_1(\sigma_2-\sigma_1)\to_\textrm{p} 0$ . Repeating this reasoning with types $j-1$ and j in place of types 1 and 2, we see that $\sigma_j-\sigma_{j-1}$ is much quicker than $\sigma_1$ when $(\mu_j\alpha^d)^{-1/(d+1)}\ll 1/(N\mu_1)$ , or, equivalently, $\mu_j\gg (N\mu_1)^{d+1}/\alpha^d$ . However, this follows from the second assumption in (3) because of (2). Hence, we also have $N\mu_1(\sigma_j-\sigma_{j-1})\to_\textrm{p} 0$ . Putting everything together, when N is large,
This gives us the following theorem. We note that the $k=2$ case was proved in [Reference Durrett, Foo and Leder5, Theorem 3] using essentially the same reasoning as above.
Theorem 2. Suppose (2) holds. Suppose $\mu_1\gg \alpha/N^{(d+1)/d}$ and $\mu_2\gg (N\mu_1)^{d+1}/\alpha^d$ . For all $k \geq 2$ , $N\mu_1\sigma_k\Rightarrow W$ , where $W\sim \textrm{Exponential}(1)$ .
A pictorial representation is given in Fig. 2, where the nested circles correspond to mutations of types $1,\ldots,k$ for $k=4$ .
2.3. Theorem 3: Type $\boldsymbol{j}\in \{1,\ldots,\boldsymbol{k}-1\}$ mutations appear many times
Assume
As in Theorem 2, the first assumption ensures that $\sigma_1$ is shorter than the time it takes for type 1 mutations to fixate once they appear. The second assumption ensures that all mutations of types up to k do not appear too quickly, so that we are not in the setting of Theorem 2. In particular, note that when $k=2$ , we have $\beta_{k-1}=(N\mu_1)^{-1}$ , and the second assumption reduces to $\mu_2\ll (N\mu_1)^{d+1}/\alpha^d$ . When (5) holds, for $j \in \{2, \dots, k\}$ there will be many small balls of type $j-1$ before any type j mutation appears. In this case, we will be able to use a ‘law of large numbers’ established in [Reference Foo, Leder and Schweinsberg11] to approximate the total volume of type $j-1$ regions with its expectation.
To explain what happens in this case, we review a derivation from [Reference Foo, Leder and Schweinsberg11]. We want to define an approximation $v_j(t)$ to the total volume of regions with at least j mutations at time t. We set $v_0(t)\equiv N$ . Next, let $t>0$ . For times $r\in [0,t]$ , type j mutations occur at rate $\mu_jv_{j-1}(r)$ , and these type j mutations each grow into a ball of size $\gamma_d(\alpha(t-r))^d$ by time t. Therefore, we define
Note that this gives a good approximation to the volume of the type j region because we have many mostly non-overlapping balls of type j. In [Reference Foo, Leder and Schweinsberg11] it is shown using induction that
which gives us the approximation
It will follow that if we define $\beta_k$ as in (1), then we have the following result.
Theorem 3. Suppose (2) holds. Let $k\geq 2$ , and suppose $\mu_1\gg \alpha/N^{(d+1)/d}$ and $\mu_k\ll 1/(\alpha^d\beta_{k-1}^{d+1})$ . Then, for $t>0$ ,
When we have equal mutation rates (i.e. $\mu_i=\mu$ for all i), the result above is covered by [Reference Foo, Leder and Schweinsberg11, Theorem 10, part 3]. The form of the result and the strategy of the proof are exactly the same in the more general case when the mutation rates can differ. Theorem 3 is illustrated in Fig. 3 for $k=3$ .
2.4. Theorem 4: An intermediate case between Theorems 2 and 3
Assume $\mu_1\gg{\alpha}/{N^{(d+1)/d}}$ . We first define
It follows from (2) that if $\mu_j\ll 1/\big(\alpha^d\beta_{j-1}^{d+1}\big)$ , then $\mu_{j-1} \ll 1/\big(\alpha^d\beta_{j-1}^{d+1}\big)$ , which by Lemma 2 below implies that $\mu_{j-1} \ll 1/\big(\alpha^d\beta_{j-2}^{d+1}\big)$ . It follows that
Intuitively, l is the largest index for which mutations of types $1,2,\ldots,l$ behave exactly as in Theorem 3. The definition of l in (6) omits the possibility $l=1$ , since $\beta_0$ is undefined. However, if we define $l = 1$ when the set over which we take the maximum in (6) is empty, then Theorem 4 below when $l = 1$ is the same as Theorem 2. On the other hand, if $l \in \{k,k+1,\ldots\}\cup \{\infty\}$ , then by (7) we have $\mu_k\ll 1/\big(\alpha^d\beta_{k-1}^{d+1}\big)$ , in which case Theorem 3 applies. Hence, we assume $l\in \{2,\ldots,k-1\}$ and
The situation in Theorem 4 is a hybrid of Theorems 2 and 3. A mutation of type $j\in \{1,\ldots,l-1\}$ takes a longer time to fixate in the torus than the interarrival time $\sigma_{j}-\sigma_{j-1}$ . As a result, if $j\in \{2,\ldots,l\}$ , there will be many mostly non-overlapping balls of type $j-1$ before time $\sigma_j$ . Using this fact, we proceed as in Theorem 3 and find $\lim_{N\to\infty}\mathbb{P}(\sigma_l>\beta_lt)$ . Next, our assumption in (8) places us in the regime of Theorem 2; all mutations of types $l+1,\ldots,k$ happen so quickly that for all $\varepsilon>0$ we have $\mathbb{P}(\sigma_k-\sigma_l>\beta_l\varepsilon)\to 0$ . Then, combining these two results yields the following theorem.
Theorem 4. Suppose (2) holds, and suppose $\mu_1\gg \alpha/N^{(d+1)/d}$ . Suppose also that $l\in \{2,\ldots,k-1\}$ and that $\mu_{l+1}\gg 1/\big(\alpha^d\beta_l^{d+1}\big)$ . Then, for $t>0$ ,
In pictures, Theorem 4 looks like Fig. 3 for mutations up to type l. Then, once the first type l mutation appears and spreads in a circle, all the subsequent mutations become nested within that circle, similar to Fig. 2.
Remark 1. Theorems 1–4 cover most of the possible cases in which (2) holds. However, we assume that either $\mu_1 \ll \alpha/N^{(d+1)/d}$ or $\mu_1 \gg \alpha/N^{(d+1)/d}$ . In the case $\mu_1 \asymp \alpha/N^{(d+1)/d}$ , we expect that at the time a type 2 mutation appears, there could be several overlapping type 1 balls whose size is comparable to the size of the torus, and we do not expect the limiting distribution of $\sigma_k$ to have a simple expression. Consequently, we do not pursue this case here. We note that if $\mu_1 \asymp \alpha/N^{(d+1)/d}$ and all mutation rates are equal (i.e. $\mu_i=\mu$ for all i), then it is proven, as a special case of [Reference Foo, Leder and Schweinsberg11, Theorem 12], that $N\mu\sigma_k$ converges in distribution to a non-degenerate random variable for every $k\geq 1$ . Likewise, we do not consider the case in which, instead of (8), we have $\mu_{l+1} \asymp 1/\big(\alpha^d \beta_l^{d+1}\big)$ . In this case we believe there could be several overlapping type l balls at the time the first type $l+1$ mutation occurs, again preventing there from being a simple expression for the limit distribution.
2.5. Distances between mutations
For $1\leq i<j$ , define $D_{i,j}$ to be the distance in the torus between the location of the first mutation of type j and the location of the first mutation of type i. Also define $D_{i+1}\;:\!=\;D_{i,i+1}$ .
Consider the setting of Theorem 2. We will assume a stronger version of (2):
Recall that the mutations appear in nested balls as in Fig. 2. Because the first type $j+1$ mutation will therefore appear before the second type j mutation with high probability, we can calculate, as in (4), that
It follows that if we define $\kappa_{j+1}$ as in (1), then
With this, we can calculate the approximate density f(t) of $(\sigma_{j+1}-\sigma_j)/\kappa_{j+1}$ . This allows us to calculate
The location of the first type $j+1$ mutation conditioned on $\sigma_{j+1}-\sigma_j=\kappa_{j+1}t$ is a uniformly random point on a d-dimensional ball of radius $\alpha\kappa_{j+1}t$ . This allows us to calculate $\lim_{N\to\infty}\mathbb{P}(D_{j+1}\leq \alpha\kappa_{j+1}s)$ . Next, because of (9), mutations of types $j+2,j+3,j+4,\ldots$ appear rapidly once the first type $j+1$ appears. This means that $D_{j+2}+\cdots+D_{j+k}$ is small relative to $D_{j+1}$ , and therefore that $D_{j,k}$ has the same limiting distribution as $D_{j+1}$ . These heuristics lead to the following theorem.
Theorem 5. Suppose (9) holds. Suppose $\mu_1\gg\alpha/N^{(d+1)/d}$ and $\mu_2\gg \big(N\mu_1\big)^{d+1}/\alpha^d$ . Suppose $1 \leq j < k$ . Then, for all $s>0$ ,
Recall the definition of l in (6). Theorem 4 is similar to Theorem 2 in that once the first type l mutation appears, all the subsequent type $l+1,l+2,\ldots$ mutations happen quickly. Therefore, it is reasonable to expect that the type $l,l+1,l+2,\ldots$ mutations behave similarly to the type $1,2,3,\ldots$ mutations in Theorem 2. Hence, analogous to (9), assume that
We then obtain the following result.
Theorem 6. Suppose (10) holds, and suppose $\mu_1 \gg \alpha/N^{(d+1)/d}$ . Define l as in (6), and suppose also that $l \geq 2$ and that $\mu_{l+1}\gg 1/\big(\alpha^d\beta_l^{d+1}\big)$ . Suppose $l \leq j<k$ . Then, for all $s>0$ ,
Remark 2. Theorems 5 and 6 hold in the settings of Theorems 2 and 4 respectively. In the setting of Theorem 1, each type $i\geq 1$ mutation fills up the entire torus before a type $i+1$ mutation occurs, and so the first type $i+1$ mutation appears at a uniformly distributed point on the torus, independently of where all previous mutations originated. Therefore, the problem of finding the distribution of the distances between mutations becomes trivial in this case. On the other hand, in the setting of Theorem 3, type i mutations appear in small and mostly non-overlapping circles before the first type $i+1$ mutation appears. Thus, calculating the distribution of $D_{i+1}$ requires understanding not only the total volume of the type i region, but also the sizes and locations of many small type i regions. We do not pursue this case here, but we conjecture that because the first type $i+1$ mutation is likely not to appear within the type i region generated by the first type i mutation, the locations of the first type i and the first type $i+1$ mutations should be nearly independent of each other, as in the setting of Theorem 1.
3. Proofs of limit theorems for $\sigma_k$
In this section we prove Theorems 1–4. We begin by introducing the structure of the torus $\mathcal{T}=[0,L]^d$ , following the notation of [Reference Foo, Leder and Schweinsberg11]. We define a pseudometric on the closed interval [0, L] by $d_L(x,y)\;:\!=\;\min\{|x-y|,L-|x-y|\}$ . The d-dimensional torus of side length L will be denoted by $\mathcal{T}=[0,L]^d$ . For $x=(x^1,\ldots,x^d)\in \mathcal{T}$ and $y=(y^1,\ldots,y^d)\in \mathcal{T}$ , we define a pseudometric by
The torus should be viewed as $\mathcal{T}$ modulo the equivalence relation $x\sim y$ if and only if $|x-y|=0$ , or more simply $\mathcal{T}=(\mathbb{R}\ (\text{ mod }L))^d$ . However, we will continue to write $\mathcal{T}=[0,L]^d$ , keeping in mind that certain points are considered to be the same via the equivalence relation defined above. It will be useful to observe the following:
-
We have $d_L(x,y)\leq L/2$ for all $x,y\in [0,L]$ . As a result, the distance between any two points $x,y\in \mathcal{T}$ is at most
\begin{equation*} \sup\{|x-y|\colon x,y\in \mathcal{T}\} = \sqrt{\sum_{i=1}^{d}\bigg(\frac{L}{2}\bigg)^2}=\frac{\sqrt{d}L}{2}. \end{equation*} -
Therefore, once a mutation of type j appears, the entire torus will become type j in time
(11) \begin{equation} \frac{\text{maximal distance between any }x,y\in\mathcal{T}}{\text{rate of mutation spread per unit time}}=\frac{\sqrt{d}L}{2\alpha}. \end{equation}
We use $|A|$ to denote the Lebesgue measure of some subset A of $\mathcal{T}$ or $\mathcal{T}\times [0,\infty)$ , so that $N=L^d=|\mathcal{T}|$ is the torus volume. Each $x\in \mathcal{T}$ at time t has a type $k\in \{0,1,2,\ldots\}$ , which we denote by T(x, t), corresponding to the number of mutations the site has acquired. The set of type i sites is defined by $\chi_i(t)\;:\!=\;\{x\in \mathcal{T}\colon T(x,t)=i\}$ . The set of points whose type is at least i is defined by
At time t, we denote the total volume of type i sites by $X_i(t)\;:\!=\;|\chi_i(t)|$ , and the total volume of sites with type at least i by $Y_i(t)\;:\!=\;|\psi_i(t)|$ . The first time a type k mutation appears in the torus can be expressed as $\sigma_k=\inf\{t>0\colon Y_k(t)>0\}$ .
Still following [Reference Foo, Leder and Schweinsberg11], we now explicitly describe the construction of the process which gives rise to mutations in the torus. We model mutations as random space-time points $(x,t)\in \mathcal{T}\times [0,\infty)$ . Let $(\Pi_k)_{k=1}^{\infty}$ be a sequence of independent Poisson point processes on $\mathcal{T}\times [0,\infty)$ , where $\Pi_k$ has intensity $\mu_k$ . That is, for any space-time region $A\subseteq \mathcal{T}\times [0,\infty)$ , the probability that A contains j points of type k is $\textrm{e}^{-\mu_k|A|}{(\mu_k|A|)^j}/{j!}$ . Each $(x,t)\in \Pi_k$ is a space-time point at which $x\in \mathcal{T}$ can acquire a kth mutation at time t. We say that x mutates to type k at time t precisely when $x\in \chi_{k-1}(t)$ and $(x,t)\in \Pi_k$ . Once an individual obtains a type k mutation, it spreads the type k mutations outward in a ball at rate $\alpha$ per unit time.
3.1. Proof of Theorem 1
In the setting of Theorem 1, once the first mutation appears, with high probability it spreads to the entire torus before another mutation appears. The proof of Theorem 1 uses [Reference Foo, Leder and Schweinsberg11, Theorem 1], which we restate below as Theorem 7. Theorem 1 is very similar to Theorem 7 when $j=1$ . However, Theorem 7 requires $\mu_j \ll \alpha/N^{(d+1)/d}$ for all $j \in \{1, \dots, k\}$ , whereas Theorem 1 requires this condition only for $j = 1$ . This is why Theorem 1 cannot be deduced directly from Theorem 7, even though the proofs of the results are essentially the same.
Theorem 7. ([Reference Foo, Leder and Schweinsberg11, Theorem 1].) Suppose $\mu_i\ll \alpha/N^{(d+1)/d}$ for $i\in \{1,\ldots,k-1\}$ . Suppose there exists $j\in \{1,\ldots,k\}$ such that $\mu_j\ll \alpha/N^{(d+1)/d}$ and ${\mu_i}/{\mu_j}\to c_i\in (0,\infty]$ for all $i\in \{1,\ldots,k\}$ . Let $W_1,\ldots,W_k$ be independent random variables such that $W_i$ has an exponential distribution with rate parameter $c_i$ if $c_i<\infty$ and $W_i=0$ if $c_i=\infty$ . Then $N\mu_j\sigma_k\Rightarrow W_1+\cdots+W_k$ .
Proof of Theorem 1. Let $r\;:\!=\;\max\{j\in \{1,\ldots,k\}:\mu_j\lesssim \mu _1\}$ . For all $j \in \{1, \dots, r\}$ , we have $\mu_j\ll \alpha/N^{(d+1)/d}$ . By Theorem 7, $N\mu_1\sigma_r\Rightarrow W_1+\cdots+ W_r$ . If $r=k$ , then the conclusion follows. Otherwise, $r\leq k-1$ , and by the maximality of r and (2), we have $\mu_l/\mu_1 \rightarrow \infty$ for all $l\in \{r+1,\ldots,k\}$ . Then the result follows if we show that $N\mu_1(\sigma_k-\sigma_r)\to_\textrm{p} 0$ . We have
We will find an upper bound for the right-hand side of (12). For $i\geq 1$ , let $t_i=\inf\{t>0\colon Y_i(t)=N\}$ be the first time which every point in $\mathcal{T}$ is of at least type i. Define $\hat{t}_i\;:\!=\;t_i-\sigma_i$ , which is the time elapsed between $\sigma_i$ and when mutations of type i fixate in the torus. Also define $\hat{\sigma}_i=\inf\{t>0\colon \Pi_i\cap (\mathcal{T}\times [t_{i-1},t])\neq \varnothing\}$ , which is the first time there is a potential type i mutation after $t_{i-1}$ . Observe that, because we always have $\sigma_i \leq \hat{\sigma}_i$ ,
Also observe that, by (11), we have $\hat{t}_j\leq \sqrt{d}N^{1/d}/(2\alpha)$ . Consequently, the right-hand side of (12) has the upper bound
The result follows if the right-hand side of the above expression converges to 0 in probability. The first term tends to zero because $\mu_1\ll \alpha/N^{(d+1)/d}$ . The second term tends to zero because $\hat{\sigma}_{j+1}-t_j\sim \text{Exponential}(N\mu_{j+1})$ , so $N\mu_1(\hat{\sigma}_{j+1}-t_j)\sim \text{Exponential}(\mu_{j+1}/\mu_1)\to_\textrm{p} 0$ .
3.2. Proof of Theorem 2
Lemma 1. Let $t_N$ be a random time that is $\sigma(\Pi_1,\ldots,\Pi_{j-1})$ -measurable and satisfies $t_N\geq \sigma_{j-1}$ . Then
Proof. Write $\mathcal{G}\;:\!=\;\sigma(\Pi_1,\ldots,\Pi_{j-1})$ . Define the set $A\;:\!=\;\{(x,r)\in \psi_{j-1}(r)\times [\sigma_{j-1},t_N]\}$ , and note that the Lebesgue measure of this set, which we denote by $|A|$ , is a $\mathcal{G}$ -measurable random variable. The event $\{\sigma_{j}>t_N\}$ occurs precisely when $\Pi_j\cap A =\varnothing$ . Let X be the number of points of $\Pi_j$ in the set A. Because $\Pi_j$ is independent of $\Pi_1, \dots, \Pi_{j-1}$ , the conditional distribution of X given $\mathcal{G}$ is Poisson $(\mu_j|A|)$ . Therefore,
Taking expectations of both sides completes the proof.
Proof of Theorem 2. Write $N\mu_1\sigma_k$ as a telescoping sum,
We have $N\mu_1\sigma_1\sim \text{Exponential}(1)$ . Hence, it suffices to show that, for each $j\geq 2$ , the random variable $N\mu_1(\sigma_j-\sigma_{j-1})$ converges in probability to zero. Let $t>0$ . Then, by Lemma 1,
We want to show that the term on the right-hand side tends to zero. By the dominated convergence theorem, it suffices to show that as $N\to \infty$ ,
Notice that because $\mu_1\gg \alpha/N^{(d+1)/d}$ , for all sufficiently large N we have $t/(N\mu_1)\leq N^{1/d}/(2\alpha)$ . Therefore, at time $\sigma_{j-1} + t/(N \mu_1)$ , there is a ball of type $j-1$ mutations of radius $\alpha(t - \sigma_{j-1})$ which has not yet begun to wrap around the torus and overlap itself. Hence, we have $Y_{j-1}(s)\geq \gamma_d\alpha^d(s-\sigma_{j-1})^d$ for $s\in [\sigma_{j-1},\sigma_{j-1}+t/(N\mu_1)]$ , and therefore
It remains to show that
For the above to hold, it suffices to have $ \mu_j\gg (N\mu_1)^{d+1}/\alpha^d$ , which holds due to the second assumption in the theorem and (2). This completes the proof.
3.3. Proof of Theorem 3
We recall the definition of $\beta_k$ as in (1) of Section 2. In the setting of Theorem 3, $\beta_k$ is the order of magnitude of the time it takes for the kth mutation to appear.
Much of the proof of Theorem 3 will rely on [Reference Foo, Leder and Schweinsberg11, Lemma 9], which approximates a monotone stochastic process by a deterministic function under a certain time scaling. In order to apply this lemma, it is important to ensure that $Y_k(t)$ is well approximated by its expectation, which is [Reference Foo, Leder and Schweinsberg11, Lemma 8].
Before proving Theorem 3, we state several lemmas, some of which are from [Reference Foo, Leder and Schweinsberg11]. First, we need to ensure that the last assumption, $\mu_k\alpha^d\beta_{k-1}^{d+1}\to 0$ , in Theorem 3 implies $\mu_k\alpha^d\beta_{k}^{d+1}\to 0$ , so that we are able to use part (ii) of Lemma 5 to approximate $Y_{k-1}(\beta_k t)$ by its expectation.
Lemma 2. For $k\geq 2$ , we have $\mu_k\ll 1/\big(\alpha^d\beta_k^{d+1}\big)$ if and only if $\mu_k\ll 1/\big(\alpha^d\beta_{k-1}^{d+1}\big)$ .
Proof. By using the definition of $\beta_k$ from (1), we get
as claimed.
We also need [Reference Foo, Leder and Schweinsberg11, Lemma 9], which is restated as Lemma 3. This lemma gives necessary conditions under which a monotone stochastic process is well approximated by a deterministic function.
Lemma 3. Suppose, for all positive integers N, $(Y_N(t),t\geq 0)$ is a non-decreasing stochastic process such that $\mathbb{E}[Y_N(t)] < \infty$ for each $t > 0$ . Assume there exist sequences of positive numbers $(\nu_N)_{N=1}^{\infty}$ and $(s_N)_{N=1}^{\infty}$ and a continuous non-decreasing function $g>0$ such that, for each fixed $t>0$ and $\varepsilon>0$ , we have
and
Then, for all $\varepsilon>0$ and $\delta>0$ , we have
Next, we state a criterion which guarantees that, for fixed $t>0$ , the probability $\mathbb{P}(\sigma_k>\beta_kt)$ converges to a deterministic function as $N\to\infty$ .
Lemma 4. For a continuous non-negative function g, a sequence $(\nu_N)_{N=1}^{\infty}$ of positive real numbers, and $\delta,\varepsilon>0$ , define the event
If $(\nu_N)_{N=1}^{\infty}$ and g are chosen such that $\lim_{N\to\infty}\mathbb{P}\big(B_N^{k-1}(\delta,\varepsilon,g,\nu_N)\big)=1$ and $\lim_{N\to\infty}\nu_N\beta_k\mu_k$ exists, then
Proof. Suppose $\delta \leq t \leq \delta^{-1}$ . We reason as in the proof of [Reference Foo, Leder and Schweinsberg11, Theorem 10]. The upper and lower bounds from [Reference Foo, Leder and Schweinsberg11, (26) and (27)] are
Taking $N\to\infty$ and then $\varepsilon,\delta\to 0$ , we get the desired result.
We also need to approximate the expected volume of type k or higher regions, $\mathbb{E}[Y_k(t)]$ , with a deterministic function, as well as making sure that $Y_k(t)$ is well approximated by its expectation. Lemma 5 is a restatement of [Reference Foo, Leder and Schweinsberg11, Lemmas 5 and 8]. It is important to note that for this result, the time t may depend on N.
Lemma 5. Fix a positive integer k. Suppose $\mu_j\alpha^d t^{d+1}\to 0$ for all $j\in \{1,\ldots,k\}$ . Also suppose $t \leq N^{1/d}/(2\alpha)$ . Then
-
(i) Setting $\displaystyle v_k(t)\;:\!=\;\frac{\gamma_d^k(d!)^k}{(k(d+1))!} \Bigg(\prod_{i=1}^{k}\mu_i\Bigg)N\alpha^{kd}t^{k(d+1)}$ , we have $\mathbb{E}[Y_k(t)]\sim v_k(t)$ .
-
(ii) If, in addition, we assume $\displaystyle \Bigg(\prod_{i=1}^{k}\mu_i\Bigg) N\alpha^{(k-1)d}t^{(k-1)d+k}\to\infty$ , then, for all $\varepsilon>0$ ,
\begin{align*}\lim_{N\to\infty}\mathbb{P}((1-\varepsilon)\mathbb{E}[Y_k(t)] \leq Y_k(t)\leq (1+\varepsilon)\mathbb{E}[Y_k(t)])=1.\end{align*}
Remark 3. Lemma 5 in [Reference Foo, Leder and Schweinsberg11] omits the necessary hypothesis $t \leq N^{1/d}/(2\alpha)$ . This hypothesis ensures that a growing ball of mutations cannot begin to wrap around the torus and overlap itself before time t, which is needed for the formula for $\mathbb{E}[\Lambda_{k-1}(t)]$ in [Reference Foo, Leder and Schweinsberg11, (15)] to be exact. This equation is used in the proof of [Reference Foo, Leder and Schweinsberg11, Lemma 5]. Note that the hypothesis $t \leq N^{1/d}/(2\alpha)$ is also needed for [Reference Foo, Leder and Schweinsberg11, Lemma 8], because its proof uses [Reference Foo, Leder and Schweinsberg11, Lemma 5]. However, because it is easily verified that this hypothesis is satisfied in [Reference Foo, Leder and Schweinsberg11] whenever these lemmas are used, all of the main results in [Reference Foo, Leder and Schweinsberg11] are correct without additional hypotheses.
The next lemma states that if $\mu_1\gg \alpha/N^{(d+1)/d}$ , then $\beta_l$ is much smaller than the time it takes for a mutation to spread to the entire torus.
Lemma 6. Suppose $\mu_1\gg \alpha/N^{(d+1)/d}$ and (2) holds. Then $\beta_l \ll N^{1/d}/\alpha$ for any $l\in \mathbb{N}$ .
Proof. By (2), we have $\mu_1,\ldots,\mu_l\gg \alpha/N^{(d+1)/d}$ . Thus
On the other hand by simplifying,
This proves the lemma.
Proof of Theorem 3. In view of Lemma 4, we will choose $(\nu_N)_{N=1}^{\infty}$ and a continuous non-negative function $g_k$ such that $\lim_{N\to \infty}\nu_N\beta_k\mu_k$ exists and $\mathbb{P}\big(B_N^{k-1}(\delta,\varepsilon,g_k,\nu_N)\big)\to 1$ as $N\to \infty$ . We set $\nu_N=1/(\beta_k \mu_k)$ , and, as in the proof of [Reference Foo, Leder and Schweinsberg11, Theorem 10], set
A lengthy calculation shows that $\beta_k\mu_kv_{k-1}(\beta_kt)=g_k(t)$ . On the other hand, by the last assumption in the theorem, we have $\mu_k\alpha^d\beta_{k-1}^{d+1}\to 0$ . By Lemma 2, this is equivalent to $\mu_k\alpha^d\beta_{k}^{d+1}\to 0$ . Because of (2), this implies that $\mu_j\alpha^d(\beta_k t)^{d+1}\to 0$ for all $j\in \{1,\ldots,k\}$ . Also, because of Lemma 6, we have $\beta_k\ll N^{1/d}/(2\alpha)$ . Hence the hypotheses of Lemma 5 are satisfied, and by the first result in Lemma 5 applied to $k-1$ , it follows that $v_{k-1}(\beta_k t)\sim\mathbb{E}[Y_{k-1}(\beta_kt)]$ , which implies
Hence, (14) of Lemma 3 is satisfied. A direct calculation gives
which by the second result of Lemma 5 is sufficient to give (13). Therefore, Lemma 3 guarantees that $\mathbb{P}\big(B_N^{k-1}(\delta,\varepsilon,g_k,\nu_N)\big)\to 1$ as $N\to\infty$ . Then, Lemma 4 gives us
completing the proof.
3.4. Proof of Theorem 4
Now we turn to proving Theorem 4, which is a hybrid of Theorems 2 and 3. In particular, we assume that there is some $l\in\mathbb{N}$ such that the mutation rates $\mu_1,\mu_2,\ldots,\mu_l$ fall under the regime of Theorem 3, and all subsequent mutation rates $\mu_{l+1},\ldots,\mu_k$ are large enough that all mutations after the first type l mutation occur quickly, as in Theorem 2.
Proof of Theorem 4. For ease of notation, set, for $j\in\mathbb{N}$ and $t\geq 0$ ,
For $\varepsilon>0$ , we have the inequalities
Taking $N\to \infty$ and using Theorem 3 (noting that $l\geq 2$ ), we have
Since $f_l$ is continuous, the result follows (by taking $\varepsilon\to 0$ ) once we show that, for each fixed $\varepsilon>0$ ,
Notice that because
it suffices to show that, for all $j\in \{l,\ldots,k-1\}$ , $\mathbb{P}(\sigma_{j+1}-\sigma_j>\beta_l\varepsilon)\to 0$ . By Lemma 1, we have
Hence, by the dominated convergence theorem, to show that $\mathbb{P}(\sigma_{j+1}-\sigma_j>\beta_l\varepsilon)\to 0$ , it suffices to show that $\int_{\sigma_j}^{\beta_l\varepsilon+\sigma_j}\mu_{j+1}Y_j(s)\,\textrm{d} s\to \infty$ almost surely. By Lemma 6 we have $\beta_l\ll N^{1/d}/\alpha$ , so $ \beta_l\varepsilon \leq N^{1/d}/(2\alpha)$ for large enough N. That is, $\beta_l\varepsilon$ does not exceed the time it takes for a mutation to wrap around the torus. Hence, we have the lower bound $Y_j(s)\geq \gamma_d\alpha^d (s-\sigma_j)^d$ for $s\in [\sigma_j,\sigma_j+\beta_l\varepsilon]$ , and
By the second assumption in the theorem, we have $\mu_{l+1}\gg 1/\big(\alpha^d\beta_l^{d+1}\big)$ . Because of (2), we have $\mu_{j+1}\gg 1/\big(\alpha^d\beta_l^{d+1}\big)$ . It follows that the right-hand side of (16) tends to infinity as $N\to \infty$ , which completes the proof.
4. Proofs of limit theorems for distances between mutations
For each $j\geq 1$ , we define $\sigma_j^{(2)}$ to be the time at which the second type j mutation occurs, i.e.
Note that $\sigma_j^{(2)}$ is defined to be the first time, after time $\sigma_j$ , that a point of $\Pi_j$ lands in a region of type $j-1$ or higher. If this point lands in a region of type $j-1$ , then a new type j ball will begin to grow. If this point lands in a region of type j or higher, then the evolution of the process is unaffected. Also recall from (1) that $\kappa_j\;:\!=\;(\mu_j\alpha^d)^{-1/(d+1)}$ .
4.1. Proof of Theorem 5
We begin with an upper bound for the time between first mutations of consecutive types.
Lemma 7. Assume $\mu_1\gg \alpha/N^{(d+1)/d}$ . Suppose i and j are positive integers. Then, for each fixed $t>0$ , we have, for all sufficiently large N,
Proof. Using Lemma 1, we have
Because of $\mu_1\gg \alpha/N^{(d+1)/d}$ and (2), for all sufficiently large N we have $\kappa_{i}t<L/(2\alpha)$ . Thus, $Y_j(s)\geq \gamma_d \alpha^d(s-\sigma_j)^d$ for $s\in [\sigma_j,\sigma_j+\kappa_{i}t]$ . Then,
This completes the proof.
By Lemma 7, when $\mu_1\gg \alpha/N^{(d+1)/d}$ the interarrival time $\sigma_{j}-\sigma_{j-1}$ is at most the same order of magnitude as $\kappa_{j}$ . Lemma 8 further shows that if, in addition, $\mu_j\ll \mu_{j+1}\ll \mu_{j+2}\ll \cdots$ , then mutations of type $m > j$ appear on an even faster time scale.
Lemma 8. Assume $\mu_1\gg \alpha/N^{(d+1)/d}$ . Suppose j is a positive integer, and $\mu_j\ll \mu_{j+1} \ll \mu_{j+2} \ll \cdots$ . Then $(\sigma_m-\sigma_j)/\kappa_j\to_p 0$ for every $m>j$ .
Proof. Using (2), we get
so it suffices to show that $(\sigma_{i+1} - \sigma_i)/\kappa_i \rightarrow_p 0$ for all $i \in \{j, j+1, \dots, m-1\}$ . Let $\varepsilon>0$ . Using Lemma 7, we have, for all sufficiently large N,
Then, $\mathbb{P}(\sigma_{i+1}-\sigma_i>\kappa_i\varepsilon)\to 0$ because $\mu_{i+1}/\mu_i\to \infty$ , as desired.
Next, we want to show that the balls from different mutation types become nested, as in Fig. 2. That is, for any $i\geq 1$ and $j>i$ , we have $\mathbb{P}(\sigma_j<\sigma_i^{(2)})\to 1$ , meaning that a type j mutation appears before a second type i mutation can appear. We first prove the case when $i=1$ in Lemma 9, assuming the same hypotheses as in Theorem 2.
Lemma 9. Suppose (2) holds, and suppose $\mu_1 \gg \alpha/N^{(d+1)/d}$ and $\mu_2\gg (N\mu_1)^{d+1}/\alpha^d$ . Then:
(i) For all $t>0$ , $\mathbb{P}\big(\sigma_2+\kappa_2t<\sigma_1^{(2)}\big)\to 1$ .
(ii) For every $j\geq 2$ , $\mathbb{P}\big(\sigma_j<\sigma_1^{(2)}\big)\to 1$ .
Proof. To prove (i), let $\varepsilon>0$ . It was shown in the proof of Theorem 2 that $N\mu_1(\sigma_2-\sigma_1)\to_p 0$ . Also, the assumption $\mu_2\gg (N\mu_1)^{d+1}/\alpha^d$ implies $N\mu_1\kappa_2 t\to 0$ . Thus, $N\mu_1(\sigma_2-\sigma_1)+N\mu_1\kappa_2t\to_p 0$ . Therefore, for all sufficiently large N,
On the other hand, because $N\mu_1(\sigma_1^{(2)}-\sigma_1)$ has an $\text{Exponential}(1)$ distribution,
Combining the above, we find
This proves (i). For (ii), by the proof of Theorem 2 again, we have $N\mu_1(\sigma_i-\sigma_{i-1})\to_p 0$ for every $i\geq 2$ . Thus,
and the rest of the proof is essentially the same as that of (i); we just replace $N\mu_1(\sigma_2-\sigma_1)+N\mu_1\kappa_2 t$ with $N\mu_1(\sigma_j-\sigma_1)$ .
In Lemma 10, we establish that $\sigma_{j}^{(2)}-\sigma_{j}>\kappa_{j}\delta$ with high probability. Then, for $k>j$ , since $(\sigma_{k}-\sigma_j)/\kappa_{j}\to_p 0$ by Lemma 8, it will follow that $\sigma_{k}-\sigma_j<\kappa_{j}\delta$ with high probability. It will then follow that $\mathbb{P}\big(\sigma_{k}<\sigma_j^{(2)}\big)\to 1$ , which we show in Lemma 11.
Lemma 10. Suppose (2) holds, and suppose $\mu_1\gg \alpha/N^{(d+1)/d}$ . Let $i \geq 2$ . Define the events $A\;:\!=\;\{\sigma_i-\sigma_{i-1}<\kappa_i t\}$ and $B\;:\!=\;\big\{\sigma_{i}+\kappa_{i}\delta<\sigma_{i-1}^{(2)}\big\}$ . Then, for any fixed $t>0$ and $\delta>0$ , for sufficiently large N we have
Proof. Reasoning as in the proof of Lemma 1, we have
Because of $\mu_1\gg \alpha/N^{(d+1)/d}$ and (2), for sufficiently large N we have $\kappa_i(t+\delta)<L/(2\alpha)$ . Thus, on the event A, we have $\kappa_i\delta+(\sigma_i-\sigma_{i-1})<L/(2\alpha)$ . Therefore, on the event $A \cap B$ , for all $s\in [0,\kappa_i\delta]$ we have, for sufficiently large N,
Thus, for sufficiently large N,
It follows from (17) and (18) that, for sufficiently large N,
as claimed.
In Lemma 11 we give sufficient conditions for $\mathbb{P}\big(\sigma_j<\sigma_i^{(2)}\big)\to 1$ for $j>i\geq 1$ , which implies that we obtain nested balls, as in Fig. 2.
Lemma 11. Assume $\mu_1\gg \alpha/N^{(d+1)/d}$ , $\mu_2\gg (N\mu_1)^{d+1}/\alpha^d$ , and $\mu_j\ll \mu_{j+1}$ for $j\geq 2$ . Then, for every $i\geq 1$ and $j>i$ ,
Proof. The result in (19) when $i = 1$ was proved in part (ii) of Lemma 9. To establish the result when $i \geq 2$ , we will show that, for all $i \geq 2$ and all $\varepsilon > 0$ , there exists $\delta > 0$ such that, for sufficiently large N,
Assume for now that $i \geq 2$ , and that (20) and (21) hold. Let $\varepsilon > 0$ , and choose $\delta > 0$ to satisfy (21). Lemma 8 implies that if $j > i$ then $\mathbb{P}(\sigma_j-\sigma_i < \kappa_i \delta) > 1 - \varepsilon$ for sufficiently large N. It follows that
for sufficiently large N, which implies (19).
It remains to prove (20) and (21); we proceed by induction. The result (20) when $i = 2$ is part (i) of Lemma 9. Therefore, it suffices to show that (20) implies (21), and that if (21) holds for some $i \geq 2$ , then (20) holds with $i+1$ in place of i.
To deduce (21) from (20), we first let $\varepsilon > 0$ and use Lemma 7 to choose $t > 0$ large enough that
Then choose $\delta > 0$ small enough that (20) holds with $\varepsilon/3$ in place of $\varepsilon$ for sufficiently large N, and
It now follows from Lemma 10, that for sufficiently large N,
so (21) holds.
Next, suppose (21) holds for some $i \geq 2$ . Let $\varepsilon > 0$ . By (21), there exists $\delta > 0$ such that, for sufficiently large N,
By Lemma 8 and the fact that $\mu_i\ll \mu_{i+1}$ , we have $(\sigma_{i+1}-\sigma_i)/\kappa_i+\delta(\kappa_{i+1}/\kappa_i)\to_\textrm{p} 0$ . Thus, for sufficiently large N,
and therefore
which is (20) with $i+1$ in place of i.
We now find the limiting distribution of distances between mutations of consecutive types.
Lemma 12. Suppose $\mu_1\gg\alpha/N^{(d+1)/d}$ , $\mu_2\gg (N\mu_1)^{d+1}/\alpha^d$ , and $\mu_j\ll \mu_{j+1}$ for $j\geq 2$ . Then, for all $s>0$ ,
Proof. Define the event $A\;:\!=\;\bigcap_{i=1}^{j}\{\sigma_{j+1}<\sigma_i^{(2)}\}$ . On the event A, the first type $j+1$ mutation appears before the second mutation of any type $i \in \{1, \dots, j\}$ . By Lemma 11, we have $\mathbb{P}(A)\to 1$ . As a result, it will be sufficient for us to consider a modified version of our process in which, for $i \in \{1, \dots, j\}$ , only the first type i mutation is permitted to occur. Note that this modified process can be constructed from the same sequence of independent Poisson processes $(\Pi_i)_{i=1}^{\infty}$ as the original process. However, in the modified process, all points of $\Pi_i$ after time $\sigma_i$ are disregarded. On the event A, the first $j+1$ mutations will occur at exactly the same times and locations in the original process as in the modified process. Therefore, because $\mathbb{P}(A) \rightarrow 1$ , it suffices to prove (22) for this modified process. For the rest of the proof we will work with this modified process, which makes exact calculations possible.
Let $K\in (s,\infty)$ be a constant which does not depend on N. Our assumptions imply that $\mu_{j+1}\gg\mu_1\gg\alpha/N^{(d+1)/d}$ . Thus, there is an $N_K$ such that for $N \geq N_K$ we have $\kappa_{j+1}t< L/(2\alpha)$ for all $t\in[0,K]$ . It follows that $Y_{j}(s)=\gamma_d\alpha^d(s-\sigma_j)^d$ for $s\in [\sigma_j,\sigma_j+\kappa_{j+1}K]$ . Therefore, reasoning as in the proof of Lemma 7, we get
It follows that, for $N \geq N_K$ , the probability density of $(\sigma_{j+1}-\sigma_j)/\kappa_{j+1}$ restricted to [0, K] is
For $N \geq N_K$ and $t \in [0,K]$ , conditional on the event $\{\sigma_{j+1}-\sigma_j=\kappa_{j+1}t\}$ , the location of the first type $j+1$ mutation is a uniformly random point on a d-dimensional ball of radius $\alpha\kappa_{j+1}t$ , which means
It follows that, for $N \geq N_K$ ,
Because (23) implies that
the result (22) follows by letting $N \rightarrow \infty$ and then $K \rightarrow \infty$ .
Proof of Theorem 5. Lemma 12 proves the case when $k=j+1$ , so assume that $k\geq j+2$ . The triangle inequality implies that
Suppose $j+2 \leq i \leq k$ . We know from Lemma 12 that $D_i/(\alpha \kappa_i)$ converges in distribution to a non-degenerate random variable as $N \rightarrow \infty$ . Because $\kappa_{j+1}/\kappa_i \rightarrow \infty$ by the assumption in (9), it follows that $D_i/(\alpha \kappa_{j+1}) \to_\textrm{p} 0$ . Therefore,
Thus, Theorem 5 follows from Lemma 12 and Slutsky’s theorem.
4.2. Proof of Theorem 6
Having found a limiting distribution for distances between mutations in the setting of Theorem 2, we now prove a similar result in the setting of Theorem 4, where once the first type l mutation appears, all subsequent mutations appear in nested balls.
We begin with a result that bounds $\sigma_l^{(2)}-\sigma_l$ away from zero with high probability, on the time scale $\beta_l$ .
Lemma 13. Assume the same hypotheses as Theorem 4. Then, for all $\varepsilon>0$ , there is $r>0$ such that $\liminf_{N\to\infty}\mathbb{P}\big(\sigma_l^{(2)}-\sigma_l>\beta_l r\big)>1-\varepsilon$ .
Proof. Let $\varepsilon>0$ . Using Theorem 3, choose a large $t>0$ so that
Now set, as in the proof of Theorem 3,
It is clear that we can choose a small $r>0$ so that
Having chosen $t>0$ and $r>0$ , choose $\delta>0$ so that $[t,t+r]\subseteq [\delta,\delta^{-1}]$ . Then, for any $\lambda>0$ , define, as in Lemma 4, the event
Now we calculate
Because $Y_{l-1}(s)$ is monotone increasing in s, on the event $\{\sigma_l\leq \beta_l t\}\cap B$ we have
Using the above and (26), we have
Now take $N\to\infty$ . Using that $\mathbb{P}(B)\to 1$ as shown in the proof of Theorem 3, and using (24), we have
Since $\lambda>0$ is arbitrary, (25) implies $\liminf_{N\to\infty}\mathbb{P}\big(\sigma_l^{(2)}-\sigma_l>\beta_l r\big) > (1-\varepsilon/2)^2 > 1-\varepsilon$ , completing the proof.
Using Lemma 13, we prove an analog of Lemma 9 in the setting of Theorem 4.
Lemma 14. Assume the same hypotheses as Theorem 4. Then:
-
(i) For all $t>0$ , $\mathbb{P}\big(\sigma_{l+1}+\kappa_{l+1}t<\sigma_{l}^{(2)}\big)\to 1$ .
-
(ii) For every $k\geq l+1$ , $\mathbb{P}\big(\sigma_{k}<\sigma_l^{(2)}\big)\to 1$ .
Proof. Let $\varepsilon>0$ . Lemma 13 implies that there is $r>0$ such that, for sufficiently large N,
Now note that $(\sigma_{l+1}-\sigma_l)/\beta_l\to_p 0$ by (15) in the proof of Theorem 4. Also, our assumption that $\mu_{l+1}\gg 1/\big(\alpha^d\beta_l^{d+1}\big)$ is equivalent to $\kappa_{l+1}\ll \beta_l$ . It follows that, for sufficiently large N, $\mathbb{P}((\sigma_{l+1}-\sigma_l+\kappa_{l+1}t)/\beta_l<r)>1-\varepsilon$ . This estimate along with (27) imply that $\mathbb{P}\big(\sigma_{l+1}+\kappa_{l+1}t<\sigma_l^{(2)}\big)>1-2\varepsilon$ for sufficiently large N. This proves the first statement. The second statement is proved similarly, using instead that $(\sigma_k-\sigma_l)/\beta_l\to_\textrm{p} 0$ by (15).
At this point, we have proved that, for $k>l$ , the first type k mutation occurs before $\sigma_l^{(2)}$ with probability tending to 1 as $N\to \infty$ . This implies that in the setting of Theorem 4, we can disregard the type $1,\ldots,l-1$ mutations and regard the first type l mutation as the first type 1 mutation, and then prove Theorem 6 by following the same argument used to prove Theorem 5.
Proof of Theorem 6. Relabel the type $l,l+1,l+2,\ldots$ mutations as type $1,2,3,\ldots$ mutations, and repeat the arguments in Lemmas 7–12 and in the proof of Theorem 5. The only difference is that we have to apply Lemma 14 instead of Lemma 9. Note that type l mutations do not appear at the same rate as type 1 mutations, so we needed a different technique to establish $\mathbb{P}\big(\sigma_j<\sigma_l^{(2)}\big)\to 1$ for $j>l$ .
Acknowledgements
The authors thank Jasmine Foo for suggesting the problem of looking at the case of increasing mutation rates, and bringing to their attention the references [Reference Loeb and Loeb16, Reference Prindle, Fox and Loeb22].
Funding information
JS was supported in part by NSF Grant DMS-1707953.
Competing interests
There were no competing interests to declare which arose during the preparation or publication process of this article.