1. Introduction
A key component of any epidemic model is the assumption made concerning transmission of infection between individuals. In almost all epidemic models it is assumed that infection spreads via interactions of pairs of individuals, one of whom is infective and the other susceptible. In some epidemic models, such as network models (e.g. Newman [Reference Newman15]), this assumption is explicit, whereas in others, such as the so-called general stochastic epidemic (e.g. Bailey [Reference Bailey1, Chapter 6]) and many deterministic models, it is implicit. In the general stochastic epidemic, the process of the numbers of susceptible and infective individuals, $\{(S(t), I(t))\,:\,t \ge 0\}$ , is modelled as a continuous-time Markov chain with infinitesimal transition probabilities
and with all other transitions having probability $o(\Delta t)$ . Here, $\beta$ is the individual-to-individual infection rate and $\gamma$ is the recovery rate. However, it is probabilistically equivalent to a model in which the infectious periods of infectives follow independent exponential random variables having mean $\gamma^{-1}$ and contacts between distinct pairs of individuals occur at the points of independent Poisson processes, each having rate $\beta$ .
In real-life epidemics, people often meet in groups of size larger than two; in many countries, one of the most significant control measures in the COVID-19 pandemic was to impose limits on the size of gatherings outside of the home. In Cortez [Reference Cortez8] and Ball and Neal [Reference Ball and Neal5], the authors independently introduced a new class of SIR (susceptible $\to$ infective $\to$ recovered) epidemic model, in which mixing events occur at the points of a Poisson process, with the sizes of successive mixing events being independently distributed according to a random variable having support contained in $\{2,3,\dots, n\}$ , where n is the population size. Mixing events are instantaneous, and at a mixing event of size c, each infective present contacts each susceptible present independently with probability $\pi_c$ ; a susceptible becomes infected if they are contacted by at least one infective. Such an infected susceptible immediately becomes infective, although they cannot infect other susceptibles at the same mixing event, and remains so for a time that follows an exponential distribution with mean $\gamma^{-1}$ . In Cortez [Reference Cortez8] and Ball and Neal [Reference Ball and Neal5], the temporal behaviour of epidemics with many initial infectives is studied, with [Reference Cortez8] considering the mean-field limit of the stochastic epidemic process. In Ball and Neal [Reference Ball and Neal5], the focus is on a functional central limit theorem for the temporal behaviour of epidemics with many initial infectives and on central limit theorems for the final size of (i) an epidemic with many initial infectives and (ii) an epidemic with few initial infectives that becomes established and leads to a major outbreak. A branching process which approximates the early stages of an epidemic with few initial infectives is described in [Reference Ball and Neal5], though no rigorous justification is provided. A key result required in the proof of the central limit theorem for the final size in Case (ii) above is that there exists $\delta\gt 0$ (which depends on model parameters) such that the probability that an epidemic infects at least a fraction $\delta$ of the population, given that it infects at least $\log n$ individuals, converges to one as the population size $n \to \infty$ . This result is simply stated without a proof in [Reference Ball and Neal5]. The aim of the present paper is to fill these gaps for a model that allows more general transmission of infection at mixing events than that considered in [Reference Ball and Neal5].
Approximation of the process of infectives in an epidemic model by a branching process has a long history that goes back to the pioneering work of Bartlett [Reference Bartlett7, pp. 147–148] and Kendall [Reference Kendall12], who considered approximation of the number of infectives in the general stochastic epidemic by a linear birth-and-death process, with birth rate $\beta N$ and death rate $\gamma$ , where N is the initial number of susceptibles. This leads to the celebrated threshold theorem (Whittle [Reference Whittle17] and Williams [Reference Williams18]), arguably the most important result in mathematical epidemic theory. The approximation was made fully rigorous by Ball [Reference Ball2] (cf. Metz [Reference Metz13]), who defined realisations of the general stochastic epidemic, indexed by N, with the Nth epidemic having infection rate $\beta N^{-1}$ and recovery rate $\gamma$ , and the limiting birth-and-death process on a common probability space and used a coupling argument to prove almost sure convergence, as $N \to \infty$ , of the epidemic process to the limiting branching process over any finite time interval [0, t]. The method was extended by Ball and Donnelly [Reference Ball and Donnelly3] to show almost sure convergence over suitable intervals $[0, t_N]$ , where $t_N \to \infty$ as $N \to \infty$ .
The key idea of Ball [Reference Ball2] is to construct a realisation of the epidemic process for each N from the same realisation of the limiting branching process. Moreover, this coupling is done on an individual basis, in that the behaviour of an infective in the Nth epidemic model is derived from the behaviour of a corresponding individual in the branching process. The method is very powerful and applicable to a broad range of epidemic models. However, it cannot be easily applied to epidemics with mixing groups, because the mixing groups induce dependencies between different infectives. Thus instead, we generalise the method of Ball and O’Neill [Reference Ball and O’Neill6], which involves constructing sample paths of the epidemic process, indexed by the population size n, and the limiting branching process (more precisely, the numbers of infectives in the epidemic processes and the number of individuals in the branching process) via a sequence of independent and identically distributed (i.i.d.) random vectors. The generalisation is far from straightforward, since Ball and O’Neill [Reference Ball and O’Neill6] consider only epidemics in which the number of infectives changes in steps of size 1, as in the general stochastic epidemic, whereas in the model with mixing events, although the number of infectives can only decrease in steps of size 1, it can increase in steps of any size not greater than the population size n. We improve on the coupling given in [Reference Ball and O’Neill6] by coupling the time of events in the limiting branching process and epidemic processes, so that the event times agree with high probability, tending to 1 as the population size $n \to \infty$ , rather than having the event times in the epidemic processes converge in the limit, as the population size $n \to \infty$ , to the event times of the branching process. Finally, we use the coupling to prove the above-mentioned result concerning epidemics of size at least $\log n$ , which was not addressed in [Reference Ball and O’Neill6].
The remainder of the paper is structured as follows. The model with mixing groups $\mathcal{E}^{(n)}$ is defined in Section 2. The approximating branching process $\mathcal{B}$ and the main results of the paper are given in Section 3. The branching process $\mathcal{B}$ is described in Section 3.1, where some of its basic properties are presented. The offspring mean of $\mathcal{B}$ yields the basic reproduction number $R_0$ of the epidemic $\mathcal{E}^{(n)}$ . The extinction probability and Malthusian parameter of $\mathcal{B}$ are derived. The main results of the paper are collected together in Section 3.2. Theorem 3.1 shows that the number of infectives in the epidemic process $\mathcal{E}^{(n)}$ converges almost surely to the number of individuals alive in the branching process $\mathcal{B}$ on $[0,t_n)$ as $n \to \infty$ , where $t_n = \infty$ in the case the branching process goes extinct and $t_n = \rho \log n$ for some $\rho\gt 0$ otherwise. A major outbreak is defined as one that infects at least $\log n$ individuals. Theorem 3.2(a) shows that the probability of a major outbreak converges to the survival probability of $\mathcal{B}$ as $n \to \infty$ . Theorem 3.2(b) shows that if $R_0\gt 1$ , so a major outbreak occurs with non-zero probability in the limit $n \to \infty$ , then there exists $\delta\gt 0$ such that the probability that a major outbreak infects at least a fraction $\delta$ of the population tends to one as $n \to \infty$ . Moreover, we show that there exists $\delta^\prime \gt 0$ such that the fraction of the population infectious at the peak of the epidemic exceeds $\delta^\prime$ with probability tending to one as $n \to \infty$ . The proofs of Theorems 3.1 and 3.2 are given in Sections 4 and 5, respectively. Brief concluding comments are given in Section 6.
2. Model
We consider the spread of an SIR epidemic in a closed population of n individuals, with infection spread via mixing events which occur at the points of a homogeneous Poisson process having rate $n\lambda$ . The sizes of mixing events are i.i.d. according to a random variable $C^{(n)}$ having support $\{2,3,\dots,n\}$ . If a mixing event has size c then it is formed by choosing c individuals uniformly at random from the population without replacement. Suppose that a mixing event of size c involves i susceptible and j infective individuals, and hence $c-i-j$ recovered individuals. Then the probability that w new infectives are created at the event is $\pi_c(w;\,i,j)$ . The only restrictions we impose on $\pi_c (w;\,i,j) $ are the natural ones that, for $w \gt 0$ , $\pi_c (w;\,i,0) =0$ ; infections can only occur at a mixing event if there is at least one infective present; and for $w \gt i$ , $\pi_c (w;\,i,j) =0$ : the maximum number of new infectives created at a mixing event is the number of susceptibles involved in the event. Mixing events are assumed to be instantaneous. The infectious periods of infectives follow independent ${\rm Exp}(\gamma)$ random variables, i.e. exponential random variables having rate $\gamma$ and hence mean $\gamma^{-1}$ . There is no latency period, so newly infected individuals are immediately able to infect other individuals. (The possibility of their being able to infect other susceptibles during the mixing event at which they were infected can be incorporated into the $\pi_c(w;\,i,j)$ .) All processes and random variables in the above model are mutually independent. The epidemic starts at time $t=0$ with $m_n$ infective and $n-m_n$ susceptible individuals, and terminates when there is no infective left in the population. Denote this epidemic model by $\mathcal{E}^{(n)}$ .
2.1. Special cases
2.1.1. General stochastic epidemic
If all mixing groups have size 2, i.e. ${\mathbb{P}}(C^{(n)}=2)=1$ , and $\pi_c (0;\,1,1)=\pi_c (1;\,1,1)=\frac{1}{2}$ , then the model reduces to the general stochastic epidemic, with individual-to-individual infection rate $\beta=\frac{\lambda}{n-1}$ and recovery rate $\gamma$ .
2.1.2. Binomial sampling
The models studied in Cortez [Reference Cortez8] and Ball and Neal [Reference Ball and Neal5] make the Reed–Frost-type assumption that at a mixing event of size c, each infective present has probability $\pi_c$ of making an infectious contact with any given susceptible present, with all such contacts being independent. This corresponds to
3. Approximating branching process and main results
3.1. Approximating branching process
We approximate the process of infectives in the early stages of the epidemic $\mathcal{E}^{(n)}$ by a branching process $\mathcal{B}$ , which assumes that every mixing event which includes at least one infective consists of a single infective in an otherwise susceptible group. In the epidemic $\mathcal{E}^{(n)}$ , the probability that a given mixing event of size c involves a specified individual, $i_*$ say, is $\frac{c}{n}$ , so mixing events that include $i_*$ occur at rate
where $p^{(n)}_C(c) = {\mathbb{P}} \big(C^{(n)} =c\big)$ $(c=2,3,\ldots, n)$ and $\mu^{(n)}_C={\mathbb{E}}[C^{(n)}]$ . Furthermore, the probability that a given mixing event is of size c given that it includes $i_*$ is
Suppose that $C^{(n)} \stackrel{{\rm D}}{\longrightarrow} C$ as $n \to \infty$ (where $\stackrel{{\rm D}}{\longrightarrow}$ denotes convergence in distribution), $p_C(c)={\mathbb{P}}(C=c)$ $(c=2,3,\dots)$ , and $\mu^{(n)}_C \to \mu_C=\sum_{c=2}^{\infty} cp_C(c)$ , which we assume to be finite. Thus in the limit as $n \to \infty$ , mixing events involving $i_*$ occur at rate $\lambda \mu_C$ , and the size of such a mixing event is distributed according to $\tilde{C}$ , the size-biased version of C, having probability mass function
We assume that the initial number of infectives $m_n=m$ for all sufficiently large n, so the branching process $\mathcal{B}$ has m ancestors.
In $\mathcal{B}$ , a typical individual, $i_*$ say, has lifetime $L \sim {\rm Exp}(\gamma)$ , during which they have birth events at rate $\lambda \mu_C$ . Let $\tilde{Z}_1, \tilde{Z}_2, \dots$ denote the number of offspring $i_*$ has at successive birth events. A birth event corresponds to a mixing event involving a single infective in an otherwise susceptible group in the epidemic. Thus, $\tilde{Z}_1, \tilde{Z}_2, \dots$ are i.i.d. copies of a random variable $\tilde{Z}$ , with ${\mathbb{P}}(\tilde{Z}=w)=\varphi_w$ $(w=0,1,\dots)$ , where
using (3.2). Note that an individual may produce no offspring at a birth event. The number of birth events a typical individual has during their lifetime, G say, has the geometric distribution
Let R be the total number of offspring a typical individual has during their lifetime. Then
where $G, \tilde{Z}_1, \tilde{Z}_2, \dots$ are independent and the sum is zero if $G=0$ .
The basic reproduction number $R_0={\mathbb{E}}[R]$ . Hence, using (3.5) and (3.4),
where
is the mean number of new infectives generated in a mixing event of size c with one infective and $c-1$ susceptibles. Again using (3.5) and (3.4), the offspring probability generating function for the branching process $\mathcal{B}$ is
where $f_{\tilde{Z}}(s)=\sum_{w=0}^{\infty} \varphi_w s^w$ . By standard branching process theory, the extinction probability z of $\mathcal{B}$ , given that initially there is one individual, is given by the smallest solution in [0, 1] of $f_R(s)=s$ . Furthermore, $z\lt 1$ if and only if $R_0\gt 1$ .
Let r denote the Malthusian parameter of $\mathcal{B}$ ; see Jagers [Reference Jagers11, p. 10] for details. The mean rate at which an individual produces offspring t time units after their birth is ${\mathbb{P}}(L\gt t) \lambda \mu_C {\mathbb{E}}[\tilde{Z}_1]=\gamma {\rm e}^{-\gamma t} R_0$ $(t \gt 0)$ , so r is the unique solution in $(0, \infty)$ of
whence
Note that r depends on the parameters of the epidemic model only through $(R_0, \gamma)$ . Thus, if $R_0$ and $\gamma$ are held fixed, then the Malthusian parameter is the same for all corresponding choices of the distribution of C and $\{\pi_c(w;\,i,j)\}$ . In particular, under these conditions, the early exponential growth of an epidemic that takes off is the same as that of the general stochastic epidemic.
3.2. Strong convergence of epidemic processes
In this section we consider a sequence of epidemics $(\mathcal{E}^{(n)})$ , in which $m_n=m$ for all sufficiently large n, and state results concerned with convergence of the process of infectives in the epidemic process $\mathcal{E}^{(n)}$ to the branching process $\mathcal{B}$ as $n \to \infty$ that are proved in Section 4. The usual approach to proving such results is based upon that of Ball [Reference Ball2] and Ball and Donnelly [Reference Ball and Donnelly3], in which the sample paths of the epidemic process for each n are constructed from those of the limiting branching process, $\mathcal{B}$ . As noted in the introduction, that approach is not easily implemented in the present setting, because the mixing groups induce dependencies between different infectives. We therefore generalise the method in Ball and O’Neill [Reference Ball and O’Neill6] and construct sample paths of the epidemic processes and the limiting branching process, $\mathcal{B}$ , from a sequence of i.i.d. random vectors defined on an underlying probability space $(\Omega, \mathcal{F}, {\mathbb{P}})$ . The construction is described in Section 4.
For $t \ge 0$ , let $S^{(n)}(t)$ and $I^{(n)}(t)$ be the numbers of susceptibles and infectives, respectively, at time t in $\mathcal{E}^{(n)}$ . Let $T^{(n)}=n-S^{(n)}(\infty)$ be the total size of the epidemic $\mathcal{E}^{(n)}$ , i.e. the total number of individuals infected during its course, including the initial infectives. For $t \geq 0$ , let I(t) be the number of individuals alive at time t in $\mathcal{B}$ , and let T be the total size of the branching process $\mathcal{B}$ , including the m ancestors. Note that whereas $T^{(n)}(\omega)\lt\infty$ for all $\omega \in \Omega$ , $T(\omega)=\infty$ if the branching process $\mathcal{B}(\omega)$ does not go extinct.
Throughout the remainder of the paper we assume that $m_n =m$ and $\mu^{(n)}_C \leq \mu_C$ for all sufficiently large n. The assumption $\mu^{(n)}_C \leq \mu_C$ simplifies the presentation of certain results, in particular, Lemma 4.2, and holds in the most common cases: (i) C has finite support $\{2,3,\ldots, n_0\}$ , and for $n \geq n_0$ , $C^{(n)} = C$ ; (ii) $C^{(n)} = \min \{ C, n \}$ ; and (iii) $C^{(n)} \stackrel{D}{=} C | C\leq n$ . We also assume throughout that $C^{(n)} \stackrel{{\rm D}}{\longrightarrow} C$ and ${\mathbb{E}}[(C^{(n)})^2] \to {\mathbb{E}}[C^2] \lt\infty$ as $n \to \infty$ . For Theorem 3.1(b) and Theorem 3.2, we require additional conditions on $C^{(n)}$ and C, namely that
and that there exists $\theta_0 \gt 0$ such that
Note that ${\mathbb{E}}[\big(C^{(n)}\big)^3] \to {\mathbb{E}}[C^3] \lt\infty$ as $n \to \infty$ is a sufficient condition for (3.9) to hold. Also, in the three common cases listed above for constructing $C^{(n)}$ from C, (3.10) holds for any $0 \lt \theta_0 \lt \alpha$ , for which ${\mathbb{E}} [C^{1 + \alpha}] \lt \infty$ . (For Case (i), this is immediate. For Cases (ii) and (iii), the proof is similar to that of (A1) in the Supplementary Information of Ball and Neal [Reference Ball and Neal5].)
Theorem 3.1. Under the stated conditions on $C^{(n)}$ , there exists a probability space $(\Omega, \mathcal{F}, {\mathbb{P}})$ on which are defined a sequence of epidemic models, $\mathcal{E}^{(n)}$ , indexed by n, and the approximating branching process, $\mathcal{B}$ , with the following properties.
Denote by $A_{ext}$ the set on which the branching process $\mathcal{B}$ becomes extinct:
-
(a) Then, as $n \rightarrow \infty$ ,
\begin{eqnarray}\sup_{0 \leq t \lt \infty} \big| I^{(n)} (t) - I (t) \big| \rightarrow 0 \nonumber\end{eqnarray}for ${\mathbb{P}}$ -almost all $\omega \in A_{ext}$ . -
(b) Suppose that (3.9) holds and (3.10) holds for some $\theta_0 \gt 0$ . Then, if there exists $\alpha \geq 1$ such that ${\mathbb{E}} [C^{\alpha +1}] \lt \infty$ , we have for
(3.11) \begin{eqnarray} 0 \lt \rho \lt \frac{1}{r} \min \left\{ \frac{\alpha \theta_0}{2 (1+\alpha)}, \frac{\alpha}{2 + 4 \alpha} \right\},\end{eqnarray}as $n \rightarrow \infty$ ,(3.12) \begin{eqnarray} \sup_{0 \leq t \leq \rho \log n} \big| I^{(n)} (t) - I (t) \big| \rightarrow 0\end{eqnarray}for ${\mathbb{P}}$ -almost all $\omega \in A_{ext}^c$ .
The proof of Theorem 3.1 is presented in Section 4.
Note that $\rho$ given in (3.11) satisfies $\rho \lt (4r)^{-1}$ , and thus Theorem 3.1(b) is weaker than [Reference Ball and Donnelly3, Theorem 2.1, (2.2)], where (3.12) is shown to hold for $\rho \lt (2r)^{-1}$ in the standard pairwise mixing epidemic model. The following corollary of Theorem 3.1 concerns the final size of the epidemic.
Corollary 3.1. For $(\Omega, \mathcal{F}, {\mathbb{P}})$ defined in Theorem 3.1 , we have, for ${\mathbb{P}}$ -almost all $\omega \in \Omega$ ,
Corollary 3.1 shows that for large n, the final size of the epidemic $\mathcal{E}^{(n)}$ can be approximated by the total size of $\mathcal{B}$ . This leads to a threshold theorem for the epidemic process $\mathcal{E}^{(n)}$ by associating survival (i.e. non-extinction) of the branching process $\mathcal{B}$ with a major outbreak in the epidemic process $\mathcal{E}^{(n)}$ (cf. Ball [Reference Ball2, Theorem 6], and Ball and Donnelly [Reference Ball and Donnelly3, Corollary 3.4]). It then follows that a major outbreak occurs with non-zero probability if and only if $R_0\gt 1$ , and the probability of a major outbreak is $1-z^m$ . However, for practical applications it is useful to have a definition of a major outbreak that depends on n. We say that a major outbreak occurs if and only if $T^{(n)} \ge \log n$ .
Theorem 3.2. Suppose that (3.9) holds and (3.10) holds for some $\theta_0 \gt 0$ .
-
(a) Then
(3.13) \begin{equation}{\mathbb{P}}\big(T^{(n)} \ge \log n\big) \to 1-z^m \quad\textit{as } n \to \infty.\end{equation} -
(b) If also $R_0\gt 1$ and there exists $\alpha \gt 1$ such that ${\mathbb{E}} [C^{1 + \alpha}]\lt\infty$ , then there exists $\delta\gt 0$ such that
(3.14) \begin{equation}{\mathbb{P}}\big(T^{(n)} \ge \delta n \vert T^{(n)} \ge \log n\big) \to 1 \quad\textit{as } n \to \infty.\end{equation}
The proof of Theorem 3.2 is presented in Section 5.
Theorem 3.2(b) implies that a major outbreak infects at least a fraction $\delta$ of the population with probability tending to one as $n \to \infty$ . However, $\delta$ depends on the parameters of the epidemic $\mathcal{E}^{(n)}$ and can be arbitrarily close to 0. An immediate consequence of the proof of Theorem 3.2(b) is Corollary 3.2, which states that, in the limit as $n \to \infty$ , there exists $\delta^\prime \gt 0$ such that in the event of a major epidemic outbreak the proportion of the population infectious at the peak of the epidemic exceeds $\delta^\prime$ .
Corollary 3.2. Under the conditions of Theorem 3.2(b), there exists $\delta^\prime \gt 0$ such that
A central limit theorem for the total size $T^{(n)}$ in the event of a major outbreak is given in Ball and Neal [Reference Ball and Neal5], for the special case of binomial sampling (Section 2.1.2), by using the theory of (asymptotically) density-dependent population processes (Ethier and Kurtz [Reference Ethier and Kurtz10, Chapter 11] and Pollett [Reference Pollett16]) to obtain a functional central limit theorem for a random time-scale transformation of $\{(S^{(n)}(t), I^{(n)}(t)):t \ge 0\}$ and hence a central limit theorem for the number of susceptibles when the number of infectives reaches zero, via a boundary-crossing problem. As noted in the introduction, Theorem 3.2(b) is a key step in the proof of the above central limit theorem, though the result was only stated in [Reference Ball and Neal5]. A similar central limit theorem for $T^{(n)}$ is likely to hold for our more general model, although details will be messy unless $\pi_c(w;\,i,j)$ takes a convenient form.
4. Proof of Theorem 3.2
4.1. Overview
We present an overview of the steps to prove Theorem 3.1. We construct on a common probability space the Markovian branching process $\mathcal{B}$ and the sequence of epidemic processes $( \mathcal{E}^{(n)} )$ , in which we equate infection and removal events in the epidemic process, $\mathcal{E}^{(n)}$ , with birth and death events, respectively, in the branching process, $\mathcal{B}$ . Given that at time $t \geq 0$ there are the same number of infectious individuals in the epidemic process $\mathcal{E}^{(n)}$ as there are individuals alive in the branching process $\mathcal{B}$ , the removal rate in $\mathcal{E}^{(n)}$ is equal to the death rate in $\mathcal{B}$ . For $k=0,1,\ldots$ , the rate at which an infection event occurs which generates k new infections in $\mathcal{E}^{(n)}$ will depend upon the state of the population (number of susceptibles and infectives), and during the early stages of the epidemic this rate will be close to, but typically not equal to, the rate at which a birth event resulting in k new individuals occurs in $\mathcal{B}$ . Therefore, we look to bound the difference between the infection rate in $\mathcal{E}^{(n)}$ and the birth rate in $\mathcal{B}$ in order to establish a coupling between the two processes.
A useful observation is that in the epidemic processes $\mathcal{E}^{(n)}$ (the branching process $\mathcal{B}$ ) the number of infectives and susceptibles (the number of individuals alive) is piecewise constant between events, where an event is either a mixing event or a recovery. Therefore, in Section 4.2, we define embedded discrete-time jump processes for $\mathcal{E}^{(n)}$ and $\mathcal{B}$ , for the number of infectives (and susceptibles) and the number of individuals alive after each event. In the case of $\mathcal{B}$ the embedded discrete-time jump process is a random walk. Then, in Section 4.3, we provide a bound on the rate of convergence to 0 of the difference between the infection rate in $\mathcal{E}^{(n)}$ and the birth rate in $\mathcal{B}$ in Lemma 4.1, which is applicable during the early stages of the epidemic when only a few individuals have been infected. Lemma 4.1 enables us to construct the embedded discrete-time jump processes defined in Section 4.2 on a common probability space (Section 4.4) and provide an almost sure coupling between the discrete-time processes during the initial stages of the epidemic (Section 4.5). That is, we couple the outcomes of the kth $(k=1, 2, \ldots)$ events in $\mathcal{E}^{(n)}$ and $\mathcal{B}$ so that the types of event—birth (infection) and death (removal)—match, and in the case of birth/infection, the same numbers of births and infections occur. Once we have established an almost sure agreement between the types of events that have occurred in the epidemic and branching processes, it is straightforward to provide an almost sure coupling of the timing of the events. The key couplings are drawn together in Lemma 4.2, from which Theorem 3.1 follows almost immediately. Finally, we consider the total sizes of the epidemic processes $\mathcal{E}^{(n)}$ and the branching process $\mathcal{B}$ and provide a proof of Corollary 3.1.
4.2. Embedded random walk
Let the random walk $\mathcal{R}$ be defined as follows. Let $Y_k$ denote the position of the random walk after k steps, with $Y_0 = m \gt 0$ . For $k=1,2,\ldots$ , let $Y_k = Y_{k-1} + Z_k$ , where $Z_1, Z_2, \ldots$ are i.i.d. with probability mass function
where $\beta=\lambda \mu_C$ and $\varphi_w$ is defined as in (3.3). Thus, upward (downward) steps in $\mathcal{R}$ correspond to birth (death) events in $\mathcal{B}$ . Note that $Z_k =0$ is possible, corresponding to a step with no change in the random walk (a birth event with no births in $\mathcal{B}$ ). For $k=1,2,\ldots$ , let $\eta_k$ denote the time of the kth event in $\mathcal{B}$ with $\eta_0 =0$ ; then we can construct $\mathcal{R}$ from $\mathcal{B}$ by setting $Y_k = I (\eta_k)$ , where I(t) $(t \geq 0)$ is the size of the population of $\mathcal{B}$ at time t. Note that if $I (\eta_k) =0$ , then the branching process has gone extinct and $Y_k=0$ , i.e. the random walk has hit 0. We can continue the construction of the random walk after the branching process has gone extinct using $Y_k = Y_{k-1} + Z_k$ , but our primary interest is in the case where the two processes are positive. Conversely, we can construct $\mathcal{B}$ from $\mathcal{R}$ by using, in addition to $\{Y_k\}=\{Y_k:k=0,1,\dots\}$ , a sequence of i.i.d. random variables $V_1, V_2, \ldots$ , where $V_k \sim {\rm Exp} (1)$ . (Throughout the paper, discrete-time processes are assumed to have index set $\mathbb{Z}_+$ unless indicated otherwise.) For $k=1,2, \ldots$ ,
and for any $\eta_k \leq t \lt \eta_{k+1}$ , we set $I (t) = Y_k$ . Note that $\eta_k = \infty$ if $Y_{k-1} =0$ , corresponding to the branching process going extinct with $I(t) =0$ for all $t \geq \eta_{k-1}$ . Finally, note that ${\mathbb{E}} [Z_1] \lt 0$ , $=0$ , or $\gt 0$ if and only if $R_0 \lt 1$ , $=1$ , or $\gt 1$ , respectively.
We turn to the sequence of epidemic processes, $(\mathcal{E}^{(n)})$ , and for each $\mathcal{E}^{(n)}$ , an associated discrete-time epidemic jump process $\mathcal{S}^{(n)}$ . Let $Q^{(n)}_c (i,j\vert x,y)$ denote the probability that a mixing event of size c in a population of size n with x susceptibles and y infectives (and hence $n-x-y$ recovered individuals) involves i susceptibles and j infectives (and hence $c-i-j$ recovered individuals). Note that
For $w=0,1,\ldots$ , let $q^{(n)} (x,y,w)$ be such that
where the indices j and l refer to the numbers of infectives and recovered individuals, respectively, involved in the mixing event. Thus, for $w=1,2,\ldots,x$ , $q^{(n)}(x,y,w) y$ denotes the rate of occurrence of mixing events that create w new infectives within a population of size n having x susceptibles and y infectives. Hence, $q^{(n)} (x,y,w)$ can be viewed as the rate at which an infectious individual in a population of size n containing x susceptibles and y infectives generates w new infectives. Note that $q^{(n)}(x,y,0) y$ is the rate of occurrence of mixing events which involve at least one infective and create no new infectives, in a population with x susceptibles and $y\gt 0$ infectives.
Recall that, for $t \geq 0$ , $S^{(n)} (t)$ and $I^{(n)}(t)$ denote respectively the numbers of susceptibles and infectives at time t in $\mathcal{E}^{(n)}$ . Since the population is closed, for all $t \geq 0$ , $n - S^{(n)} (t)- I^{(n)}(t)$ denotes the number of recovered individuals, and we can describe the epidemic $\mathcal{E}^{(n)}$ in terms of $\big\{(S^{(n)}(t), I^{(n)}(t))\,:\, t \geq 0\big\}$ , which is a continuous-time Markov chain on the state space $E^{(n)}=\big\{ (x,y) \in \mathbb{Z}^2\,:\, x+y \leq n, 0 \leq x \leq n-m_n, y \geq 0 \big\}$ with transition probabilities
and with all other transitions having probability $o (\Delta t)$ . The events (4.4) and (4.5) correspond to infection of w individuals and recovery of an individual, respectively. The function $q^{(n)}$ is real-valued with domain a subset of $\mathbb{Z}_+ \times \mathbb{Z}_+ \times \mathbb{N}$ . We note that the epidemic process is invariant to the choice of $q^{(n)} (x,y,0) \geq 0$ , so we can define $q^{(n)} (x,y,0)$ to satisfy (4.3) with $w=0$ . Similarly, the epidemic process is invariant to the choice of $q^{(n)} (x,0,w)$ , as no infections can occur if $y=0$ , but for coupling purposes it is useful to define $q^{(n)} (x,y,w) = \beta \varphi_w$ for $y=0,-1,-2,\ldots$ . Finally, as noted in Section 4.1, we observe that the recovery rate (4.5) coincides with the death rate of the branching process $\mathcal{B}$ , so to couple the number of infectives in the epidemic process $\mathcal{E}^{(n)}$ to the number of individuals in the branching process $\mathcal{B}$ , we require that $q^{(n)} (x,y,w) \approx \beta \varphi_w$ and $q^{(n)} (x,y) = \sum_{w=0}^{n-1} q^{(n)} (x,y,w) = \sum_{w=0}^{\infty} q^{(n)} (x,y,w) \approx \beta$ as n becomes large. $\Big($ Note that for $w \gt n-1$ , $q^{(n)} (x,y,w) =0.\Big)$ We proceed by making this precise after first describing the discrete-time epidemic jump process $\mathcal{S}^{(n)}$ .
For $n=1,2,\ldots$ and $k=0,1,\ldots$ , let $\Big(X_k^{(n)}, Y_k^{(n)}\Big)$ denote the state of the jump process $\mathcal{S}^{(n)}$ after the kth event with $\Big(X_0^{(n)}, Y_0^{(n)}\Big) = (n-m_n, m_n)$ . For $k=1,2,\ldots$ , $(x,y) \in E^{(n)}$ , and $w=0,1,\ldots,x$ , let
with all other transitions having probability 0 of occurring. Letting $\eta_k^{(n)}$ denote the time of the kth event in $\mathcal{E}^{(n)}$ , with $\eta_0^{(n)} =0$ , we can construct $\mathcal{S}^{(n)}$ from $\mathcal{E}^{(n)}$ by setting $\Big(X_k^{(n)}, Y_k^{(n)}\Big) = \Big(S^{(n)}\big(\eta_k^{(n)}\big), I^{(n)}\big(\eta_k^{(n)}\big)\Big)$ . As with the construction of $\mathcal{R}$ , we can continue the construction of $\mathcal{S}^{(n)}$ after the kth event with $Y_k^{(n)}=0$ , using $q^{(n)} (x,y,w) = \beta \varphi_w$ for $y=0,-1,-2,\ldots$ . Conversely, we can construct $\mathcal{E}^{(n)}$ from $\mathcal{S}^{(n)}$ by in addition using the sequence of i.i.d. random variables $V_1^{(n)}, V_2^{(n)}, \ldots$ , where $V_i^{(n)} \sim {\rm Exp} (1)$ . For $k=1,2, \ldots$ , set
then for any $\eta_k^{(n)} \leq t \lt \eta_{k+1}^{(n)}$ , set $\big(S^{(n)} (t),I^{(n)} (t)\big) = \Big(X_k^{(n)}, Y_k^{(n)}\Big)$ . Note that if $Y_{k-1}^n =0$ , $\eta_k^{(n)} = \infty$ and for all $t \geq \eta_{k-1}^{(n)}$ , the epidemic has died out with $\big(S^{(n)} (t),I^{(n)} (t)\big) = \Big(X_{k-1}^{(n)}, 0\Big)$ .
We briefly discuss the choice of $V^{(n)}_k$ . A simple coupling with the branching process $\mathcal{B}$ would be to set $V^{(n)}_k =V_k$ , which results in $\eta_k^{(n)} \approx \eta_k$ if $\eta_{k-1}^{(n)} \approx \eta_{k-1}$ and $ Y_{k-1}^{(n)} \Big[\gamma + q^{(n)} \big(X_{k-1}^{(n)},Y_{k-1}^{(n)}\big) \Big] \approx Y_{k-1} [\gamma + \beta]$ . This is the approach taken in [Reference Ball and O’Neill6] and leads to a slight mismatch between the event times in $\mathcal{E}^{(n)}$ and $\mathcal{B}$ , with the mismatch converging to 0 as $n \rightarrow \infty$ . Therefore we take an alternative approach which results in there being high probability of $\eta_k^{(n)} =\eta_k$ , if $\eta_{k-1}^{(n)} = \eta_{k-1}$ and $Y_{k-1}^{(n)} = Y_{k-1}$ , with the details provided in Section 4.4.
4.3. Matching infection rate to birth rate
In this section, we provide bounds on the differences between the rate, $q^{(n)}\big(x^{(n)}, y^{(n)}, w\big)$ , at which events creating w ( $w=0,1, \ldots$ ) new infectives occur in $\mathcal{E}^{(n)}$ with $x^{(n)}$ susceptibles and $y^{(n)}$ infectives present in the population, and the rate, $\beta \varphi_w$ , at which birth events creating w new individuals occur in $\mathcal{B}$ . The bounds on the difference in rates are appropriate during the early stages of the epidemic process where $n - r_n \leq x \leq n - m_n$ (i.e. whilst fewer than $r_n$ individuals have ever been in the infectious state), for a sequence $(r_n)$ satisfying $r_n \rightarrow \infty$ and $r_n/ \sqrt{n} \rightarrow 0$ as $n \rightarrow \infty$ .
In the early stages of the epidemic, when $x \ge n - r_n$ , it is unlikely that a mixing event will involve more than one non-susceptible individual. Thus we split the double sum over j and l in (4.3) into the case $j=1$ and $l=0$ , a single infective in an otherwise susceptible group of size c, and the case $j+l \geq 2$ , where there is more than one non-susceptible individual in a mixing event. This gives, for $y\gt 0$ ,
We consider the two terms on the right-hand side of (4.7). Note that for $y \leq0$ , we set $q^{(n)}_1 (x,y,w) = \beta \varphi_w$ and $q^{(n)}_2 (x,y,w) = 0$ , which is consistent with $q^{(n)} (x,y,w) =\beta \varphi_w$ $(y=0,-1,\ldots)$ . Also, for $w=n, n+1, \ldots$ , $q^{(n)} (x,y,w) =0$ , which implies $q^{(n)}_h (x,y,w) = 0$ $(h=1,2)$ . For $h=1,2$ , let $q^{(n)}_h (x,y) = \sum_{w=0}^{n-1} q^{(n)}_h (x,y,w) = \sum_{w=0}^{\infty} q^{(n)}_h (x,y,w)$ , the sums over w of the two components of $q^{(n)} (x,y,w)$ in (4.7). Hence $q^{(n)} (x,y) = q^{(n)}_1 (x,y) + q^{(n)}_2 (x,y)$ .
Lemma 4.1 provides bounds on the rate of convergence to 0, as $n \to \infty$ , of the difference between the infection rate in the epidemic process and the birth rate in the branching process, in terms of the number of non-susceptibles in the population ( $r_n$ ) and the rate of convergence of $C^{(n)}$ to C. Remember that throughout we assume that $C^{(n)} \stackrel{{\rm D}}{\longrightarrow} C$ and ${\mathbb{E}} [(C^{(n)})^2] \rightarrow {\mathbb{E}} [C^2]$ as $n \rightarrow \infty$ , with ${\mathbb{E}} [C^2] \lt \infty$ ; see the conditions stated before Theorem 3.1 in Section 3.2.
Lemma 4.1. Let $(r_n)$ be a sequence of positive real numbers such that $r_n \rightarrow \infty$ and $r_n/\sqrt{n} \rightarrow 0$ as $n \rightarrow \infty$ .
Let $(s_n)$ be a sequence of positive real numbers such that $s_n r_n^2/n \rightarrow 0$ and
Suppose that $(x^{(n)})$ and $(y^{(n)})$ are two sequences such that $n - r_n \leq x^{(n)} \leq n - m_n$ and $0 \lt y^{(n)} \leq r_n$ for all sufficiently large n. Then
and
Consequently, if $s_n r_n^2/n \rightarrow 0$ as $n \rightarrow \infty$ , then
Proof. First note that, for $Q^{(n)}_c \big(c-1,1\vert x^{(n)} ,y^{(n)} \big)$ defined in (4.2) and any $c=2,3, \ldots$ ,
where
For $x^{(n)} \geq n - r_n$ and $c \leq n/2$ , we have that
Therefore, for $x^{(n)} \geq n - r_n$ and $c \leq n/2$ ,
Note that $p_C^{(n)} (c)=0$ for $c\gt n$ . Also, using (3.3) and recalling that $\beta = \lambda \mu_C$ , we have
Hence, for $w=0,1,\ldots$ ,
It follows that
The first term on the right-hand side of (4.14) converges to 0 by (4.8). Using (4.13) and Markov’s inequality, the second term on the right-hand side of (4.14) satisfies
Hence (4.9) is proved.
The probability that a pair of individuals, chosen uniformly at random, are both non-susceptible is $(n-x^{(n)})(n-x^{(n)}-1)/[n(n-1)]$ . In a group of c individuals there are $c(c-1)/2$ pairs, so
For $x^{(n)} \geq n - r_n$ , the right-hand side of (4.15) is bounded above by $[ c(c-1)/2] \times [r_n/n ]^2$ .
Therefore, since $ q^{(n)}_2 \big(x^{(n)}, y^{(n)},w\big) =0$ for $w =n, n+1, \ldots$ , we have that
and (4.10) is proved.
Finally, (4.11) follows from (4.9) and (4.10) by the triangle inequality.
Note that if C has finite support $\{2, 3, \ldots, n_0 \}$ , then for all $n \geq n_0$ , $C^{(n)} \equiv C$ , and (4.8) holds for any sequence $\{s_n \}$ .
4.4. Construction of the event processes
Lemma 4.1 implies that the difference between the transition probabilities of $\mathcal{R}$ and $\mathcal{S}^{(n)}$ tends to 0 as $n \to \infty$ , provided the number of non-susceptible individuals remains sufficiently small. We proceed by constructing $\mathcal{R}$ and $\mathcal{S}^{(n)}$ on a common probability space $(\Omega, \mathcal{F}, {\mathbb{P}})$ , with $Y_0 = m$ and, for all sufficiently large n, $\Big(X_0^{(n)}, Y_0^{(n)}\Big) = (n-m_n ,m_n) = (n-m,m)$ . For $k=1,2,\ldots$ , let $\textbf{U}_k = (U_{k,1}, U_{k,2}, U_{k,3})$ be i.i.d. random vectors defined on $(\Omega, \mathcal{F}, {\mathbb{P}})$ , with $U_{k,i} \sim {\rm U}(0,1)$ $(i=1,2,3)$ being independent, where ${\rm U}(0,1)$ denotes a random variable that is uniformly distributed on [0, 1].
We construct $\mathcal{R}$ as follows. Suppose that for some $k=1,2,\ldots$ , $Y_{k-1} =y$ . The kth step in $\mathcal{R}$ is a downward step (of size 1) with $Y_k = y-1$ if $U_{k,1} \leq \gamma/(\gamma + \beta)$ . Otherwise the random walk has an ‘upward’ step of size $a_k$ with $Y_k = y+a_k$ , where $a_k$ satisfies
Note that all sums are equal to 0, if vacuous; $a_k=0$ is possible and the probability that $a_k =i$ is $\varphi_i$ .
Similarly, we construct $\mathcal{S}^{(n)}$ as follows. Suppose that for some $k=1,2,\ldots$ , $\Big(X_{k-1}^{(n)}, Y_{k-1}^{(n)}\Big) =\Big(x_k^{(n)},y_k^{(n)}\Big)$ . The kth event in $\mathcal{S}^{(n)}$ is a recovery with $\Big(X_k^{(n)}, Y_k^{(n)}\Big) =\Big(x_k^{(n)},y_k^{(n)}-1\Big)$ if $U_{k,1} \leq \gamma/\Big[\gamma + q^{(n)} \Big(x_k^{(n)},y_k^{(n)}\Big)\Big]$ . Otherwise the kth event in $\mathcal{S}^{(n)}$ is an infection event of size $a_k^{(n)}$ with $\Big(X_k^{(n)}, Y_k^{(n)}\Big) =\Big(x_k^{(n)}-a_k^{(n)},y_k^{(n)}+a_k^{(n)}\Big)$ , where $a_k^{(n)}$ satisfies
To enable an effective coupling between $\mathcal{R}$ and $\mathcal{S}^{(n)}$ , we obtain $a_k^{(n)}$ as follows. For $i=0,1,\ldots$ , let $\varpi_i^{(n)} \Big(x_k^{(n)}, y_k^{(n)}\Big) = \min \left\{ \varphi_i, \varphi_i^{(n)} \Big(x_k^{(n)}, y_k^{(n)}\Big) \right\}$ and let
where (a, b] is the empty set if $a=b$ . If $U_{k,2} \not\in D^{(n)}_2 \big(x^{(n)}_k, y^{(n)}_k\big) $ , then there exists $i \in \mathbb{Z}_+$ such that
and we set $a_k^{(n)} = i$ . Therefore, if $U_{k,2} \not\in D^{(n)}_2 \big(x^{(n)}_k, y^{(n)}_k\big) $ , we have that $a_k^{(n)} = a_k$ . Let
the total variation distance between $(\varphi_0, \varphi_1, \ldots)$ and $\Big(\varphi_0^{(n)} \Big(x_k^{(n)}, y_k^{(n)}\Big), \varphi_1^{(n)}\Big(x_k^{(n)}, y_k^{(n)}\Big), \ldots\Big)$ . If $U_{k,2} \in D^{(n)}_2 \big(x^{(n)}_k, y^{(n)}_k\big) $ , we set $a_k^{(n)} =i$ with probability
which ensures that overall the probability that $a_k^{(n)} = i$ is $ \varphi_i^{(n)} \Big(x_k^{(n)}, y_k^{(n)}\Big)$ . We do not need to be more explicit about the choice $a_k^{(n)}$ when $a_k^{(n)} \neq a_k$ .
Given $V_1, V_2, \ldots$ , i.i.d. according to ${\rm Exp} (1)$ , we can construct $\mathcal{B}$ from $\mathcal{R}$ as outlined in Section 4.2. We conclude this section with a description of the construction of $\mathcal{E}^{(n)}$ from $\mathcal{S}^{(n)}$ , in order to couple the time of events in $\mathcal{E}^{(n)}$ to the event times in $\mathcal{B}$ . Given that there are $y^{(n)}$ infectives in the population, the probability that an individual chosen uniformly at random is infectious is $y^{(n)}/n$ , so the probability that a mixing event of size c involves at least one infective is bounded above by $c y^{(n)}/n$ . Therefore
Hence, under the assumption $\mu_C^{(n)} \leq \mu_C$ , we have that $q^{(n)} \big(x^{(n)},y^{(n)}\big) \leq \beta (= \lambda \mu_C)$ . Therefore, letting
we have, if $\Big(X_{k-1}^{(n)}, Y_{k-1}^{(n)}\Big) = \big(x^{(n)},y^{(n)}\big)$ , that
For $z \geq 0$ , let
denote the probability density function of $\tilde{V}_k^{(n)} \big(x^{(n)}, y^{(n)}\big)$ . Similarly, let $f_V (z) = \exp(\!-\!z)$ $(z\geq 0)$ denote the probability density function of $V_1$ . It follows from (4.20), for all $z \geq 0$ , that
Therefore, we can construct a realisation of $\tilde{V}_k^{(n)} \big(x^{(n)}, y^{(n)}\big)$ by setting $\tilde{V}_k^{(n)} \big(x^{(n)}, y^{(n)}\big) = V_k$ if $U_{k,3} \leq 1 - \tilde{d}^{(n)} \big(x^{(n)}, y^{(n)}\big)$ , and if $U_{k,3} \gt 1 - \tilde{d}^{(n)}\big(x^{(n)}, y^{(n)}\big)$ , we draw $\tilde{V}_k^{(n)} \big(x^{(n)}, y^{(n)}\big)$ from a random variable with, for $z \geq 0$ , probability density function
Finally, we set
which ensures that $V_k^{(n)} \sim {\rm Exp} (1)$ . Also, if $\eta_{k-1}^{(n)} = \eta_{k-1}$ , $Y_{k-1}^{(n)} = Y_{k-1}$ , and $U_{k,3} \leq 1 - \tilde{d}^{(n)} \Big(X_{k-1}^{(n)}, Y_{k-1}^{(n)}\Big)$ , then $\tilde{V}_k^{(n)} \Big(X_{k-1}^{(n)}, Y_{k-1}^{(n)}\Big) = V_k$ , and substituting $V_k^{(n)}$ into (4.6) and using (4.22) gives
4.5. Coupling of the epidemic and branching processes
A mismatch occurs at event k whenever the kth events in the epidemic process, $\mathcal{E}^{(n)}$ (discrete epidemic jump process $\mathcal{S}^{(n)}$ ), and branching process, $\mathcal{B}$ (random walk $\mathcal{R}$ ), are either a removal ( $\mathcal{E}^{(n)}$ ) and a birth ( $\mathcal{B}$ ), or an infection ( $\mathcal{E}^{(n)}$ ) and a birth $(\mathcal{B})$ where the number of new infections ( $\mathcal{E}^{(n)}$ ) and the number of births $(\mathcal{B})$ differ. The first type of mismatch occurs in Ball and O’Neill [Reference Ball and O’Neill6], where also mismatches of the type with an infection ( $\mathcal{E}^{(n)}$ ) and a death ( $\mathcal{B}$ ) are permissible. Owing to (4.18) and the assumption that $\mu_C^{(n)} \leq \mu_C$ for all sufficiently large n, an infection in ( $\mathcal{E}^{(n)}$ ) and a death ( $\mathcal{B}$ ) is not possible for such n in the current setup, but the arguments can easily be modified to allow for this situation. The second type of mismatch comes from allowing multiple infections/births.
Since $q^{(n)} \big(x^{(n)},y^{(n)}\big) \leq \beta$ , a type-1 mismatch occurs at event k, where after event $k-1$ there are $x^{(n)}$ susceptibles and $y^{(n)}$ infectives, if and only if
with
Let $\tilde{Z}_1, \tilde{Z}_2, \ldots$ be i.i.d. according to $\tilde{Z}$ with probability mass function ${\mathbb{P}} (\tilde{Z} = i) = \varphi_i$ $(i=0,1,\ldots)$ . We construct $\tilde{Z}_1, \tilde{Z}_2, \ldots$ from $U_{1,2}, U_{2,2}, \ldots$ by setting $\tilde{Z}_k$ to satisfy
Thus $\tilde{Z}_k$ is the number of births (size of the ‘upward step’) occurring in $\mathcal{B}$ ( $\mathcal{R}$ ) if the kth event is a birth event.
A third type of mismatch occurs in coupling the event times in $\mathcal{E}^{(n)}$ and $\mathcal{B}$ . Conditionally upon there being no mismatches of the first two types in the first k events and $\eta_{k-1}^{(n)} = \eta_{k-1}$ , we have by (4.23) that a mismatch occurs and $\eta_k^{(n)} \neq \eta_k$ only if $U_{k,3} \gt 1 - \tilde{d}^{(n)} \Big(X^{(n)}_{k-1}, Y^{(n)}_{k-1}\Big)$ .
The following lemma gives conditions under which the processes $\mathcal{B}$ $(\mathcal{R})$ and $\mathcal{E}^{(n)}$ $(\mathcal{S}^{(n)})$ can be constructed on a common probability space $(\Omega, \mathcal{F}, {\mathbb{P}})$ , so that for ${\mathbb{P}}$ -almost all $\omega \in \Omega$ they coincide over the first $u_n$ events for all sufficiently large n, where $u_n \rightarrow \infty$ as $n \rightarrow \infty$ .
Lemma 4.2. Suppose that (3.9) holds and (3.10) holds for some $\theta_0 \gt 0$ . Suppose that there exists $\alpha \geq 1$ such that ${\mathbb{E}} [C^{\alpha +1}] \lt \infty$ , which in turn implies that ${\mathbb{E}} [\tilde{Z}^\alpha] \lt \infty$ .
Let $(u_n)$ be any non-decreasing sequence of integers such that there exists
so that for all sufficiently large n, $u_n \leq \lfloor K n^\zeta \rfloor$ for some $K \in \mathbb{R}^+$ .
Then there exists a probability space $(\Omega, \mathcal{F}, {\mathbb{P}})$ , on which are defined the branching process, $\mathcal{B}$ , the random walk, $\mathcal{R}$ , and the sequence of epidemic processes and discrete epidemic processes, $\left(\mathcal{E}_n, \mathcal{S}_n\right)$ , such that for ${\mathbb{P}}$ -almost all $\omega \in \Omega$ ,
and
for all sufficiently large n.
Proof. Without loss of generality, we prove the lemma by taking $u_n = \lfloor K n^\zeta \rfloor$ for some $K \in (0, \infty)$ and $\zeta$ satisfying (4.27). It follows from (4.27) that $\theta$ and $\delta$ can be chosen so that $\theta, \delta \gt 0$ , $\frac{2 (1 + \alpha)}{\alpha} \zeta \lt \theta \leq \theta_0$ , and $\theta + 2 \zeta + 2 \delta \lt 1$ . (Note that (4.27) implies $2 \zeta (1+\alpha)/\alpha \lt \theta_0$ . Furthermore,
by (4.27).) Set $s_n = n^{\theta}$ , $r_n = K n^{ \zeta + \delta}$ , $a_n = \lfloor n^{\theta/(\alpha+1)} \rfloor$ , and, for convenience, $\epsilon_n =1/s_n$ . Note that $s_nr_n^2/n \rightarrow 0$ as $n \to \infty$ , satisfying the conditions of Lemma 4.1.
For $h,n=1,2,\ldots$ , let $\textbf{x}_h^{(n)}= \Big(x_0^{(n)}, x_1^{(n)}, \ldots, x_h^{(n)}\Big)$ and define $\textbf{y}_h^{(n)}$ similarly. Let
and $A_{n,0}=\Big\{\omega \in \Omega: \Big(\textbf{X}^{(n)}_{u_n} (\omega),\textbf{Y}^{(n)}_{u_n} (\omega)\Big) \in \tilde{A}_{n,0}\Big\}$ . Note that if $\omega \in A_{n,0}$ for all sufficiently large n, then $\Big\{\Big(X^{(n)}_k(\omega), Y^{(n)}_k(\omega)\Big)\Big\}$ satisfies the conditions of Lemma 4.1.
Let $H_n$ denote the event at which the first mismatch occurs between $\mathcal{S}_n$ and $\mathcal{R}$ . Then, for $\omega \in \Omega$ , (4.28) holds if and only if $H_n (\omega) \gt u_n$ . Note that the first mismatch occurs at event k with $\Big(X_{k-1}^{(n)}, Y_{k-1}^{(n)}\Big) = \Big(x^{(n)}_{k-1}, y_{k-1}^{(n)}\Big)$ , if
where $D^{(n)}_1 \big(x^{(n)}, y^{(n)}\big)$ and $D^{(n)}_2 \big(x^{(n)}, y^{(n)}\big)$ are defined in (4.24) and (4.16), respectively.
Similarly, let $\tilde{H}_n$ denote the event at which the first mismatch occurs between the times of corresponding events in $\mathcal{E}_n$ and $\mathcal{B}$ . Then (4.29) holds if and only if $\tilde{H}_n (\omega) \gt u_n$ . Note that if $H_n (\omega ) \gt u_n$ then the first mismatch in the time of events occurs at event k with $\Big(X_{k-1}^{(n)}, Y_{k-1}^{(n)}\Big) = \Big(x^{(n)}_{k-1}, y_{k-1}^{(n)}\Big)$ , if
where $\tilde{d}^{(n)} \big(x^{(n)}, y^{(n)}\big)$ is defined in (4.19).
By Lemma 4.1, we have for any $\ell \gt 0$ , for all sufficiently large n and $\Big(\textbf{x}^{(n)}_{u_n}, \textbf{y}^{(n)}_{u_n}\Big) \in \tilde{A}_{n,0}$ , that $\sum_{w=0}^\infty \big\vert q^{(n)} \big(x^{(n)}, y^{(n)},w\big) - \beta \varphi_w \big\vert \lt \ell \epsilon_n$ and $\big\vert q^{(n)} \big(x^{(n)},y^{(n)}\big) - \beta\big\vert \lt \ell \epsilon_n$ . The first inequality implies that for all $w \in \mathbb{Z}_+$ ,
Therefore, since $q^{(n)} \big(x^{(n)}, y^{(n)}\big) \gt \beta/2$ for all sufficiently large n, we have by the triangle inequality that
Setting $\ell=\frac{\beta}{5}$ , we have that for all sufficiently large n, $\left\vert \varpi_w^{(n)} \big(x^{(n)}, y^{(n)}\big) - \varphi_w \right\vert \le \epsilon_n$ $(w=0,1,\ldots)$ .
Thus we can define sets $\tilde{D}^{(n)}_i$ $(i=1,2,3)$ such that for all sufficiently large n, if $\big(x^{(n)}, y^{(n)}\big) \in \tilde{A}_n$ , then $D^{(n)}_i \big(x^{(n)}, y^{(n)}\big) \subseteq \tilde{D}_i^{(n)}$ $(i=1,2,3)$ , where
and
Since $\epsilon_n$ is decreasing in n, we have that for all n, $\tilde{D}_i^{(n+1)} \subseteq \tilde{D}_i^{(n)}$ $(i=1,2,3)$ .
For $i=1,2,3$ , let
We observe that if $u_{n+1} = u_n$ , then $A_{n,i} \subseteq A_{n+1,i}$ $(i=0,1,2,3)$ . Therefore, following Ball and O’Neill [Reference Ball and O’Neill6, Lemma 2.11], we define $\mathcal{Q} = \{ n \in \mathbb{N}: \lfloor K n^\zeta \rfloor \neq \lfloor K (n-1)^\zeta \rfloor\}$ and note that, for $i=0,1,2,3$ , to show that
it is sufficient to show that
Given that (4.30) holds for $i=0,1,2,3$ , we have that there exists $\tilde{\Omega} \subseteq \Omega$ such that ${\mathbb{P}} (\tilde{\Omega}) =1$ and for every $\omega \in \tilde{\Omega}$ , there exists $n(\omega) \in \mathbb{N}$ such that for all $n \geq n (\omega)$ , $H_n (\omega) \gt u_n$ and $\tilde{H}_n (\omega) \gt u_n$ . Thus (4.28) and (4.29) hold, and the lemma follows.
We complete the proof of the lemma by proving (4.30) for $i=0,1,2,3$ . Suppose that, for $i=0,1,2,3$ , there exist $L_i \lt \infty$ and $\chi_i \gt 1$ such that, for all sufficiently large n,
Following the proof of Ball and O’Neill [Reference Ball and O’Neill6, Lemma 2.10], we have that
so by the first Borel–Cantelli lemma, (4.30) holds.
Let us prove (4.31). Recall that $\mu_C = {\mathbb{E}}[C]$ , ${\mathbb{E}} [\tilde{Z}] = {\mathbb{E}} [C \nu (C)]/\mu_C$ , and ${\mathbb{E}} [C (C-1) \nu (C)] \lt \infty$ , where $\nu (c)$ , defined at (3.7), is the mean number of new infectives created in a mixing event of size c with 1 infective and $c-1$ susceptibles. Since $u_n \leq \lfloor K n^\zeta \rfloor$ and $r_n =K n^{\zeta +\delta}$ , by Chebyshev’s inequality, we have that, for all sufficiently large n,
Hence (4.31) holds for $i=0$ .
Since $\theta - 2 (1 + \alpha) \zeta/\alpha \gt 0$ , we have that for all sufficiently large n,
Hence (4.31) holds for $i=1$ .
Similarly, since ${\mathbb{P}} (A_{n,3}^c) \leq u_n \epsilon_n$ , we have that (4.31) holds for $i=3$ .
Finally, let $\delta_1 = \frac{\alpha}{\zeta(1+ \alpha)} \theta - 2 \gt 0$ . For all sufficiently large n, we have that $a_n^{\alpha} , s_n/a_n \geq \frac{1}{2} n^{\theta \alpha/(1+\alpha)}$ . Thus, recalling that $\epsilon_n=1/s_n$ , we have that for all sufficiently large n,
Hence (4.31) holds for $i=2$ . Thus (4.30) holds for $i=0,1,2,3$ and the lemma is proved.
Lemma 4.2 ensures that the the processes $\mathcal{E}^{(n)}$ ( $\mathcal{S}^{(n)}$ ) and $\mathcal{B}$ ( $\mathcal{R}$ ) coincide for an increasing number of events as n increases. For Theorem 3.1(a) we do not require as strong a result as Lemma 4.2, and the following corollary, which can be proved in a similar fashion to Lemma 4.2, suffices.
Corollary 4.1. For any $K \in \mathbb{N}$ , we have, for $(\Omega, \mathcal{F}, {\mathbb{P}})$ defined in Lemma 4.2, that for ${\mathbb{P}}$ -almost all $\omega \in \Omega$ ,
and
for all sufficiently large n.
The coupling in Lemma 4.2 includes birth events where no births occur, that is, $Z_1 =0$ . Given that $\lambda \lt \infty$ $(\beta = \lambda \mu_C\lt\infty)$ and $\gamma \gt 0$ , it follows that ${\mathbb{P}} (Z_1 \neq 0) \gt 0$ . Since $Z_1, Z_2, \ldots$ are i.i.d., the strong law of large numbers yields
where $\stackrel{{\rm a.s.}}{\longrightarrow}$ denotes convergence almost surely. For $k=1,2,\ldots$ , let
Thus $M_k$ is the kth event in $\mathcal{B}$ for which $Z_i \neq 0$ . (If $\mathcal{B}$ goes extinct then $M_k$ has this interpretation for only finitely many k.) Theorem 3.1 now follows straightforwardly from Lemma 4.2.
Proof of Theorem 3.1. (a) Recall that T is the total size of the branching process $\mathcal{B}$ and $A_{ext} = \{ \omega \in \Omega\,:\, T (\omega) \lt \infty \}$ .
Fix $\omega \in A_{ext}$ and suppose that $T(\omega) =k \lt\infty$ . Then there exists $h = h (\omega) \leq 2 k -m$ such that $Y_{M_h} (\omega) =0$ . That is, there are at most $k-m$ birth events (with a strictly positive number of births) and k death events in the branching process. By Corollary 4.1 there exists $n_2 (\omega) \in \mathbb{N}$ such that for all $n \geq n_2 (\omega)$ and $l=1,2,\ldots, M_h (\omega)$ , $Y_l^{(n)} (\omega) = Y_l(\omega)$ and $\eta_l^{(n)} (\omega) = \eta_l (\omega) $ , and hence, for all $t \geq 0$ , $I_n (t,\omega) = I (t, \omega)$ .
(b) Let $\rho$ satisfy (3.11) and $t_n = \rho \log n$ . Remembering from (3.8) that $r = \gamma (R_0 -1)$ is the Malthusian parameter (growth rate) of the branching process, we take $\zeta$ such that
so that $\zeta$ satisfies (4.27) in the statement of Lemma 4.2.
For $t \geq 0$ , let N(t) denote the total number of (birth and death) events in the branching process $\mathcal{B}$ up to and including time t. Then, if $N(t_n, \omega) \leq u_n = \lfloor n^\zeta \rfloor$ and $\big(Y^{(n)}_h (\omega) , \eta^{(n)}_h (\omega)\big)= (Y_h (\omega) , \eta_h (\omega))$ $(h=1,2,\ldots, u_n)$ , we have from Lemma 4.2 that
Give the initial ancestors the labels $-(m -1), -(m-2), \ldots, 0$ , and label the individuals born in the branching process sequentially $1,2,\ldots$ . For $i=1,2,\ldots$ , let $\tau_i$ denote the time of the birth of the ith individual, with the conventions that $\tau_i =\infty$ if fewer than i births occur, and $\tau_i=0$ for $i=-(m-1), -(m-2), \ldots, 0$ . For $i=-(m-1),-(m-2), \ldots$ , let $\tilde{G}_i (s)$ denote the number of birth and death events involving individual i in the first s time units after their birth, if $s \ge 0$ , and let $\tilde{G}_i (s)=0$ if $s\lt 0$ . Note that $\tilde{G}_i (s)$ is non-decreasing in s and $\tilde{G}_i (\infty) \stackrel{D}{=} G +1$ , where G is the number of birth events involving an individual and is a geometric random variable given by (3.4). Therefore, for all $t \geq 0$ ,
Note that N(t) satisfies the form of Nerman [Reference Nerman14, (1.11)]. It is straightforward to show that the conditions of Theorem 5.4 in Nerman [Reference Nerman14] hold, since ${\mathbb{E}} [\tilde{G} (\infty)] =(\gamma + \lambda \mu_C)/\gamma$ . Therefore, by that theorem, there exists a positive, almost surely finite random variable W such that
for ${\mathbb{P}}$ -almost all $\omega \in A_{\rm ext}^c$ . It is then straightforward to show, following the proof of Ball and O’Neill [Reference Ball and O’Neill6, Lemma 2.9], that there exists some ${\mathbb{P}}$ -measurable set $B_1 \subseteq A_{\rm ext}^c$ such that ${\mathbb{P}} (B_1) = {\mathbb{P}} (A_{\rm ext}^c)$ and for all $\omega \in B_1$ ,
Hence, for all sufficiently large n, $N(t_n, \omega ) \leq 2 W (\omega) n^{\rho r} \leq u_n$ . Finally, by Lemma 4.2, for ${\mathbb{P}}$ -almost all $\omega \in B_1$ , (4.28) and (4.29) hold, so (3.12) follows.
Finally, we consider the total size of the epidemic processes and branching processes, with Corollary 3.1 following straightforwardly from Corollary 4.1.
Proof of Corollary 3.1. Let $\hat{\Omega} \subseteq \Omega$ be the set on which the convergence underlying (4.35) holds, so ${\mathbb{P}}(\hat{\Omega})=1$ , and fix $\omega \in\tilde{\Omega} \cap \hat{\Omega}$ . Suppose that $T(\omega)=k\lt\infty$ . Then there exists $h=h(\omega) \leq 2k-m$ such that $Y_{M_h}(\omega)=0$ . By Corollary 4.1, there exists $\tilde{n} (\omega) \in \mathbb{N}$ such that for all $n \geq \tilde{n} (\omega)$ , $Y_i^{(n)} (\omega)=Y_i (\omega)$ $(i=1,2,\dots, M_h (\omega))$ . Thus, $T^{(n)} (\omega) = T (\omega)$ for all $n \geq \tilde{n} (\omega)$ .
Suppose instead that $T(\omega)=\infty$ . Choose any $k_1 \in \mathbb{N}$ and let $n_E(k_1, \omega)$ be the number of events in $\mathcal{B}$ when the total size of $\mathcal{B}$ first reaches at least $k_1$ . Then arguing as above, with k replaced by $k_1$ and h replaced by $n_E(k_1, \omega)$ , shows that $T^{(n)} (\omega) \ge k_1$ for all sufficiently large n. This holds for all $k_1 \in \mathbb{N}$ , so $T^{(n)}(\omega) \to \infty$ as $n \rightarrow \infty$ .
5. Proof of Theorem 3.2
5.1. Overview
We present an overview of the steps to prove Theorem 3.2. In Section 5.2, we prove Theorem 3.2(a), which states that as $n \to \infty$ , the probability of a major epidemic outbreak in $\mathcal{E}^{(n)}$ (the epidemic infects at least $\log n$ individuals) tends to the probability that the branching process, $\mathcal{B}$ , does not go extinct. In Section 5.3, we introduce a sequence of lower-bound random walks, $(\mathcal{L}^{(n)})$ , which is a key component in showing that a major epidemic in the discrete epidemic jump process, $\mathcal{S}^{(n)}$ , and hence in the epidemic process $\mathcal{E}^{(n)}$ infects at least $\delta^\ast n$ individuals with probability tending to 1 as $n \to \infty$ . We provide an outline of the coupling of $\mathcal{S}^{(n)}$ and $\mathcal{L}^{(n)}$ , via an intermediary process $\mathcal{G}^{(n)}$ , and in (5.5) we identify the relationship between the three processes with $\mathcal{L}^{(n)}$ as a lower bound in terms of the number of infectives in the epidemic, to establish Theorem 3.2(b). The details of $\mathcal{L}^{(n)}$ are provided in Section 5.4, along with Lemmas 5.1 and 5.2, which provide the main steps in establishing (5.5). Finally, Section 5.4 concludes with the proof of Theorem 3.2(b), from which Corollary 3.2 follows immediately.
5.2. Probability of a major epidemic
Under the conditions of Theorem 3.2, the conditions of Lemma 4.2 are satisfied with $\alpha =1$ since ${\mathbb{E}}[C^2] \lt \infty$ . The proof of Theorem 3.2(a) then follows almost immediately from the proof of Theorem 3.1 by considering the embedded random walk and discrete-time epidemic jump process. From Lemma 4.2, (4.28), we have that $\textbf{Y}_{u_n} = (Y_1, Y_2, \ldots, Y_{u_n})$ and $\textbf{Y}_{u_n}^{(n)} = \big(Y_1^{(n)}, Y_2^{(n)}, \ldots, Y_{u_n}^{(n)}\big)$ can be constructed so that
for $u_n = \lfloor n^\zeta \rfloor$ and $\zeta \gt 0$ satisfying (4.27) with $\alpha =1$ . Hence we can couple the process $\mathcal{S}^{(n)}$ to $\mathcal{R}$ over the first $u_n$ steps, and Theorem 3.2(a) follows, as we now show.
Proof of Theorem 3.2 (a). Since T is the total size of a branching process with m initial ancestors, it follows that
Following the proof of Corollary 3.1, $T \lt \log n$ if there exists $h_n \leq 2 \log n -m$ such that $Y_{M_{h_n}}=0$ . Let $\tilde{\Omega}$ and $\hat{\Omega}$ be as in the proofs of Lemma 4.2 and Corollary 3.1, respectively, and note that ${\mathbb{P}} (\tilde{\Omega} \cap \hat{\Omega}) =1$ . Fix $\omega \in \tilde{\Omega} \cap \hat{\Omega} $ . Then
for all sufficiently large n, where $M_k$ is given in (4.36) and $\tilde{\Omega}$ and $\hat{\Omega}$ are defined in Lemma 4.2 and Corollary 3.1, respectively, with ${\mathbb{P}} (\tilde{\Omega} \cap \hat{\Omega}) =1$ . It follows using Lemma 4.2 that $1_{\{T(\omega)\lt\log n\}}=1_{\{T^{(n)}(\omega)\lt\log n\}}$ for all sufficiently large n. Thus, $1_{\{T\lt \log n\}}-1_{\{T^{(n)}\lt\log n\}}$ converges almost surely (and hence in probability) to 0 as $n \to \infty$ . Therefore,
5.3. Coupling of the lower-bound random walk to the epidemic
We turn to the proof of Theorem 3.2(b) and note that we now need to consider the epidemic process and any approximation over $\lfloor \delta_1 n \rfloor$ events for some $\delta_1 \gt 0$ . The couplings utilised thus far do not extend to $\lfloor \delta_1 n \rfloor$ events in the limit as $n \rightarrow \infty$ . However, we can still utilise the couplings over the first $u_n = \lfloor n^\zeta \rfloor$ events. Hence, given that the embedded discrete epidemic jump process $\mathcal{S}^{(n)}$ reaches $u_n$ events without hitting 0 (the epidemic process $\mathcal{E}^{(n)}$ does not go extinct), we can show, following the proof of Theorem 3.2(a), that, with probability tending to 1 as $n \rightarrow \infty$ , $T^{(n)} \geq v_n$ , where $v_n = {\mathbb{P}} (Z_1 \neq 0) u_n/3$ . It immediately follows using the coupling in Lemma 4.2 (see (5.1)) that ${\mathbb{P}} \big(T \geq v_n \vert T^{(n)} \geq v_n\big) \rightarrow 1$ as $n \rightarrow \infty$ . We have that if $T \geq v_n$ , then $\min_{k \leq v_n} \{ Y_k \} \gt 0$ , and by the weak law of large numbers,
Hence, under the assumption $R_0 \gt 1$ , which is required for Theorem 3.2(b), we have that ${\mathbb{E}} [Z] \gt 0$ and, since $Y_{v_n} = m + \sum_{k=1}^{v_n} Z_k$ , that
Now (5.3) implies that if the branching process $\mathcal{B}$ has at least $v_n$ individuals ever alive, then the number of individuals alive in $\mathcal{B}$ (the position of the random walk $\mathcal{R}$ ) after $v_n$ events exceeds $v_n {\mathbb{E}}[Z]/2$ with probability tending to 1 as $n \to \infty$ . Combined with (5.1) the same holds true for $Y^{(n)}_{v_n}$ in $\mathcal{E}^{(n)}$ and $\mathcal{S}^{(n)}$ .
The next step is to show that, given that the epidemic $\mathcal{E}^{(n)}$ ( $\mathcal{S}^{(n)}$ ) has not gone extinct in $v_n$ events, there exists $\delta^\ast\gt 0$ such that, with probability tending to 1 as $n \rightarrow \infty$ , at least $\lfloor \delta^\ast n \rfloor$ events occur in $\mathcal{S}^{(n)}$ . In order to do this, we introduce a lower-bound random walk $\mathcal{L}^{(n)}$ indexed by the population size n. Lower-bound branching processes (random walks) for epidemic processes go back to [Reference Whittle17], and the main idea is along similar lines to [Reference Whittle17], in that we set up the lower-bound random walk so that the number of infectives in the discrete epidemic jump process $\mathcal{S}^{(n)}$ is at least the number of individuals alive in the branching process with embedded random walk $\mathcal{L}^{(n)}$ for the initial stages of the epidemic process.
The key features in setting up $\mathcal{L}^{(n)}$ are as follows. Let $L^{(n)}_k$ denote the position of the random walk $\mathcal{L}^{(n)}$ after k steps. The random walk $\mathcal{L}^{(n)}$ is set identical to the random walk $\mathcal{R}$ for the first $v_n$ steps; that is, the distribution of steps is according to Z given in (4.1). Hence $L_0^{(n)} = m$ , and for $k=1,2,\ldots,v_n$ , $L_k^{(n)} = Y_k$ . For $k=v_n + 1, v_n +2, \ldots$ , the steps in $\mathcal{L}^{(n)}$ are i.i.d. according to $\hat{Z}^{(n)}$ defined below in (5.9), with ${\mathbb{E}} [ \hat{Z}^{(n)}] \gt 0$ so that the lower-bound random walk has positive drift. Therefore we can show that, as $n \rightarrow \infty$ , if $\mathcal{L}^{(n)}$ has not hit 0 in the first $v_n$ steps when it is coupled to $\mathcal{R}$ , and hence reached $L_{v_n}^{(n)} = Y_{v_n} \geq v_n {\mathbb{E}} [Z]/2$ (cf. (5.3)), with probability tending to 1 it will not hit 0 in the first $\lfloor \delta_1 n \rfloor$ steps for $\delta_1 \gt 0$ .
It is difficult to directly couple $\mathcal{S}^{(n)}$ and $\mathcal{L}^{(n)}$ , owing to differences in the distribution of steps caused by the changing rate of events in $\mathcal{E}^{(n)}$ and hence the probability of events occurring in $\mathcal{S}^{(n)}$ . Therefore we introduce an intermediary process $\mathcal{G}^{(n)}$ . The intermediary process $\mathcal{G}^{(n)}$ is a bivariate (epidemic) process, indexed by the population size n, whose steps are state-dependent, with the dependence corresponding to the number of susceptibles and infectives in the population. For $k=1,2,\ldots$ , let $\big(A_k^{(n)}, G_k^{(n)}\big)$ denote the state of $\mathcal{G}^{(n)}$ after k steps (events), with $A_k^{(n)}$ and $G_k^{(n)}$ denoting the numbers of susceptibles and infectives, respectively. For the first $v_n$ steps, $\mathcal{G}^{(n)}$ is set identical to $\mathcal{S}^{(n)}$ , so that for $k=1,2,\ldots, v_n$ , $\big(A_k^{(n)}, G_k^{(n)}\big) = \big(X_k^{(n)},Y_k^{(n)}\big)$ . After $v_n$ steps have occurred in both $\mathcal{G}^{(n)}$ and $\mathcal{S}^{(n)}$ , we allow the two processes to differ as follows. The process $\mathcal{G}^{(n)}$ is associated with an epidemic-type process, $\mathcal{E}^{(n)}_G$ , which has a higher rate of events than the epidemic process $\mathcal{E}^{(n)}$ , but in such a way that the additional events in $\mathcal{E}^{(n)}_G$ , which do not occur in $\mathcal{E}^{(n)}$ , are infection events where no infections occur. In this way we can construct $\mathcal{E}^{(n)}_G$ from $\mathcal{E}^{(n)}$ so that all events in $\mathcal{E}^{(n)}$ occur in $\mathcal{E}^{(n)}_G$ , but there are additional ghost events which occur in $\mathcal{E}^{(n)}_G$ where there is no change in state (no infection or removal occurs); the only change is to increment the counter of the number of events. Similarly, we can reverse this process and generate $\mathcal{E}^{(n)}$ from $\mathcal{E}^{(n)}_G$ by eliminating, with an appropriate probability, some of the events where no change in state occurs. Therefore, for $k=1,2,\ldots$ , there exists $\kappa_n (k) \leq k$ such that
Note that for $k=1,2,\ldots,v_n$ , $\kappa_n (k) = k$ .
We couple $\mathcal{G}^{(n)}$ and $\mathcal{L}^{(n)}$ so that, with probability tending to 1 as $n \to \infty$ , we have $G_k^{(n)} \geq L_k^{(n)}$ for $k=1,2,\ldots,\lfloor \delta_1 n \rfloor$ , where $\delta_1 \gt 0$ is given in (5.7). That is, for the first $\lfloor \delta_1 n \rfloor$ events, the number of infectives in the process $\mathcal{G}^{(n)}$ is at least the number of individuals alive in the random walk $\mathcal{L}^{(n)}$ . It then follows that, for $k=1,2,\ldots, \lfloor \delta_1n \rfloor$ , we have
with probability tending to 1 as $n \rightarrow \infty$ . The proof of Theorem 3.2(b) follows almost immediately after we have established (5.5).
5.4. Lower bound for the size of a major epidemic outbreak
In this section we formally define the lower-bound random walk $\mathcal{L}^{(n)}$ . Then, in Lemma 5.1, we show that $\mathcal{L}^{(n)}$ has positive drift, so that after $\lfloor \delta_1 n \rfloor$ steps we have, for some $\delta \gt 0$ (defined in (5.10)), that $L_{\lfloor \delta_1 n \rfloor}^{(n)} \geq \delta n$ with probability tending to 1 as $n \rightarrow \infty$ . This is followed by the construction of $\mathcal{G}^{(n)}$ . We show that whilst fewer than $\lfloor \delta_1 n \rfloor$ events have occurred, the number of susceptibles (with high probability) remains above $(1- \epsilon_1) n$ , where $\epsilon_1 \gt 0$ is given in (5.7), which enables us to show, in Lemma 5.2, that the inequality in (5.5) holds with probability tending to 1 as $n \rightarrow \infty$ . We then establish the equality in (5.5) through the coupling of $\mathcal{S}^{(n)}$ and $\mathcal{G}^{(n)}$ , with the proof of Theorem 3.2(b) following.
By (4.18), we have, for all $\big(x^{(n)}, y^{(n)}\big)$ , that $q^{(n)} \big(x^{(n)}, y^{(n)} \big) \leq \lambda \mu_C^{(n)} = \beta^{(n)}$ , say. Since $R_0 \gt 1 $ , and by (3.9), $ {\mathbb{E}} [ C (C -1) \nu (C)] \lt \infty$ , we can define
We can then fix $\delta_1$ and $\epsilon_1 \lt \epsilon$ such that
Throughout the remainder of the section we assume that $\delta_1$ and $\epsilon_1$ satisfy (5.7). For $n=1,2,\ldots$ and $w=1,2,\ldots,n-1$ , let
and $\psi_0^{(n)} = 1- \sum_{v=1}^{n-1} \psi_v^{(n)}$ . Note that there exists $n_0 \in \mathbb{N}$ such that for all $n \geq n_0$ , $2 \epsilon_1 \lfloor n/2 \rfloor \gt 1$ , and hence $\psi_w^{(n)} =0$ for all $w \gt \lfloor n/2 \rfloor$ . We define
Let $\hat{Z}^{(n)}_{v_n +1}, \hat{Z}^{(n)}_{v_n +2}, \ldots$ , be i.i.d. according to $\hat{Z}^{(n)}$ given in (5.9). Then, given $L_{v_n}^{(n)} = Y_{v_n}$ , for $v_n +1 \leq k \leq \lfloor \delta_1 n \rfloor$ we set
Lemma 5.1. Let $b_n$ be any sequence of positive integers such that $b_n \to \infty$ as $n \to \infty$ . Let $\delta_1$ satisfy (5.7), and let $\delta$ satisfy
where $\epsilon_1 \lt \epsilon$ ensures that the right-hand side of (5.10) is positive. Then
where $v_n = {\mathbb{P}}(Z_1 \neq 0) u_n/3$ .
Recall the expression (3.6) for $R_0$ and the fact that by (3.9), ${\mathbb{E}}[C(C-1) \nu (C)] \lt\infty$ . Since $ \lambda {\mathbb{E}} [C^{(n)}\nu \big(C^{(n)}\big)] \rightarrow R_0 \gamma$ and $\beta^{(n)} \to \beta = \lambda \mu_C$ as $n \rightarrow \infty$ , we have that
where the final inequality follows from (5.10). It also follows from (5.8) that $\beta^{(n)} \psi_w^{(n)} \leq \lambda \sum_{c=w+1}^n c p_C^{(n)} (c) \pi_c (w;\, c-1,1)$ , for all $w=1,2,\ldots, n-1$ , so
where ${\mathbb{E}} [ C (C-1) \nu (C)]\lt\infty$ ensures that the right-hand side of (5.12) is finite.
Let $z_n$ be the probability that the random walk $\mathcal{L}^{(n)}$ ever hits 0 given $L^{(n)}_{v_n} =1$ . Since $\liminf_{n \to \infty}{\mathbb{E}} [\hat{Z}^{(n)} ] \gt 0$ and $\sup_n {\mathbb{E}} [(\hat{Z}^{(n)})^2 ] \lt \infty $ , it follows from Ball and Neal [Reference Ball and Neal4, Lemma A.3] that $\limsup_{n \to \infty} z_n \lt 1$ . Therefore, for any $b_n \to \infty$ as $n \rightarrow \infty$ , we have that $z_n^{b_n} \to 0$ as $n \rightarrow \infty$ . Moreover, for $\sup_n {\mathbb{E}} [(\hat{Z}^{(n)})^2 ] \lt \infty $ , it follows by the weak law of large numbers for triangular arrays (e.g. Durrett [Reference Durrett9, Theorem 2.2.4]) that
Hence, for any $b_n \to \infty$ and $\delta$ satisfying (5.10), we have that (5.11) holds.
Turning to the intermediary (ghost) process $\mathcal{G}^{(n)}$ , we define independent random variables $W^{(n)}_k \big(x^{(n)}, y^{(n)}\big)$ $(k=v_n +1, v_n + 2, \ldots)$ to define the transitions given the current state $(A^{(n)}, G^{(n)}) = \big(x^{(n)}, y^{(n)}\big)$ after event $v_n$ . For $k=v_n + 1,v_n +2,\ldots$ , let $W_k^{(n)} \big(x^{(n)}, y^{(n)}\big)$ satisfy
Then for $\Big(A^{(n)}_{k-1}, G_{k-1}^{(n)}\Big) = \big(x^{(n)}, y^{(n)}\big)$ , we set
The continuous-time epidemic-type process $\mathcal{E}_G^{(n)}$ is constructed from $\mathcal{G}^{(n)}$ as follows. If $G_k^{(n)} = y^{(n)}$ , then the time from the kth to the $(k+1)$ th event is drawn from ${\rm Exp} \big((\gamma + \beta^{(n)}) y^{(n)}\big)$ , regardless of $A^{(n)}$ , the number of susceptibles. Therefore, if there are $y^{(n)}$ infectives in the population, mixing events occur at rate $\beta^{(n)} y^{(n)}$ , with the number of individuals infected in such a mixing event depending on the number of susceptibles, $A^{(n)}$ .
We consider the coupling of $\mathcal{G}^{(n)}$ and $\mathcal{L}^{(n)}$ in Lemma 5.2 before finalising the coupling between $\mathcal{G}^{(n)}$ and $\mathcal{S}^{(n)}$ .
Lemma 5.2. There exists a coupling of $\mathcal{G}^{(n)}$ and $\mathcal{L}^{(n)}$ such that, for any $\delta_1$ satisfying (5.7),
Proof. By the construction of $\mathcal{G}^{(n)}$ and $\mathcal{L}^{(n)}$ and Lemma 4.2, (4.28), we have that with probability tending to 1, $G_{k}^{(n)} = Y_{k} = L_{k}^{(n)}$ $(k=1,2,\ldots,v_n)$ .
The first step is to show that ${\mathbb{P}} \Big(A_{\lfloor \delta_1 n \rfloor}^{(n)} \geq (1-\epsilon_1) n\Big) \to 1$ as $n \to \infty$ , which, since $A_k^{(n)}$ is non-increasing in k, implies that, for all $k=1,2,\ldots,\lfloor \delta_1 n \rfloor$ , $A_k^{(n)} \geq (1-\epsilon_1) n$ with probability tending to 1 as $n \to \infty$ .
It is straightforward to show, for all $\big(x^{(n)}, y^{(n)}\big)$ , that $W^{(n)} \big(x^{(n)}, y^{(n)}\big)$ is stochastically smaller than ( $ \stackrel{st}{\le}$ ) a random variable $\tilde{W}^{(n)}$ , where ${\mathbb{P}} (\tilde{W}^{(n)} = -1) = \gamma/\big(\gamma + \beta^{(n)}\big)$ and for $k=1,2,\ldots$ , ${\mathbb{P}} (\tilde{W}^{(n)} = k) = \beta^{(n)} {\mathbb{P}} (\tilde{C}^{(n)} = k+1) /\big(\gamma + \beta^{(n)}\big)$ , with
Note that $\tilde{C}^{(n)}$ is the size-biased distribution of mixing group sizes, and for $c \geq 2$ , ${\mathbb{P}} (\tilde{W}^{(n)}=c-1)$ is the probability that an event is a mixing event multiplied by the probability that a mixing event involving a given infective is of size c. It is then assumed that a mixing group of size c involving an infective produces $c-1$ new infections, the maximum number of new infections that can be produced from a mixing group of size c.
The proof that $W^{(n)} \big(x^{(n)}, y^{(n)}\big) \stackrel{st}{\le} \tilde{W}^{(n)}$ is as follows. Remember that $Q_c^{(n)} \big(l,i|x^{(n)}, y^{(n)}\big) $ , defined in (4.2), is the probability that a mixing group of size c in a population of size n containing $x^{(n)}$ susceptibles and $y^{(n)}$ infectives contains l susceptibles and i infectives. Note that for $v\gt 0$ , the probability that such a mixing group results in v new infectives, $\pi_c (v;\, l,i)$ , is not equal to 0 only if $i \gt 0$ and $v \leq l$ . Given that there are $y^{(n)}$ infectives in the population, the probability that a mixing group of size c includes at least one infective is at most $c y^{(n)}/n$ , and therefore, for $w=0,1,\ldots$ ,
as required.
Hence we can couple $\Big\{W^{(n)}_k \Big(A^{(n)}_{k-1}, G^{(n)}_{k-1}\Big) \Big\}$ and $\big\{\tilde{W}^{(n)}_k \big\} $ so that, for all $k=v_n + 1,v_n+ 2,\ldots$ ,
Recall that ${\mathbb{E}} [(C^{(n)})^2] \to {\mathbb{E}} [C^2]$ as $n \to \infty$ , where ${\mathbb{E}} [C^2] \lt \infty$ . By the weak law of large numbers for triangular arrays, we have that
where the right-hand side of (5.15) is less than ${\mathbb{E}}[C^2]/\mu_C$ . As noted at the start of the proof, $G_{k}^{(n)} = Y_{k} = L_{k}^{(n)}$ $(k=1,2,\ldots,v_n)$ with probability tending to 1 as $n \to \infty$ , where $Y_k = m + \sum_{j=1}^k \tilde{Z}_j$ . Therefore, for any $\epsilon_2$ satisfying $\delta_1 {\mathbb{E}}[C^2]/\mu_C \lt \epsilon_2 \lt \epsilon_1$ , we have that
In Section 4.3, we showed that, during the early stages of the epidemic, the contribution to the spread of the disease from mixing events containing more than one non-susceptible individual is negligible, and whilst the number of susceptibles remains above $(1-\epsilon_1) n$ we can similarly bound the contribution from mixing events with multiple non-susceptible individuals. Following the proof of Lemma 4.1, we have for $w=1,2,\ldots, [n/2]$ that $q^{(n)} \big(x^{(n)},y^{(n)},w\big) \geq q^{(n)}_1 \big(x^{(n)},y^{(n)},w\big) $ , and using (4.12) and (4.13), for $x^{(n)} \geq (1- \epsilon_1) n$ , we have that
for all sufficiently large n, as, for such n, $\psi_w^{(n)} = 0$ for all $w \gt \lfloor n/2 \rfloor$ . Hence, for $k=v_n +1, v_n +2, \ldots, \lfloor \delta_1 n \rfloor$ , provided that $A_{k-1}^{(n)} \geq (1-\epsilon_1)n$ , we can couple $W^{(n)}_k \Big(A_{k-1}^{(n)}, G_{k-1}^{(n)}\Big)$ and $\hat{Z}^{(n)}_k$ , defined in (5.13) and (5.9) respectively, so that
for all sufficiently large n. Specifically, we couple deaths (downward steps) in $\mathcal{L}^{(n)}$ with removals in $\mathcal{G}^{(n)}$ , so $W^{(n)}_k \big(A_{k-1}^{(n)}, G_{k-1}^{(n)}\big) =-1$ if and only if $\hat{Z}^{(n)}_k =-1$ . For $w =1,2,\ldots$ , if $W^{(n)}_k \big(A_{k-1}^{(n)}, G_{k-1}^{(n)}\big) = w$ then we set $\hat{Z}^{(n)}_k =w$ with probability $\psi_w^{(n)}/q^{(n)} \big(x^{(n)}, y^{(n)},w\big)$ and set $\hat{Z}^{(n)}_k =0$ otherwise. Provided that $A_{\lfloor \delta_1 n \rfloor}^{(n)} \geq (1-\epsilon_1) n$ , it then immediately follows by induction that for $k=v_n +1, v_n +2, \ldots, \lfloor \delta_1 n \rfloor$ ,
and (5.14) holds.
The final step is to couple $\mathcal{S}^{(n)}$ to $\mathcal{G}^{(n)}$ . By definition, the processes $\mathcal{S}^{(n)}$ and $\mathcal{G}^{(n)}$ coincide for the first $v_n$ events, so $\big(X_{v_n}^{(n)}, Y_{v_n}^{(n)}\big) = \big(A_{v_n}^{(n)}, G_{v_n}^{(n)}\big)$ . Remember, for $k=v_n +1, v_n +2, \ldots, \lfloor \delta_1 n \rfloor$ , that $\kappa_n (k)$ , defined in (5.4), denotes the number of events that have occurred in $\mathcal{S}^{(n)}$ up to and including the kth event in $\mathcal{G}^{(n)}$ , with $\kappa_n (v_n) = v_n$ by definition. It is helpful to consider the epidemic processes $\mathcal{E}^{(n)}$ $(\mathcal{S}^{(n)})$ and $\mathcal{E}_G^{(n)}$ $(\mathcal{G}^{(n)})$ and to note that the rates at which events occur are $\Big\{\gamma + \sum_{v=0}^{n-1} q^{(n)} (x^{(n)},y^{(n)},v)\Big\} y^{(n)}$ and $\big\{\gamma + \beta^{(n)}\big\} y^{(n)}$ , respectively. Therefore, the rate of occurrence of an event which results in $w =1,2,\ldots, n-1$ infections, when the population is in state $\big(x^{(n)}, y^{(n)}\big)$ , is $q^{(n)} \big(x^{(n)},y^{(n)},w\big) y^{(n)}$ in both $\mathcal{E}^{(n)}$ and $\mathcal{E}_G^{(n)}$ . Similarly, the rate at which a removal occurs in state $\big(x^{(n)}, y^{(n)}\big)$ is $\gamma y^{(n)}$ . Thus, the only difference in event rates is for infection events which produce no infections where the rates are $q^{(n)} \big(x^{(n)},y^{(n)},0\big)$ and $\beta^{(n)} - \sum_{v=1}^{n-1} q^{(n)} \big(x^{(n)},y^{(n)},v\big)$ in $\mathcal{E}^{(n)}$ and $\mathcal{E}_G^{(n)}$ , respectively. Hence, if $W_k^{(n)} \Big(A_{k-1}^{(n)},G_{k-1}^{(n)}\Big) \neq 0$ , we set $\kappa_n (k) = \kappa_n (k-1) +1$ , and
That is, each event which leads to a change in the state of the population in $\mathcal{E}^{(n)}_G$ $(\mathcal{G}^{(n)})$ has a corresponding event in $\mathcal{E}^{(n)}$ $(\mathcal{S}^{(n)})$ . Similarly, if $W_k^{(n)} \Big(A_{k-1}^{(n)},G_{k-1}^{(n)}\Big) = 0$ , we set $\kappa_n (k) = \kappa_n (k-1) +1$ with probability $q^{(n)} \big(x^{(n)},y^{(n)},0\big)/ \Big\{\beta^{(n)} - \sum_{v=1}^{n-1} q^{(n)} (x^{(n)},y^{(n)},v)\Big\}$ and (5.16) holds with $\Big(X_{\kappa_n (k) }^{(n)}, Y_{\kappa_n (k) }\Big) = \Big(X_{\kappa_n (k-1) }^{(n)}, Y_{\kappa_n (k-1) }^{(n)}\Big)$ ; otherwise we set $\kappa_n (k) = \kappa_n (k-1) $ , corresponding to no event in $\mathcal{E}^{(n)}$ $(\mathcal{S}^{(n)})$ and a ghost event in $\mathcal{E}_G^{(n)}$ $(\mathcal{G}^{(n)})$ . Thus there exists $\kappa_n (\lfloor \delta_1 n \rfloor) \leq \lfloor \delta_1 n \rfloor$ such that
Proof of Theorem 3.2 (b). Let $v_n = \lfloor \log n \rfloor$ and $u_n = \lfloor 3 v_n /{\mathbb{P}} (Z_1 \neq 0) \rfloor$ , so $(u_n)$ satisfies the conditions stated in Lemma 4.2. Let $\epsilon$ satisfy (5.6), and fix $\delta_1$ and $\epsilon_1 \lt \epsilon$ such that $0 \lt \delta_1 \lt \epsilon_1 \mu_C/{\mathbb{E}}[C^2]$ (i.e. satisfying (5.7)). It follows from (5.3) that, for $R_0\gt 1$ , ${\mathbb{E}} [Z]\gt 0$ and for $T^{(n)} \geq v_n$ with probability tending to 1 as $n \rightarrow \infty$ , $Y_{v_n}\gt v_n {\mathbb{E}} [Z]/2$ .
Given $\delta \gt 0$ satisfying (5.10), it follows from (5.17) and Lemma 5.2 that, with probability tending to 1 as $n \to \infty$ , for all $k = v_n +1, v_n +2, \ldots, \lfloor \delta_1 n \rfloor$ ,
By setting $b_n = v_n {\mathbb{E}} [Z]/2$ in Lemma 5.1, we have that, as $n \to \infty$ ,
Given that $T^{(n)} \geq Y_{\kappa_n (\lfloor \delta_1 n \rfloor) }$ , (3.14) follows immediately.
Finally, note that Corollary 3.2 follows immediately from (5.18) as $\sup_{t \geq 0} I^{(n)} (t) \geq Y_{\kappa_n (\lfloor \delta_1 n \rfloor) }$ .
6. Concluding comments
As noted in the introduction, the aims of this paper are to provide a rigorous justification for the approximating branching process introduced in Ball and Neal [Reference Ball and Neal5] and a proof of a key result (Theorem 3.2 of this paper) required for a central limit theorem in [Reference Ball and Neal5] for the size of a major outbreak for epidemics with few initial infectives. The latter clearly requires a limit theorem. A limit theorem is also a common approach to rigorously justifying a branching process, but for practical purposes it is often useful to have information concerning the accuracy of the approximation for finite population size n, as given for example by a bound on the total variation distance between the epidemic $\mathcal{E}^{(n)}$ in a population of size n and the limiting branching process $\mathcal{B}$ . A detailed analysis of such accuracy of approximation is beyond the scope of the paper, so here we make a few very brief comments.
Recall from Section 4.5 that $H_n$ denotes the first event at which a mismatch occurs between the embedded discrete jump processes of $\mathcal{E}^{(n)}$ and $\mathcal{B}$ . It follows immediately from results in the proof of Lemma 4.2, using the notation in that lemma and its proof, that
thus yielding a bound on the total variation distance between $\mathcal{E}^{(n)}$ and $\mathcal{B}$ over the first $u_n$ events for quantities that do not depend on the times of those events. The latter can be included by adding ${\mathbb{P}}\big(A_{n,3}^c\big)$ to the right-hand side of (6.1). Bounds for ${\mathbb{P}}(A_{n,i}^c)$ ( $i=0,1,2,3$ ) can be obtained using results given in the proof of Lemma 4.2. For approximation purposes a source of inaccuracy can be removed by using a branching process defined analogously to $\mathcal{B}$ but with C replaced by $C^{(n)}$ .
Acknowledgement
We would like to thank two anonymous reviewers for their helpful comments, which have significantly improved the presentation. In particular, comments by one reviewer motivating the exact coupling of event times between the epidemic and branching processes.
Funding information
There are no funding bodies to thank in relation to the creation of this article.
Competing interests
There were no competing interests to declare which arose during the preparation or publication process of this article.