1. Introduction
Shot noise models built on one-dimensional Poisson processes are very popular in applied probability. Because of their versatility and mathematical tractability, they find application in many fields, such as insurance, finance, queueing theory, and neuroscience (see e.g. [Reference Brémaud7, Reference Brigham and Destexhe10, Reference Ganesh, Macci and Torrisi17, Reference Ganesh, Macci and Torrisi18, Reference Konstantopoulos and Lin24, Reference Klüppelberg and Mikosch27, Reference Klüppelberg and Mikosch28, Reference Leonardi and Torrisi31, Reference Macci, Stabile and Torrisi33, Reference Møller and Torrisi34, Reference Privault36, Reference Torrisi43, Reference Torrisi and Leonardi46). Shot noise models whose underlying point processes are spatial Poisson processes (hereafter called spatial Poisson shot noise models; see Section 2 for a formal definition) are a bit less popular, but they play an important role in wireless communication, where they are exploited as models of the inference in ad hoc networks (see e.g. [Reference Baccelli and Błaszczyszyn1–Reference Baccelli and Błaszczyszyn3, Reference Ganesh and Torrisi16, Reference Privault and Torrisi38, Reference Torrisi and Leonardi42]). Furthermore, as explained in detail in the next section, spatial Poisson shot noise models encompass spatial Poisson cluster point processes, which are widely used in many research areas, such as spatial statistics (see e.g. [Reference Møller and Waagepetersen35]). Since spatial Poisson shot noise models are stochastic integrals with respect to a Poisson random measure, Gaussian approximation bounds for the Wasserstein and Kolmogorov distances between such random variables (properly standardized) and the standard normal law can easily be obtained by applying the general theory developed in the seminal paper [Reference Last, Peccati and Schulte30]. One of the main achievements of the present article are explicit bounds for the Wasserstein and Kolmogorov distances between a properly standardized compound sum, which extends Poisson cluster and Hawkes point processes, and the standard normal law (see Corollaries 1, 3 and 5). These results improve upon and go beyond the findings in [Reference Hillairet, Huang, Khabou and Réveillac23, Reference Khabou, Privault and Réveillac26], exploiting a considerably simpler approach (see also the discussion in Section 7.3.1). Using a well-known link between cumulants and large deviation theory (see [Reference Saulis and Statulevicius40]), we also provide sufficient conditions which guarantee moderate deviations, Bernstein-type concentration inequalities, and normal approximation bounds with Cramér correction term (see Definition 1 for details) for sequences of random variables which belong to the first chaos on the Poisson space (see Theorem 2). We then transfer such results to sequences of spatial Poisson shot noise models. As a main application, we provide moderate deviations, Bernstein-type concentration inequalities, and normal approximation bounds with Cramér correction term for sequences of compound sums, which extend Poisson cluster and Hawkes point processes (see Corollaries 2, 4, and 6). Remarkably, the result on moderate deviations recovers, under an alternative condition on the fertility function, the moderate deviations for the number of points of a classical Hawkes process on the time interval (0, t] proved in [Reference Zhu47] (see Section 7.3.2).
The paper is structured as follows. In Section 2 we introduce the Poisson shot noise models considered in the paper, and we show that compound Poisson cluster point processes and generalized compound Hawkes processes are indeed Poisson shot noise models. Furthermore, we recall a simple model of wireless communication, which accounts for interference effects described by a Poisson shot noise. In Section 3 we provide an informal description of our results. In Section 4 we give Gaussian approximation bounds for the Wasserstein and Kolmogorov distances between a random variable belonging to the first chaos of the Poisson space and the standard normal law, and we give moderate deviations, Bernstein-type concentration inequalities, and normal approximation bounds with Cramér correction term for sequences of random variables belonging to the first chaos on the Poisson space. Applications of the results of Section 4 to spatial Poisson shot noise models and compound Poisson cluster point processes are provided in Sections 5 and 6, respectively. The general results on Gaussian approximation and moderate deviations are applied to generalized compound Hawkes processes in Sections 7 and 8, respectively.
2. Poisson shot noise random variables
Throughout this article, if x is a point in some set E and $C \subset E$ , then $C-x$ denotes the set $\{y-x, y \in C\}.$ A Poisson shot noise random variable is a real-valued random variable of the form
Here $\mathcal{B}({\mathbb R}^d)$ denotes the Borel $\sigma$ -field on ${\mathbb R}^d$ , $d\geq 1$ , $\mathcal{P}\equiv\{(X_n,Z_n)\}_{n\geq 1}$ is a Poisson process on $\mathbb R^d\times{\textbf{Z}}$ with mean measure $\lambda(x)\mathrm{d}x\mathbb{Q}(\mathrm{d}z)$ , $({\textbf{Z}},\mathcal Z)$ is a measurable space, $\lambda\;:\;\mathbb R^d\to [0,\infty)$ is a locally integrable intensity function, $\mathbb{Q}$ is a probability measure on ${\textbf{Z}}$ , and $H\;:\;\mathcal{B}({\mathbb R}^d)\times{\textbf{Z}}\to{\mathbb R}$ is a mapping such that, for each fixed $C\in\mathcal{B}(\mathbb R^d)$ , the function
is measurable. Poisson shot noise random variables encompass a variety of important stochastic models.
2.1. Compound Poisson cluster point processes
Let $\{X_n\}_{n\geq 1}$ be the points of a Poisson process on $\mathbb R^d$ , $d\geq 1$ , with a locally integrable intensity function $\lambda\;:\;\mathbb{R}^d\to [0,\infty)$ , and let $\{Z_n(\cdot,\cdot)\}_{n\geq 1}$ be a sequence of independent and identically distributed simple point processes on ${\mathbb R}^d \times {\mathbb R}$ , independent of $\{X_n\}_{n\geq 1}$ . More concretely, for $(C_1,C_2) \in \mathcal B ({\mathbb R}^d) \times \mathcal B ({\mathbb R})$ , $Z_n(C_1,C_2)$ counts the number of points of the nth point process that fall in $C_1$ and whose marks are in $C_2$ . For each $n\geq 1$ , we denote the points of $Z_n(\cdot, \cdot)$ by $\{(Y_{n,k},M_{n,k})\}_{k\geq 0}$ , and we assume that $Y_{n,0}\;:\!=\; \textbf {0}$ (which implies $Z_n(\{\textbf{0}\},{\mathbb R})\;:\!=\;1$ ) and that the sequence $\{M_{n,k}\}_{k\geq 0}$ is independent of $\{Y_{n,k}\}_{k\geq 0}$ . Furthermore, we suppose that the random variables $\{M_{n,k}\}_{n\geq 1,\,k\geq 0}$ are independent and identically distributed. Throughout the paper we denote by M the generic random variable $M_{n,k}$ .
One naturally interprets the first component of each point of $Z_n(\cdot,\cdot)$ as a ‘location’, and the second component as a ‘mark’ which describes some characteristic of the location to which it is attached. Hereafter, for $n\geq 1$ , we consider the point processes $\theta_{X_n}Z_n(\cdot,\cdot)\equiv\{(X_n+Y_{n,k},M_{n,k})\}_{k\geq 0}$ .
For arbitrarily fixed $n\geq 1$ and $C \in \mathcal B ({\mathbb R}^d)$ , we define the random variable
which aggregates the marks attached to the locations that fall in C. It turns out that the random variable, say V(C), which aggregates all the marks attached to the points which fall in C of the Poisson cluster point process
is a Poisson shot noise random variable. Indeed,
is a random variable of the form (1) with $H(C-x,z)\;:\!=\;v(z)(C-x)$ , $x\in\mathbb{R}^d$ , $z\in\textbf{Z}\;:\!=\;\textbf{N}_{\mathbb{R}^d\times\mathbb{R}}$ . Here $\textbf{N}_{\mathbb R^d\times\mathbb R}$ denotes the space of $\sigma$ -finite counting measures on $(\mathbb R^d\times\mathbb{R},\mathcal B(\mathbb R^d)\otimes\mathcal B(\mathbb R))$ equipped with the usual $\sigma$ -field (see Section 4 for details), and
Note that if $M_{n,k}\;:\!=\;1$ for every $n\geq 1$ and $k\geq 0$ , then the random variable
equals the number of points of the Poisson cluster point process N which fall in $C\in\mathcal{B}({\mathbb R}^d)$ .
2.2. Generalized Hawkes processes and generalized compound Hawkes processes
Let $N\equiv\{N(C)\}_{C\in\mathcal{B}({\mathbb R}^d)}$ be the Poisson cluster point process defined by (5). We shall refer to N as a generalized Hawkes process if the random variable $Z\;:\!=\;Z_1({\mathbb R}^d,{\mathbb R})$ is distributed as the total progeny of a sub-critical Galton–Watson process with one ancestor. It is worthwhile to note the following:
-
(i) Classical Hawkes processes on $(0,\infty)$ (respectively, on $\mathbb R$ ) with parameters $(\lambda,g)$ , introduced in the seminal papers [Reference Hawkes21, Reference Hawkes and Oakes22], are particular examples of generalized Hawkes processes. Indeed, they are Poisson cluster point processes defined as follows. (1) The process of cluster centers $\{X_n\}_{n\geq 1}$ is a Poisson process on $(0,\infty)$ (respectively, on $\mathbb R$ ) with constant intensity equal to $\lambda>0$ . (2) The points of the cluster $\theta_{X_n}Z_n(\cdot,{\mathbb R})$ are partitioned into generations and generated recursively as follows. The ancestor constitutes the generation 0 of the cluster and is located at $X_n$ . Given $X_n$ , the ancestor generates points of the first generation of the cluster according to a non-homogeneous Poisson process on $(X_n,\infty)$ with intensity function $g(\cdot-X_n)$ , where $g\;:\;\mathbb R\to [0,\infty)$ is a measurable function which is null on $(\!-\!\infty,0]$ and such that $h\;:\!=\;\int_0^\infty g(x)\mathrm{d}x<1$ . In turn, given the points of the first generation of the cluster, a point of this generation, which is located at X, generates points of the second generation of the cluster according to a non-homogeneous Poisson process on $(X,\infty)$ with intensity function $g(\cdot-X)$ ; and so on and so forth. Note that $Z_n(\mathbb R,{\mathbb R})=\theta_{X_n}Z_n(\mathbb R,{\mathbb R})=\theta_{X_n}Z_n([X_n,\infty),{\mathbb R})=Z_n([0,\infty),{\mathbb R})$ is distributed as the total progeny of a sub-critical Galton–Watson process with one ancestor and Poisson offspring law with mean h.
-
(ii) Spatial Hawkes processes on ${\mathbb R}^d$ , $d\geq 1$ , with parameters $(\lambda,g)$ , introduced in [Reference Brémaud, Massoulié and Ridolfi9] and further studied in [Reference Møller and Torrisi34], are also particular examples of generalized Hawkes processes. Indeed, they are Poisson cluster point processes defined as follows. (1) The process of cluster centers $\{X_n\}_{n\geq 1}$ is a Poisson process on $\mathbb R^d$ , $d\geq 1$ , with constant intensity equal to $\lambda>0$ . (2) The points of the cluster $\theta_{X_n}Z_n(\cdot,{\mathbb R})$ are partitioned into generations and generated recursively as follows. The ancestor constitutes the generation 0 of the cluster and is located at $X_n$ . Given $X_n$ , the ancestor generates points of the first generation of the cluster according to a non-homogeneous Poisson process on $\mathbb R^d$ with intensity function $g(\cdot-X_n)$ , where $g\;:\;\mathbb R^d\to [0,\infty)$ is a measurable function such that $h\;:\!=\;\int_{\mathbb R^d}g(x)\mathrm{d}x<1$ . In turn, given the points of the first generation of the cluster, a point of this generation, which is located at X, generates points of the second generation of the cluster according to a non-homogeneous Poisson process on $\mathbb R^d$ with intensity function $g(\cdot-X)$ ; and so on and so forth. Note that $\theta_{X_n}Z_n(\mathbb R^d,{\mathbb R})=Z_n(\mathbb R^d,{\mathbb R})$ is distributed as the total progeny of a sub-critical Galton–Watson process with one ancestor and Poisson offspring law with mean h.
Note that, according to these definitions, classical Hawkes processes on $\mathbb R$ are different from spatial Hawkes processes on $\mathbb R$ .
The collection of random variables $V\equiv\{V(C)\}_{C\in\mathcal{B}({\mathbb R}^d)}$ , where V(C) is defined by (3), will be called a generalized compound Hawkes process if the random variable Z is distributed as the total progeny of a sub-critical Galton–Watson process with one ancestor. Note that V(C) aggregates the marks attached to the points of a generalized Hawkes process which fall in C.
2.3. Interference in wireless communication
Consider the following simple model of wireless communication, which accounts for interference effects that arise when several nodes transmit at the same time. Suppose that transmitting nodes (e.g., antennas) are located according to $\{X_n\}_{n\geq 1}$ , a Poisson process on the plane with intensity function $\lambda(\cdot)$ —i.e., $X_n$ is the location of node n—and denote by $Z_n\in (0,\infty)$ the signal power of the transmitting node n. Suppose that the sequence $\{Z_n\}_{n\geq 1}$ is independent of the Poisson process, and that the random variables $Z_n$ , $n\geq 1$ , are independent and identically distributed. Assume that a receiver is located at the origin $\textbf{0}\in\mathbb R^2$ and that a new transmitter is added at $x\in\mathbb R^2$ and has signal power $y\in (0,\infty)$ . Suppose that the physical propagation of the signal is described by a measurable positive function $A\;:\;\mathbb R^2\to (0,\infty)$ , which gives the attenuation or path loss of the signal power. For simplicity, we assume that random fading (due to occluding objects, reflections, multipath interference, etc.) is encoded in the random variables $Z_n\in (0,\infty)$ . Thus, $Z_n A(X_n)$ is the power received at the origin from the transmitting node at $X_n$ , and the total interference at the origin, due to simultaneous transmissions, is equal to
Note that this is a Poisson shot noise random variable of the form (1). Indeed, let $H\;:\;\mathcal{B}({\mathbb R}^2)\times (0,\infty)\to (0,\infty)$ be a mapping which, restricted to ${\mathbb R}^2\times (0,\infty)$ , coincides with $\widetilde{H}(x,z)\;:\!=\;zA(\!-x)$ . Then
We refer the reader to [Reference Baccelli and Błaszczyszyn2, Reference Baccelli and Błaszczyszyn3] for more insight into this model, and limit ourselves to observing that the receiver at the origin can decode the signal of power $y\in (0,\infty)$ from the transmitter at $x\in\mathbb R^2$ if and only if the signal-to-interference-plus-noise ratio (SINR) is bigger than a given threshold, i.e.,
where w is e.g. a thermal noise near the receiver at the origin and $\tau$ is the given threshold.
3. Informal description of the results
We start by noting that some results in this paper refer to sequences of Poisson shot noise random variables of the form
where, for each $\ell\geq 1$ , $\mathcal{P}_\ell=\{(X_n^{(\ell)},Z_n^{(\ell)})\}_{n\geq 1}$ is a Poisson process on $\mathbb{R}^d\times{\textbf{Z}}$ with mean measure $\lambda_\ell(x)\mathrm{d}x\mathbb{Q}_\ell(\mathrm{d}z)$ , $\lambda_\ell\;:\;\mathbb{R}^d\to [0,\infty)$ is a locally integrable function, and $\mathbb{Q}_\ell(\cdot)$ is a probability measure on a measurable space $({\textbf{Z}},\mathcal{Z})$ .
The achievements of the paper are the following:
-
(i) We provide bounds for the Wasserstein and Kolmogorov distances (hereafter denoted by $d_W$ and $d_K$ , respectively; see Section 4.1 for a formal definition of these probability metrics) between a standard Gaussian random variable G and
(7) \begin{equation}T(C)\;:\!=\;\frac{S(C)-\mathbb E S(C)}{\sqrt{\mathbb{V}\mathrm{ar}(S(C))}}, \quad C\in\mathcal{B}(\mathbb R^d);\end{equation}see Theorem 3. As special cases (see Remark 1), we get bounds for the Wasserstein and Kolmogorov distances between G and(8) \begin{equation}W(C)\;:\!=\;\frac{V(C)-\mathbb E V(C)}{\sqrt{\mathbb{V}\mathrm{ar}(V(C))}}, \quad C\in\mathcal{B}(\mathbb R^d),\end{equation}where V(C) is defined by (3), and between G and(9) \begin{equation}L(\{\textbf{0}\})\;:\!=\;\frac{I(\{\textbf{0}\})-\mathbb E I(\{\textbf{0}\})}{\sqrt{\mathbb{V}\mathrm{ar}(I(\{\textbf{0}\}))}}, \quad\textbf{0}\in\mathbb R^2,\end{equation}where $I(\{\textbf{0}\})$ is defined in Section 2.3. -
(ii) We provide moderate deviations, Bernstein-type concentration inequalities, and normal approximation bounds with Cramér correction term for the sequence $\{T_\ell(C_\ell)\}_{\ell\geq 1}$ , where
\[T_\ell(C_\ell)\;:\!=\;\frac{S_\ell(C_\ell)-\mathbb E S_\ell(C_\ell)}{\sqrt{\mathbb{V}\mathrm{ar}(S_\ell(C_\ell))}}, \quad C_\ell\in\mathcal{B}(\mathbb R^d)\]see Theorem 4. As particular cases (see Remark 2), we get moderate deviations, Bernstein-type concentration inequalities, and normal approximation bounds with Cramér correction term for the sequences(10) \begin{equation}W_\ell(C_\ell)\;:\!=\;\frac{V_\ell(C_\ell)-\mathbb E V_\ell(C_\ell)}{\sqrt{\mathbb{V}\mathrm{ar}(V_\ell(C_\ell))}}, \quad C_\ell\in\mathcal{B}(\mathbb R^d),\quad\ell\geq 1,\end{equation}where $V_\ell(C_\ell)$ is defined by (3), with obvious modifications, and(11) \begin{equation}L_\ell(\{\textbf{0}\})\;:\!=\;\frac{I_\ell(\{\textbf{0}\})-\mathbb E I_\ell(\{\textbf{0}\})}{\sqrt{\mathbb{V}\mathrm{ar}(I_\ell(\{\textbf{0}\}))}}, \quad\textbf{0}\in\mathbb R^2,\end{equation}where $I_\ell(\{\textbf{0}\})$ is defined in Section 2.3, with obvious modifications. -
(iii) If the Poisson process $\{X_n\}_{n\geq 1}$ has intensity function of the form $\lambda(x)\;:\!=\;\lambda\textbf{1}_{B}(x)$ , $x\in\mathbb R^d$ , for some positive constant $\lambda>0$ and a suitable Borel set $B\subseteq\mathbb R^d$ , then the bounds on $d_W(W(C),G)$ and $d_K(W(C),G)$ are particularly simple and depend only on $\lambda$ , the Lebesgue measure of $B\cap C$ , and a few moments of M and Z; see Corollary 1. If Z is distributed as the total progeny of a sub-critical Galton–Watson process with one ancestor, then we are able to compute the moments of Z in terms of the moments of the offspring law; see Proposition 1. This allows for explicit bounds when $V=\{V(C)\}_{C\in\mathcal{B}({\mathbb R}^d)}$ is a generalized compound Hawkes process with Poisson or binomial offspring laws; see Corollaries 3 and 5, respectively.
-
(iv) If the Poisson process $\{X_n^{(\ell)}\}_{n\geq 1}$ has intensity function of the form $\lambda_\ell(x)=\lambda_\ell \boldsymbol 1_{B_\ell}(x)$ , $x\in\mathbb R^d$ , for positive constants $\lambda_\ell>0$ and suitable Borel sets $B_\ell\in\mathcal{B}(\mathbb R^d)$ , and $\mathbb Q_\ell\equiv\mathbb Q$ for each $\ell\geq 1$ , then the condition which guarantees a moderate deviation principle, a Bernstein-type concentration inequality, and a normal approximation bound with Cramér correction term for the sequence $\{W_\ell(C_\ell)\}_{\ell\geq 1}$ is quite simple (see the condition (22) of Corollary 2), and it allows for applications to generalized compound Hawkes processes with Poisson and binomial offspring distributions; see Corollaries 4 and 6, respectively.
We conclude this section by emphasizing that the basic idea of this paper is very simple (especially if compared with the techniques exploited in [Reference Hillairet, Huang, Khabou and Réveillac23, Reference Khabou, Privault and Réveillac26] for the Gaussian approximation of classical Hawkes processes on $(0,\infty)$ ). Since the random variable S(C) can be rewritten as the Poisson integral
we are able to do the following:
-
(i) We apply the quantitative central limit theorems, in the Wasserstein and Kolmogorov metrics, for functionals of the Poisson measure proved in the seminal paper [Reference Last, Peccati and Schulte30], to provide normal approximation bounds for first chaoses on the Poisson space (see Theorems 1), and then we apply such bounds to the random variable T(C).
-
(ii) We prove moderate deviations, Bernstein-type concentration inequalities, and normal approximation bounds with Cramér correction for a sequence of first chaoses on the Poisson space (see e.g. Theorem 2), and then we apply such results to the sequence $\{T_\ell(C_\ell)\}_{\ell\geq 1}$ .
4. Gaussian approximation and moderate deviations of the first chaos on the Poisson space
Let $(A,\mathcal A,\alpha)$ be a measure space with $\alpha(\cdot)$ a $\sigma$ -finite measure, and let ${\textbf{N}}_A$ be the set of all $\sigma$ -finite counting measures on $(A,\mathcal A)$ equipped with the $\sigma$ -field generated by the mappings $\nu\mapsto\nu(B)$ , $B\in\mathcal A$ . We say $\Pi$ is a Poisson measure on $(A,\mathcal A)$ with mean measure $\alpha(\cdot)$ if it is a measurable mapping from an underlying probability space $(\Omega,\mathcal F,\mathbb P)$ to ${\textbf{N}}_A$ such that (i) for any $B\in\mathcal A$ , $\Pi(A)$ is Poisson distributed with mean $\alpha(A)$ , and (ii) if $B_1,\ldots,B_n\in\mathcal A$ , $n\in\mathbb N$ , are pairwise disjoint, then the random variables $\Pi(B_1),\ldots,\Pi(B_n)$ are independent.
Let $\{\Pi_\ell\}_{\ell\in\mathbb N}$ be a sequence of Poisson measures on $(A,\mathcal A)$ , defined on the probability space $(\Omega,\mathcal F,\mathbb P)$ . Suppose that $\Pi_\ell$ has a $\sigma$ -finite mean measure $\alpha_\ell(\cdot)$ , $\ell\in\mathbb N$ . We denote by $L^m(\alpha_\ell)$ the space of measurable functions $f\;:\;A\to{\mathbb R}$ such that $\int_A|f(a)|^m\alpha_\ell({\mathrm d} a)<\infty$ , $m\in\mathbb N$ , and, for $\{f_\ell\}_{\ell\in\mathbb N}\in L^2(\alpha_\ell)$ , we consider
the first chaos on the Poisson space, i.e., the first-order stochastic integral of $f_\ell$ with respect to the compensated Poisson measure $\Pi_\ell({\mathrm d} a)-\alpha_\ell({\mathrm d} a)$ . If the law of $\Pi_\ell$ does not depend on $\ell$ , we suppress the dependence on $\ell$ of the related quantities, and for $f\in L^2(\alpha)$ we simply write
4.1. Bounds on the Wasserstein and Kolmogorov distances between the law of a first chaos on the Poisson space and the standard normal distribution
Let X and Y be two real-valued random variables defined on $(\Omega,\mathcal{F},\mathbb P)$ . The Wasserstein and Kolmogorov distances between the law of X and the law of Y, written $d_W(X,Y)$ and $d_K(X,Y)$ , respectively, are defined by
and
Here, $\mathrm{Lip}(1)$ denotes the set of Lipschitz functions $g\;:\;{\mathbb R}\to{\mathbb R}$ with Lipschitz constant less than or equal to 1. We recall that throughout this paper, G denotes a random variable distributed according to the standard normal law.
Theorem 1. Let $f\in L^2(\alpha)$ be such that $\|f\|_{L^2(\alpha)}=1$ . Then
and
Proof. We refer the reader to [Reference Last and Penrose29] for all the notions of stochastic analysis on the Poisson space used in this proof. Let F be a functional of $\Pi$ , i.e., $F=\mathfrak{f}(\Pi)$ , where $\mathfrak f$ is a real-valued measurable function defined on ${\textbf{N}}_A$ . We recall that the difference operator D is defined by
where $\delta_a$ is the Dirac measure at $a\in A$ , and that the second difference operator $D^2$ is defined by
We also recall that the domain of D, denoted by $\mathrm{dom}(D)$ , is the family of square-integrable random variables $F=\mathfrak{f}(\Pi)$ such that
Setting $F\;:\!=\;I(f)$ , we have that ${\mathbb E} F = 0$ with $\mathbb{V}\mathrm{ar}(F)=1$ (as follows by applying the isometry formula for Poisson chaoses) and that $F\in\mathrm{dom}(D)$ . Using Theorem 1.1 in [Reference Last, Peccati and Schulte30] we have that
where
Since for $F=I(f)$ we have that $D_a F=f(a)$ and $D_{a_1,a_2}F=0$ , $a_1,a_2\in A$ , the terms $\gamma_1$ and $\gamma_2$ vanish and $\gamma_3= \int_A |f(a)|^3 \alpha( \mathrm d a)$ . Therefore, we obtain the bound on the Wasserstein distance between I(f) and G.
Similarly, Theorem 1.2 in [Reference Last, Peccati and Schulte30] gives
Using Lemma 4.2 in [Reference Last, Peccati and Schulte30] we have
which yields the upper bound on the Kolmogorov distance.
4.2. Moderate deviations, Bernstein-type concentration inequalities, and normal approximation bounds with Cramér correction term for first chaoses on the Poisson space
We start with a definition.
Definition 1. Let $\{Y_\ell\}_{\ell\in\mathbb N}$ be a sequence of real-valued random variables, $\gamma\geq 0$ a non-negative constant, and $\{\Delta_\ell\}_{\ell\in\mathbb N}$ a positive numerical sequence. We make the following definitions:
-
(1) The sequence $\{Y_\ell\}_{\ell\in\mathbb N}$ satisfies a moderate deviation principle with parameters $\gamma$ and $\{\Delta_\ell\}_{\ell\in\mathbb N}$ ( $\textbf{MDP}(\gamma,\{\Delta_\ell\}_{\ell\in\mathbb N})$ for short) if, for any sequence of positive numbers $\{a_\ell\}_{\ell\in\mathbb N}$ such that $\lim_{\ell\to\infty}a_\ell=+\infty$ and $\lim_{\ell\to\infty}\frac{a_{\ell}}{\Delta_\ell^{1/(1+2\gamma)}}=0$ , the sequence $\{Y_\ell\}_{\ell\in\mathbb N}$ satisfies a large deviation principle with speed $a_\ell^2$ and rate function $J(x)\;:\!=\;x^2/2$ , i.e., for any Borel set $B\subset{\mathbb R}$ ,
\[-\inf_{x\in\overset{\circ}B}J(x)\leq\liminf_{\ell\to\infty}a_\ell^{-2}\log\mathbb{P}\left(Y_\ell/a_\ell\in B\right)\leq\limsup_{\ell\to\infty}a_\ell^{-2}\log\mathbb{P}\left(Y_\ell/a_\ell\in B\right)\leq-\inf_{x\in\overline B}J(x),\]where $\overset{\circ}B$ denotes the interior of B and $\overline B$ denotes the closure of B. -
(2) The sequence $\{Y_\ell\}_{\ell\in\mathbb N}$ satisfies a Bernstein-type concentration inequality with parameters $\gamma$ and $\{\Delta_\ell\}_{\ell\in\mathbb N}$ ( $\textbf{BCI}(\gamma,\{\Delta_\ell\}_{\ell\in\mathbb N})$ for short) if, for all $\ell\in\mathbb N$ and $x\geq 0$ , we have
\[\mathbb{P}(|Y_\ell |\geq x)\leq 2\exp\left(-\frac{1}{4}\min\left\{\frac{x^2}{2^{1+\gamma}},(x\Delta_\ell)^{1/(1+\gamma)}\right\}\right).\] -
(3) The sequence $\{Y_\ell\}_{\ell\in\mathbb N}$ satisfies a normal approximation bound with Cramér correction term with parameters $\gamma$ and $\{\Delta_\ell\}_{\ell\in\mathbb N}$ ( $\textbf{NACC}(\gamma,\{\Delta_\ell\}_{\ell\in\mathbb N})$ for short), if there exist positive constants $c_0,c_1,c_2>0$ depending only on $\gamma$ such that for all $\ell\in\mathbb N$ and $x\in [0,c_0\Delta_\ell^{1/(1+2\gamma)}]$ ,
\[\mathbb{P}(Y_\ell\geq x)=\mathrm{e}^{L_{\ell,x}^+}[1-\mathbb P(G\leq x)]\left(1+c_1\theta_{\ell,x}^+\frac{1+x}{\Delta_\ell^{1/(1+2\gamma)}}\right)\]and\[\mathbb{P}(Y_\ell\leq -x)=\mathrm{e}^{L_{\ell,x}^-}[1-\mathbb P(G\leq -x)]\left(1+c_1\theta_{\ell,x}^-\frac{1+x}{\Delta_\ell^{1/(1+2\gamma)}}\right),\]where $\theta_{\ell,x}^{\pm}\in [\!-\!1,1]$ and $L_{\ell,x}^{\pm}\in\left(-c_2\frac{x^3}{\Delta_\ell^{1/(1+2\gamma)}},c_2\frac{x^3}{\Delta_\ell^{1/(1+2\gamma)}}\right)$ .
As a preliminary result, we provide moderate deviations, Bernstein-type concentration inequalities, and normal approximation bounds with Cramér correction term for sequences of first chaoses on the Poisson space.
Theorem 2. Assume the following: (i) $f_\ell\in L^m(\alpha_\ell)$ for any $m\geq 2$ and any $\ell\in\mathbb N$ with $\|f_\ell\|_{L^2(\alpha_\ell)}=1$ for any $\ell\in\mathbb N$ ; (ii) there exist a constant $\gamma\geq 0$ and a positive numerical sequence $\{\Delta_\ell\}_{\ell\in\mathbb N}$ such that
Then the sequence $\{I^{(\ell)}(f_\ell)\}_{\ell\in\mathbb N}$ satisfies an $\textbf{MDP}(\gamma,\{\Delta_\ell\}_{\ell\in\mathbb N})$ , a $\textbf{BCI}(\gamma,\{\Delta_\ell\}_{\ell\in\mathbb N})$ , and an $\textbf{NACC}(\gamma,\{\Delta_\ell\}_{\ell\in\mathbb N})$ .
Proof. We recall that for real-valued random variables $X_1,\ldots,X_m$ , $m\in\mathbb N$ , the joint cumulant of $X_1,\ldots,X_m$ is defined as
where ${\textbf{i}}$ is the imaginary unit and $\varphi_{X_1,\ldots,X_m}$ is the joint characteristic function of $(X_1,\ldots,X_m)$ . For a real-valued random variable X and $m\in\mathbb N$ we shall write $\mathrm{cum}_m(X)\;:\!=\;\mathrm{cum}(X,\ldots,X)$ for the mth cumulant of X.
For an arbitrarily fixed $\ell\in\mathbb N$ , set $X_\ell\;:\!=\;I^{(\ell)}(f_\ell)$ . Clearly, $\mathbb E X_\ell=0$ and $\mathbb E X_\ell^2=1$ (which is a consequence of the isometry formula for Poisson chaoses). Then the claim follows by the theory developed in [Reference Saulis and Statulevicius40] (see e.g. Proposition 2.1 in [Reference Schulte and Thäle41]; see also [Reference Döring and Eichelsbacher13, Reference Döring, Jansen and Schubert14]) if we prove that
and
To this end we are going to apply Theorem 3.6 of [Reference Schulte and Thäle41]. A partition $\sigma$ of $\{1,\ldots,m\}$ , $m\geq 3$ , is a collection $\{B_1,\ldots,B_k\}$ of $1\leq k\leq m$ pairwise disjoint non-empty sets, called blocks, such that $B_1\cup\ldots\cup B_k=\{1,\ldots,m\}$ . The number k of blocks of a partition $\sigma$ is denoted by $|\sigma|$ . Let $J_j\;:\!=\;\{j\}$ , $j\in\{1,\ldots,m\}$ . Letting $\Pi(\textbf{1}_m)$ , $\textbf{1}_m\;:\!=\;(1,\ldots,1)\in{\mathbb R}^m$ , denote the set of all partitions $\sigma$ of $\{1,\ldots,m\}$ whose blocks B are such that $\mathrm{Card}(B\cap J_{j})\leq 1$ for every $j\in\{1,\ldots,m\}$ , we clearly have that $\Pi(\textbf{1}_m)$ is the set of all partitions of $\{1,\ldots,m\}$ . Letting $\widetilde{\Pi}(\textbf{1}_m)$ denote the set of all partitions $\sigma\in\Pi(\textbf{1}_m)$ with $|\sigma|=1$ , we clearly have $\widetilde\Pi(\textbf{1}_m)=\{\{1,\ldots,m\}\}$ , $m\geq 3$ . Letting $\widetilde{\Pi}_{\geq 2}(\textbf{1}_m)$ denote the set of all partitions $\sigma\in\widetilde\Pi(\textbf{1}_m)$ whose blocks have cardinality bigger than or equal to 2, since $m\geq 3$ , we clearly have $\widetilde\Pi_{\geq 2}(\textbf{1}_m)=\widetilde\Pi(\textbf{1}_m)=\{\{1,\ldots,m\}\}$ . We denote by $\Pi_{\geq 2}(\textbf{1}_m)$ , $m\geq 2$ , the family of all partitions $\sigma\in\Pi(\textbf{1}_m)$ , i.e., the family of all partitions $\sigma$ of $\{1,\ldots,m\}$ whose blocks have cardinality bigger than or equal to 2.
For a function $g\;:\;A\to\mathbb R$ , set
For $\sigma\in\Pi(\textbf{1}_m)$ , define the function $(\!\otimes_{j=1}^{m}g)_\sigma\;:\;A^{|\sigma|}\to{\mathbb R}$ by replacing in $(\!\otimes_{j=1}^{m}g)(x_1,\ldots,x_m)$ all the variables whose indexes belong to the same block of $\sigma$ by a new common variable. Note that for $\sigma\in\Pi(\textbf{1}_m)$ , $(\!\otimes_{j=1}^{m}g)_\sigma\;:\;A^{|\sigma|}\to{\mathbb R}$ can be represented as
where $B_1,\ldots,B_{|\sigma|}$ are the blocks of $\sigma$ and $|B_i|$ , $i=1,\ldots,|\sigma|$ , is the cardinality of the block $B_i$ . In particular, for $\sigma\in\widetilde\Pi_{\geq 2}(\textbf{1}_m)$ , $(\!\otimes_{j=1}^{m}g)_\sigma(a)\;:\!=\;g(a)^m,$ $a\in A$ , and, for $\sigma\in\Pi _{\geq 2}(\textbf{1}_m)$ ,
where $B_1,\ldots,B_{|\sigma|}$ are the blocks of $\sigma$ and $|B_i|\geq 2$ for any $i=1,\ldots,|\sigma|$ . Therefore the hypothesis (i) implies the assumptions $(3.4)$ and $(3.5)$ of Theorem 3.6 in [Reference Schulte and Thäle41], and so (13) and (14) hold.
5. Application to Poisson shot noise random variables
5.1. Gaussian approximation
In this section, we apply Theorem 1 to the standardized random variables T(C), $C\in\mathcal{B}(\mathbb R^d)$ , defined by (7). Note that, for $C\in\mathcal{B}({\mathbb R}^d)$ ,
and that, by the isometry formula for Poisson chaoses, if $\int_{\mathbb R^d}\lambda(x)\mathbb{E}H(C-x,Z_1)^2\mathrm{d}x<\infty$ , then
Therefore the random variable T(C) is well defined and finite for any $C \in \mathcal{B}(\mathbb R^d)$ such that
The following theorem holds.
Theorem 3. Let $C \in \mathcal{B}(\mathbb R^d)$ be such that (15) holds.
Then
and
Proof. By (1) we have
$C\in\mathcal{B}({\mathbb R}^d)$ . Therefore T(C) belongs to the first chaos of $\mathcal P$ with kernel
The claim follows from applying Theorem 1.
Remark 1. As particular cases of Theorem 3 we have the following:
-
(i) If $H(C-x,z)\;:\!=\;v(z)(C-x)$ , with $C\in\mathcal{B}(\mathbb R^d)$ , $x\in\mathbb{R}^d$ , $z\in\textbf{Z}\;:\!=\;\textbf{N}_{\mathbb{R}^d\times\mathbb{R}^p}$ , and $\upsilon(z)(C)$ is defined by (4), then $S(C)=V(C)$ where the random variable V(C) is defined by (3). So Theorem 3 provides Gaussian approximation bounds for the random variable W(C) defined by (8). An interesting special case is investigated in Section 6.1.
-
(ii) If $H\;:\;\mathcal{B}({\mathbb R}^2)\times (0,\infty)\to (0,\infty)$ is a mapping such that its restriction to ${\mathbb R}^2\times (0,\infty)$ coincides with $\widetilde{H}(x,z)\;:\!=\;zA(-x)$ , $x\in\mathbb R^2$ and $z\in\textbf{Z}\;:\!=\;(0,\infty)$ , then $S(\{\textbf{0}\})=I(\{\textbf{0}\})$ where the random variable $I(\{\textbf{0}\})$ is defined in Section 2.3. So Theorem 3 provides Gaussian approximation bounds for the random variable $L(\{\textbf{0}\})$ defined by (9). An interesting special case is investigated in Section 8.1.
5.2. Moderate deviations, Bernstein-type concentration inequalities, and normal approximation bounds with Cramér correction term
In this section, we apply Theorem 2 to the sequence of standardized random variables $\{T_\ell(C_\ell)\}_{\ell\geq 1}$ , $\{C_\ell\}_{\ell\geq 1}\subset\mathcal{B}(\mathbb R^d)$ , defined by (6).
The following theorem holds.
Theorem 4. Let $\{C_\ell\}_{\ell \in \mathbb N}\subset\mathcal{B}(\mathbb R^d)$ be a sequence of Borel sets such that
and assume that there exist a non-negative constant $\gamma\geq 0$ and a positive numerical sequence $\{\Delta_\ell\}_{\ell \in \mathbb N}$ such that
Then the sequence $\{T_\ell(C_\ell)\}_{\ell\geq 1}$ satisfies an $\textbf{MDP}(\gamma,\{\Delta_\ell\}_{\ell\in\mathbb N})$ , a $\textbf{BCI}(\gamma,\{\Delta_\ell\}_{\ell\in\mathbb N})$ , and an $\textbf{NACC}(\gamma,\{\Delta_\ell\}_{\ell\in\mathbb N})$ .
Proof. Similarly to the proof of Theorem 3, by (6) we have
$C_\ell\in\mathcal{B}({\mathbb R}^d)$ . Therefore $T_\ell(C_\ell)$ belongs to the first chaos of $\mathcal P_\ell$ with kernel
The claim follows by Theorem 2.
Remark 2. As particular cases of Theorem 4 we have the following:
-
(i) If $H(C_\ell-x,z)\;:\!=\;v(z)(C_\ell-x)$ , with $C_\ell\in\mathcal{B}(\mathbb R^d)$ , $x\in\mathbb{R}^d$ , $z\in\textbf{Z}\;:\!=\;\textbf{N}_{\mathbb{R}^d\times\mathbb{R}}$ , and $\upsilon(z)(\cdot)$ is defined by (4), then $S_\ell(C_\ell)=V_\ell(C_\ell)$ where the random variable $V_\ell(C_\ell)$ is defined by (3) with $\lambda_\ell$ , $\mathbb Q_\ell$ , and $C_\ell$ in place of $\lambda$ , $\mathbb Q$ , and C, respectively. So Theorem 4 provides moderate deviations, Bernstein-type concentration inequalities, and normal approximation bounds with Cramér correction term for the sequence $\{W_\ell(C_\ell)\}_{\ell\geq 1}$ , where $W_\ell(C_\ell)$ is defined by (8), with obvious modifications. An interesting special case is investigated in Section 6.2.
-
(ii) If $H\;:\;\mathcal{B}({\mathbb R}^2)\times (0,\infty)\to (0,\infty)$ is a mapping such that its restriction to ${\mathbb R}^2\times (0,\infty)$ coincides with $\widetilde{H}(x,z)\;:\!=\;zA(-x)$ , $x\in\mathbb R^2$ and $z\in\textbf{Z}\;:\!=\;(0,\infty)$ , then $S_\ell(\{\textbf{0}\})=I_\ell(\{\textbf{0}\})$ where the random variable $I_\ell(\{\textbf{0}\})$ is defined as in Section 2.3, with $\lambda_\ell$ and $\mathbb Q_\ell$ in place of $\lambda$ and $\mathbb Q$ , respectively. So Theorem 4 provides moderate deviations, Bernstein-type concentration inequalities, and normal approximation bounds with Cramér correction term for the sequence $\{L_\ell(\{\textbf{0}\})\}_{\ell\geq 1}$ , where $L_\ell(\{\textbf{0}\})$ is defined by (9), with obvious modifications. An interesting special case is investigated in Section 8.2.
6. Application to a class of compound Poisson cluster point processes
6.1. Gaussian approximation
In this section we apply Theorem 3 to the class of standardized compound Poisson cluster point processes $\{W(C)\}_{C\in\mathcal{B}({\mathbb R}^d)}$ with $\{X_n\}_{n\geq 1}$ having a piecewise constant intensity function. In such a case we have more explicit upper bounds on the Wasserstein and Kolmogorov distances, which pave the way to explicit bounds for some classes of generalized compound Hawkes processes (see Corollaries 3 and 5). We recall that Z denotes the random total number of points of the progeny process, i.e. $Z=Z_1({\mathbb R}^d,{\mathbb R})$ , and that M is a generic random variable with the same distribution as the independent and identically distributed marks.
Hereafter, we denote by $\mathrm{Leb}(\cdot)$ the Lebesgue measure on $\mathbb R^d$ .
Corollary 1. Let $(B,C)\in\mathcal{B}(\mathbb R^d)^2$ be such that $0<\mathrm{Leb}(B\cap C)<+\infty$ . If $\lambda(x)=\lambda \boldsymbol 1 _{B}(x)$ for any $x\in{\mathbb R}^d$ and some positive constant $\lambda>0$ , $\mathbb E Z^2<\infty$ , and ${\mathbb E} M^2\in (0,\infty)$ , then
and
We point out that many articles in the literature (e.g. [Reference Bacry, Delattre, Hoffmann and Muzy4, Reference Hillairet, Huang, Khabou and Réveillac23, Reference Khabou, Privault and Réveillac26]) consider Hawkes processes with an empty history, that is, with no points in $(\!-\!\infty,0]$ , which corresponds to the piecewise constant intensity function $\lambda(x)=\boldsymbol 1_{[0,+\infty)}(x)$ in Corollary 1.
The proof of Corollary 1 exploits the following lemma, which is proved in Appendix A.
Lemma 1. For any $(B,C)\in\mathcal{B}(\mathbb R^d)^2$ and $m\in\mathbb N$ , we have
Proof of Corollary 1. In order to apply Theorem 3 we need to verify (15) with $H(C-x,z)\;:\!=\;\upsilon(z)(C-x)$ , $C\in\mathcal{B}(\mathbb R^d)$ , $x\in\mathbb R^d$ , $z\in\textbf{Z}\;:\!=\;\textbf{N}_{\mathbb R^d\times{\mathbb R}}$ , and $\lambda(\cdot)\equiv\lambda \boldsymbol 1 _B(\cdot)$ . For the lower bound we note that
Expanding the square of the sum and using the independence we have that
For $x\in B\cap C$ , we have $x\in C$ , and so the set $C-x$ contains the origin. Since $Z_1(\{\textbf{0}\},{\mathbb R})=1$ , we then have
Therefore,
As far as the upper bound is concerned, we note that by the bound on the Wasserstein distance in Theorem 3 and the inequalities (21) and (20), it immediately follows that
Similarly, the upper bound on the Kolmogorov distance follows from the upper bound on the Kolmogorov distance in Theorem 3, and again the inequalities (21) and (20).
6.2. Moderate deviations, Bernstein-type concentration inequalities, and normal approximation bounds with Cramér correction term
In this section we apply Theorem 4 to the sequence $\{W_\ell(C_\ell)\}_{\ell\geq 1}$ , when the Poisson processes $\{X_n^{(\ell)}\}$ , $\ell\geq 1$ , have a piecewise deterministic intensity function and $\mathbb Q_\ell\equiv\mathbb Q$ for every $\ell\geq 1$ . In such a case the assumption (17) is greatly simplified. Moreover, the next corollary paves the way to the application of the theorem to some classes of generalized compound Hawkes processes (see Corollaries 4 and 6).
Corollary 2. Let $\{(B_\ell,C_\ell)\}_{\ell\in\mathbb N}\subset\mathcal{B}(\mathbb R^d)^2$ be such that $0<\mathrm{Leb}(B_\ell \cap C_\ell)<+\infty$ , $\ell\in\mathbb N$ . Assume that $\lambda_\ell(x)=\lambda_\ell \boldsymbol 1_{B_\ell}(x)$ , $x\in\mathbb R^d$ , for positive constants $\lambda_\ell>0$ , $\ell\in\mathbb N$ , $\mathbb Q_\ell\equiv \mathbb Q$ , ${\mathbb E} M^2>0$ , and
for some $\gamma\geq 0$ and a positive numerical sequence $\{\Delta_\ell\}_{\ell \in \mathbb N}$ . Then the sequence $\{W_\ell(C_\ell)\}_{\ell\geq 1}$ satisfies an $\textbf{MDP}(\gamma,\{\Delta_\ell\}_{\ell\in\mathbb N})$ , a $\textbf{BCI}(\gamma,\{\Delta_\ell\}_{\ell\in\mathbb N})$ , and an $\textbf{NACC}(\gamma,\{\Delta_\ell\}_{\ell\in\mathbb N})$ .
Proof. By (21), (20), the choice of the Borel sets $B_\ell$ and $C_\ell$ , $\ell\in\mathbb N$ , the assumption (22), and the fact that ${\mathbb E} M^2>0$ , we have
and
Therefore the condition (16) holds, and using again the assumption (22), we have
The claim follows by Theorem 4.
7. Application to generalized compound Hawkes processes
We start with a proposition which expresses the moments of the total progeny of a Galton–Watson process with one ancestor in terms of moments of the offspring distribution.
Proposition 1. Suppose that Z is distributed as the total progeny of a Galton–Watson process with one ancestor, and let P be a random variable distributed according to the offspring law of the Galton–Watson process. Assume that the Galton–Watson process is sub-critical, i.e.
Then, for any $n\in \mathbb N$ such that
we have that $\mathbb E Z^n<\infty$ and
where $\mathbb E(P)_1=\mathbb E P$ ,
and the third sum in (25) is taken over all the $m_1,\ldots,m_i\in\mathbb N$ such that $m_1+\ldots+m_i=k$ .
In particular,
and
As an immediate consequence of this proposition and Corollary 1, we have that if the point processes $Z_i(\cdot,{\mathbb R})$ are such that Z is distributed as the total progeny of a Galton–Watson process with one ancestor and offspring law satisfying (23), then the following two statements hold: (i) if the offspring law satisfies (24) with $n=3$ and ${\mathbb E} M^2\in (0,\infty)$ , then the relation (18) holds and the upper bound on $d_W(W(C),G)$ is explicit and depends only on $\lambda$ , the Lebesgue measure of $B\cap C$ , the first three moments of the offspring law and the second and third moments of $|M|$ ; (ii) if the offspring law satisfies (24) with $n=4$ and ${\mathbb E} M^2\in (0,\infty)$ , then the relation (19) holds and the upper bound on $d_K(W(C),G)$ is explicit and depends only on $\lambda$ , the Lebesgue measure of $B\cap C$ , the first four moments of the offspring law, and the second, third, and fourth moments of $|M|$ .
The cases of the Poisson offspring law (which includes compound Hawkes processes) and the binomial offspring law are treated in detail in Sections 7.1.1 and 7.4.1, respectively.
The proof of Theorem 1 exploits the following lemma. Hereafter, for a sufficiently smooth function f, we denote by $f^{(n)}$ its derivative of order $n\in\mathbb N$ .
Lemma 2. (Faà di Bruno formula.) For any sufficiently smooth functions g and h,
where the second sum is taken over all the $m_1,\ldots,m_i\in\mathbb N$ such that $m_1+\ldots+m_i=j$ .
Proof of Proposition 1. We divide the proof into three steps. In the first step we provide a functional equation for $\mathbb E\mathrm{e}^{\theta Z}$ , $\theta\in (\!-\!\infty,0)$ . In the second step we prove the finiteness of the moments of Z and the formula (25). In the third step we compute the second, third, and fourth moments of Z.
Step 1: A functional equation for $\mathbb E \mathrm{e}^{\theta Z}$ , $\theta\in(\!-\!\infty,0)$ .
We note that Z can be represented as $Z=\sum_{n\geq 0}K_n,$ where $K_0=1$ and $K_n$ is the number of offspring in the nth generation of the related Galton–Watson process. Let $\{Z_j\}_{j\geq 1}$ be independent copies of Z. For any $\theta\in(\!-\!\infty,0)$ , by standard computations we have
where $\{p_k\}_{k\geq 0}$ is the law of P and $G_P$ is the probability generating function of P.
Step 2: Proof of $\mathbb E Z^n<\infty$ , $n\in\mathbb N$ , and of (25).
As far as the moments of Z are concerned, we start by showing that they coincide with the left derivative of the moment generating function at zero. For any $\theta < 0$ , the theorem of differentiation under the expected value yields, for any non-negative integer n,
The family $(Z^n e^{\theta Z})_{\theta <0}$ is nonnegative and increasing in $\theta$ ; hence, using the Beppo Levi theorem,
where the equality holds whether the quantities are finite or infinite. Next, we combine the Faà di Bruno formula with the following elementary relation:
for sufficiently smooth functions f and g. For any $\theta\in (\!-\!\infty,0)$ , by (26) and (28), for any non-negative integer n we have
By the Faà di Bruno formula we have
where the sum is taken over all the $m_1,\ldots,m_i\in\mathbb N$ such that $m_1+\ldots+m_i=k$ . Then, for any $\theta\in (\!-\!\infty,0)$ and $n\in\mathbb N\cup\{0\}$ ,
Therefore,
Letting $\theta\uparrow 0$ in this relation we have
Since
we have
Reasoning by induction on $n\in\mathbb N$ , by the relation (31) we immediately have that $\mathbb E Z^n<\infty$ (note that, for $n=1$ , we have $\mathbb E Z=1/(1-\mathbb E P)<\infty$ , and that, for $n\geq 2$ , all the moments of Z involved in the right-hand side of (31) are of order less than or equal to $n-1$ ). The formula (25) follows by letting $\theta\uparrow 0$ in (29) and using the equalities in (30).
Step 3: Computing $\mathbb E Z^2$ , $\mathbb E Z^3$ , and $\mathbb E Z^4$ . The claimed expressions for the second, third, and fourth moments of Z easily follow by (25) (or (31)). For instance, for the second moment, the formula gives
from which the claimed expression for $\mathbb E Z^2$ immediately follows (recalling that $\mathbb E Z=(1-\mathbb E P)^{-1}$ ). We omit the computations for the third and fourth moments of Z.
7.1. Generalized compound Hawkes processes with Poisson offspring distribution
7.1.1. Gaussian approximation
In this paragraph we suppose that Z is distributed as the total progeny of a Galton–Watson process with one ancestor and offspring distribution the Poisson law with mean $h\in (0,1)$ , and that $\{X_n\}_{n\geq 1}$ is a Poisson process on ${\mathbb R}^d$ with intensity function $\lambda(x)=\lambda\textbf{1}_B(x)$ , $x\in\mathbb R^d$ , for some $\lambda>0$ and some Borel set $B\subseteq\mathbb R^d$ . We denote by $V_{\mathrm{Poisson}}$ the corresponding generalized compound Hawkes process and by $W_{\mathrm{Poisson}}$ the functional (8) with $V_{\mathrm{Poisson}}$ in place of V.
Corollary 3. Under the foregoing assumptions and notation, if the Borel sets B and C are such that $0<\mathrm{Leb}(B\cap C)<+\infty$ and ${\mathbb E} M^2\in (0,\infty)$ , then the bounds (18) and (19) hold with $W_{\mathrm{Poisson}}$ in place of W,
and
7.1.2. Moderate deviations, Bernstein-type concentration inequalities, and normal approximation bounds with Cramér correction term
In this paragraph we suppose that, for each $\ell\in\mathbb N$ , $Z=Z_1^{\ell}({\mathbb R}^d,{\mathbb R})$ is distributed as the total progeny of a Galton–Watson process with one ancestor and offspring distribution the Poisson law with mean $h\in (0,1)$ , and that $\{X_n^{(\ell)}\}_{n\geq 1}$ is a Poisson process on ${\mathbb R}^d$ with intensity function $\lambda_\ell (x)=\lambda_\ell \boldsymbol 1_{B_\ell}(x)$ , $x\in\mathbb R^d$ , for positive constants $\lambda_\ell>0$ and Borel sets $B_\ell\subseteq\mathbb{R}^d$ , $\ell \in \mathbb N$ . We denote by $V_{\mathrm{Poisson}}^{(\ell)}$ the corresponding generalized compound Hawkes process and by $W_{\mathrm{Poisson}}^{(\ell)}$ the functional (10) with $V_{\mathrm{Poisson}}^{(\ell)}$ in place of $V_\ell$ .
Corollary 4. Let the foregoing assumptions and notation prevail, and let the Borel sets $B_\ell$ and $C_\ell$ , $\ell \in \mathbb N$ , be such that $0<\mathrm{Leb}(B_\ell \cap C_\ell)<+\infty$ , $\ell\in\mathbb N$ , ${\mathbb E} M^2>0$ , and
Then the following hold:
-
(i) If $h-1-\log h\geq 1$ , then the sequence $\{W_{\mathrm{Poisson}}^{(\ell)}(C_\ell)\}_{\ell\geq 1}$ satisfies $\textbf{MDP}(\gamma,\{\Delta_\ell\}_{\ell\in\mathbb N})$ , $\textbf{BCI}(\gamma,\{\Delta_\ell\}_{\ell\in\mathbb N})$ , $\textbf{NACC}(\gamma,\{\Delta_\ell\}_{\ell\in\mathbb N})$ , where
\[\Delta_\ell\;:\!=\;h\sqrt{\lambda_\ell\text{Leb}(B_\ell \cap C_\ell)}.\] -
(ii) If $h-1-\log h<1$ , then the sequence $\{W_{\mathrm{Poisson}}^{(\ell)}(C_\ell)\}_{\ell\geq 1}$ satisfies $\textbf{MDP}(\gamma,\{\Delta_\ell\}_{\ell\in\mathbb N})$ , $\textbf{BCI}(\gamma,\{\Delta_\ell\}_{\ell\in\mathbb N})$ , $\textbf{NACC}(\gamma,\{\Delta_\ell\}_{\ell\in\mathbb N})$ , where
\[\Delta_\ell \;:\!=\; h(h-1-\log h)^3\sqrt{\lambda_\ell \text{Leb}(B_\ell \cap C_\ell)}.\]
Example 1. Note that the condition (34) is trivially satisfied with $\gamma=0$ if M is a constant different from zero. Similarly, if M is a uniform variable on [0, D] with $D> 0$ , the moments of M satisfy
Assumption (34) holds with $\gamma=1$ even if M is exponentially distributed with mean $\mu^{-1}$ , for some $\mu>0$ . Indeed, in such a case we have
Another example is M Gaussian distributed with mean zero and variance $\sigma^2$ . Indeed, in such a case we have
By distinguishing the two cases $m=2p+1$ and $m=2p+2$ for $p=2,3,\cdots$ , we conclude that
which implies that the condition (34) is satisfied with $\gamma=\frac{1}{2}.$
The proof of the corollary exploits the following lemmas, which are proved in Appendix A.
Lemma 3. For any $\nu>0$ and any integer $m\geq 2$ we have
with
Lemma 4. The function
is such that $f(x)\in (0,1)$ .
Proof of Corollary 4. It is well-known that the total progeny Z of a sub-critical Galton–Watson process with one ancestor and Poisson offspring law with mean $h\in (0,1)$ follows the Borel distribution (cf. [Reference Privault37] and the references therein), i.e.,
Therefore, by Stirling’s inequality, for $m\geq 3$ we have
Using Lemma 3 with $\nu\;:\!=\;h-1-\log h>0$ , we have
We now give separately the proofs of Parts (i) and (ii); in both cases we shall apply Corollary 2.
Proof of Part (i).
If $\nu\geq 1$ , then, for any $m\geq 3$ ,
Combining this inequality with (35), for $m\geq 3$ and $\ell\in\mathbb N$ , we have
since $h<1$ and $\sqrt{2\pi}>1$ . Combining this latter inequality with the assumption (34) we have that the condition (22) is satisfied with $\Delta_\ell=h\sqrt{\lambda_\ell \text{Leb}(B_\ell \cap C_\ell)}$ , and the claim follows by Corollary 2.
Proof of Part (ii).
Now we suppose $\nu < 1$ . Since
by (35) we have
Since $\nu <1$ , we have
Therefore
By Lemma 4 we have $u_h^{-1} <1$ . Therefore, for any $m\geq 3$ and $\ell\in\mathbb N$ , we have
Combining this latter inequality with the assumption (34) we have that the condition (22) is satisfied with $\Delta_\ell \;:\!=\; h\nu^3\sqrt{\lambda_\ell \text{Leb}(B_\ell \cap C_\ell)}$ , and the claim follows by Corollary 2.
7.2. On the Gaussian approximation bound in the Kolmogorov distance and the normal approximation with Cramér correction term
The aim of this section is to illustrate, by means of a simple example, the differences and the analogies between Gaussian approximation bounds in the Kolmogorov distance and normal approximations with Cramér correction term.
Let $W_{\mathrm{Poisson}}^{(\ell)}$ , $\ell\in\mathbb N$ , be defined as at the beginning of Section 7.1.2, with $\{(B_\ell,C_\ell)\}_{\ell\in\mathbb N}\subset\mathcal{B}(\mathbb{R}^d)^2$ a sequence of Borel sets such that $0<\mathrm{Leb}(B_\ell\cap C_\ell)<\infty$ , $\ell\in\mathbb N$ , and $\lambda_\ell\mathrm{Leb}(B_\ell\cap C_\ell)\to+\infty$ as $\ell\to+\infty$ . Assume $M\equiv 1$ (so that (34) holds with $\gamma=0$ ), and let $\Delta_\ell$ be defined as in Part (i) or (ii) of Corollary 4. We know that $\{W_{\mathrm{Poisson}}^{(\ell)}(C_\ell)\}_{\ell\geq 1}$ satisfies $\textbf{NACC}(0,\{\Delta_\ell\}_{\ell\in\mathbb N})$ .
For a fixed $0<r<1/3$ , let $\ell^*$ be sufficiently large so that
where $c_0$ is the positive constant which appears in the definition of $\textbf{NACC}(0,\{\Delta_\ell\}_{\ell\in\mathbb N})$ . Setting $x_\ell\;:\!=\;\Delta_{\ell}^r$ , for all $\ell\geq\ell^*$ , we have
Since $|L_{\ell,x_\ell}^{+}|\leq c_2\Delta_\ell^{3r-1}$ and $|\theta_{\ell,x_\ell}^+|\leq 1$ , we have
Bounding the Gaussian tail from above, we obtain
where either $u_h\;:\!=\;h$ or $u_h\;:\!=\;h(h-1-\log h)^3$ . On the other hand, the bound (19) and the relations (32) and (33) yield
Clearly the rate (38) is much faster than that of (39). Now let $x\in (0,\infty)$ be arbitrarily fixed. For $\ell$ large enough we have $x\in [0,c_0\Delta_\ell]$ , and so by Corollary 4, for all $\ell$ large enough, we have
Clearly, the same rate is provided by the bound (19) and the relations (32) and (33). We emphasize that (i) the inequality (19) and the relations (32) and (33) do indeed yield an explicit bound on the quantity $|\mathbb{P}(W_{\mathrm{Poisson}}^{(\ell)}(C_\ell)\geq x)-\mathbb{P}(G\geq x)|$ , for any $x\in\mathbb R$ and any $\ell\in\mathbb N$ ; (ii) an explicit bound on the quantity $|\mathbb{P}(W_{\mathrm{Poisson}}^{(\ell)}(C_\ell)\geq x)-\mathbb{P}(G\geq x)|$ , for any $x\in\mathbb R$ and any $\ell\in\mathbb N$ , is not amenable via the normal approximation with Cramér correction term (for various obvious reasons).
7.3. Comparison with some related literature
7.3.1. Gaussian approximation
Let N be a classical Hawkes process on $(0,\infty)$ with parameters $(\lambda,g)$ . Then Corollary 3 with $B=(0,\infty)$ and $C=(0,t]$ , $t>0$ , gives explicit bounds for the Gaussian approximation of
both in the Wasserstein and in the Kolmogorov distance. Note that for the fertility function g we assume only the standard stability condition $h\;:\!=\;\int_0^\infty g(t)\,\mathrm{d}t\in (0,1)$ .
It is worthwhile to compare these bounds with the ones in [Reference Hillairet, Huang, Khabou and Réveillac23, Reference Khabou, Privault and Réveillac26]. Theorem 3.13 in [Reference Hillairet, Huang, Khabou and Réveillac23] gives a bound of the kind
for some constant $c>0$ which is not explicitly computed and for specific choices of g (exponential and Erlang). This result has been extended in [Reference Khabou, Privault and Réveillac26] to fertility functions $g\;:\; [0,\infty)\to [0,\infty)$ such that $h\in (0,1)$ and $\int_0^\infty t g(t)\mathrm{d}t<\infty$ . The techniques used in [Reference Hillairet, Huang, Khabou and Réveillac23, Reference Khabou, Privault and Réveillac26] are based on the Poisson embedding construction of Hawkes processes and the Malliavin calculus on the Poisson space. These ideas were previously used in [Reference Torrisi44, Reference Torrisi45] for the purpose of Gaussian and Poisson approximation of some classes of nonlinear Hawkes processes. Note that, in contrast with classical Hawkes processes, nonlinear Hawkes processes (introduced in [Reference Brémaud and Massoulié8]) do not have a Poisson cluster representation. For Poisson cluster processes (such as classical Hawkes processes), the number of points on some measurable set can be represented as an integral with respect to a suitable Poisson random measure. As a consequence, results on the Gaussian approximation of the number of points on a measurable set can be obtained by applying the general results in [Reference Last, Peccati and Schulte30].
7.3.2. Moderate deviations
In this section we compare Corollary 4 with a couple of related results in the literature. Firstly, we prove that Corollary 4, when specialized to a classical Hawkes process on $(0,\infty)$ , implies the same moderate deviation principle provided in [Reference Zhu47] (see Theorem 1 therein), with an alternative assumption on the fertility function of the process. We refer the reader to [Reference Gao and Wang19] for sample-path moderate deviation principles, on the space of càdlàg functions on [0, 1] equipped with the Skorokhod topology, for Poisson cluster point processes on the line; see [Reference Gao and Wang19, Theorem 2]. Secondly, we compare Corollary 4 (again specialized to a classical Hawkes process on $(0,\infty)$ ) with Theorem 8 in [Reference Gao and Zhu20]. Hereafter, N denotes a classical Hawkes process on $(0,\infty)$ with parameters $(\lambda,g)$ , and we assume that g satisfies the usual stability condition $h\;:\!=\;\int_0^\infty g(t)\,\mathrm{d}t\in (0,1)$ .
Corollary 4 with $\lambda_\ell=\lambda>0$ , $B_\ell=(0,\infty)$ , and $C_\ell=(0,\ell]$ , $\ell\in\mathbb N$ , and $M\equiv 1$ yields that, for any sequence of positive numbers $\{a_\ell\}_{\ell\in\mathbb N}$ such that $\lim_{\ell\to\infty}a_\ell=+\infty$ and $\lim_{\ell\to\infty}\frac{a_{\ell}}{\sqrt{\ell}}=0$ , for any Borel set $B\subseteq{\mathbb R}$ ,
Reasoning by contradiction, one has that, for any function $a(\cdot)$ such that $\lim_{t\to\infty}a(t)=+\infty$ and $a(t)=o(\sqrt t)$ , as $t\to+\infty$ , and any Borel set $B\subseteq\mathbb R$ ,
It is easily realized (cf. [Reference Bacry, Delattre, Hoffmann and Muzy4] for example) that, under the stability condition,
So, letting $b(\cdot)$ denote a function such that $\sqrt t=o(b(t))$ and $b(t)=o(t)$ , as $t\to\infty$ , and setting $a(t)\;:\!=\;b(t)/\sqrt{\mathbb{V}\mathrm{ar}(N((0,t]))}$ , we have that, for any Borel set $B\subseteq\mathbb R$ ,
By Lemma 5 in [Reference Bacry, Delattre, Hoffmann and Muzy4], we have that, if in addition to the stability condition $h\in (0,1)$ we assume
then
So, for an arbitrarily fixed $\delta>0$ , there exists $t_\delta$ such that for any $t>t_\delta$ it holds that
Therefore, for an arbitrarily fixed $\delta>0$ , there exists $t_\delta$ such that for any $t>t_\delta$ we have
Hence the processes $\{\frac{N((0,t])-\mathbb E N((0,t])}{b(t)}\}_{t>0}$ and $\{\frac{N((0,t])-\lambda t/(1-h)}{b(t)}\}_{t>0}$ are exponentially equivalent (see [Reference Dembo and Zeitouni12, Definition 4.2.10, p. 130]). Therefore, by [Reference Dembo and Zeitouni12, Theorem 4.2.13, p. 130], the relation (40) holds with $\mathbb E N((0,t])$ replaced by $\lambda t/(1-h)$ . Thus we recover the moderate deviation principle proved in [Reference Zhu47] under an alternative condition on g (the latter paper assumes the stability condition and $\sup_{t>0}t^{3/2}g(t)<\infty$ , which is clearly different from (41)).
Theorem 8 in [Reference Gao and Zhu20] states that, under the assumption
(which is clearly stronger than (41)), for any $y(t)=o(t^{1/2-1/m})$ as $t\to+\infty$ , any integer $m\geq 3$ , and any positive function $b(\cdot)$ , it holds that
where $\{c_i\}_{i=1,\cdots,m-2}$ are real coefficients that can be computed explicitly; for instance, one has $c_2=\frac 12 $ . In particular, if $b(t)=o( t^{2/3})$ , as $t\to+\infty$ , then by choosing
we have
which is a more precise form of the relation (40) for the Borel set $B=[K,+\infty)$ . Note that, unlike the formula (40), which is valid for any Borel set B, the formula (42) gives asymptotic estimates only for half-lines.
7.3.3. Bernstein-type concentration inequalities
In this section we present some consequences of Corollary 4 concerning stationary compound Hawkes processes on the line, i.e., $B_T\;:\!=\;{\mathbb R}$ , observed on the time interval $C_T\;:\!=\;(0,T]$ ; here T replaces $\ell$ to emphasize the dependence on time.
If we interpret the mark M as the claim that an insurer must pay to an insurance policy holder, then the variable V((0, T]) (defined by (3)) represents the total loss incurred by the insurer in the time interval (0, T].
Assume that the claims arrivals are modeled by the points of a Hawkes process of baseline intensity $\lambda>0$ and Poisson offspring distribution with mean $h\in (0,1)$ satisfying $h-1-\log h \geq 1$ . Assume moreover that the mark M follows the exponential distribution of parameter $\mu^{-1} \in (0,+\infty)$ . Then Corollary 4 yields
where $\Delta_T=h\sqrt{\lambda T}$ and $\gamma=1$ by virtue of Example 1. By stationarity, this inequality can be rewritten as
which yields a non-asymptotic lower bound on the probability that the total loss is within x times its standard deviation.
Another quantity of interest for insurers is the probability that the total loss greatly exceeds its expected value. The Bernstein-type concentration inequality, being valid for any $x\geq 0$ , yields an upper bound on this probability. Indeed, by choosing
we have
If the time horizon T satisfies $T \geq \frac{2^{11/2}h}{(k-1)^3 \lambda (1-h)^{3/2}}$ , then the inequality (43) simplifies to
A similar (non-asymptotic) inequality appears in Proposition 2.1 of [Reference Reynaud-Bouret and Roy39], albeit working only for stationary Hawkes processes on the line whose fertility functions have compact support, and involving quantities that are not explicitly known. We also point out that, by specializing the inequality (44) to the simple Hawkes process, that is, with constant marks ( $\gamma=0$ by virtue of Example 1), we can find a decay rate for the tail probability similar to the one given in [Reference Reynaud-Bouret and Roy39], but with explicit constants.
7.4. Generalized compound Hawkes processes with binomial offspring distribution
7.4.1. Gaussian approximation
In this paragraph we suppose that Z is distributed as the total progeny of a Galton–Watson process with one ancestor and offspring distribution the binomial law with parameters (h, p), with $h\in\mathbb N$ and $p\in (0,1)$ such that $hp\in (0,1)$ . We assume that $\{X_n\}_{n\geq 1}$ is a Poisson process on ${\mathbb R}^d$ with intensity function $\lambda(x)=\lambda\textbf{1}_B(x)$ , $x\in\mathbb R^d$ , for some positive constant $\lambda>0$ and some Borel set $B\subseteq\mathbb{R}^d$ . We denote by $V_{\mathrm{binomial}}$ the corresponding generalized compound Hawkes process and by $W_{\mathrm{binomial}}$ the functional (8) with $V_{\mathrm{binomial}}$ in place of V.
Corollary 5. Under the foregoing assumptions and notation, if the Borel sets B and C are such that $0<\mathrm{Leb}(B\cap C)<+\infty$ and ${\mathbb E} M^2\in (0,\infty)$ , then the bounds (18) and (19) hold with $W_{\mathrm{binomial}}$ in place of W,
and
Here $(h)_n\;:\!=\;h(h-1)\ldots (h-(n-1))\textbf{1}_{\{h\geq n\}}$ .
Proof. Similar to the proof of Corollary 3.
7.4.2. Moderate deviations, Bernstein-type concentration inequalities, and normal approximation bounds with Cramér correction term
In this paragraph we suppose that, for each $\ell\in\mathbb N$ , $Z=Z_1^{\ell}({\mathbb R}^d,{\mathbb R})$ is distributed as the total progeny of a Galton–Watson process with one ancestor and offspring distribution the binomial law with parameters (h, p), with $h\in\mathbb N$ and $p\in (0,1)$ such that $hp\in (0,1)$ . We assume that $\{X_n^{(\ell)}\}_{n\geq 1}$ is a Poisson process on ${\mathbb R}^d$ with intensity function $\lambda_\ell (x)=\lambda_\ell \boldsymbol 1_{B_\ell}(x)$ , $x\in\mathbb R^d$ , for positive constants $\lambda_\ell>0$ and Borel sets $B_\ell\subseteq\mathbb R^d$ , $\ell \in \mathbb N$ . We denote by $V_{\mathrm{binomial}}^{(\ell)}$ the corresponding generalized compound Hawkes process and by $W_{\mathrm{binomial}}^{(\ell)}$ the functional (10) with $V_{\mathrm{binomial}}^{(\ell)}$ in place of $V_\ell$ .
Corollary 6. Let the foregoing assumptions and notation prevail, and let the Borel sets $B_\ell$ and $C_\ell$ , $\ell \in \mathbb N$ , be such that $0<\mathrm{Leb}(B_\ell \cap C_\ell)<+\infty$ , $\ell\in\mathbb N$ , ${\mathbb E} M^2>0$ . Assume (34). Then the following hold:
-
(i.1) If $h=1$ and $p\leq\mathrm{e}^{-1}$ , then the sequence $\{W_{\mathrm{binomial}}^{(\ell)}(C_\ell)\}_{\ell\geq 1}$ satisfies $\textbf{MDP}(\gamma,\{\Delta_\ell\}_{\ell\in\mathbb N})$ , $\textbf{BCI}(\gamma,\{\Delta_\ell\}_{\ell\in\mathbb N})$ , $\textbf{NACC}(\gamma,\{\Delta_\ell\}_{\ell\in\mathbb N})$ , where
\[\Delta_\ell\;:\!=\; \frac{p}{1.05(1-p)}\sqrt{\lambda_\ell \text{Leb}(B_\ell \cap C_\ell)}.\] -
(i.2) If $h=1$ and $p>\mathrm{e}^{-1}$ , then the sequence $\{W_{\mathrm{binomial}}^{(\ell)}(C_\ell)\}_{\ell\geq 1}$ satisfies $\textbf{MDP}(\gamma,\{\Delta_\ell\}_{\ell\in\mathbb N})$ , $\textbf{BCI}(\gamma,\{\Delta_\ell\}_{\ell\in\mathbb N})$ , $\textbf{NACC}(\gamma,\{\Delta_\ell\}_{\ell\in\mathbb N})$ , where
\[\Delta_\ell\;:\!=\;\frac{p(\log p)^4}{1.05(1-p)}\sqrt{\lambda_\ell \text{Leb}(B_\ell \cap C_\ell)}.\] -
(ii.1) If $h\geq 2$ and $ph(h(1-p)/(h-1))^{h-1}\leq\mathrm{e}^{-1}$ , then the sequence $\{W_{\mathrm{binomial}}^{(\ell)}(C_\ell)\}_{\ell\geq 1}$ satisfies $\textbf{MDP}(\gamma,\{\Delta_\ell\}_{\ell\in\mathbb N})$ , $\textbf{BCI}(\gamma,\{\Delta_\ell\}_{\ell\in\mathbb N})$ , $\textbf{NACC}(\gamma,\{\Delta_\ell\}_{\ell\in\mathbb N})$ , where
\[\Delta_\ell\;:\!=\;\left(1+\sqrt{1+\frac{1}{h-1}} \frac{\mathrm e^{\frac{1}{24\cdot25}}(1-p)}{p(h-1)\sqrt {2\pi}}\right)^{-1}\sqrt{\lambda_\ell \text{Leb}(B_\ell \cap C_\ell)}.\] -
(ii.2) If $h\geq 2$ and $ph(h(1-p)/(h-1))^{h-1}>\mathrm{e}^{-1}$ , then the sequence $\{W_{\mathrm{binomial}}^{(\ell)}(C_\ell)\}_{\ell\geq 1}$ satisfies $\textbf{MDP}(\gamma,\{\Delta_\ell\}_{\ell\in\mathbb N})$ , $\textbf{BCI}(\gamma,\{\Delta_\ell\}_{\ell\in\mathbb N})$ , $\textbf{NACC}(\gamma,\{\Delta_\ell\}_{\ell\in\mathbb N})$ , where
\[\Delta_\ell \;:\!=\;-\frac{\left(1+\sqrt{1+\frac{1}{h-1}} \frac{\mathrm e^{\frac{1}{24\cdot25}}(1-p)}{p(h-1)\sqrt {2\pi}}\right)^{-1}\left[\log\left(ph\left( \frac{h(1-p)}{h-1}\right) ^{h-1}\right)\right]^3}{1.16}\sqrt{\lambda_\ell \text{Leb}(B_\ell \cap C_\ell)}.\]
Proof. It is well known that the total progeny Z of a sub-critical Galton–Watson process with one ancestor and binomial offspring law with parameters (h, p) follows the Consul distribution, i.e.,
By Stirling’s upper and lower bounds on the factorial, for $k\geq 1$ and $h\geq 2$ , we have
where the inequality (46) follows if we notice that $kh/[k(h-1)+1]\leq h/(h-1)$ , and the last inequality follows if we notice that
and that $\left(1\pm\frac{1}{n}\right)^n \leq\mathrm e^{\pm 1},$ $n\geq 1$ . By this latter inequality (with the sign $+$ ) and the Stirling lower bound on the factorial, we have
Therefore, for $k\geq 2$ and $h\geq 2$ , we have
We now distinguish between two cases: $h=1$ and $h\geq 2$ .
Case $\textit{h=1.}$
Since $\binom{kh}{k-1}=k$ , by (45), for any $m\in\mathbb N$ we have
Using Lemma 3 we have
Proof of Part (i.1).
If $p\leq\mathrm{e}^{-1}$ , then $\nu_1 \geq 1$ ; therefore, setting $u_p\;:\!=\;1.05 \frac{1-p}{p}$ , by (48), for any $m\geq 3$ and $\ell\in\mathbb N$ , we have
where the latter inequality follows if we notice that $u_p^{-1}\leq 1$ (indeed $\frac{1}{p}-1 \geq \log \left( \frac{1}{p}\right) \geq 1$ ), and so $u_p\geq 1$ . Combining (49) with (34), we have that the condition (22) is satisfied with $\Delta_\ell \;:\!=\; \frac{p}{1.05(1-p)}\sqrt{\lambda_\ell \text{Leb}(B_\ell\cap C_\ell)}$ , and the claim follows by Corollary 2.
Proof of Part (i.2).
If $p>\mathrm{e}^{-1}$ , then $\nu_1<1$ ; therefore, setting $u_p\;:\!=\;\frac{1-p}{p\nu_1^3}$ , by (48), for any $m\geq 3$ and $\ell\in\mathbb N$ ,
Since the function $(\mathrm{e}^{-1},1)\ni p\mapsto u_p=\frac{1-p}{-p(\log p)^3}$ is increasing and $\lim_{p\to\mathrm{e}^{-1}}u_p=\mathrm{e}-1>1$ , we have that $u_p>1$ and so $1.05 u_p > 1$ , $p\in (\mathrm{e}^{-1},1)$ . Therefore, for all $m\geq 3$ and $\ell\in\mathbb N$ , we have
Combining this latter inequality with (34), we have that the condition (22) is satisfied with $\Delta_\ell \;:\!=\; \frac{p(\log p)^4}{1.05(1-p)}\sqrt{\lambda_\ell \text{Leb}(B_\ell \cap C_\ell)}$ , and the claim follows by Corollary 2.
$\textit{Case}\,\,{h\geq 2.}$
If $h\geq 2$ , then by (47) we have
Combining this relation with (45) we have
where
Now we are going to verify that $\nu_2>0$ , i.e.,
Setting $x\;:\!=\;ph \in (0,1)$ , we have
The relation (50) follows if we notice that the mapping $x\in (0,1)\mapsto x\mathrm e^{1-x}$ is an increasing bijection from (0,1) to itself. Therefore, by Lemma 3, for any $m\geq 3$ , we have
where
Proof of Part (ii.1).
If $ph(h(1-p)/(h-1))^{h-1}\leq\mathrm{e}^{-1}$ , i.e., $\nu_2\geq 1$ , then combining (36) (with $\nu_2$ in place of $\nu$ ) with (51), we have
where we used that $(u_{p,h})^{-1}<1$ . Combining (52) with (34), we have that the condition (22) is satisfied with $\Delta_\ell\;:\!=\;(u_{p,h})^{-1}\sqrt{\lambda_\ell \text{Leb}(B_\ell \cap C_\ell)}$ , and the claim follows by Corollary 2.
Proof of Part (ii.2).
If $ph(h(1-p)/(h-1))^{h-1}>\mathrm{e}^{-1}$ , i.e., $\nu_2<1$ , then combining (37) (with $\nu_2$ in place of $\nu$ ) with (51), we have
Since $\nu_2 <1$ , we have
Therefore
Using Lemma 4, we have that $(\widetilde{u}_{p,h})^{-1} <(u_{p,h})^{-1}\nu_2^2<(u_{p,h})^{-1}<1$ . Therefore
Combining this latter inequality with (34), we have that the condition (22) is satisfied with $\Delta_\ell \;:\!=\;\sqrt{\lambda_\ell \text{Leb}(B_\ell \cap C_\ell)}\nu_2(\widetilde{u}_{p,h})^{-1}$ , and the claim follows by Corollary 2.
8. Application to a class of interferences in a wireless communication model
8.1. Gaussian approximation
In this section we apply Theorem 3 to the interference $I(\{\textbf{0}\})$ (see e.g. Remark 1) when the Poisson process of node locations has a piecewise constant intensity function of the form $\lambda(x)\;:\!=\;\lambda\textbf{1}_B(x)$ , for some $\lambda>0$ and $B\in\mathcal{B}(\mathbb R^2)$ . In such a case we have quite explicit upper bounds on the Wasserstein and Kolmogorov distances. The following corollary (whose proof is straightforward, and therefore omitted) allows for explicit bounds for some classes of signal power distributions and attenuation functions.
Corollary 7. Let $\lambda(x)\;:\!=\;\lambda\textbf{1}_B(x)$ , $x\in\mathbb R^2$ , for some $\lambda>0$ and $B\in \mathcal{B}(\mathbb R^2)$ such that
Then
and
Example 2. If the path loss function is the Hertzian attenuation function, i.e., $A(x)\;:\!=\;\max\{R,\|x\|\}^{-\alpha}$ , $x\in{\mathbb R}^2$ , for some $R>0$ and $\alpha>1$ , $B\;:\!=\;{\mathbb R}^2$ , and ${\mathbb E} Z_1^2\in (0,\infty)$ , then Corollary 7 applies with
8.2. Moderate deviations, Bernstein-type concentration inequalities, and normal approximation bounds with Cramér correction term
In this section we apply Theorem 4 to the sequence $\{I_\ell(\{\textbf{0}\})\}_{\ell\geq 1}$ (defined in Remark 2) when the Poisson processes of node locations have piecewise deterministic intensity functions of the form $\lambda_\ell(x)\;:\!=\;\lambda_\ell\textbf{1}_{B_\ell}(x)$ , $x\in\mathbb R^2$ , for some sequences $\{\lambda_\ell\}_{\ell\geq 1}\subset (0,\infty)$ and $\{B_\ell\}_{\ell\geq 1}\subset\mathcal{B}(\mathbb R^2)$ . In such a case the assumption (17) is greatly simplified. The following corollary (whose proof is straightforward, and therefore omitted) holds.
Corollary 8. Let $\{B_\ell\}_{\ell \in \mathbb N}\subset\mathcal{B}(\mathbb R^d)$ and $\{\mathbb Q_\ell\}_{\ell\geq 1}$ be such that
and assume that there exist a non-negative constant $\gamma\geq 0$ and a positive numerical sequence $\{\Delta_\ell\}_{\ell \in \mathbb N}$ such that
Then the sequence $\{I_\ell(\{\textbf{0}\})\}_{\ell\geq 1}$ satisfies an $\textbf{MDP}(\gamma,\{\Delta_\ell\}_{\ell\in\mathbb N})$ , a $\textbf{BCI}(\gamma,\{\Delta_\ell\}_{\ell\in\mathbb N})$ , and an $\textbf{NACC}(\gamma,\{\Delta_\ell\}_{\ell\in\mathbb N})$ .
Example 3. Under the notation of Corollary 8, let us set $B_\ell\equiv\mathbb R^2$ for any $\ell\geq 1$ , suppose that $\mathbb Q_\ell$ is the exponential law with mean $\mu^{-1}$ , for some $\mu>0$ , and assume that the attenuation of the signal is Hertzian, i.e., $A(x)\;:\!=\;\max\{R,\|x\|\}^{-\alpha}$ , $x\in\mathbb R^2$ , for some constants $R>0$ and $\alpha>1$ . Then
So the assumption (53) of Corollary 8 is satisfied, and the left-hand side of the relation (54) reads, for any $\ell\in\mathbb N$ and $m\geq 3$ ,
Since $m\geq 3$ we have $\frac{\alpha m-m}{\alpha m-2}<1.$ So Corollary 8 yields that the sequence $\{I_\ell(\{\textbf{0}\})\}_{\ell\geq 1}$ satisfies an $\textbf{MDP}(0,\{\Delta_\ell\}_{\ell\in\mathbb N})$ , a $\textbf{BCI}(0,\{\Delta_\ell\}_{\ell\in\mathbb N})$ , and an $\textbf{NACC}(0,\{\Delta_\ell\}_{\ell\in\mathbb N})$ with
9. Conclusion
Exploiting the theory developed in [Reference Last, Peccati and Schulte30], we have provided explicit bounds on the Wasserstein and Kolmogorov distances between random variables lying in the first chaos of the Poisson space and the standard normal distribution. Relying on the findings in [Reference Saulis and Statulevicius40] and on a fine control of the cumulants of the first chaos on the Poisson space, we have also provided moderate deviations, Bernstein-type concentration inequalities, and normal approximation bounds with Cramér correction terms for the same random variables. We have applied these results to Poisson shot noise random variables, and in particular to generalized compound Hawkes point processes. As far as Hawkes processes are concerned, the results proven in this paper generalize many of the asymptotic theorems found in the literature [Reference Gao and Zhu20, Reference Hillairet, Huang, Khabou and Réveillac23, Reference Khabou, Privault and Réveillac26, Reference Zhu47] to the spatial case, eventually with a varying baseline intensity and with less constraining assumptions on the excitation kernels.
We point out that some Hawkes processes have a Galton–Watson representation but cannot easily be expressed as a Poisson integral of the type (12). The main example is that of a multivariate Hawkes process exhibiting both self-excitation and cross-excitation between many interacting nodes. Indeed, such a process does have a branching structure [Reference Embrechts, Liniger and Lin15], but a priori it does not fall within the context of this paper. To the best of our knowledge, we only have bounds on the Wasserstein metric between multivariate Hawkes processes with exponential kernels and their multivariate Gaussian limit, which is of order $O\left ( 1/\sqrt t \right)$ [Reference Khabou25].
Another interesting development of the results proven in this paper would be their extension to the whole path of the process, rather than the process evaluated at one instant. More specifically, we would like to find upper bounds on the distance between the centered and normalized path of the Poisson shot noise process, and its limiting Gaussian process in the space of càdlàg functions equipped with the Skorokhod metric, for example using the results provided in [Reference Barbour, Ross and Zheng5]. These approximation results are obviously more delicate to obtain, and to the best of our knowledge they have been studied only in a few works, such as [Reference Besançon, Coutin, Decreusfeond and Moyal6].
Appendix A. Proofs of Lemmas 1, 3, and 4
A.1. Proof of Lemma 1
The claim is clearly true if $\max\{\mathrm{Leb}(B\cap C),\mathbb E Z^m,{\mathbb E} M^m\}=+\infty$ . Therefore we assume $\max\{\mathrm{Leb}(B\cap C),{\mathbb E} Z^m,{\mathbb E} M^m\}<+\infty$ . We start with the obvious inequality
Using Hölder’s inequality we have
Raising this to the mth power, we obtain
Using the independence between $Z_1(C-x,{\mathbb R})$ and $(|M_{k,1}|)_{k \in \mathbb N}$ and Wald’s identity, we have
and finally the inequality (55) yields
Recalling that we denote by $\{Y_{1,k}\}_{k\geq 0}$ , $Y_{1,0}\;:\!=\;\textbf{0}$ the first components of the points of $Z_1(\cdot,\cdot)$ , we have
The mth power of the sum of indicators can be expanded by using the multinomial theorem, which yields
Here
denotes the multinomial coefficient. By (57) we have
where the latter inequality follows from another application of the multinomial theorem. The claim easily follows by (55), (56), and (58).
A.2. Proof of Lemma 3
Set $D\;:\!=\;\{z\in\mathbb C:\,\,\mathrm{Re}z>0\}$ and define $f(z)\;:\!=\;z^{m-1}\mathrm{e}^{-\nu z} $ , $z\in D$ , $m\geq 2$ , $\nu>0$ .
Clearly, f is analytic on D; we shall check later on that the following statements hold:
Therefore, by the Abel–Plana formula (see e.g. [Reference Butzer, Ferreira, Schmeisser and Stens11]) we have (note that $f(0)=0$ )
where we used that
and that the Euler gamma function $\Gamma(\cdot)$ computed at the integer m is equal to $(m-1)!$ . We proceed by bounding $|R_m|$ from above. We distinguish two cases: $m=2p$ and $m=2p+1$ , $p\in \mathbb N$ . If $m=2p$ we have
Thus
where the latter inequality follows from (62) with $\pi$ in place of $\nu$ . Similarly, we have
The claim follows by the relations (61), (63) and (64).
It remains to prove (59) and (60). We start by proving (59). Let $K\subset (0,\infty)$ be an arbitrary compact set. We have
Finally we prove (60). For any $x,y>0$ , we have
Therefore, for any $x>0$ , we have
where the latter equality follows from the relation (62) with $m-k$ in place of m and $2\pi$ in place of $\nu$ . Clearly, the right-hand side of the relation (65) is finite and tends to zero as $x\to+\infty$ . The proof is completed.
A.3. Proof of Lemma 4
A simple computation shows
Since $x-1-\log x>0$ for every $x\in (0,1)$ , the sign of f’ coincides with the sign of $g(x)\;:\!=\;3x-3-\log x$ , $x\in (0,1)$ . Studying the derivative of g, we see that g is increasing on $(1/3,1)$ and decreasing on $(0,1/3)$ , with a minimum at $x=1/3$ . Since $\lim_{x\to 0^+}g(x)=+\infty$ , $g(1/3)<0$ , and $\lim_{x\to 1^-}g(x)=0$ , we then have that there exists a unique $x^*\in (0,1/3)$ with $g(x^*)=0$ , g is positive on $(0,x^*)$ , and it is negative on $(x^*,1)$ . Therefore f has a maximum at $x^*$ , and consequently, for any $x\in (0,1)$ , we have
Here we used that $g(x^*)=0$ , that the mapping $(0,1)\in x\mapsto x(1-x)$ is increasing on $(0,1/2)$ , and that $x^*<1/3$ .
Acknowledgements
We wish to thank the editor and the anonymous referees for their careful reading and constructive comments, and Prof. Matthias Schulte for useful suggestions.
Funding information
M. Khabou was supported by the project EDDA (ANR-20-IADJ-0003) of the French National Research Agency (ANR); G. L. Torrisi was supported by group GNAMPA of INdAM. For the purpose of open access, the authors have applied a Creative Commons Attribution (CC BY) license to any author-accepted manuscript of this paper.
Competing interests
There were no competing interests to declare which arose during the preparation or publication process of this article.