1. Introduction
Let $\mathcal{S}_n$ be the symmetric group of permutations on the set of integers $\{1,\ldots,n\}$ where $n \geq 1$ . A permutation $\pi_{{n}} \in \mathcal{S}_n$ is said to have a descent at position $k \in \{1,\ldots,n-1\}$ if $\pi_{ n}(k)>\pi_{ n}(k+1)$ . Denote by $D_n=D_n(\pi_{ n})$ the random variable counting the number of descents of a permutation $\pi_{ n}$ chosen uniformly at random from $\mathcal{S}_n$ . We clearly have $D_1=0$ and, for all $n \geq 2$ ,
A host of results are available on the asymptotic behavior of the sequence $(D_n)$ . More precisely, we can find in [Reference Bóna3] that, for all $n \geq 2$ , $\mathbb{E}[D_n]= ({n-1})/{2}$ and $\text{Var}(D_n)=({n+1})/{12}$ . In addition, it is possible to get a connection with the generalized Pólya urn with two colors, also known as Friedman’s urn; see [Reference Friedman10] and Remark 2.1. In particular, for this construction we have, by [Reference Freedman9, Corollary 5.2], the almost sure (a.s.) convergence
Following the approach of [Reference Tanny17], see Section 3.2, it is also possible to construct a different sequence $(D_n)$ with the same marginal distribution using a sequence of independent random variables sharing the same uniform distribution on [0, 1]. For this construction, we directly obtain the same almost sure convergence (1.2), as noticed in [Reference Gnedin and Olshanski12, Section 7.3]. Nevertheless, the distribution of the process $(D_n)$ does not correspond to the one investigated in Section 2.
Four different approaches have been reported in [Reference Chatterjee and Diaconis5] to establish the asymptotic normality
We also refer the reader to the recent contribution of [Reference Garet11] that relies on the method of moments, as well as to the recent proof in [Reference Özdemir15] using a rather complicated martingale approach. Furthermore, denote by $L_n$ the number of leaves in a random recursive tree of size n. It is well known [Reference Zhang19] that $L_{n+1}=D_n+ 1$ . Hence, it has been proven in [Reference Bryc, Minda and Sethuraman4] that the sequence $(D_n/n)$ satisfies a large-deviation principle (LDP) with good rate function given by
where the asymptotic cumulant-generating function is of the form
The purpose of this paper is to go further in the analysis of the behavior of the number of descents by proving a sharp large-deviation principle (SLDP) for the sequence $(D_n)$ . We shall also establish a sharp concentration inequality involving the rate function I given by (1.4).
To be more precise, we propose two different approaches that lead us to an SLDP and a concentration inequality for the sequence $(D_n)$ . The first one relies on a martingale approach while the second one uses a miraculous link between the distribution of $(D_n)$ and the Irwin–Hall distribution, as pointed out in [Reference Tanny17]. On the one hand, the second method is more direct and simpler in establishing our results. On the other hand, the first approach is much more general and we are strongly convinced that it can be extended to other statistics on random permutations that share the same kind of iterative structure, such as the number of alternating runs [Reference Bóna3, Reference Stanley16] or the length of the longest alternating subsequence in a random permutation [Reference Houdré and Restrepo14, Reference Widom18]. Moreover, we have intentionally kept these two proof strategies in the manuscript in order to highlight that the martingale approach is as efficient and powerful as the direct method in terms of results.
The paper is organized as follows. Section 2 is devoted to our martingale approach which allows us to again find a direct proof of (1.2) and (1.3) and to propose new standard results for the sequence $(D_n)$ such as a law of iterated logarithm, a quadratic strong law, and a functional central limit theorem. The main results of the paper are given in Section 3. We establish an SLDP for the sequence $(D_n)$ as well as a sharp concentration inequality involving the rate function I. Three keystone lemmas are analyzed in Section 4. All the technical proofs are postponed to Sections 5–8.
2. Our martingale approach
We start by describing precisely the construction of the sequence $(D_n)$ on a unique probability space. Let us remark that this construction can be naturally linked to generalized Pólya urns; see Remark 2.1. We consider a sequence $(V_n)$ of independent random variables uniformly distributed on $\{1,\ldots,n\}$ . Then, we set $\pi_1=(1)$ and, for each $n\geq 1$ , we recursively define the permutation $\pi_{n+1}$ as
By a direct recursive argument, it is clear that, for each $n\geq 1$ , $\pi_n$ is uniformly distributed on $\mathcal{S}_n$ . Moreover, as explained in [Reference Özdemir15], it follows from (1.1) and (2.1) that, for all $n \geq 1$ ,
with $\mathcal{F}_n=\sigma(D_1, \ldots,D_n)$ . This means that
where the conditional distribution of $\xi_{n+1}$ given $\mathcal{F}_n$ is the Bernoulli $\mathcal{B}(p_n)$ distribution with parameter $p_n=({n-D_n})/({n+1})$ . Since $\mathbb{E}[\xi_{n+1} \mid \mathcal{F}_n]=p_n$ and $\mathbb{E}[\xi_{n+1}^2 \mid \mathcal{F}_n]=p_n$ , we deduce from (2.2) that
Moreover, let $(M_n)$ be the sequence defined for all $n \geq 1$ by
We obtain from (2.3) that
which means that $(M_n)$ is a locally square integrable martingale. We deduce from (2.4) that its predictable quadratic variation is given by
The martingale decomposition (2.5) allows us to again find all the asymptotic results previously established for the sequence $(D_n)$ such as the almost sure convergence (1.2) and the asymptotic normality (1.3). Some improvements to these standard results are as follows. To the best of our knowledge, the quadratic strong law and the law of iterated logarithm are new.
Proposition 2.1. We have the quadratic strong law:
Moreover, we also have the law of iterated logarithm:
In particular,
Denote by $D([0,\infty[)$ the Skorokhod space of right-continuous functions with left-hand limits. The functional central limit theorem extends the asymptotic normality (1.3); see a similar result in [Reference Gouet13] using generalized Pólya urns.
Proposition 2.2. We have the distributional convergence in $D([0,\infty[)$
where $(W_t)$ is a real-valued centered Gaussian process starting at the origin with covariance given, for all $0<s \leq t$ , by $\mathbb{E}[W_s W_t]= {s}/{12 t^2}$ . In particular, we again find the asymptotic normality (1.3).
The proofs are postponed to Section 8.
Remark 2.1. Relation (2.2) allows us to see the sequence $(D_n)$ as the sequence of the number of white balls in a two-color generalized Pólya urn [Reference Friedman10] with the following rule: at each step, one ball is drawn at random and then replaced with an additional ball of the opposite color.
3. Main results
3.1. Sharp large deviations and concentration
Our first result concerns the SLDP for the sequence $(D_n)$ , which nicely extends the LDP previously established in [Reference Bryc, Minda and Sethuraman4]. For any positive real number x, write $\{x\}=\lceil x \rceil - x$ .
Theorem 3.1. For any x in $\big]\frac12,1\big[$ , we have, on the right side,
where the value $t_x$ is the unique solution of $L^\prime(t_x)=x$ and $\sigma_x^2=L^{\prime \prime}(t_x)$ .
Our second result is devoted to an optimal concentration inequality involving the rate function I.
Theorem 3.2. For any $x \in \big]\frac12,1\big[$ and for all $n \geq 1$ , we have the concentration inequality
where the prefactor can be taken as
Remark 3.1. Let us denote by $A_n = A_n(\pi_{ n})$ the random variable counting the number of ascents of a permutation $\pi_{ n} \in S_n$ . Then, it is clear that $D_n(\pi_{ n})+A_n(\pi_{ n})=n-1$ . Moreover, by a symmetry argument, $D_n$ and $A_n$ share the same distribution. In particular, $D_n$ has the same distribution as $(n-1) - D_n$ . Consequently, for all $x \in \big]\frac12,1\big[$ , we have
which allows us to immediately extend the previous results to the left side.
Remark 3.2. One can observe from (3.1) or (3.2) that, for all $\varepsilon >0$ ,
That is the complete convergence of $(D_n/n)$ to $\frac12$ , which directly implies the almost sure convergence (1.2) for any construction of the sequence $(D_n)$ .
3.2. A more direct approach
An alternative approach to proving the SLDP and concentration inequalities for the sequence $(D_n)$ relies on a famous result from [Reference Tanny17] which says that the distribution of $D_n$ is nothing other than that of the integer part of the sum $S_n$ of independent and identically distributed random variables. More precisely, let $(U_n)$ be a sequence of independent random variables sharing the same uniform distribution on [0, 1]. Write $S_n= \sum_{k=1}^n U_k$ . Then we have, from [Reference Tanny17], that, for all $k \in [ 0, n-1 ]$ ,
This simply means that the distribution of $D_n$ is that of the integer part of the the Irwin–Hall distribution. The identity (3.3) is somewhat miraculous and it is really powerful in order to carry out a sharp analysis of the sequence $(D_n)$ . Once again, we would like to emphasize that this direct approach is only relevant for the study of $(D_n)$ , while our martingale approach is much more general. A direct proof of Theorem 3.1 is provided in Section 6, relying on the identity (3.3). It is also possible to use this direct approach in order to establish a sharp concentration inequality with the same shape as (3.2); see Remark 6.1.
3.3. Further considerations on concentration inequalities
We wish to compare our concentration inequality (3.2) with some classical ones. The first one is given by the well-known Azuma–Hoeffding inequality [Reference Bercu, Delyon and Rio2]. It follows from (2.6) that the predictable quadratic variation $\langle M \rangle_n$ of the martingale $(M_n)$ satisfies $\langle M \rangle_n \leq {s_n}/{4}$ where $s_n=\sum_{k=2}^n k^2$ . In addition, its total quadratic variation reduces to $[M]_n=\sum_{k=1}^{n-1}(M_{k+1}-M_{k})^2=s_n$ . Consequently, we deduce from an improvement of the Azuma–Hoeffding inequality given by [Reference Bercu, Delyon and Rio2, (3.20)] that, for any $x \in \big]\frac12,1\big[$ and for all $n \geq 1$ ,
We can observe that (3.2) is much sharper than (3.4) for all values of $x \in \big]\frac12,1\big[$ . Furthermore, by using (3.3), we can also infer a concentration inequality by means of Chernoff’s inequality. Indeed, for any $x \in \big]\frac12,1\big[$ and for all $n \geq 1$ ,
which is also rougher than (3.2).
4. Three keystone lemmas
Denote by $m_n$ the Laplace transform of $D_n$ defined, for all $t \in \mathbb{R}$ , by
We can observe that $m_n(t)$ is finite for all $t \in \mathbb{R}$ and all $n \geq 1$ since $D_n$ is finite. Let us introduce the generating function defined, for all $t \in \mathbb{R}$ and for all $z \in \mathbb{C}$ , by $F(t,z)=\sum_{n=0}^\infty m_n(t) z^n$ , where the initial value is such that, for all $t\in \mathbb{R}$ , $m_0(t)=1$ . Notice that the radius of convergence, denoted $R^F(t)$ , should depend on t and is positive since $|m_n(t)| \leq \textrm{e}^{n|t|}$ . Moreover, we easily have, for all $|z|<R^F(0)=1$ , $F(0,z) = {1}/({1-z})$ . Our first lemma is devoted to the calculation of the generating function F; see also [Reference Bryc, Minda and Sethuraman4, p. 865], where a similar expression was given without proof. We can observe that $k_0$ should be replaced by $1-k_0$ . Let us also remark that the recursive equation (4.4) was already given in [Reference Friedman10, Section 4].
Lemma 4.1. For all $t\in \mathbb{R}$ , $R^F(t) = {t}/({\textrm{e}^t-1})$ . Moreover, for all $t \in \mathbb{R}$ and for all $z \in \mathbb{C}$ such that $|z| < R^F(t)$ ,
Proof. It follows from (2.2) that, for all $t\in \mathbb{R}$ and for all $n \geq 1$ ,
However, we already saw that $p_n= ({n-D_n})/({n+1})$ , which implies that
Consequently, we obtain from (4.3) that, for all $t \in \mathbb{R}$ and for all $n\geq 1$ ,
We can observe that (4.4) remains true for $n=0$ . We deduce from (4.4) that, for all $|z|<R^F(t)$ ,
where the last equality comes from the fact that $|m_n^\prime(t)| \leq n m_n(t)$ allows us to apply the dominated convergence theorem in order to differentiate the series in t. Hence, we have shown that the generating function F is the solution of the partial differential equation
with initial value
We now proceed as in [Reference Flajolet, Gabarró and Pekari8] in order to solve the partial differential equation (4.5) via the classical method of characteristics; see, e.g., [Reference Zwillinger20]. Following this method, we first associate with the linear first-order partial differential equation (4.5) the ordinary differential system given by
where w stands for the generating function F. We assume in the following that $t>0$ , inasmuch as the proof for $t<0$ follows exactly the same lines. The equation binding w and t can be easily solved, and we obtain
The equation binding z and t leads to the ordinary differential equation
We find by the variation of constant method that
According to the method of characteristics, the general solution of (4.5) is obtained by coupling (4.7) and (4.8), namely
where f is a function which can be explicitly calculated from the boundary value in (4.6). We deduce from the conjunction of (4.7), (4.8), and (4.9) that, for all $t>0$ and for all $z \in \mathbb{C}$ such that $|z|<R^F(t)$ ,
It only remains to determine the exact value of the function f by taking into account the initial condition (4.6). We obtain from (4.10) with $z=0$ and replacing $-t$ by t that
Finally, the explicit solution (4.2) clearly follows from (4.10) and (4.11). Moreover, we can observe that the radius of convergence comes immediately from (4.2), which completes the proof of Lemma 4.1.
The global expression (4.2) of the generating function F allows us to deduce a sharp expansion of the Laplace transform $m_n$ of $D_n$ , as follows.
Lemma 4.2. For any $t\neq 0$ ,
where the remainder term $r_n(t)$ goes exponentially fast to zero as
Proof. Throughout the proof, we assume that $t \neq 0$ . It follows from (4.2) that F is a meromorphic function on $\mathbb{C}$ with simple poles given, for all $\ell \in \mathbb{Z}$ , by
By a slight abuse of notation, we still denote by F this meromorphic extension. Hereafter, for the sake of simplicity, we consider the function $\mathcal{F}$ defined, for all $z \in \mathbb{C}$ , by
where the function f was previously defined in (4.11) and the function $\xi$ is given, for all $z \in \mathbb{C}$ , by
By the same token, we also introduce the functions $\mathcal{G}$ and $\mathcal{H}$ defined, for all $z \in \mathbb{C}$ , by
where g and h are given, for all $z \in \mathbb{C}^*$ , by
We can immediately observe from (4.17) that $\mathcal{H} = \mathcal{F}-\mathcal{G}$ , which means that we have subtracted from $\mathcal{F}$ its simple pole at 0 to get $\mathcal{H}$ . Given a function $\mathcal{K}(t,z)$ analytic in z on some set $\{(t,z)\in \mathbb{R}\times \mathbb{C}, |z|\leq R^\mathcal{K}(t) \}$ , we denote by $m_n^\mathcal{K}(t)$ the coefficient of its Taylor series at point (t, 0), i.e. $\mathcal{K}(t,z)= \sum_{n=0}^\infty m_n^\mathcal{K}(t) z^n$ . Thanks to this notation, we clearly have $m_n^{\mathcal{F}} (t)= m_n^{\mathcal{G}} (t)+ m_n^{\mathcal{H}} (t)$ . Moreover, we deduce from (4.14) that
The first coefficient $m_n^{\mathcal{G}} (t)$ can be explicitly computed by
As a matter of fact, for all $z \in \mathbb{C}$ such that $|z| < R^{\mathcal{G}}(t)=t(\textrm{e}^t-1)^{-1}$ , it follows from (4.15) and (4.16) that
Consequently, as $m_n(t)=m_n^F(t)$ , we obtain from (4.18) that
which leads via (4.19) to
where the remainder term $r_n(t)$ is the ratio $r_n(t)=m_n^{\mathcal{H}} (t)/m_n^{\mathcal{G}} (t)$ . From now on, we focus our attention on a sharp upper bound for $m_n^{\mathcal{H}} (t)$ . The function h is meromorphic with simple poles at the points $2\textrm{i}\pi \mathbb{Z}^*$ . Moreover, for a given $t \neq 0$ , z is a pole of $\mathcal{H}$ if and only if $(\textrm{e}^t-1)z - t$ is a pole of h. Hence, the poles of $\mathcal{H}$ are given, for all $\ell \in \mathbb{Z}^*$ , by
In addition, its radius of convergence $R^\mathcal{H} (t)$ is nothing more than the shortest distance between 0 and one of these poles. Consequently, we obtain
Furthermore, it follows from Cauchy’s inequality that, for any $0<\rho(t)<R^\mathcal{H}(t)$ ,
where the norm in the numerator is
Since $\xi(t,\mathcal{C}(0,\rho(t)))$ coincides with the circle $\mathcal{C}( -t, |\textrm{e}^t- 1| \rho(t))$ , we deduce from the identity $\mathcal{H}(t,z)= h( \xi(t,z))$ that $\Vert \mathcal{H}(t,\cdot) \Vert_{\infty, \mathcal{C}(0,\rho(t))} = \Vert h \Vert_{\infty, \mathcal{C}({-}t, |\textrm{e}^t-1|\rho(t))}$ . Hereafter, we introduce a radial parameter
where $\alpha$ is a real number in the interval $]{-}t^2/4\pi^2, 1[$ . We also define the distance between the circle $\mathcal{C}({-}t, |\textrm{e}^t-1|\rho(t, \alpha))$ and the set of the poles of h,
We clearly have from the Pythagorean theorem that $\delta(t,\alpha)= \sqrt{t^2+ 4\pi^2} - \sqrt{t^2+ 4\alpha \pi^2}$ . In addition, we can easily check that
which ensures that
It follows from the maximum principle that $\Vert h \Vert_{\infty, \mathcal{C}({-}t, |\textrm{e}^t-1|\rho(t, \alpha))} \leq \Vert h \Vert_{\infty,\partial \mathcal{D} (L,\Lambda, \delta(t,\alpha))}$ where, for $L>0$ and $\Lambda>0$ large enough, $\mathcal{D} (L,\Lambda,\delta(t,\alpha))=\mathcal{B}(L,\Lambda)\cap{\mathcal{A}_h(\delta(t,\alpha))}^{\textrm{c}}$ is the domain given by the intersection of the box $\mathcal{B} (L,\Lambda)=\{z\in\mathbb{C},|\textrm{Re}(z)|<L,|\textrm{Im}(z)|<\Lambda\}$ and the complementary set of
On the one hand we have, for all $y \in \mathbb{R}$ , $|\textrm{e}^{L+\textrm{i}y}-1|\geq \textrm{e}^{L}-1$ and $|L+iy| \geq L$ , implying that, for all $y \in \mathbb{R}$ ,
By the same token, we also have, for all $y \in \mathbb{R}$ , $|\textrm{e}^{-L+\textrm{i}y}-1|\geq 1- \textrm{e}^{-L}$ and $|-L+iy| \geq L$ , leading, for all $y \in \mathbb{R}$ , to
On the other hand, we can choose $\Lambda$ of the form $\Lambda=(2 k+1) \pi $ for a value $k\in \mathbb{N}^*$ large enough. Then, for all $x \in \mathbb{R}$ , $\exp\!(x+(2k+1)\textrm{i}\pi)=-\exp\!(x)$ and $|x+(2k+1)\textrm{i}\pi|\geq(2k+1)\pi$ , implying that, for all $x \in \mathbb{R}$ ,
By letting L and $\Lambda$ go to infinity, we obtain
where $\mathcal{D} (\delta(t,\alpha))$ is the domain $\mathcal{D} (\delta(t,\alpha))= {\mathcal{A}_h(\delta(t,\alpha))}^{\textrm{c}}$ . We clearly have from (4.17) that, for all $z\in \partial \mathcal{D} (\delta(t,\alpha))$ with $|\textrm{Im}(z)| > \pi$ ,
Moreover, it follows from tedious but straightforward calculations that
which ensures that
In addition, we obtain from (4.23) that, for all $z\in \partial \mathcal{D} (\delta(t,\alpha))$ with $|\textrm{Im}(z)|=\pi$ ,
Hence, we find from (4.24), (4.25), and (4.26) that
We were not able to find an explicit maximum for the previous upper bound. However, it is not hard to see that
which gives us
Consequently, we deduce from (4.20), (4.21), (4.22), and (4.27) that, for all $t \neq 0$ and for all $n \geq 1$ ,
where
For the sake of simplicity, let $\Phi$ be the function defined, for all $\alpha \in\! ]{-}t^2/4\pi^2,1[$ , by
We can easily see that $\Phi$ is a convex function reaching its minimum for the value
Some numerical experiments show that this explicit value seems to be not far from being the optimal value that minimizes $\varphi_n(t,\alpha)$ . By plugging $\alpha$ into (4.29), we obtain from (4.28) that, for all $t \neq 0$ and for all $n \geq 1$ ,
Finally, (4.13) follows from (4.19) together with (4.30), which completes the proof of Lemma 4.2.
We now focus our attention on a complex estimate of the Laplace transform $m_n$ of $D_n$ since $m_n$ clearly extends to an analytic function on $\mathbb{C}$ . More precisely, our goal is to compute an estimate of $m_n(t+\textrm{i}v)$ for $t \neq 0$ and for $v \in \mathbb{R}$ such that $|v|< \pi$ . Note that $m_n$ is $2\textrm{i}\pi$ periodic.
Lemma 4.3. For any $t \neq 0$ and for all $v \in \mathbb{R}$ such that $|v|<\pi$ ,
where the remainder term $r_n(t+\textrm{i}v)$ is exponentially negligible and satisfies
Moreover, for any $t \neq 0$ and for all $v \in \mathbb{R}$ such that $|v|\leq \pi$ , we also have the alternative upper bound
where the second derivative of the asymptotic cumulant-generating function L is the positive function given by
Proof. We still assume in the following that $t \neq 0$ . We also extend F(t, z) in the complex plane with respect to the first variable, $F(t+\textrm{i}v,z) = \sum_{n= 0}^\infty m_n(t+\textrm{i}v) z^n$ , where the initial value is such that $m_0(t+\textrm{i}v)=1$ . Since $|m_n(t+\textrm{i}v)| \leq m_n(t)$ , the radius of convergence in z of $F(t+\textrm{i}v,\cdot)$ is at least the one for $v=0$ . Moreover, the poles of $F(t+\textrm{i}v,\cdot)$ are given, for all $\ell \in \mathbb{Z}$ , by
Consequently, for all $v \in \mathbb{R}$ such that $|v|< \pi$ ,
As in the proof of Lemma 4.2, we can split $F(t+\textrm{i}v,z)$ into two terms,
where we recall from (4.16) that, for all $z \in \mathbb{C}$ and for all $v \in \mathbb{R}$ such that $|v|< \pi$ ,
where g and h are given, for all $z \in \mathbb{C}^*$ , by
and the function $\xi$ is such that $\xi(t+\textrm{i}v,z)= (\textrm{e}^{t+\textrm{i}v}-1)z-(t+\textrm{i}v)$ . By holomorphic extension, we deduce from (4.19) that
Moreover, the poles of $\mathcal{H}(t+\textrm{i}v)$ are given, for all $\ell \in \mathbb{Z}^*$ , by
Hence, we obtain that, for all $v \in \mathbb{R}$ such that $|v|< \pi$ ,
It follows once again from Cauchy’s inequality that, for any $0<\rho(t+\textrm{i}v)<R^\mathcal{H}(t+\textrm{i}v)$ ,
where the norm in the numerator is
Since the image of the circle $\mathcal{C}(0,\rho(t+\textrm{i}v))$ by the application $\xi(t+\textrm{i}v,\cdot)$ coincides with the circle $\mathcal{C}({-}(t+\textrm{i}v),|\textrm{e}^{t+\textrm{i}v}-1|\rho(t+\textrm{i}v))$ , we obtain from $\mathcal{H}(t+\textrm{i}v,z)=h(\xi(t+\textrm{i}v,z))$ that
Hereafter, since $|v|< \pi$ , we can take the radius
Moreover, as in the proof of Lemma 4.2, denote by $\delta(t+\textrm{i}v)$ the distance between the circle $\mathcal{C}({-}(t+\textrm{i}v),|\textrm{e}^{t+\textrm{i}v}-1|\rho(t+\textrm{i}v)))$ and the set of poles of h,
It follows from (4.35) and the Pythagorean theorem that
We can observe that
which leads to
Using (4.34) together with (4.27) and (4.36), we obtain
Hence, we find that $m_n(t+\textrm{i}v) = (1-\textrm{e}^{-(t+\textrm{i}v)})m_n^{\mathcal{G}}(t+\textrm{i}v)(1+r_n(t+\textrm{i}v))$ , where the remainder term $r_n(t)$ is the ratio
that satisfies
Hereafter, we go further in the analyses of $m_n(t+\textrm{i}v)$ by providing a different upper bound for $m_n^\mathcal{H}(t+\textrm{i}v)$ . Our motivation is that the factor ${1}/({\pi -|v|})$ in (4.37) becomes very large when $|v|$ is close to $\pi$ . Our strategy is not to obtain the best exponent by taking the largest radius, close to the radius of convergence. Instead, we consider a smaller radius in order to stay away from the poles, but not too small in order to still have an exponential term with respect to $m_n^\mathcal{F}(t)$ . Let $\beta$ be the function defined, for all $|v| < \pi$ , by
It is clear that $\beta$ is an even function, increasing on $[{-}\pi,0]$ and decreasing on $[0,\pi]$ with a maximum value $\beta(0)=1$ , and such that $\beta(\pi)=4/\pi^2$ . We replace the radius previously given by (4.35) by the new radius
We can observe that we only replaced $\pi^2$ by $\beta(v)v^2=2(1-\cos\!(v))$ . As before, denote by $\delta(t+\textrm{i}v)$ the distance between the circle $\mathcal{C}({-}(t+\textrm{i}v),|\textrm{e}^{t+\textrm{i}v}-1|\rho(t+\textrm{i}v))$ and the set of the poles of h,
As in the proof of (4.36), we obtain
which ensures that
It follows from straightforward calculation that
Moreover, we also have from (1.5) that, for all $t \neq 0$ ,
In addition, by using that for $|v| \leq \pi$ we have
we deduce from the elementary inequality $1-x \leq \exp\!({-}x)$ that
We also recall that
We also have from straightforward calculation that
since $\beta(v) \leq 1$ . Hence, we obtain from (4.38) that
Finally, we already saw that $m_n(t+\textrm{i}v) = (1-\textrm{e}^{-(t+\textrm{i}v)})(m_n^{\mathcal{G}}(t+\textrm{i}v) + m_n^{\mathcal{H}}(t+\textrm{i}v))$ . Consequently, (4.39) together with (4.42) clearly lead to (4.33), which achieves the proof of Lemma 4.3.
5. Proof of the sharp large-deviation principle
Let us start with an elementary lemma concerning the asymptotic cumulant-generating function L defined by (1.5).
Lemma 5.1. The function $L \,:\, \mathbb{R} \to \mathbb{R}$ is twice differentiable and strictly convex, and its first derivative $L^{\prime}\,:\,\mathbb{R} \to ]0,1[$ is a bijection. In particular, for each $x\in ]0,1[$ , there exists a unique value $t_x\in \mathbb{R}$ such that
where I is the Fenchel–Legendre transform of L. The value $t_x$ is also characterized by the relation $L^{\prime}(t_x) =x$ where, for all $t \neq 0$ ,
Moreover, for all $x \in \big]\frac12,1\big[$ , $t_x >0$ , while for all $x \in \big]0,\frac12\big[$ , $t_x<0$ . In addition, for all $t \in \mathbb{R}$ , $L^{\prime\prime}(t)>0$ as the second derivative of L is given, for all $t \neq 0$ , by
Finally, the function L can be extended to a function $L\,:\,\mathbb{C}\setminus2\textrm{i}\pi\mathbb{Z}^*\to\mathbb{C}$ satisfying, for all $v \in \mathbb{R}$ such that $|v|\leq \pi$ , $\textrm{Re}(L(t+\textrm{i}v)) \leq L(t) - C(t) {v^2}/{2}$ , where
Proof. We saw in the previous section that the calculation of the first two derivatives (5.2) and (5.3) of L follows from straightforward calculation. Let us remark that $\lim_{t \to 0}L^{\prime}(t)=\frac12$ and $\lim_{t \to 0}L^{\prime\prime}(t)=\frac{1}{12}$ , which means that L can be extended as a $C^2$ function on $\mathbb{R}$ . The above computation also gives $\lim_{t\to-\infty}L^{\prime}(t)=0$ and $\lim_{t\to+\infty} L^{\prime}(t)=1$ .
We now focus our attention on the complex extension of L. We deduce from (4.41) and (5.3) together with the elementary inequality $\ln(1-x) \leq -x$ that, for all $t\neq 0$ and $|v|\leq \pi$ ,
which completes the proof of Lemma 5.1.
We continue with an elementary lemma which can be seen as a slight extension of the usual Laplace method.
Lemma 5.2. Let us consider two real numbers $a< 0 <b$ , and two functions $f\,:\,[a,b]\rightarrow \mathbb{C}$ and $\varphi\,:\,[a,b] \rightarrow \mathbb{C}$ such that, for all $\lambda$ large enough, $\int_{a}^{b}\textrm{e}^{-\lambda\textrm{Re}\,\varphi(u)}|f(u)|\,\textrm{d} u < +\infty$ . Assume that f is a continuous function in $0, f(0)\neq 0$ , $\varphi$ is a $C^2$ function in $0, \varphi^{\prime}(0)=0$ , $\varphi^{\prime\prime}(0)$ is a real positive number, and there exists a constant $C>0$ such that $\textrm{Re}\,\varphi(u) \geq \textrm{Re}\,\varphi(0) + Cu^2$ . Then,
Proof. First of all, we can assume without loss of generality that $\varphi(0)=0$ . We can observe that, for all $\lambda$ large enough,
However, it follows from the assumptions on the function $\varphi$ that
together with
Consequently, according to the dominated convergence theorem, we obtain
Furthermore, by using the usual Laplace method, we find that
Finally, (5.4), (5.5), and (5.6) allow us to conclude the proof of Lemma 5.2.
Proof of Theorem 3.1. We now proceed to the proof of Theorem 3.1. Our goal is to estimate, for all $x \in \big]\frac12,1\big[$ , the probability
To do so, we extend the Laplace transform $m_n$ of $D_n$ , defined in (4.1), into an analytic function on the complex plane. For all $t,v\in \mathbb{R}$ ,
Therefore, for all $t,v\in \mathbb{R}$ and for all $k \geq 0$ ,
We can observe that (5.7) is also true for $k \geq n$ , allowing us to recover that $\mathbb{P}(D_n=k)=0$ . Consequently, since $|m_n(t+\textrm{i}v)|\leq\textrm{e}^{tn}$ , it follows from Fubini’s theorem that, for all $t>0$ ,
In the following we choose $t=t_x$ . In particular, $t_x > 0$ since $x> \frac12$ . Then, we deduce from (5.8) that
where the integral $I_n$ can be separated into two parts, $I_n=J_n+K_n$ , with
where $\varepsilon_n= n^{-3/4}$ . On the one hand, we obtain from (4.12) that
by using (4.12), (5.1), and the fact that $t_x>0$ . Consequently, (4.13) together with the definition of $\varepsilon_n$ ensure that
It only remains to evaluate the integral $J_n$ . We deduce from (4.31) and (5.1) that
where the functions $\varphi$ and g are given, for all $|v| < \pi$ , by
Thanks to Lemma 5.1, we have $\varphi(0)=0$ , $\varphi^{\prime}(0)=0$ , and $\varphi^{\prime\prime}(0)=\sigma_x^2$ , which is a positive real number. In addition, there exists a constant $C>0$ such that $\textrm{Re}\,\varphi(v) \geq Cv^2$ . Therefore, via the extended Laplace method given by Lemma 5.2, we obtain
It follows from (4.32) that there exist positive constants $a_x, b_x, c_x$ such that, for all $|v|\leq \pi-\varepsilon_n$ ,
which ensures that
Then, we obtain
where
since $\mathbf{1}_{|v|\geq\pi -\varepsilon_n} \leq |v|$ . By the standard Laplace method, $\lim_{n\rightarrow\infty}\sqrt{n}\Lambda_n=0$ . Consequently, we deduce from (5.11) and (5.12) that
Finally, (5.9) together with (5.10) and (5.13) clearly lead to (3.1).
6. An alternative proof
Alternative proof of Theorem 3.1. We already saw from (3.3) that the distribution $D_n$ is nothing other than that of the integer part of the Irwin–Hall distribution, which is the sum $S_n=U_1+\cdots+U_n$ of independent and identically distributed random variables sharing the same uniform distribution on [0, 1]. It follows from some direct calculation that, for any $x\in \big]\frac12, 1\big[$ ,
where $\mathbb{E}_n$ is the expectation under the new probability $\mathbb{P}_n$ given by
and $\varepsilon_n= \{nx\}/n$ . Let
and denote by $f_n$ and $\Phi_n$ the probability density function and the characteristic function of $V_n$ under the new probability $\mathbb{P}_n$ , respectively. Let us remark that, under $\mathbb{P}_n$ , we know that $(V_n)$ converges in distribution to the standard Gaussian measure. Using the Parseval identity, we have
Recalling that L is also the logarithm of the Laplace transform of the uniform distribution in [0, 1], we obtain from (6.2) that
Let A be a positive constant chosen later. We can split the integral in (6.3) into two parts: we call $J_n$ the integral on $[{-}A\sigma_x\sqrt{n}, +A\sigma_x\sqrt{n}]$ and $K_n$ the integral on the complementary set. On the one hand, as (4.41) also holds when replacing $\pi^2$ by $v^2$ , which is smaller than $A^2 \sigma_x^2 n$ , and since $L^{\prime\prime}(t_x)=\sigma_x^2$ , we get, for all $v \in \mathbb{R}$ ,
Then, we deduce from Lebesgue’s dominated convergence theorem that
On the other hand, concerning $K_n$ , since now v is large in the integral we use (4.40) to get
leading to
Moreover, for all $n>2$ ,
By taking $A^2 = t_x^2+8t_x^2{\textrm{e}^{t_x}}/{(\textrm{e}^{t_x}-1)^2}$ , we obtain from (6.6) and (6.7) that
exponentially fast. Finally, (6.1) together with (6.3), (6.5), and (6.8) allow us to conclude the alternative proof of Theorem 3.1.
Remark 6.1. We can use the previous computations to also get a new concentration inequality. More precisely, by using the upper bound (6.4) to get an upper bound for $J_n$ instead of a limit, we are able to prove that
where the prefactor can be taken as
We can observe that (6.9) is similar to (3.2). Note that the constants in (3.2) as well as in (6.9) are not sharp. It is in fact possible to improve them by more precise cuttings in the integrals.
7. Proof of the concentration inequalities
We now proceed to prove the concentration inequalities. Recalling that $x \in \big]\frac12,1\big[$ , which implies that $t_x>0$ , we obtain from the equality in (5.8) that
Consequently, we deduce from the alternative upper bound in (4.33) that
where
Hereafter, we recall from (1.4) that $I(x)=x t_x -L(t_x)$ and we write $\sigma_x^2=L^{\prime\prime}(t_x)$ . It follows from standard Gaussian calculation that
Finally, we find from (7.1) together with (7.2) that
where
which is exactly what we wanted to prove.
8. Proofs of the standard results
We now focus our attention on the more standard results concerning the sequence $(D_n)$ such as the quadratic strong law, the law of iterated logarithm, and the functional central limit theorem.
Proof of Proposition 2.1. First of all, we can observe from (2.2) and (2.5) that the martingale $(M_n)$ can be rewritten in the additive form $M_n=\sum_{k=1}^{n-1} (k+1) (\xi_{k+1}-p_k)$ . It follows from the almost sure convergence (1.2) together with (2.6) and the classical Toeplitz lemma that the predictable quadratic variation $\langle M \rangle_n$ of $(M_n)$ satisfies
Denote by $f_n$ the explosion coefficient associated with $(M_n)$ ,
We obtain from (1.2) and (8.1) that $\lim_{n \rightarrow \infty}nf_n=3$ a.s., which implies that $f_n$ converges to zero almost surely as n goes to infinity. In addition, we clearly have for all $n\geq 1$ , $| \xi_{n+1} -p_n | \leq 1$ . Consequently, we deduce from the quadratic strong law for martingales given, e.g., by [Reference Bercu1, Theorem 3] that
which ensures that
However, it follows from (2.5) that
Therefore, we obtain once again from (1.2) together with (8.2) and (8.3) that
which is exactly the quadratic strong law (2.7). It only remains to prove the law of iterated logarithm given by (2.8). It immediately follows from the law of iterated logarithm for martingales given, e.g., by [Reference Duflo6, Corollary 6.4.25] that
which leads via (8.1) to
Finally, we find from (2.5) and (8.4) that
which completes the proof of Proposition 2.1.
Proof of Proposition 2.2. We now proceed to prove the functional central limit theorem given by the distributional convergence (2.10). On the one hand, it follows from (8.1) that, for all $t \geq 0$ ,
On the other hand, it is quite straightforward to check that $(M_n)$ satisfies Lindeberg’s condition given, for all $t \geq 0$ and for any $\varepsilon>0$ , by
where $\Delta M_n=M_n-M_{n-1}=n(\xi_n - p_{n-1})$ . As a matter of fact, we have, for all $t \geq 0$ and for any $\varepsilon>0$ ,
However, we already saw that, for all $n \geq 2$ , $|\Delta M_n|\leq n$ . Consequently, for all $t \geq 0$ and for any $\varepsilon>0$ ,
which immediately implies (8.6). Therefore, we deduce from (8.5) and (8.6) together with the functional central limit theorem for martingales given, e.g., in [Reference Durrett and Resnick7, Theorem 2.5] that
where $(B_t, t \geq 0)$ is a real-valued centered Gaussian process starting at the origin with covariance given, for all $0<s \leq t$ , by $\mathbb{E}[B_s B_t]= s^{3}/12$ . Finally, (2.5) and (8.7) lead to (2.10) where $W_t=B_t/t^2$ , which is exactly what we wanted to prove.
Acknowledgements
The authors would like to thank the two anonymous reviewers and the associate editor for their careful reading and helpful comments which helped to improve the paper substantially.
Funding information
There are no funding bodies to thank relating to the creation of this article.
Competing interests
There were no competing interests to declare which arose during the preparation or publication process of this article.