Hostname: page-component-586b7cd67f-dsjbd Total loading time: 0 Render date: 2024-11-24T22:27:39.778Z Has data issue: false hasContentIssue false

Asymptotic behaviour of the first positions of uniform parking functions

Published online by Cambridge University Press:  20 April 2023

Etienne Bellin*
Affiliation:
Ecole Polytechnique
*
*Postal address: Centre de Mathématiques Appliquées, Ecole Polytechnique, Palaiseau, France. Email address: [email protected]
Rights & Permissions [Opens in a new window]

Abstract

In this paper we study the asymptotic behaviour of a random uniform parking function $\pi_n$ of size n. We show that the first $k_n$ places $\pi_n(1),\ldots,\pi_n(k_n)$ of $\pi_n$ are asymptotically independent and identically distributed (i.i.d.) and uniform on $\{1,2,\ldots,n\}$, for the total variation distance when $k_n = {\rm{o}}(\sqrt{n})$, and for the Kolmogorov distance when $k_n={\rm{o}}(n)$, improving results of Diaconis and Hicks. Moreover, we give bounds for the rate of convergence, as well as limit theorems for certain statistics such as the sum or the maximum of the first $k_n$ parking places. The main tool is a reformulation using conditioned random walks.

Type
Original Article
Copyright
© The Author(s), 2023. Published by Cambridge University Press on behalf of Applied Probability Trust

1. Introduction

A parking function of size n is a function $\pi_n\,:\, [\![ 1,n ]\!] \to [\![ 1,n ]\!]$ such that, if $\pi^{\prime}_n(1) \leq \cdots \leq \pi^{\prime}_n(n)$ is the non-decreasing rearrangement of $(\pi_n(1),\ldots,\pi_n(n))$ , then $\pi^{\prime}_n(i) \leq i$ for all i. Konheim and Weiss [Reference Konheim and Weiss16] first introduced parking functions, in the context of information storage, to study hashing functions, and they have shown that there are $(n+1)^{n-1}$ parking functions of size n. Since then, parking functions have become a subject of interest in the fields of combinatorics, probability, group theory, and computer science. More precisely, parking functions are linked to the enumerative theory of trees and forests [Reference Chassaing and Marckert8], to coalescent processes [Reference Broutin and Marckert6, Reference Chassaing and Louchard7], to the analysis of set partitions [Reference Stanley20], hyperplane arrangements [Reference Shi19, Reference Stanley21], polytopes [Reference Chebikin and Postnikov9, Reference Stanley and Pitman22], and sandpile groups [Reference Cori and Rossin10]. Finally, the study of probabilistic properties of parking functions has recently attracted some interest [Reference Diaconis and Hicks11, Reference Kenyon and Yin15, Reference Yin25]. We refer to [Reference Yan24] for an extensive survey. Our initial interest in parking functions comes from the study of minimal factorisations of cycles [Reference Biane3].

For all $n\geq 1$ , consider a random parking function $(\pi_n(i))_{1 \leq i \leq n}$ chosen uniformly among all the $(n+1)^{n-1}$ possible parking functions of size n. For all $1 \leq k \leq n$ , let

(1) \begin{equation}d_{\mathit{TV}}(k,n) \,:\!=\, \dfrac12 \sum_{i_1,\ldots,i_k=1}^n \biggl| \mathbb{P}(\pi_n(1)=i_1,\ldots,\pi_n(k)=i_k) - \dfrac{1}{n^k} \biggr|\end{equation}

denote the total variation distance between $(\pi_n(1),\ldots,\pi_n(k))$ and $(U_n(1),\ldots,U_n(k))$ , where $(U_n(i))_{1 \leq i \leq n}$ are independent and identically distributed (i.i.d.) and uniform in $[\![ 1,n ]\!]$ . Diaconis and Hicks [Reference Diaconis and Hicks11, Corollary 6] have shown that $d_{\mathit{TV}}(1,n)$ tends to 0 as n tends to infinity, and conjectured that for any fixed k, $d_{\mathit{TV}}(k,n)$ should be ${\rm{O}}(k/\sqrt{n})$ . In the same paper they studied the Kolmogorov distance

(2) \begin{equation}d_K(k,n) \,:\!=\, \max_{1 \leq i_1 \cdots i_k \leq n} \biggl| \mathbb{P}(\pi_n(1)\leq i_1,\ldots,\pi_n(k) \leq i_k) - \dfrac{i_1\cdots i_k}{n^k} \biggr| ,\end{equation}

and have shown that [Reference Diaconis and Hicks11, Theorem 3] for $1\leq k \leq n$ ,

\begin{align*}d_K(k,n) = {\rm{O}}\biggl( k\sqrt{\dfrac{\log{n}}{n}} + \dfrac{k^2}{n} \biggr).\end{align*}

They also discuss the growth threshold of k at which $d_K$ no longer converges towards 0, and find that for k of order n the convergence fails. We prove a stronger version of Diaconis and Hicks’ conjecture when k is allowed to grow with n at rate at most $\sqrt{n}$ . Moreover, the Kolmogorov distance converges towards 0 when $k = {\rm{o}}(n)$ , as shown in the following result.

Theorem 1. (i) If $k_n = {\rm{o}}(\sqrt{n})$ , then

(3) \begin{equation}d_{\mathit{TV}}(k_n,n) = {\rm{O}} \biggl( \dfrac{k_n}{\sqrt{n}} \biggr).\end{equation}
  1. (ii) If $k_n = {\rm{o}}(n)$ and $\sqrt{n} = {\rm{o}}(k_n)$ , then

    (4) \begin{equation}d_K(k_n,n) = {\rm{O}}\biggl( \dfrac{\sqrt{n}}{k_n} + \biggl(\dfrac{k_n}{n}\biggr)^{0.19} \biggr).\end{equation}

Remark 1. In Theorem 1(ii), $\sqrt{n}$ is assumed to be ${\rm{o}}(k_n)$ . Since the function $k \mapsto d_K(k,n)$ is non-decreasing for fixed n, the distance $d_K(k_n,n)$ still tends towards 0 as long as $k_n = {\rm{o}}(n)$ . Thus sequence $a_n = n$ satisfies $d_K(k_n,a_n) \to 0$ if $k_n = {\rm{o}}(a_n)$ and $d_K(k_n,a_n) \not\to 0$ if $a_n = {\rm{O}}(k_n)$ . It would be very interesting to identify such a sequence for $d_{\mathit{TV}}$ instead of $d_K$ , and, in particular, to see if $d_{\mathit{TV}}(k_n,n) \to 0$ when $k_n={\rm{o}}(n)$ .

The main idea in proving Theorem 1 is to express the law of $\pi_n$ in terms of a conditioned random walk (Proposition 2 below) and to control its moments, uniformly in time (Proposition 4). Such uniform estimates on conditioned random walks are delicate to establish, and we believe them to be of independent interest. As an application, we obtain limit theorems for the maximum and the sum of the first $k_n$ parking places. Namely we obtain the following corollary (whose proof is postponed to the last section).

Corollary 1. (i) If $k_n = {\rm{o}}(\sqrt{n})$ and $k_n \to \infty$ , then the convergence

\begin{align*}\sqrt{\dfrac{12}{k_n}}\biggl(\dfrac{\pi_n(1)+\cdots+\pi_n(k_n)}{n}-\dfrac{k_n}{2}\biggr) \longrightarrow \mathcal{N}(0,1)\end{align*}

holds in distribution, where $\mathcal{N}(0,1)$ is a standard normal distribution.

  1. (ii) If $k_n = {\rm{o}}(n)$ and $k_n \to \infty$ , then the convergence

    \begin{align*}k_n \biggl( 1 - \dfrac{1}{n}\max \{ \pi_n(1),\ldots,\pi_n(k_n) \} \biggr) \longrightarrow \mathcal{E}(1)\end{align*}
    holds in distribution, where $\mathcal{E}(1)$ is an exponential distribution with mean 1.

Remark 2. The complete sum $\pi_n(1)+\cdots+\pi_n(n)$ has been studied, and converges, after renormalisation, towards a more complicated distribution involving zeros of the Airy function (see [Reference Diaconis and Hicks11, Theorem 14]).

When $k_n \sim cn$ we obtain the following limit theorem for the first $k_n$ parking places. The proof uses other techniques and Proposition 2 (or rather its proof).

Proposition 1. If $k_n \sim cn$ with $c \in (0,1]$ , then for all $a \in \mathbb N$ there exists an integer-valued random variable $S_a^*$ such that $0 \leq S_a^* \leq a$ almost surely and

\begin{align*}\mathbb P (n-\max\{\pi_n(1),\ldots,\pi_n(k_n)\} \geq a) \longrightarrow \mathbb E \bigl[(1-c)^{a - S_a^*}\bigr].\end{align*}

In Section 2 we use a bijection between parking functions and Cayley trees and use it to reformulate the law of $\pi_n$ in terms of conditioned random walks. Then in Section 3 we bound the moments of a conditioned random walk in order to control the probability mass function of $\pi_n$ and prove Theorem 1(i). In Section 4 we prove (ii) using arguments developed in the previous sections. The last section is devoted to the proof of Corollary 1 and Proposition 1.

In the following, C denotes a constant which may vary from line to line.

2. Bijection between parking functions and Cayley trees

Here the goal is to use the bijection found by Chassaing and Marckert [Reference Chassaing and Marckert8] between parking functions of size n and Cayley trees with $n+1$ vertices (i.e. acyclic connected graphs with $n+1$ vertices labelled from 0 to n). This bijection will allow us to express the joint distribution of the first positions of a uniform parking function in terms of random walks. To this end, we start with some notation and definitions. Let $\mathfrak{C}_{n+1}$ be the set of Cayley trees with $n+1$ vertices labelled from 0 to n, where the vertex labelled 0 is distinguished from the others (we call it the root of the tree). Also, let $P_n$ be the set of parking functions of size n. We consider the breadth-first search on a tree $t \in \mathfrak{C}_{n+1}$ by ordering the children of each vertex of t in increasing order of their labels (thus t is viewed as a plane tree) and then taking the regular breadth-first search associated with the plane order (see [Reference Chassaing and Marckert8] for a detailed definition and see Figure 1 for an example). For $t \in \mathfrak{C}_{n+1}$ and $1 \leq i \leq n$ , define r(i, t) to be the rank for the breadth-first search of the parent of the vertex labelled i in t. The bijection of Chassaing and Marckert is described in the following theorem.

Figure 1. Example of a Cayley tree t with 10 vertices. For every vertex, its label is represented in black on the left and its rank for the breadth-first search is in red on the right. Here, for example, we have . The parking function associated with this tree by (5) is (2, 2, 1, 1, 2, 1, 8, 4, 8).

Theorem 2. (Chassaing and Marckert.) The map

(5) \begin{equation}t \mapsto (r(1,t),\ldots,r(n,t))\end{equation}

is a bijection between $\mathfrak{C}_{n+1}$ and $P_n$ .

Remark 3. Chassaing and Louchard [Reference Chassaing and Louchard7] described a similar bijection using what they call the standard order instead of breadth-first search.

Let $(X_i)_{i \geq 1}$ be i.i.d. random variables distributed as a Poisson distribution of parameter 1. For all $n \geq 0$ we set $S_n \,:\!=\, \sum_{i=1}^n (X_i -1)$ and, for all $a \in \mathbb{Z}$ , $\tau_{a} \,:\!=\, \min \{ n \geq 1\,:\, S_n=a \}$ the first time that the random walk $(S_n)_n$ reaches a. Consider the probability measure $\mathbb{P}_n \,:\!=\, \mathbb{P} ({\cdot} |\tau_{-1}=n+1)$ and set $\mathbb{E}_n \,:\!=\, \mathbb{E} [\cdot|\tau_{-1}=n+1]$ . It is well known that a Bienaymé–Galton–Watson tree with a critical Poisson reproduction law conditioned on having n vertices has the same distribution, when we uniformly randomly label the vertices from 1 to n, as a uniform Cayley tree with n vertices (see e.g. [Reference Janson14, Example10.2]). From this, Chassaing and Marckert deduce the following corollary.

Corollary 2. (Chassaing and Marckert.) Let $T_{n+1}$ be a random Cayley tree in $\mathfrak{C}_{n+1}$ with uniform distribution. The random vector $( \# \{1\leq j \leq n\,:\, r(j,T_{n+1})=i\} )_{1 \leq i \leq n+1}$ has the same distribution as $(X_i)_{1 \leq i \leq n+1}$ under $\mathbb{P}_n$ .

We are now able to state and prove the main result of this section.

Proposition 2. Fix $1 \leq k \leq n$ and $1 \leq i_1, \ldots ,i_k \leq n$ . Let $j_1<\cdots<j_r$ be such that $\{i_1,\ldots,i_k\} = \{j_1,\ldots,j_r\}$ (so the vector $(j_1,\ldots,j_r)$ is the vector $(i_1,\ldots,i_k)$ where repeated elements have been reduced to a single one, and r is the number of distinct elements of $(i_1,\ldots,i_k)$ ). Define $m_s = \# \{u\,:\, i_u = j_s \}$ for all $1 \leq s \leq r$ . Then

(6) \begin{equation}\mathbb{P}(\pi_n(1)=i_1,\ldots,\pi_n(k)=i_k) = \dfrac{(n-k)!}{n!} \mathbb{E}_n \Biggl[ \prod_{s=1}^r (X_{j_s})_{m_s} \Biggr] ,\end{equation}

where $(x)_m \,:\!=\, x(x-1)\cdots(x-m+1)$ .

Proof. Let $T_{n+1}$ be a random Cayley tree in $\mathfrak{C}_{n+1}$ with uniform distribution. Let $\mathfrak{S}(k,n)$ denote the set of all injections between $[\![ 1,k ]\!]$ and $[\![ 1,n ]\!]$ . We have

The first equality comes from the fact that any permutation of a parking function is still a parking function, thus any permutation induces a bijection in $P_n$ . The second equality comes from Theorem 2 and the last from Corollary 2. This completes the proof.

3. Convergence for the total variation distance

In this section we suppose that $k_n = {\rm{o}}(\sqrt{n})$ . We will write k instead of $k_n$ to ease notation, but keep in mind that k depends on n. The goal of this section is to show item (i) of Theorem 1.

3.1. Probability that the parking places are distinct

The first step is to reduce the problem to distinct parking places; in this case equation (6) becomes easier. To this end we introduce the set of distinct indices

\begin{align*}D_n \,:\!=\, \{ (u_1,\ldots,u_k) \in [\![ 1,n ]\!]^k\,:\, i \neq j \Rightarrow u_i \neq u_j \}.\end{align*}

We also introduce the set

\begin{align*}G_n \,:\!=\, \{ (u_1,\ldots,u_k) \in [\![ 1,n ]\!]^k\,:\, \mathbb{P}(\pi_n(1)=u_1,\ldots,\pi_n(k)=u_k) \geq (n-k)!/n! \}\end{align*}

and the quantity

(7) \begin{equation}\delta(k,n) \,:\!=\, \sum_{\substack{(i_1,\ldots,i_k) \\ \in D_n \cap G_n}} \biggl[\mathbb{P}(\pi_n(1)=i_1,\ldots,\pi_n(k)=i_k) - \dfrac{(n-k)!}{n!}\biggr].\end{equation}

The next lemma shows that the first k parking places of a uniform parking function are all distinct with high probability. It also shows that if $\delta(k,n)$ is ${\rm{O}}(k/\sqrt{n})$ then so is $d_{\mathit{TV}}(k,n)$ . Recall that $(U_n(i))_{1 \leq i \leq n}$ are i.i.d. uniformly distributed in $[\![ 1,n ]\!]$ .

Lemma 1. We have

  1. (i) $\mathbb{P}((U_n(1),\ldots,U_n(k)) \in D_n) = 1 + {\rm{O}}\biggl( \dfrac{k}{\sqrt{n}} \biggr)$ ,

  2. (ii) $\mathbb{P}((\pi_n(1),\ldots,\pi_n(k)) \in D_n) = 1 + {\rm{O}}\biggl( \dfrac{k}{\sqrt{n}} \biggr)$ ,

  3. (iii) $\delta(k,n) = {\rm{O}}\biggl( \dfrac{k}{\sqrt{n}} \biggr) \implies d_{\mathit{TV}}(k,n) = {\rm{O}}\biggl( \dfrac{k}{\sqrt{n}} \biggr)$ .

Proof. Let $\mu_n$ be the law of $(\pi_n(1),\ldots,\pi_n(k))$ and $\nu_n$ the law of $(U_n(1),\ldots,U_n(k))$ , with support on the same finite space $E_n \,:\!=\, [\![ 1,n ]\!]^k$ . First we check (i). By Markov’s inequality,

Since $k={\rm{o}}(\sqrt{n})$ , we have

\begin{align*}\dfrac1n \dfrac{k(k-1)}{2} = {\rm{O}} \biggl( \dfrac{k^2}{n} \biggr) = {\rm{O}} \biggl( \dfrac{k}{\sqrt{n}} \biggr).\end{align*}

Now we check (ii). To do so, we will use the Prüfer encoding (or a slight variant thereof) of a rooted Cayley tree $t \in \mathfrak{C}_{n}$ into a sequence $(p_1,\ldots,p_{n-2}) \in [\![ 0,n-1 ]\!]^{n-2}$ , which we now explain. For $a \in [\![ 1,n-1 ]\!]$ , define p(t, a) to be the label of the parent of the vertex labelled a in t. Also, define $\ell(t)$ as the biggest leaf label of t, and $t^*$ the tree t obtained after removing the leaf labelled $\ell(t)$ and its adjacent edge. Finally we define the sequence of trees $t_1 \,:\!=\, t$ , $t_i \,:\!=\, t_{i-1}^*$ for $2 \leq i \leq n-2$ . The Prüfer encoding of t is then defined as $p_i \,:\!=\, p(t_i,\ell(t_i))$ . For example, the Prüfer encoding of the tree in Figure 1 is (8, 8, 6, 0, 3, 0, 3, 3). The key property of this encoding is that it is a bijection between the sets $\mathfrak C_n$ and $[\![ 0,n-1 ]\!]^{n-2}$ . Now, let $T_{n+1}$ be a uniform Cayley tree in $\mathfrak{C}_{n+1}$ . Theorem 2 implies that $\mu_n(D_n)$ is equal to the probability that the vertices labelled 1 to k in $T_{n+1}$ have distinct parents. Let $(v_1,\ldots,v_k)$ be a random vector with uniform distribution in $D_n$ independent of $T_{n+1}$ . Since the distribution of $T_{n+1}$ is invariant under permutation of the labels, the previous probability is also equal to the probability that the vertices labelled $v_1,\ldots,v_k$ have distinct parents in $T_{n+1}$ . Let $(p_1,\ldots,p_{n-1})$ be the Prüfer encoding of the tree $T_{n+1}$ . We complete this vector with $p_n \,:\!=\, 0$ (this comes from the fact that $t_{n-2}$ has two vertices, one of them being the root labelled 0). Since $T_{n+1}$ is uniformly distributed in $\mathfrak{C}_{n+1}$ , the vector $(p_1,\ldots,p_{n-1})$ is uniformly distributed in $[\![ 0,n ]\!]^{n-1}$ . From the previous discussion and the definition of the Prüfer encoding, we deduce that

\begin{align*}\mu_n(D_n) = \mathbb{P} ( (p_{v_1},\ldots,p_{v_k})\in D_n).\end{align*}

Consider the event $Z_n \,:\!=\, \{ v_1,\ldots,v_k \neq n \}$ . Under this event, it is easy to see that $(p_{v_1},\ldots,p_{v_k})$ has the same law as k i.i.d. random variables uniformly distributed in $[\![ 0,n ]\!]$ . So from (i) we have

\begin{align*}\mu_n(D_n) \geq \mathbb{P} ( (p_{v_1},\ldots,p_{v_k})\in D_n \mid Z_n) \,\mathbb{P}(Z_n) = \biggl( 1 + {\rm{O}} \biggl( \dfrac{k}{\sqrt{n}}\biggr) \biggr)\,\mathbb{P}(Z_n).\end{align*}

To conclude, notice that $\mathbb{P}(Z_n) = 1-k/n$ .

Finally we show (iii). Assume that $\delta(k,n) = {\rm{O}}( k/\sqrt{n} )$ . For all $i_1,\ldots,i_k$ , let $\Delta_{i_1,\ldots,i_k}$ denote the quantity $(\mathbb P (\pi_n(1)=i_1,\ldots,\pi_n(k)=i_k)-(n-k)!/n!)$ . Notice that $n^k(n-k)!/n!-1 = {\rm{O}}(k/\sqrt{n})$ , so

\begin{align*}2d_{\mathit{TV}}(k,n) = \sum_{i_1,\ldots,i_k=1}^n | \Delta_{i_1,\ldots,i_k} | + {\rm{O}} \biggl( \dfrac{k}{\sqrt{n}} \biggr).\end{align*}

Let $d^+_n$ denote the sum of $\Delta_{i_1,\ldots,i_k}$ over the indices in $G_n$ and $d^-_n$ the opposite of the sum over the indices in $E_n \setminus G_n$ . We have

\begin{align*}d^+_n - d^-_n = \sum_{i_1,\ldots,i_k=1}^n \Delta_{i_1,\ldots,i_k} = 1 - \dfrac{n^k(n-k)!}{n!} = {\rm{O}} \biggl( \dfrac{k}{\sqrt{n}} \biggr).\end{align*}

The last two equalities imply that

\begin{align*}2d_{\mathit{TV}}(k,n) = d^+_n + d^-_n + {\rm{O}} \biggl( \dfrac{k}{\sqrt{n}} \biggr) = 2d^+_n + {\rm{O}} \biggl( \dfrac{k}{\sqrt{n}} \biggr).\end{align*}

In conclusion we just need to show that $d^+_n$ is ${\rm{O}}(k/\sqrt n)$ . Notice that

\begin{align*}d^+_n = \delta(k,n) + \mu_n(D_n^\rm c \cap G_n) - \nu_n(D_n ^\rm c \cap G_n) \times \dfrac{n^k(n-k)!}{n!}.\end{align*}

From (i), (ii), and the assumption on $\delta(k,n)$ , we deduce that $d^+_n$ is indeed ${\rm{O}}(k/\sqrt{n})$ .

To prove (i) of Theorem 1 it remains to show that $\delta(k,n) = {\rm{O}}(k/\sqrt{n})$ . This is the goal of the next three sections.

3.2. A monotonicity argument

In this section we bound the terms $\mathbb{E}_n [ X_{i_1} \cdots X_{i_k} ]$ that appear in (6) when $i_1,\ldots,i_k$ are distinct with terms involving $\mathbb{E}_n [ S_{i_1} \cdots S_{i_k} ]$ , since the latter are more manageable. More precisely, the aim of this section is to prove the following result.

Proposition 3. Fix $i_1,\ldots,i_k \in [\![ 1,n ]\!]$ pairwise distinct. We have

(8) \begin{align} i_1 \cdots i_k \, \mathbb{E}_n [ X_{i_1} \cdots X_{i_k} ] \leq \mathbb{E}_n [ (S_{i_1}+i_1) \cdots (S_{i_k}+i_k) ].\end{align}

To prove Proposition 3 we first state a really useful lemma which, put in simple terms, says that the steps of the random walk S tend to decrease under $\mathbb{P}_n$ .

Lemma 2. Fix $n \geq k \geq 1$ and $m_1, \ldots, m_k \geq 1$ . Let $1 \leq i_1 < \cdots < i_k \leq n$ and $1\leq j_1 < \cdots < j_k \leq n$ such that $j_r \leq i_r$ for all $1 \leq r \leq k$ . Let $f\,:\, \mathbb{N} \times \mathbb{N}\setminus \{0\} \mapsto [0,\infty)$ be a non-negative function such that $f(0,m)=0$ . Then

(9) \begin{equation}\mathbb{E}_n [ f(X_{i_1},m_1) \cdots f(X_{i_k},m_k) ] \leq \mathbb{E}_n [ f(X_{j_1},m_1) \cdots f(X_{j_k},m_k) ].\end{equation}

Proof of Lemma 2.To ease notation, we define the random vector $\textbf{X} \,:\!=\, (X_1,\ldots,X_{n+1})$ and write $\textbf{x}$ as a shortcut to designate an element $(x_1,\ldots,x_{n+1})$ of $\mathbb N^{n+1}$ . Let $s \,:\!=\, \min \{ r \geq 1\,:\, j_r < i_r \}$ . We only need to treat the case where $i_r = j_r$ for all $r \neq s$ and $j \,:\!=\, j_s = i_s-1$ (the general result can then be obtained by induction). Let $\sigma = (j\,j+1) \in \mathfrak{S}_{n+1}$ be the permutation that transposes j and $j+1$ . Let

and let

\begin{align*}\mathcal{E}^{\prime}_n \,:\!=\, \{ \textbf x \in \mathcal{E}_n \,:\, x_{j+1} > 0 \text{ or } (x_1-1)+\cdots+(x_{j-1}-1)>0 \}.\end{align*}

Note that $(x_1,\ldots,x_{n+1}) \mapsto (x_{\sigma(1)},\ldots,x_{\sigma(n+1)})$ is a bijection on $\mathcal{E}^{\prime}_n$ . Then

If $\textbf x \in \mathcal{E}_n \setminus \mathcal{E}^{\prime}_n$ , then $f(x_{j+1},m_{j+1}) = f(0,m_{j+1}) = 0$ ; in particular, since f is non-negative,

\begin{align*}f(x_{i_1},m_1) \cdots f(x_{i_k},m_k) \leq f(x_{\sigma(i_1)},m_1) \cdots f(x_{\sigma(i_k)},m_k).\end{align*}

Finally

Remark 4. In Lemma 2 we can for instance take $f(x,m) = x^m$ or $f(x,m) = (x)_m$ . Note that in Lemma 2 the indices $(i_r)_r$ must be pairwise distinct as well as the indices $(j_r)_r$ . In the proof of Proposition 3, we extend the result for $f(x,m)=x^m$ to the case where only the $(i_r)_r$ are pairwise distinct.

Proof of Proposition 3.First we show the following inequality. Fix $n \geq k \geq 1$ . Let $1 < i_1 < \cdots < i_k \leq n$ , $1 \leq j_1 \leq \cdots \leq j_k \leq n$ be such that $j_r \leq i_r$ for all $1 \leq r \leq k$ . Then

(10) \begin{equation}\mathbb{E}_n [ X_{i_1} \cdots X_{i_k} ] \leq \mathbb{E}_n [ X_{j_1} \cdots X_{j_k} ].\end{equation}

To show (10) it is actually enough to show the following result. Let $J \subset [\![ 1,n ]\!]$ and $2 \leq i \leq n$ such that i and $i-1$ do not belong to J. Let $m_j \geq 1$ for $j \in J$ and $m \geq 1$ . Then

(11) \begin{equation}\mathbb{E}_n \Biggl[ X_{i-1}^mX_{i} \prod_{j \in J} X_j^{m_j} \Biggr] \leq \mathbb{E}_n \Biggl[ X_{i-1}^{m+1} \prod_{j \in J} X_j^{m_j} \Biggr].\end{equation}

Inequality (10) can then be obtained by induction using Lemma 2 and (11). By Young’s inequality,

\begin{align*}X_{i-1}^mX_{i} \leq \dfrac{m}{m+1}X_{i-1}^{m+1} + \dfrac{1}{m+1}X_i^{m+1}.\end{align*}

Combining this with Lemma 2 gives (11) and concludes the proof of (10). Now, using inequality (10), we obtain

which concludes the proof of Proposition 3.

3.3. Bounding the moments of a random walk conditioned to be an excursion

The goal of this section is to find bounds for the moments of the random walk S conditioned to be an excursion. More precisely, the aim of this section is to show the following result.

Proposition 4. There exists a constant $C > 0$ such that for all $n\geq2$ , $1 \leq k \leq n-1$ and $d \geq 1$ ,

  1. (i) we have

    (12) \begin{equation} \mathbb{E}_n \bigl[ S_k^d \bigr] \leq ( C d n )^{d/2}, \end{equation}
  2. (ii) and

    (13) \begin{equation} \mathbb{E}_n \bigl[ S_{n-k}^d \bigr] \leq \biggl( \dfrac{n}{n-k} \biggr)^{3/2}(Cdk)^{d/2}, \end{equation}
  3. (iii) as well as

    (14) \begin{equation} \mathbb{E}_n \bigl[ S_k^d \bigr] \leq \biggl( \dfrac{n}{n-k} \biggr)^{3/2} \bigl( C d \sqrt{k} \bigr)^d. \end{equation}

Remark 5. Items (i) and (ii) of Proposition 4 actually hold true for any random walk $(S_n)_{n \geq 0}$ starting from 0 with i.i.d. increments all having the distribution $\mu$ whose support is $\mathbb{N}\cup \{-1\}$ and such that $\mu$ has mean 0 and finite variance. However, (iii) uses in addition the fact that a Poisson random walk has a sub-exponential tail (see e.g. [Reference Wellner23, Example 3]), namely, for all $k \geq 1$ and $x \geq 0$ ,

(15) \begin{equation}\mathbb{P}(S_k \geq x) \leq \exp \biggl( -\dfrac{x^2}{2(k+x)} \biggr).\end{equation}

To prove Proposition 4 we will use the following lemma, whose proof is postponed to the end of this section. This lemma is similar to the cyclic lemma in spirit, but instead of conditioning the walk to be an excursion we only condition it to stay positive.

Lemma 3. Let $n \geq 1$ and $F\,:\, \mathbb{R}^n \rightarrow [0,+\infty)$ be invariant under cyclic shifts. Then

(16)

Proof of Proposition 4.We recall that C denotes a constant which may vary from line to line. For (i), according to [Reference Addario-Berry, Devroye and Janson2, equation (32)], the maximum of the excursion of S has a sub-Gaussian tail, namely there exist constants $C, \alpha > 0$ such that, for all $n \geq 1$ and $x \geq 0$ ,

\begin{align*} \mathbb{P}_n ( M_n \geq x ) \leq C{\rm{e}}^{-\alpha x^2/n}, \end{align*}

where $M_n \,:\!=\, \max \{ S_0,\ldots,S_n \}$ is the maximum of the walk S on [0, n]. So we have

\begin{align*} \mathbb{E}_n \bigl[ n^{-d/2} S_k^d \bigr] &\leq \mathbb{E}_n \bigl[ n^{-d/2} M_n^d \bigr]\\ & = \int_0^\infty \,{\rm{d}} x^{d-1}\mathbb{P}_n \bigl( M_n \geq \sqrt{n}x \bigr) \,{\rm{d}} x \\ & \leq \int_0^\infty C{\rm{d}} x^{d-1} \,{\rm{e}}^{-\alpha x^2}\,{\rm{d}} x \\ & \leq C^d d^{d/2}. \end{align*}

This shows (i).

The following computation is a common starting point to show both (ii) and (iii). Let $H(S_k,x)$ be either the indicator function

or

with $x>0$ . Using the fact that $\mathbb{P} ( \tau_{-1}=n+1)$ is equivalent to a constant times $n^{-3/2}$ (see e.g. [Reference Le Gall17, equation (10)]), and then using the Markov property and finally the cyclic lemma (see e.g. [Reference Pitman18, Section 6.1]), we get

where S is independent of S and has the same distribution. Now we use Janson’s inequality [Reference Janson13, Lemma 2.1], which states that $\mathbb{P} (S_r = -m) \leq Cr^{-1/2}\,{\rm{e}}^{-\lambda m^2/r}$ for all $r \geq 1$ and $m \geq 0$ , to get

(17)

To prove (ii) we first define $T_k \,:\!=\, \max \{ 0 \leq i \leq k\,:\, S_i = 0 \}$ . Let $p_k(x)$ denote the probability of the event $\{ S_k = x, \, S_1,\ldots,S_k \geq 0 \}$ . Using the Markov property, we obtain

\begin{align*}p_k(x) &= \sum_{i=0}^{k-1} \mathbb{P} ( S_k = x, \, S_1,\ldots,S_k \geq 0, \, T_k=i ) \\ &= \sum_{i=0}^{k-1} \mathbb{P} (S_1,\ldots,S_{i} \geq 0, \, S_i = 0 ) \,\mathbb{P} (S_{k-i} = x, \, S_1,\ldots,S_{k-i} > 0 ).\end{align*}

We apply Lemma 3 and the local limit theorem (see e.g. [Reference Ibragimov and Linnik12, Theorem 4.2.1]), which gives a constant $C>0$ such that $\mathbb{P}(S_{k-i}=x) \leq C(k-i)^{-1/2}$ , for every $k>i$ and $x \in \mathbb Z$ , so that

\begin{align*}p_k(x) &\leq \sum_{i=0}^{k-1} \mathbb{P} (S_1,\ldots,S_{i} \geq 0, \, S_i = 0 ) \dfrac{x}{k-i} \mathbb{P} (S_{k-i}=x) \\ &\leq C \, x\sum_{i=0}^{k-1} \mathbb{P} (S_1,\ldots,S_{i} \geq 0, \, S_i = 0 ) \dfrac{1}{(k-i)^{3/2}}.\end{align*}

Notice that

\begin{align*}\mathbb{P} (S_1,\ldots,S_{i} \geq 0, \, S_i = 0 ) = e \, \mathbb{P}(\tau_{-1}=i+1) \leq \dfrac{C}{(i+1)^{3/2}}.\end{align*}

So, finally we have

\begin{align*}p_k(x) \leq C \, x\sum_{i=0}^{k-1} \dfrac{1}{((i+1)(k-i))^{3/2}} = \dfrac{C\,x}{(k+1)^{3/2}}\sum_{i=1}^{k-1} \biggl( \dfrac{1}{i+1} + \dfrac{1}{k-i} \biggr)^{3/2} \leq \dfrac{C\,x}{k^{3/2}}.\end{align*}

Putting the previous inequality in (17) with

and replacing k with $n-k$ gives

\begin{align*}\mathbb{P}_n(S_{n-k}=x) \leq C \biggl[ \dfrac{n}{k(n-k)} \biggr]^{3/2} x^2 \,{\rm{e}}^{-\lambda {{x^2}/{k}}}.\end{align*}

We can now bound the dth moment of $S_{n-k}$ :

\begin{align*} \mathbb{E}_n\bigl[S_{n-k}^d\bigr] &\leq C \biggl[ \dfrac{n}{k(n-k)} \biggr]^{3/2} \int_0^\infty x^{d+2}\,{\rm{e}}^{-\lambda {{x^2}/{k}}}\,{\rm{d}} x \\ &\leq C^d k^{d/2} \biggl[ \dfrac{n}{n-k} \biggr]^{3/2} \int_0^\infty x^{d+2}\,{\rm{e}}^{-x^2}\,{\rm{d}} x \\ &\leq C^d k^{d/2} \biggl[ \dfrac{n}{n-k} \biggr]^{3/2} d^{d/2}.\end{align*}

This concludes the proof of (ii).

To prove (iii) we follow the same principle. Using the Markov property and Lemma 3, we have

Then we apply the Cauchy–Schwarz inequality:

The last inequality comes from an explicit computation of the fourth central moment of a Poisson distribution. We combine the last inequality with (15) to get

Putting everything together, we obtain

We combine the last inequality with (17) to get

\begin{align*} \mathbb{E}_n\bigl[k^{-d/2}S_k^d\bigr] &= \int_0^\infty \,{\rm{d}} x^{d-1} \mathbb{P}_n\bigl(S_k \geq \sqrt{k}x\bigr)\,{\rm{d}} x\\ & \leq C \biggl[ \dfrac{n}{n-k} \biggr]^{3/2} \int_0^\infty \,{\rm{d}} x^{d-1} \,\exp\biggl({-}\frac{kx^2}{4(k+x\sqrt k)}\biggr)\,{\rm{d}} x.\end{align*}

We cut the last integral into two parts and obtain

\begin{align*}\int_0^\infty \,{\rm{d}} x^{d-1} \,\exp\biggl({-}\frac{kx^2}{4(k+x\sqrt k)}\biggr)\,{\rm{d}} x \leq \int_0^{\sqrt k} \,{\rm{d}} x^{d-1} \,{\rm{e}}^{-{{x^2}/{8}}}\,{\rm{d}} x + \int_{\sqrt k}^\infty \,{\rm{d}} x^{d-1} \,{\rm{e}}^{-{{x}/{8}}}\,{\rm{d}} x.\end{align*}

Noticing that the last two integrals are both smaller than $C^dd^d$ , this concludes the proof of (iii).

We now prove Lemma 3.

Proof of Lemma 3.For $x = (x_1,\ldots,x_n) \in \mathbb{R}^n$ and $i \in [\![ 0,n-1 ]\!]$ , let $x^i$ denote the ith cyclic permutation of x, namely $x^i \,:\!=\, (x_{1+i},\ldots,x_{n},x_1,\ldots,x_i)$ . Consider the set

\begin{align*} A_n \,:\!=\, \Biggl\{ (x_1,\ldots,x_n) \in \mathbb{N}^n\,:\, \text{for all } k \in [\![ 1,n ]\!], \sum_{i=1}^k (x_i-1) > 0 \Biggr\} \end{align*}

and write $X \,:\!=\, (X_1,\ldots,X_n)$ . Then

(18)

The inequality comes from the fact that the number of cyclic shifts of X such that $X^i \in A_n$ is almost surely bounded by .

3.4. Proof of Theorem 1(i)

Recall we want to show that $d_{\mathit{TV}}(k,n) = {\rm{O}}(k/\sqrt{n})$ . In Section 3.1 we have shown that it is enough to show $\delta(k,n) = {\rm{O}}(k/\sqrt{n})$ , where the definition of $\delta(k,n)$ is given by (7). Thanks to Proposition 2, this quantity can be rewritten as

\begin{align*}\delta(k,n) = \dfrac{n^k(n-k)!}{n!} \int_{\Lambda_k} [ \mathbb{E}_n [ X_{\lceil nt_1 \rceil} \cdots X_{\lceil nt_k \rceil} ] -1 ] \,{\rm{d}} t_1 \cdots {\rm{d}} t_k,\end{align*}

where

\begin{align*}\Lambda_k \,:\!=\, \{(t_1,\ldots,t_k) \in (0,1]^k\,:\, (\lceil nt_1 \rceil,\ldots,\lceil nt_k \rceil) \in D_n \cap G_n\}.\end{align*}

As was already mentioned in Section 3.1, $n^k(n-k)!/n! = 1 + {\rm{O}}(k/\sqrt n)$ , so for our purpose it is sufficient to bound the integral

\begin{align*}I(k,n) \,:\!=\, \int_{\Lambda_k} [ \mathbb{E}_n [ X_{\lceil nt_1 \rceil} \cdots X_{\lceil nt_k \rceil} ] -1 ] \,{\rm{d}} t_1 \cdots {\rm{d}} t_k.\end{align*}

Using inequality (8), we obtain

(19) \begin{align} I(k,n) \leq & \int_{[0,1]^k} \biggl[ \mathbb{E}_n \biggl[ \biggl( \dfrac{S_{\lceil nt_1 \rceil}}{\lceil nt_1 \rceil} + 1 \biggr) \cdots \biggl( \dfrac{S_{\lceil nt_k \rceil}}{\lceil nt_1 \rceil} + 1 \biggr) \biggr] -1 \biggr] \,{\rm{d}} t_1 \cdots {\rm{d}} t_k.\end{align}

Notice that (i) and (iii) of Proposition 4 imply that for all $0 \leq k \leq n$

(20) \begin{equation}\mathbb{E}_n \bigl[ S_k^d \bigr] \leq \bigl( C d \sqrt{k} \bigr)^d.\end{equation}

Indeed, Proposition 4(i) covers the case $k \geq n/2$ of equation (20) and (iii) covers the case $k < n/2$ (notice that the cases $k=0$ and $k=n$ are trivial, since, conditionally on $\tau_{-1}=n+1$ , $S_0 = S_n = 0$ ). Hölder’s inequality shows that for $0 \leq i_1,\ldots,i_d \leq n$

(21) \begin{equation}\mathbb{E}_n [S_{i_1} \cdots S_{i_d} ] \leq (Cd)^d \sqrt{i_1 \cdots i_d}.\end{equation}

Expanding the products in (19) and using (21) gives

\begin{align*}I(k,n) \leq \sum_{d=1}^k \binom{k}{d} (Cd)^d n^{-d/2} \int_{[0,1]^d} \dfrac{{\rm{d}} t_1 \cdots {\rm{d}} t_d}{(t_1 \cdots t_d)^{1/2}} .\end{align*}

Notice that $t \mapsto t^{-1/2}$ is integrable, so

\begin{align*}I(k,n) \leq \sum_{d=1}^k \binom{k}{d} (Cd)^d n^{-d/2}.\end{align*}

Using the bound $\binom{k}{d} \leq (ke/d)^d$ ,

\begin{align*}I(k,n) \leq \sum_{d=1}^k \biggl(Ce\dfrac{k}{\sqrt{n}}\biggr)^d.\end{align*}

Since $k = {\rm{o}}(\sqrt{n})$ , we conclude that $I(k,n) = {\rm{O}}(k/\sqrt{n})$ .

4. Convergence for the Kolmogorov distance

In this section we suppose that $k_n = {\rm{o}}(n)$ and $\sqrt{n}={\rm{o}}(k_n)$ . We will write k instead of $k_n$ to ease notation, but keep in mind that k depends on n. The goal of this section is to show Theorem 1(ii). The following lemma allows us to replace the cumulative probability in (2) with the term $\mathbb{E}_n[(S_{i_1}+i_1)\cdots(S_{i_n}+i_n)]$ , which is more manageable.

Lemma 4. There is a constant $C>0$ such that

(22) \begin{equation} d_K(k,n) \leq \dfrac{1}{n^k} \max_{1 \leq i_1\cdots i_k \leq n} | \mathbb{E}_n [ (S_{i_1}+i_1)\cdots(S_{i_k}+i_k) ] - i_1\cdots i_k | + \dfrac{Ck}{n}.\end{equation}

Before proving this lemma we show how it implies Theorem 1(ii). We will also need the following simple lemma, which extends [Reference Billingsley4, equation (27.5)].

Lemma 5. Let $r \geq 1$ , $w_1,\ldots,w_r$ , and $z_1,\ldots,z_r$ be complex numbers of modulus smaller than or equal to $a>0$ and $b>0$ , respectively. Then

(23) \begin{equation} \Biggl| \prod_{i=1}^r w_i - \prod_{i=1}^r z_i \Biggr| \leq \sum_{i=1}^r |w_i - z_i| a^{r-i} b^{i-1}.\end{equation}

Proof of Lemma 5.The result readily follows from the identity

\begin{equation*} \prod_{i=1}^r w_i - \prod_{i=1}^r z_i = (w_1-z_1)\prod_{i=2}^r w_i + z_1\Biggl(\prod_{i=2}^r w_i - \prod_{i=2}^r z_i\Biggr). \end{equation*}

Proof of Theorem 1(ii).Let $1/2 < \alpha < 1$ . We define a sequence of intervals $I_1,\ldots,I_{M+3}$ (depending on n) in the following way:

\begin{align*} &I_1 \,:\!=\, [1,n-k),\quad I_2 \,:\!=\, [n-k,n-k^\alpha),\quad I_3 \,:\!=\, \bigl[n-k^\alpha,n-k^{\alpha^2}\bigr),\quad \ldots\\ &\qquad I_{M+1} \,:\!=\, \bigl[n-k^{\alpha^{M-1}},n-k^{\alpha^M}\bigr),\quad I_{M+2} \,:\!=\, \bigl[n-k^{\alpha^M},n-n/k\bigr),\quad I_{M+3}\,:\!=\, [n-n/k,n] ,\end{align*}

where M is the biggest integer such that $n-k^{\alpha^M} \leq n-n/k$ . Let $1 \leq i_1,\ldots,i_k \leq n$ . Using Lemma 5 (with $a=b=1$ and noticing that $(S_i+i)/n$ under $\mathbb{P}_n$ is almost surely smaller than 1 for every i), we decompose the quantity $\mathbb{E}_n [ (S_{i_1}+i_1)\cdots(S_{i_k}+i_k) - i_1\cdots i_k]$ depending on those intervals to which the $i_j$ belong, so that

(24) \begin{equation}\mathbb{E}_n \biggl[ \biggl(\dfrac{S_{i_1}+i_1}{n}\biggr)\cdots \biggl(\dfrac{S_{i_k}+i_k}{n}\biggr) - \dfrac{i_1}{n}\cdots\dfrac{i_k }{n} \biggr] \leq \sum_{m=1}^{M+3} \mathbb{E}_n \Biggl[\prod_{i_j \in I_m} \dfrac{S_{i_j}+i_j}{n} - \prod_{i_j \in I_m} \dfrac{i_j}{n} \Biggr].\end{equation}

Fix $m \in \{1,\ldots,M+3\}$ and let $\iota_1,\ldots,\iota_{r_m}$ denote the $i_j$ that belong to $I_m$ . If $m=1$ , then by Lemma 5 and Proposition 4(i),

\begin{align*}\mathbb{E}_n \Biggl[\prod_{i_j \in I_1} \dfrac{S_{i_j}+i_j}{n} - \prod_{i_j \in I_1} \dfrac{i_j}{n} \Biggr] & \leq \dfrac{1}{n}\sum_{j=1}^{r_1} \mathbb{E}_n [S_{\iota_j}] \biggl(\dfrac{n-k}{n}\biggr)^{j-1} \\ & \leq \dfrac{C\sqrt{n}}{n} \dfrac{n}{k} \\ &= \dfrac{C\sqrt{n}}{k}.\end{align*}

If $2 \leq m \leq M+2$ , we follow the same principle but we use Proposition 4(ii) instead:

\begin{align*}\mathbb{E}_n \Biggl[\prod_{i_j \in I_m} \dfrac{S_{i_j}+i_j}{n} - \prod_{i_j \in I_m} \dfrac{i_j}{n} \Biggr] & \leq \dfrac{1}{n}\sum_{j=1}^{r_m} \mathbb{E}_n [S_{\iota_j}] \biggl(\dfrac{n-k^{\alpha^{m-1}}}{n}\biggr)^{j-1} \\ & \leq \dfrac{Ck^{\alpha^{m-2}/2}}{n} \dfrac{n}{k^{\alpha^{m-1}}}\\ &= \dfrac{C}{k^{(\alpha-1/2)\alpha^{m-2}}}.\end{align*}

The previous computation works in the case $m=M+2$ because the maximality of M implies that $n-n/k \leq n-k^{\alpha^{M+1}}$ . Finally, if $m=M+3$ ,

\begin{align*}\mathbb{E}_n \Biggl[\prod_{i_j \in I_{M+3}} \dfrac{S_{i_j}+i_j}{n} - \prod_{i_j \in I_{M+3}} \dfrac{i_j}{n} \Biggr] & \leq \dfrac{1}{n}\sum_{j=1}^{r_{M+3}} \mathbb{E}_n [S_{\iota_j}] \\ & \leq r_{M+3}\dfrac{C}{n} \biggl(\dfrac{n}{k}\biggr)^{1/2} \\ &\leq C \biggl( \dfrac kn \biggr)^{1/2}.\end{align*}

Notice that $k^{\alpha^M} \geq n/k$ , so for all $0 \leq m \leq M$ , $k^{\alpha^{M-m}} \geq (n/k)^{\alpha^{-m}}$ . Summing the preceding bounds for $2 \leq m \leq M+2$ gives

\begin{align*} \sum_{m=2}^{M+2} \mathbb{E}_n \Biggl[\prod_{i_j \in I_m} \dfrac{S_{i_j}+i_j}{n} - \prod_{i_j \in I_m} \dfrac{i_j}{n} \Biggr] &\leq C\sum_{m=0}^{M} \dfrac{1}{k^{(\alpha-1/2)\alpha^{m}}} \\ &= C\sum_{m=0}^{M} \dfrac{1}{k^{(\alpha-1/2)\alpha^{M-m}}} \\ &\leq C\sum_{m=0}^{M} \biggl( \dfrac{k}{n} \biggr)^{(\alpha-1/2)\alpha^{-m}} \\ &\leq C\biggl( \dfrac{k}{n} \biggr)^{(\alpha-1/2)} +C\sum_{m=1}^{M} \biggl( \dfrac{k}{n} \biggr)^{-(\alpha-1/2)e\ln(\alpha)m} \\ &\leq C\biggl( \dfrac{k}{n} \biggr)^{(\alpha-1/2)} + C\biggl( \dfrac{k}{n} \biggr)^{-(\alpha-1/2)e\ln(\alpha)},\end{align*}

where we used the fact that $-e\ln(\alpha)m \leq 1/\alpha^m$ for all $m \geq 1$ . If we take $\alpha$ such that $\alpha -1/2 = \exp({-}{\rm{e}}^{-1}) -1/2 \simeq 0.1922$ , then the last quantity is ${\rm{O}}(k/n)^{0.19}$ . Putting everything together, we finally obtain

\begin{align*}\dfrac{1}{n^k} \max_{1 \leq i_1\cdots i_k \leq n} \Biggl| \mathbb{E}_n \Biggl[ \prod_{r=1}^k (S_{i_r}+i_r) \Biggr] - \prod_{r=1}^k i_r \Biggr| \leq C \biggl[\dfrac{\sqrt{n}}{k} + \biggl( \dfrac{k}{n} \biggr)^{0.19}+\biggl( \dfrac{k}{n} \biggr)^{1/2}\biggr].\end{align*}

Combining the last display with Lemma 4 gives the desired result.

Now we prove Lemma 4.

Proof of Lemma 4.Define the empirical distribution function of $\pi_n$ , namely, for all $i \in \{1,\ldots,n\}$ ,

As suggested in [Reference Diaconis and Hicks11], we can use a result of Bobkov [Reference Bobkov5, Theorem 1.1]. The sequence $(\pi_n(1),\ldots,\pi_n(n))$ is an exchangeable extension of $(\pi_n(1),\ldots,\pi_n(k))$ , meaning that the distribution of $(\pi_n(1),\ldots,\pi_n(n))$ stays the same after any permutation. So by Theorem 1.1 of [Reference Bobkov5] we have

\begin{align*}\max_{1 \leq i_1\cdots i_k \leq n}| \mathbb{P}(\pi_n(1)\leq i_1,\ldots,\pi_n(k)\leq i_k) - \mathbb{E}[ F_n(i_1)\cdots F_n(i_k) ] | \leq C\dfrac{k}{n},\end{align*}

where C is a universal constant. Using Proposition 2 and Corollary 2, we find

\begin{align*}\mathbb{E}[ F_n(i_1)\cdots F_n(i_k) ] = \dfrac{1}{n^k} \mathbb{E}_n [ (S_{i_1}+i_1)\cdots(S_{i_k}+i_k) ].\end{align*}

Indeed, Proposition 2 implies that $(F_n(1),\ldots,F_n(n))$ and $(G_n(1),\ldots,G_n(n))$ have the same distribution, with

where $T_{n+1}$ is a uniform random tree of $\mathfrak{C}_{n+1}$ . Then Corollary 2 shows that

\begin{align*}G_n(i) = \dfrac{1}{n} \# \{ 1 \leq j \leq n\,:\, r(j,T_{n+1}) \leq i \} \stackrel{(d)}{=} \dfrac{1}{n}(S_i+i)\end{align*}

jointly for $i \in \{1,\ldots,n\}$ . This concludes the proof.

5. Sum and maximum of the first parking places

We begin this section with the proof of Corollary 1. Then we finish by proving Proposition 1.

Proof of Corollary 1.(i) Recall that $(U_n(i))_{1 \leq i \leq n}$ are i.i.d. uniformly distributed in $[\![ 1,n ]\!]$ . By the central limit theorem, the convergence

\begin{align*}\sqrt{\dfrac{12}{k_n}}\biggl(\dfrac{U_n(1)+\cdots+U_n(k_n)}{n} - \dfrac{k_n}{2}\biggr) \longrightarrow \mathcal{N}(0,1)\end{align*}

holds in distribution. Using the first item of Theorem 1, we deduce that the total variation distance between the distributions of $\sum_{i=1}^{k_n} U_n(i)$ and $\sum_{i=1}^{k_n} \pi_n(i)$ tends to 0. Thus the above convergence still holds when $U_n(i)$ is replaced by $\pi_n(i)$ .

(ii) Let $x>0$ . Then

\begin{align*}\mathbb{P} \biggl[ k_n\biggl( 1-\dfrac1n \max \{ U_n(1),\ldots,U_n(k_n) \} \biggr) \geq x \biggr] = 0 \vee \dfrac1n \biggl\lfloor n\biggl(1-\dfrac{x}{k_n}\biggr) \biggr\rfloor^{k_n} \xrightarrow[n \to \infty]{} \,{\rm{e}}^{-x}.\end{align*}

Using Theorem 1(ii), we deduce that the above convergence still holds when $U_n(i)$ is replaced by $\pi_n(i)$ .

Proof of Proposition 1.In this proof we write k instead of $k_n$ to ease notation. For every $a \geq 0$ :

(25) \begin{equation}\mathbb P (\pi_n(1),\ldots,\pi_n(k) \leq n-a) = \dfrac{(n-k)!}{n!}\mathbb E_n [ (S_{n-a}+n-a)_k ].\end{equation}

Indeed, following the same computation as in the proof of Proposition 2, we have

which leads to (25) since $X_1+\cdots+X_{n-a} = S_{n-a}+n-a$ . Let $\tau_n$ be a Bienaymé–Galton–Watson tree with a critical Poisson offspring distribution $\mu$ conditioned on having n vertices, and define $S^n$ to be the associated Ł ukasiewicz path. More precisely, if $v_1,\ldots,v_n$ are the vertices of $\tau_n$ ordered according to the lexicographic order (see e.g. [Reference Le Gall17, Section 1.1]), then for all $0 \leq k \leq n$ ,

\begin{align*}S^n_k \,:\!=\, \# \{ e \,:\, e \text{ is an edge adjacent to a vertex $ v_i $ with $ i \leq k$} \} - k.\end{align*}

In the previous definition $S^n_k$ is deduced from the first k vertices $v_1,\ldots,v_k$ , but it is possible to see $S^n_k$ in terms of the last $n-k$ vertices $v_{k+1},\ldots,v_n$ :

\begin{align*}S^n_k = n-1-k-\# \{ e \,:\, e \text{ is an edge between two vertices $ v_i$ and $ v_j $ with $ i,j > k$} \}.\end{align*}

It is known that $S^{n+1}$ and S under $\mathbb P_n$ have the same distribution. Thus equality (25) can be rewritten in the following way:

(26) \begin{equation}\mathbb P (\pi_n(1),\ldots,\pi_n(k) \leq n-a) = \dfrac{(n-k)!}{n!}\mathbb E \bigl[ \bigl(S_{n-a}^{n+1}+n-a\bigr)_k \bigr].\end{equation}

Let $\tau^*$ be the so-called Kesten’s tree associated with $\mu$ (see e.g. [Reference Abraham and Delmas1, Section 2.3]). Let $\preceq$ denote the lexicographic order on the set of vertices of $\tau^*$ . It is always possible to find a unique infinite sequence $u_1,u_2,\ldots$ of distinct vertices of $\tau^*$ such that for all $i \geq 1$ , $\{u \,:\, u \text{ is a vertex of } \tau^* \text{ such that } u_i \preceq u \} = \{ u_1,\ldots,u_i\}$ . In other words, $u_1,u_2,\ldots$ are the last vertices of $\tau^*$ for the lexicographic order, which, necessarily, lie on the right of the infinite spine. Similarly to the Łukasiewicz path we can define the quantity

\begin{align*}S_a^* \,:\!=\, a-\# \{ e \,:\, e \text{ is an edge between two vertices $ u_i$ and $ u_j$ with $ i,j \leq a+1$} \}.\end{align*}

It is known that $\tau_n$ converges in distribution, for the local topology, towards $\tau^*$ (see e.g. [Reference Abraham and Delmas1, Section 3.3.5]). Making use of Skorokhod’s representation theorem, suppose that the latter convergence holds almost surely. Thus $S^{n+1}_{n-a}$ converges almost surely towards $S_a^*$ . Consequently, the convergence

\begin{align*} \dfrac{(n-k)!}{n!} \bigl(S_{n-a}^{n+1}+n-a\bigr)_k & = \dfrac{(n-k)!}{\bigl(n-k+S_{n-a}^{n+1}-a\bigr)!} \dfrac{\bigl(n+S_{n-a}^{n+1}-a\bigr)!}{n!} \\ & \sim (n-cn)^{a-S_a^*}\dfrac{1}{n^{a-S_a^*}} \\ &\longrightarrow (1-c)^{a - S_a^*}\end{align*}

holds almost surely. Since the above sequence is bounded by 1, we deduce that the convergence of the expectation holds, which concludes the proof.

Acknowledgements

I am really grateful to Igor Kortchemski for useful suggestions and the careful reading of the manuscript.

Funding information

There are no funding bodies to thank relating to this creation of this article.

Competing interests

There were no competing interests to declare which arose during the preparation or publication process of this article.

References

Abraham, R. and Delmas, J.-F. (2020). An introduction to Galton–Watson trees and their local limits. Available at arXiv:1506.05571.Google Scholar
Addario-Berry, L., Devroye, L. and Janson, S. (2013). Sub-Gaussian tail bounds for the width and height of conditioned Galton–Watson trees. Ann. Prob. 41, 10721087.CrossRefGoogle Scholar
Biane, P. (2002). Parking functions of types A and B. Electron. J. Combin. 9, N7.CrossRefGoogle Scholar
Billingsley, P. (2012). Probability and Measure (Wiley Series in Probability and Statistics). Wiley.Google Scholar
Bobkov, S. G. (2005). Generalized symmetric polynomials and an approximate de Finetti representation. J. Theoret. Prob. 18, 399412.CrossRefGoogle Scholar
Broutin, N. and Marckert, J.-F. (2016). A new encoding of coalescent processes: applications to the additive and multiplicative cases. Prob. Theory Relat. Fields 166, 515552.Google Scholar
Chassaing, P. and Louchard, G. (2002). Phase transition for parking blocks, Brownian excursion and coalescence. Random Structures Algorithms 21, 76119.CrossRefGoogle Scholar
Chassaing, P. and Marckert, J.-F. (2001). Parking functions, empirical processes, and the width of rooted labeled trees. Electron. J. Combin. 8, R14.CrossRefGoogle Scholar
Chebikin, D. and Postnikov, A. (2010). Generalized parking functions, descent numbers, and chain polytopes of ribbon posets. Adv. Appl. Math. 44, 145154.Google Scholar
Cori, R. and Rossin, D. (2000). On the sandpile group of dual graphs. European J. Combinatorics 21, 447459.CrossRefGoogle Scholar
Diaconis, P. and Hicks, A. (2017). Probabilizing parking functions. Adv. Appl. Math. 89, 125155.CrossRefGoogle Scholar
Ibragimov, I. A. and Linnik, Y. V. (1971). Independent and Stationary Sequences of Random Variables. Wolters-Noordhoff, Groningen.Google Scholar
Janson, S. (2006). Random cutting and records in deterministic and random trees. Random Structures Algorithms 29, 139179.CrossRefGoogle Scholar
Janson, S. (2012). Simply generated trees, conditioned Galton–Watson trees, random allocations and condensation. Prob. Surveys 9, 103252.CrossRefGoogle Scholar
Kenyon, R. and Yin, M. (2023). Parking functions: from combinatorics to probability. Methodology Comput. Appl. Prob. 25, 32.Google Scholar
Konheim, A. G. and Weiss, B. (1966). An occupancy discipline and applications. SIAM J. Appl. Math. 14, 12661274.CrossRefGoogle Scholar
Le Gall, J.-F. (2005). Random trees and applications. Prob. Surveys 2, 245311.CrossRefGoogle Scholar
Pitman, J. (2006). Combinatorial Stochastic Processes (Lecture Notes Math. 1875). Springer, New York.Google Scholar
Shi, J. Y. (1986). The Kazhdan–Lusztig Cells in Certain Affine Weyl Groups (Lecture Notes Math. 1179). Springer, Berlin.Google Scholar
Stanley, R. P. (1997). Parking functions and noncrossing partitions. Electron. J. Combin. 4, R20.CrossRefGoogle Scholar
Stanley, R. P. (1998). Hyperplane arrangements, parking functions and tree inversions. In Mathematical Essays in Honor of Gian-Carlo Rota (Progress Math. 161), ed. B. E. Sagan and R. P. Stanley, pp. 359–375. Birkhäuser, Boston.Google Scholar
Stanley, R. P. and Pitman, J. (2002). A polytope related to empirical distributions, plane trees, parking functions, and the associahedron. Discrete Comput. Geom. 27, 603634.CrossRefGoogle Scholar
Wellner, J. A. (2018). The Cramér–Chernoff method and some exponential bounds. Available at https://sites.stat.washington.edu/jaw/COURSES/580s/581/HO/Chernoff-Cramer.pdf Google Scholar
Yan, C. H. (2015). Parking functions. In Handbook of Enumerative Combinatorics (Discrete Math. Appl.), ed. M. Bóna, pp. 835893. CRC Press, Boca Raton, FL.Google Scholar
Yin, M. (2023). Parking functions: interdisciplinary connections. Adv. Appl. Prob. Available at doi:10.1017/apr.2022.49.CrossRefGoogle Scholar
Figure 0

Figure 1. Example of a Cayley tree t with 10 vertices. For every vertex, its label is represented in black on the left and its rank for the breadth-first search is in red on the right. Here, for example, we have . The parking function associated with this tree by (5) is (2, 2, 1, 1, 2, 1, 8, 4, 8).