The number of K-tons in the coupon collector problem

John C. Saunders

doi:10.1017/jpr.2022.94

The number of K-tons in the coupon collector problem

Part of: Distribution theory - Probability Combinatorial probability

Published online by Cambridge University Press: 07 February 2023

John C. Saunders

Show author details

John C. Saunders*: Affiliation:
Middle Tennessee State University
*: *Postal address: Middle Tennessee State University Department of Mathematical Sciences. Email: [email protected]

Article contents

Abstract
Introduction
Proofs
Future work
Funding information
Competing interests
References

Rights & Permissions

Abstract

Consider the coupon collector problem where each box of a brand of cereal contains a coupon and there are n different types of coupons. Suppose that the probability of a box containing a coupon of a specific type is $1/n$, and that we keep buying boxes until we collect at least m coupons of each type. For $k\geq m$ call a certain coupon a k-ton if we see it k times by the time we have seen m copies of all of the coupons. Here we determine the asymptotic distribution of the number of k-tons after we have collected m copies of each coupon for any k in a restricted range, given any fixed m. We also determine the asymptotic joint probability distribution over such values of k, and the total number of coupons collected.

Keywords

Probability distributions combinatorial probability

MSC classification

Primary: 60E05: Distributions

Secondary: 60C05: Combinatorial probability

Type: Original Article
Information: Journal of Applied Probability , Volume 60 , Issue 3 , September 2023 , pp. 723 - 736

DOI: https://doi.org/10.1017/jpr.2022.94 [Opens in a new window]
Copyright: © The Author(s), 2023. Published by Cambridge University Press on behalf of Applied Probability Trust

1. Introduction

\begin{equation*}H_n=1+\frac{1}{2}+\frac{1}{3}+\cdots+\frac{1}{n}=\log n+\gamma+\frac{1}{2n}+O\bigg(\frac{1}{n^2}\bigg),\end{equation*}

where $\gamma$ is the Euler–Mascheroni constant. The expected value of the total number of boxes collected has been extensively studied. For instance, it was proved in [Reference Newman and Shepp9] that, if you continue to collect boxes until you have at least m coupons of each type, then the expected number of boxes collected is $n\log n+(m-1)n\log\log n+n\cdot C_m+o(n)$ , where $C_m$ is a constant depending on m.

Definition 1.1. The Gumbel distribution with parameters $\mu$ and $\beta$ , which is denoted as $\mathrm{Gumbel}(\mu,\beta)$ , is defined as the probability distribution with cumulative density function $\exp\![-\mathrm{e}^{-(x-\mu)/\beta}]$ for $-\infty<x<\infty$ .

The above result was improved in [Reference Erdös and Rényi5], which proved that, if $v_m(n)$ is the total number of boxes collected, then $({v_m(n)}/{n})-\log n-(m-1)\log\log n\sim\mathrm{Gumbel}(-\log((m-1)!),1)$ as $n\rightarrow\infty$ . The probability distribution of the number of boxes that need to be collected to collect $a_n+1$ coupons, where $0\leq a_n<n$ , was studied in [Reference Baum and Billingsley1]. The expected number of boxes needing to be collected if the types of coupons have different probabilities of being in a box was examined in [Reference Berenbrink and Sauerwald2], and [Reference Doumas and Papanicolaou3, Reference Neal8] looked at the actual probability distribution of the number of boxes collected if the types of coupons have different probabilities. The case of collecting at least m coupons of each type has also been examined [Reference Doumas and Papanicolaou4]. Further variations of the problem, where you have multiple collectors, were studied in [Reference Foata and Han6, Reference Myers and Wilf7].

Here we examine the situation where we continue to collect boxes until we have at least m copies of each type of coupon for any fixed $m\in\mathbb{N}$ .

Definition 1.2. For all $k\geq m$ , call a certain coupon a k-ton if we see it k times by the time we have seen m copies of all of the coupons. Let the number of k-tons be denoted as $S_k$ .

In the case of $m=1$ , [Reference Myers and Wilf7] determined the expected number of 1-tons, which they called singletons, to be $H_n$ . Singletons were also studied in [Reference Penrose10], but in the case of unequal coupon probabilities and where the number of coupons collected in total is proportional to the number of types of coupons; many applications to their study, such as in databases, biological particles, and communication channels, were noted here. We extend this singleton result by determining the asymptotic joint probability distribution over values of k in a restricted range and the total number of coupons collected assuming, as in [Reference Myers and Wilf7], equal coupon probabilities. We also determine the asymptotic distribution of the number of k-tons after we have collected m copies of each coupon for any such values of k, given any fixed $m\in\mathbb{N}$ . While making an analogous extension to [Reference Penrose10] is certainly interesting, we leave it for a future project.

Theorem 1.1. Fix $m\in\mathbb{N}$ , let $T_m(n)$ be the total number of coupons collected to see at least m copies of each coupon, and let $n\rightarrow\infty$ . Let $k=o(\!\log n)$ if $m=1$ and $k=o(({\log n})/({\log\log n}))$ if $m\geq 2$ , with $k\geq m$ in either case. Then the joint probability distribution of

\begin{align*} \bigg(\frac{T_m(n)-n\log n-(m-1)n\log\log n}{n},\frac{k!S_k}{(\!\log n)^{k-m+1}}\bigg) \end{align*}

converges to $(X,\mathrm{e}^{-X})$ , where $X\sim\mathrm{Gumbel}(-\log((m-1)!),1)$ .

A simple heuristic argument for Theorem 1.1 may be given as follows. First, from [Reference Erdös and Rényi5], we know that

\begin{align*}\frac{T_m(n)-n\log n-(m-1)n\log\log n}{n}\sim\mathrm{Gumbel}(-\log((m-1)!),1).\end{align*}

Later on, we also see, unsurprisingly, that the number of times a specific type of coupon is seen is essentially independent of whether or not all types of coupons have been collected at least m times. Also, unsurprisingly, it turns out that the number of times two different types of coupons are seen is asymptotically independent. Since the probability of a type of coupon being seen k times is

\begin{equation*} \binom{T_m(n)}{k}\bigg(\frac{1}{n}\bigg)^{k}\bigg(1-\frac{1}{n}\bigg)^{T_m(n)-k} \sim \frac{(n\log n)^k\mathrm{e}^{-\log n-(m-1)\log\log n-x}}{k!n^k} = \frac{(\!\log n)^{k-m+1}}{n\mathrm{e}^xk!},\end{equation*}

where $x=({T_m(n)-n\log n-(m-1)n\log\log n})/{n}$ , the expected number of k-tons is $\sim{(\!\log n)^{k-m+1}}/{\mathrm{e}^xk!}$ , so that ${k!S_k}/{(\!\log n)^{k-m+1}}\sim \mathrm{e}^{-x}$ .

While Theorem 1.1 provides us with the joint distribution between the total number of coupons collected and the number of k-tons for minimal k, we can also ask what happens for maximum values of k. The answer to this is provided in the next theorem.

Theorem 1.2. Fix $m\in\mathbb{N}$ and let $T_m(n)$ be the total number of coupons collected to see at least m copies of each coupon.

(i) Fix $d\in\mathbb{R}$ . Pick an increasing sequence $(n_j)_j\subset\mathbb{N}$ and a sequence $(d_j)_j\subset\mathbb{R}$ such that $\lim_{j\rightarrow\infty}d_j=d$ and $k_j=\mathrm{e}\log n_j+\big((\mathrm{e}-1)(m-1)-\frac{1}{2}\big)\log\log n_j+d_j\in\mathbb{N}$ for all $j\in\mathbb{N}$ . Then the joint probability distribution of
\begin{align*} \bigg(\frac{T_m(n_j)-n_j\log n_j-(m-1)n_j\log\log n_j}{n_j},S_{k_j}\bigg) \end{align*}
converges to
\begin{align*} \bigg(X,\mathrm{Pois}\bigg(\frac{\mathrm{e}^{(\mathrm{e}-1)X-d}}{\sqrt{2\pi \mathrm{e}}}\bigg)\bigg), \end{align*}
where $X\sim\mathrm{Gumbel}(-\log((m-1)!),1)$ .
(ii) Let $g(n)=o(\!\log\log n)$ with $\lim_{n\rightarrow\infty}g(n)=\infty$ , such that $k=\mathrm{e}\log n+\big((\mathrm{e}-1)(m-1)$ $-\frac{1}{2}\big)\log\log n+g(n)\in\mathbb{N}$ for all $n\in\mathbb{N}$ . Then $S_k$ converges to 0 with probability 1.
(iii) Let $g(n)=o(\!\log\log n)$ with $\lim_{n\rightarrow\infty}g(n)=\infty$ , such that $k=\mathrm{e}\log n+\big((\mathrm{e}-1)(m-1)$ $-\frac{1}{2}\big)\log\log n-g(n)\in\mathbb{N}$ for all $n\in\mathbb{N}$ . Then the value of $\mathbb{P}(S_k=0)$ converges to 0.

A simple heuristic argument for Theorem 1.2 may be given as follows. The probability of a type of coupon being seen $k_j$ times is

\begin{equation*} \binom{T_m(n_j)}{k_j}\bigg(\frac{1}{n_j}\bigg)^{k_j}\bigg(1-\frac{1}{n_j}\bigg)^{T_m(n_j)-k_j}.\end{equation*}

Assuming that $x=({T_m(n_j)-n_j\log n_j-(m-1)n_j\log\log n_j})/{n_j}$ is constant as $j\rightarrow\infty$ and $k_j$ is as defined in Theorem 1.2, we have that the above is asymptotic to

\begin{align*} \frac{1}{\sqrt{2\pi k_j}\,}\bigg(\frac{\mathrm{e} T_m(n_j)}{n_jk_j}\bigg)^{k_j}\exp&\bigg[{-}\frac{T_m(n_j)}{n_j}\bigg] \\[5pt] & \sim \frac{1}{\sqrt{2\pi \mathrm{e}\log n_j}\,}\exp\bigg[\frac{T_m(n_j)\mathrm{e}-k_jn_j-T_m(n_j)}{n_j}\bigg] \\[5pt] & = \frac{1}{\sqrt{2\pi \mathrm{e}\log n_j}\,}\exp\bigg[{-}\log n_j+\frac{\log\log n_j}{2}+(\mathrm{e}-1)x-d_j\bigg] \\[5pt] & = \frac{1}{\sqrt{2\pi \mathrm{e}}n_j}\mathrm{e}^{(\mathrm{e}-1)x-d_j}.\end{align*}

Therefore, by similar reasoning to the heuristic for Theorem 1.1, we can see that the probability that there are y k-tons is asymptotic to

\begin{equation*} \binom{n}{y}\bigg(\frac{\lambda}{n}\bigg)^y\bigg(1-\frac{\lambda}{n}\bigg)^{n-y}\sim\frac{\lambda^y\mathrm{e}^{-\lambda}}{y!},\end{equation*}

where $\lambda = {\mathrm{e}^{(\mathrm{e}-1)x-d}}/({\sqrt{2\pi \mathrm{e}}})$ , which gives the Poisson distribution in Theorem 1.2(i). Parts (ii) and (iii) can be heuristically argued by letting $d_j$ approach $\infty$ or $-\infty$ .

From Theorem 1.1 we can derive the following corollaries.

Corollary 1.1. Fix $m\in\mathbb{N}$ , let $n\rightarrow\infty$ , and let $k=o(\!\log n)$ if $m=1$ and $k=o(({\log n})/({\log\log n}))$ if $m\geq 2$ , with $k\geq m$ in either case. Suppose we keep collecting coupons until we see at least m copies of each coupon. Then

\begin{align*} \frac{k!S_k}{(\!\log n)^{k-m+1}} \quad \overset{\mathrm{D}}{\longrightarrow} \quad \mathrm{Exp}\bigg(\frac{1}{(m-1)!}\bigg). \end{align*}

Corollary 1.1 gives the asymptotic probability distribution of the number of k-tons. If we fix k, we can also determine the joint asymptotic distribution between the number of m-tons, $m+1$ -tons, …, and k-tons. We can see that asymptotically all of these variables are linearly dependent.

Corollary 1.2. Fix $k,m\in\mathbb{N}$ with $k\geq m$ , and let $n\rightarrow\infty$ . Suppose we keep collecting coupons until we see at least m copies of each coupon. Then the joint probability distribution

\begin{align*} \bigg(\frac{m!S_m}{(\!\log n)},\frac{(m+1)!S_{m+1}}{(\!\log n)^2},\ldots,\frac{k!S_k}{(\!\log n)^{k-m+1}}\bigg) \end{align*}

converges to $X\cdot(1,1,\ldots,1)$ , where

\begin{align*} X\sim\mathrm{Exp}\bigg(\frac{1}{(m-1)!}\bigg). \end{align*}

2. Proofs

To prove Theorem 1.1 we require the following notation and lemma.

Note 2.1. Suppose we collect x coupons in total, regardless of whether we have collected all n types of coupons or not. Let $S_{i,x}$ denote the number of types of coupons we collected exactly i times.

Note 2.2. Throughout this paper, let $N(n,f(n))\;:\!=\;n\log n+(m-1)n\log\log n+f(n)$ .

Lemma 2.1. Let $f\colon\mathbb{N}\rightarrow\mathbb{R}$ be such that $-\frac{1}{2}n\log\log n \leq f(n) \leq \frac{1}{2}n\log\log n$ , and $N(n,f(n))\in\mathbb{N}$ for all $n\in\mathbb{N}$ . Suppose we collect N(n,f(n)) coupons in total, regardless of whether we collect m copies of all n different kinds of coupons or not. Let $k=o(\!\log n)$ if $m=1$ and $k=o({\log n}/{\log\log n})$ if $m\geq 2$ . If $m\leq k$ , then as $n\rightarrow\infty$

\begin{align*}\frac{k!S_{k,N(n,f(n))}}{(\!\log n)^{k-m+1}}\end{align*}

is asymptotic to $\mathrm{e}^{-{f(n)}/{n}}$ with probability 1, and

\begin{equation*} \mathrm{Var}\bigg(\frac{k!S_{k,N(n,f(n))}}{(\!\log n)^{k-m+1}}\bigg) = O\bigg(\frac{1}{\sqrt{\log n}\,}\bigg). \end{equation*}

Proof. First, assume that $m\leq k$ . We can see that, as $n\rightarrow\infty$ ,

\begin{align*} \mathbb{E}(S_{k,N(n,f(n))}) & = n\cdot \mathbb{P}(c_0\text{ occurs exactly }k\text{ times in the collection of }N(n,f(n))\text{ coupons}) \\[5pt] & = \binom{N(n,f(n))}{k}\bigg(1-\frac{1}{n}\bigg)^{N(n,f(n))-k}\frac{1}{n^{k-1}}\nonumber \\[5pt] & \sim \frac{N(n,f(n))^k}{k!n^{k-1}}\exp\bigg[{-}\log n-(m-1)\log\log n-\frac{f(n)}{n}\bigg] \\[5pt] & \sim \frac{(\!\log n)^{k-m+1}}{k!\mathrm{e}^{{f(n)}/{n}}}. \end{align*}

Thus, we can deduce that

\begin{equation*} \mathbb{E}\bigg(\frac{k!S_{k,N(n,f(n))}}{(\!\log n)^{k-m+1}}\bigg) = \mathrm{e}^{-{f(n)}/{n}}(1+o(1)) \end{equation*}

as $n\rightarrow\infty$ . Also,

(2.1)

\begin{align} \mathbb{E}(&S_{k,N(n,f(n))}^2) \nonumber \\[5pt] & = n(n-1)\cdot \mathbb{P}(c_0,c_1\text{ both occur exactly }k\text{ times in the collection of }N(n,f(n))\text{ coupons}) \nonumber \\[5pt] & \quad + n\cdot \mathbb{P}(c_0\text{ occurs exactly }k\text{ times in the collection of }N(n,f(n))\text{ coupons}) \nonumber \\[5pt] & = n(n-1)\binom{N(n,f(n))}{k}\binom{N(n,f(n))-k}{k}\bigg(1-\frac{2}{n}\bigg)^{N(n,f(n))-2k}\frac{1}{n^{2k}} \nonumber \\[5pt] & \quad + \binom{N(n,f(n))}{k}\bigg(1-\frac{1}{n}\bigg)^{N(n,f(n))-k}\frac{1}{n^{k-1}} \nonumber \\[5pt] & \leq (\mathbb{E}(S_{k,N(n,f(n))}))^2\bigg(1-\frac{1}{n}\bigg)^{-2k} + \mathbb{E}(S_{k,N(n,f(n))}) . \end{align}

Notice that

(2.2)

\begin{equation} \frac{k!\mathrm{e}^{-{f(n)}/{n}}}{(\!\log n)^{k-m+1}} \leq \frac{k!}{(\!\log n)^{k-m+\frac{1}{2}}}. \end{equation}

If k is bounded as $n\rightarrow\infty$ , then the right-hand side of (2.2) $O({1}/{\sqrt{\log n}})$ . On the other hand, for k sufficiently large we have $k!<k^{k-m}$ , so even if values of k tend toward $\infty$ , the right-hand side of (2.2) is $O({1}/{\sqrt{\log n}})$ . Thus, from (2.1), we obtain

\begin{align*} \mathbb{E}\bigg(\bigg(&\frac{k!S_{k,N(n,f(n))}}{(\!\log n)^{k-m+1}}\bigg)^2\bigg) - \bigg(\mathbb{E}\bigg(\frac{k!S_{k,N(n,f(n))}}{(\!\log n)^{k-m+1}}\bigg)\bigg)^2 \\[5pt] & \leq \bigg(\mathbb{E}\bigg(\frac{k!S_{k,N(n,f(n))}}{(\!\log n)^{k-m+1}}\bigg)\bigg)^2 \cdot O\bigg(\frac{k}{n}\bigg) + \frac{(k!)^2\mathbb{E}(S_{k,N(n,f(n))})}{(\!\log n)^{2k-2m+2}} \\[5pt] & \leq O \bigg(\frac{k\sqrt{\log n}}{n}\bigg) + O\bigg(\frac{1}{\sqrt{\log n}\,}\bigg) = O \bigg(\frac{1}{\sqrt{\log n}\,}\bigg) \end{align*}

as $n\rightarrow\infty$ . Thus,

\begin{equation*} \mathrm{Var}\bigg(\frac{k!S_{k,N(n,f(n))}}{(\!\log n)^{k-m+1}}\bigg) = O\bigg(\frac{1}{\sqrt{\log n}\,}\bigg) \end{equation*}

as $n\rightarrow\infty$ , so that we obtain our result for $m\leq k$ .

Proof of Theorem 1.1. $X=({T_m(n)-n\log n-(m-1)n\log\log n})/{n}$ was proved to be asymptotically Gumbel distributed in [Reference Erdös and Rényi5]. To prove Theorem 1.1, it therefore suffices to show that$

(2.3)

\begin{equation} \lim_{n\rightarrow\infty} \mathbb{P}\bigg(\frac{T_m(n)-n\log n-(m-1)n\log\log n}{n}<x,\frac{k!S_k}{(\!\log n)^{k-m+1}} < \mathrm{e}^{-y}\bigg) = 0 \end{equation}

for any fixed $x<y$ , and that$

(2.4)

\begin{equation} \lim_{n\rightarrow\infty} \mathbb{P}\bigg(\frac{T_m(n)-n\log n-(m-1)n\log\log n}{n}>x,\frac{k!S_k}{(\!\log n)^{k-m+1}} > \mathrm{e}^{-y}\bigg) = 0 \end{equation}

for any fixed $y<x$ . To prove (2.3), we first calculate an upper bound for

\begin{equation*} \mathbb{P}\bigg(T_m(n)\leq N(n,nx),\frac{k!S_k}{(\!\log n)^{k-m+1}}<\mathrm{e}^{-y}\bigg) \end{equation*}

for all sufficiently large $n\in\mathbb{N}$ . Let $M(n)=n\log n+\big(m-\frac{3}{2}\big)n\log\log n$ . We have

\begin{align*} \mathbb{P}\bigg(T_m&(n)\leq N(n,nx),\frac{k!S_k}{(\!\log n)^{k-m+1}}<\mathrm{e}^{-y}\bigg) \\[5pt] & = \mathbb{P}\bigg(T_m(n)< M(n),\frac{k!S_k}{(\!\log n)^{k-m+1}}<\mathrm{e}^{-y}\bigg) \\[5pt] & \quad + \mathbb{P}\bigg(M(n)\leq T_m(n)\leq N(n,nx),\frac{k!S_k}{(\!\log n)^{k-m+1}}<\mathrm{e}^{-y}\bigg) \\[5pt] & \leq \mathbb{P}(T_m(n)< M(n)) + \mathbb{P}\bigg(M(n)\leq T_m(n)\leq N(n,nx),\frac{k!S_k}{(\!\log n)^{k-m+1}}<\mathrm{e}^{-y}\bigg). \end{align*}

We have

(2.5)

\begin{align*}\mathbb{P}\bigg(M(n)\leq T_m(n)\leq N(n,nx),\frac{k!S_k}{(\!\log n)^{k-m+1}}&<\mathrm{e}^{-y}\bigg)\\ &= \sum_{M=\lceil M(n)\rceil}^{\lfloor N(n,nx)\rfloor}\mathbb{P}\bigg(T_m(n)=M,\frac{k!S_k}{(\!\log n)^{k-m+1}}<\mathrm{e}^{-y}\bigg). \end{align*}

Clearly,

(2.6)

\begin{align} \mathbb{P}\bigg(T_m(n)=M,\frac{k!S_k}{(\!\log n)^{k-m+1}}<\mathrm{e}^{-y}\bigg) & \leq \mathbb{P}\bigg(S_{m-1,M-1}=1,S_{m-1,M}=0,\frac{k!S_{k,M}}{(\!\log n)^{k-m+1}}<\mathrm{e}^{-y}\bigg) \nonumber \\[5pt] & = \frac{1}{n}\mathbb{P}\bigg(S_{m-1,M-1}=1,\frac{k!S_{k,M}}{(\!\log n)^{k-m+1}}<\mathrm{e}^{-y}\bigg) \nonumber \\[5pt] & \leq\frac{1}{n}\mathbb{P}\bigg(\frac{k!S_{k,M}}{(\!\log n)^{k-m+1}}<\mathrm{e}^{-y}\bigg). \end{align}

Combining (2.5) and (2.6), we therefore have

(2.7)

\begin{align*}\mathbb{P}\bigg(M(n)\leq T_m(n)\leq N(n,nx),\frac{k!S_{k,M}}{(\!\log n)^{k-m+1}}&<\mathrm{e}^{-y}\bigg)\\ &\leq \sum_{M=\lceil M(n)\rceil}^{\lfloor N(n,nx)\rfloor}\frac{1}{n}\mathbb{P}\bigg(\frac{k!S_{k,M}}{(\!\log n)^{k-m+1}}<\mathrm{e}^{-y}\bigg).\end{align*}

From Chebyshev’s inequality and Lemma 2.1 we have, for sufficiently large n,

\begin{equation*} \mathbb{P}\bigg(\frac{k!S_{k,M}}{(\!\log n)^{k-m+1}}<\mathrm{e}^{-y}\bigg) \leq \frac{C}{\sqrt{\log n}\,} \end{equation*}

for some constant $C>0$ , depending only on x and y. From (2.7), we can thus deduce that$

(2.8)

\begin{align} \mathbb{P}\bigg(M(n)\leq T_m(n)\leq N(n,nx),&\frac{k!S_{k,M}}{(\!\log n)^{k-m+1}}<\mathrm{e}^{-y}\bigg) \nonumber \\[5pt] & \leq \sum_{M=\lceil M(n)\rceil}^{\lfloor N(n,nx)\rfloor}\frac{C}{n\sqrt{\log n}\,} \nonumber \\[5pt] & \leq \frac{C\big(nx+\tfrac{1}{2}n\log\log n\big)}{n\sqrt{\log n}} = \frac{C(2x+\log\log n)}{2\sqrt{\log n}} \end{align}

for all sufficiently large n.

Now we let $n\rightarrow\infty$ . From the result quoted at the start of the proof, we can deduce that $\lim_{n\rightarrow\infty}\mathbb{P}(T< M(n))=0$ . From (2.8) we can also see that

\begin{align*}\lim_{n\rightarrow\infty}\mathbb{P}\bigg(M(n) \leq T_m(n)\leq N(n,nx),\frac{k!S_{k,M}}{(\!\log n)^{k-m+1}}<\mathrm{e}^{-y}\bigg)=0.\end{align*}

These two limits give us (2.3).

We can prove (2.4) similarly, replacing $M(n)=n\log n+\big(m-\frac{3}{2}\big)n\log\log n$ with $M'(n)=n\log n+\big(m-\frac{1}{2}\big)n\log\log n$ and reversing the appropriate inequalities.

Proof of Corollaries 1.1 and 1.2. Corollaries 1.1 and 1.2 follow from Theorem 1.1 by showing that, if $X\sim\mathrm{Gumbel}(-\log((m-1)!),1)$ , then

\begin{align*}\mathrm{e}^{-X}\sim\mathrm{Exp}\bigg(\frac{1}{(m-1)!}\bigg).\end{align*}

Indeed, we have, for any $r>0$ ,

\begin{align*} \mathbb{P}(\mathrm{e}^{-X}\leq r) = \mathbb{P}(X\geq -\log r) & = 1-\mathbb{P}(X\leq -\log r) \\[5pt] & = 1 - \exp\big[{-}\mathrm{e}^{-(-\log r+\log(m-1)!)}\big] \\[5pt] & = 1 - \mathrm{e}^{-{r}/{(m-1)!}}. \end{align*}

Hence, the result follows.

To prove Theorem 1.2, we require the following lemma.

Lemma 2.2. Fix $m\in\mathbb{N}$ . Let $f\colon\mathbb{N}\rightarrow\mathbb{R}$ be such that there exists $x\in\mathbb{R}$ such that $\lim_{n\rightarrow\infty}f(n)/n=x$ and $N(n,f(n))\in\mathbb{N}$ for all $n\in\mathbb{N}$ . Suppose we collect N(n, f(n)) coupons in total regardless of whether we collect m copies of all n different kinds of coupons or not. Pick an increasing sequence $(n_j)_j\subset\mathbb{N}$ and a sequence $(d_j)_j\subset\mathbb{R}$ such that $\lim_{j\rightarrow\infty}d_j=d$ for some $d\in\mathbb{R}$ and $k=\mathrm{e}\log n_j+\big((\mathrm{e}-1)(m-1)-\frac{1}{2}\big)\log\log n_j+d_j\in\mathbb{N}$ for all $j\in\mathbb{N}$ . Then, as $j\rightarrow\infty$ ,

\begin{align*} \mathbb{P}(S_{m-1,N(n_j,f(n_j))}=r_1,S_{k,N(n_j,f(n_j))}=r_2) & = \frac{1}{r_1!r_2!}\bigg(\frac{\mathrm{e}^{-{f(n_j)}/{n_j}}}{(m-1)!}\bigg)^{r_1} \exp\bigg[\frac{-\mathrm{e}^{-{f(n_j)}/{n_j}}}{(m-1)!}\bigg] \\[5pt] & \quad \times \bigg(\frac{\exp\![({(\mathrm{e}-1)f(n_j)}/{n_j})-d]}{\sqrt{2\pi \mathrm{e}}}\bigg)^{r_2} \\[5pt] & \quad \times \exp\bigg[\frac{-\exp\![({(\mathrm{e}-1)f(n_j)}/{n_j})-d]}{\sqrt{2\pi \mathrm{e}}}\bigg] \\[5pt] & \quad \times (1+o(1)), \end{align*}

where the implied constant in the error term only depends upon $r_1$ and $r_2$ .

Proof. Take some $j\in\mathbb{N}$ . Let the probability that $r_1$ prescribed types of coupons are $(m-1)$ -tons and $r_2$ prescribed types of coupons are k-tons be denoted as $W_{r_1,r_2}(n)$ . Then, as $j\rightarrow\infty$ ,

\begin{align*} & \binom{n_j}{r_1,r_2,n_j-r_1-r_2}W_{r_1,r_2}(n_j) \\[5pt] & \quad = \binom{n_j}{r_1,r_2,n-r_1-r_2} \frac{N(n_j,f(n_j))!}{(m-1)!^{r_1}k!^{r_2}(N(n_j,f(n_j))-(m-1)r_1-kr_2)!n_j^{(m-1)r_1+kr_2}} \\[5pt] & \qquad \times \bigg(1-\frac{r_1+r_2}{n_j}\bigg)^{N(n_j, f(n_j))-(m-1)r_1-kr_2} \\[5pt] & \quad = \binom{n_j}{r_1,r_2,n_j-r_1-r_2}\frac{N(n_j,f(n_j))!\mathrm{e}^{-(r_1+r_2)N(n_j,f(n_j))/n_j}}{(m-1)!^{r_1}k!^{r_2}(N(n_j, f(n_j))-(m-1)r_1-kr_2)!n_j^{(m-1)r_1+kr_2}} \\[5pt] & \qquad \times \bigg(1+O\bigg(\frac{\log n_j}{n_j}\bigg)\bigg) \\[5pt] & \quad = \frac{N(n_j,f(n_j))^{(m-1)r_1+kr_2}\mathrm{e}^{-{(r_1+r_2)f(n_j)}/{n_j}}}{r_1!r_2!(m-1)!^{r_1}k!^{r_2}(\!\log n_j)^{(r_1+r_2)(m-1)}n_j^{(m-1)r_1+kr_2}}\bigg(1+O\bigg(\frac{\log n_j}{n_j}\bigg)\bigg) \\[5pt] & \quad = \frac{1}{r_1!r_2!}\bigg(\frac{\mathrm{e}^{-{f(n_j)}/{n_j}}}{(m-1)!}\bigg)^{r_1} \bigg(\frac{\mathrm{e}^{({(\mathrm{e}-1)f(n_j)}/{n_j})-d_j}}{\sqrt{2\pi \mathrm{e}}}\bigg)^{r_2} \bigg(1+O\bigg(\frac{(\!\log\log n_j)^2}{\log n_j}\bigg)\bigg), \end{align*}

where the implied constant in the error term only depends on $r_1$ and $r_2$ . By the inclusion–exclusion formula, we have

(2.9)

\begin{align} & \mathbb{P}(S_{m-1,N(n_j,f(n_j))}=r_1,S_{k,N(n_j,f(n_j))}=r_2) \nonumber \\[5pt] & = \sum_{0\leq j_1+j_2\leq n_j-r_1-r_2}\bigg[(-1)^{j_1+j_2}\binom{n_j}{r_1+j_1,r_2+j_2,n_j-r_1-r_2-j_1-j_2}\binom{r_1+j_1}{r_1}\binom{r_2+j_2}{r_2} \nonumber \\[5pt] & \qquad \qquad \qquad \qquad \qquad \times W_{r_1+j_1,r_2+j_2}(n_j)\bigg]. \end{align}

For $t\leq n_j-r_1-r_2$ even:

(2.10)

\begin{align} & \mathbb{P}(S_{m-1,N(n_j,f(n_j))}=r_1,S_{k,N(n_j,f(n_j))}=r_2) \nonumber \\[5pt] & \leq \sum_{0\leq j_1+j_2\leq t}\bigg[(-1)^{j_1+j_2}\binom{n_j}{r_1+j_1,r_2+j_2,n_j-r_1-r_2-j_1-j_2}\binom{r_1+j_1}{r_1}\binom{r_2+j_2}{r_2} \nonumber \\[5pt] & \qquad \qquad \qquad \quad \times W_{r_1+j_1,r_2+j_2}(n_j)\bigg]. \end{align}

For $t\leq n_j-r_1-r_2$ odd:

(2.11)

\begin{align} & \mathbb{P}(S_{m-1,N(n_j,f(n_j))}=r_1,S_{k,N(n_j,f(n_j))}=r_2) \nonumber \\[5pt] & \geq \sum_{0\leq j_1+j_2\leq t}\bigg[(-1)^{j_1+j_2}\binom{n_j}{r_1+j_1,r_2+j_2,n_j-r_1-r_2-j_1-j_2}\binom{r_1+j_1}{r_1}\binom{r_2+j_2}{r_2} \nonumber \\[5pt] & \qquad \qquad \qquad \quad \times W_{r_1+j_1,r_2+j_2}(n_j)\bigg]. \end{align}

Combining (2.9)–(2.11), we have

\begin{align*} \mathbb{P}(&S_{m-1,N(n_j,f(n_j))}=r_1,S_{k,N(n_j,f(n_j))}=r_2) \\[5pt] & \sim \frac{1}{r_1!r_2!}\sum_{j_1,j_2=0}^{\infty}\frac{(-1)^{j_1+j_2}}{j_1!j_2!} \bigg(\frac{\mathrm{e}^{-x}}{(m-1)!}\bigg)^{r_1+j_1} \bigg(\frac{\mathrm{e}^{((\mathrm{e}-1)x)-d}}{\sqrt{2\pi \mathrm{e}}}\bigg)^{r_2+j_2} \\[5pt] & = \frac{1}{r_1!r_2!}\bigg(\frac{\mathrm{e}^{-x}}{(m-1)!}\bigg)^{r_1} \exp\bigg[\frac{-\mathrm{e}^{-x}}{(m-1)!}\bigg] \bigg(\frac{\mathrm{e}^{((e-1)x)-d}}{\sqrt{2\pi \mathrm{e}}}\bigg)^{r_2}\exp\bigg[\frac{-\mathrm{e}^{((e-1)x)-d}}{\sqrt{2\pi \mathrm{e}}}\bigg] , \end{align*}

as $j\rightarrow\infty$ .

Proof of Theorem 1.2. We first prove Theorem 1.2(i). Let $d\in\mathbb{R}$ and pick increasing sequences $(n_j)_j$ and $(d_j)_j$ such that $\lim_{j\rightarrow\infty}d_j=d$ and $k_j \;:\!=\; \mathrm{e}\log n_j + \big((\mathrm{e}-1)(m-1)-\frac{1}{2}\big)\log\log n_j + d_j\in\mathbb{N}$ for all $j\in\mathbb{N}$ . Let $x<y$ and $r\in\mathbb{N}$ be constants. Consider the probability $\mathbb{P}(N(n_j,n_jx)\leq T_m(n)\leq N(n_j,n_jy),S_k=r)$ . We have

\begin{equation*} \mathbb{P}(N(n_j,n_jx)\leq T_m(n)\leq N(n_j,n_jy),S_k=r) = \sum_{M=\lceil N(n_j,n_jx)\rceil }^{\lfloor N(n_j,n_jy)\rfloor}\mathbb{P}(T=M,S_{k,M-1}=r). \end{equation*}

Clearly,

(2.12)

\begin{align} \mathbb{P}(T_m(n)=M,S_{k,M-1}=r) & \leq \mathbb{P}(S_{m-1,M-1}=1,S_{m-1,M}=0,S_{k,M-1}=r) \nonumber \\[5pt] & = {\mathbb{P}(S_{m-1,M-1}=1,S_{k,M-1}=r)}/{n}. \end{align}

Also, for $M\geq N(n_j,n_jx)$ , if we have $S_{i,M}>0$ for some $0\leq i\leq m-2$ , then we have $S_{l,\lceil N(n_j,n_jx)\rceil}>0$ for some $0\leq l\leq i$ . Thus, we can also deduce that

(2.13)

\begin{align*}\mathbb{P}(N(n_j,n_jx)\leq T_m(n)&\leq N(n_j,n_jy),S_k=r)\\ &\geq \Bigg(\sum_{M=\lceil N(n_j,n_jx)\rceil }^{\lfloor N(n_j,n_jy)\rfloor}\frac{\mathbb{P}(S_{m-1,M-1}=1,S_{k,M-1}=r)}{n_j}\Bigg) \\ &\qquad - \mathbb{P}(\text{there} \, \text{exists}\ 0\leq l\leq m-2\ \text{such} \, \text{that} S_{l,\lceil N(n_j,n_jx)\rceil}>0).\end{align*}

For any $0\leq l\leq m-2$ , notice that, as $j\rightarrow\infty$ ,

\begin{align*} \mathbb{E}(S_{l,\lceil N(n_j,n_jx) \rceil}) & = n \cdot \mathbb{P}(c_0\text{ occurs exactly }l\text{ times in the collection of }\lceil N(n_j,n_jx)\rceil \text{ coupons}) \\[5pt] & = \binom{\lceil N(n_j,n_jx)\rceil}{l}\bigg(1-\frac{1}{n_j}\bigg)^{\lceil N(n_j,n_jx)\rceil-l}\frac{1}{n_j^{l-1}} \\[5pt] & \sim \frac{N(n_j,n_jx)^l}{l!n_j^{l-1}}\mathrm{e}^{-\log n_j-(m-1)\log\log n_j-x} \sim \frac{(\!\log n_j)^{l-m+1}\mathrm{e}^{-x}}{l!}. \end{align*}

Since $l\leq m-2$ , we therefore have $\lim_{j\rightarrow\infty}\mathbb{E}(S_{l,\lceil N(n_j,n_jx)\rceil})=0$ , and so$

(2.14)

\begin{equation} \lim_{j\rightarrow\infty}\mathbb{P}(\text{there exists } 0 \leq l \leq m-2\text{ such that }S_{l,\lceil N(n_j,n_jx)\rceil}>0) = 0. \end{equation}

For every $N(n_j,n_jx) \leq M \leq N(n_j,n_jy)$ , let $c_{M,j} \;:\!=\; ({M-n_j\log n_j-(m-1)n_j\log\log n_j})/{n_j}$ . By Lemma 2.2, we have$

(2.15)

\begin{align} & \sum_{M=\lceil N(n_j,n_jx)\rceil}^{\lfloor N(n_j,n_jy)\rfloor} \frac{\mathbb{P}(S_{m-1,M-1}=1,S_{k,M-1}=r)}{n} \nonumber \\[5pt] & = \sum_{M=\lceil N(n_j,n_jx)\rceil}^{\lfloor N(n_j,n_jy)\rfloor} \frac{1}{r!n} \bigg(\frac{\mathrm{e}^{-c_{M,j}}}{(m-1)!}\bigg) \exp\bigg[\frac{-\mathrm{e}^{-c_{M,j}}}{(m-1)!}\bigg] \bigg(\frac{\mathrm{e}^{(\mathrm{e}-1)c_{M,j}-d}}{\sqrt{2\pi \mathrm{e}}}\bigg)^r \exp\bigg[\frac{-\mathrm{e}^{(\mathrm{e}-1)c_{M,j}-d}}{\sqrt{2\pi \mathrm{e}}}\bigg] \nonumber \\[5pt] & \quad \times (1+o(1)) \nonumber \\[5pt] & < \bigg(\frac{\mathrm{e}^{(\mathrm{e}-1)y-d}}{\sqrt{2\pi \mathrm{e}}}\bigg)^r \exp\bigg[\frac{-\mathrm{e}^{(\mathrm{e}-1)x-d}}{\sqrt{2\pi \mathrm{e}}}\bigg] \nonumber \\[5pt] & \quad \times \sum_{M=\lceil N(n_j,n_jx)\rceil}^{\lfloor N(n_j,n_jy)\rfloor} \frac{1}{r!n} \bigg(\frac{\mathrm{e}^{-c_{M,j}}}{(m-1)!}\bigg) \exp\bigg[\frac{-\mathrm{e}^{-c_{M,j}}}{(m-1)!}\bigg] (1 + o(1)). \end{align}

Note that

(2.16)

\begin{align} \lim_{j\rightarrow\infty}\sum_{M=\lceil N(n_j,n_jx)\rceil}^{\lfloor N(n_j,n_jy)\rfloor} \frac{1}{n} \bigg(\frac{\mathrm{e}^{-c_{M,j}}}{(m-1)!}\bigg) & \exp\bigg[\frac{-\mathrm{e}^{-c_{M,j}}}{(m-1)!}\bigg] \nonumber \\[5pt] & = \int_x^y\frac{\mathrm{e}^{-t}\exp\left[\frac{-\mathrm{e}^{-t}}{(m-1)!}\right]}{(m-1)!} \, \mathrm{dd} t \nonumber \\[5pt] & = \exp\bigg[\frac{-\mathrm{e}^{-y}}{(m-1)!}\bigg] - \exp\bigg[\frac{-\mathrm{e}^{-x}}{(m-1)!}\bigg] \nonumber \\[5pt] & = \lim_{j\rightarrow\infty}\mathbb{P}(N(n_j,x)\leq T_m(n_j)\leq N(n_j,y)). \end{align}

Combining (2.12)–(2.16), we obtain

\begin{align*}\limsup_{j\rightarrow\infty} \, &\mathbb{P}(N(n_j,n_jx)\leq T_m(n_j)\leq N(n_j,n_jy),S_k=r)\\ &\leq \frac{1}{r!}\bigg(\frac{\mathrm{e}^{(\mathrm{e}-1)y-d}}{\sqrt{2\pi \mathrm{e}}}\bigg)^r \exp\bigg[\frac{-\mathrm{e}^{(\mathrm{e}-1)x-d}}{\sqrt{2\pi \mathrm{e}}}\bigg] \lim_{j\rightarrow\infty}\mathbb{P}(N(n_j,n_jx)\leq T_m(n_j)\leq N(n_j,n_jy)).\end{align*}

By similar reasoning, we can also obtain

\begin{align*}\liminf_{j\rightarrow\infty} \, &\mathbb{P}(N(n_j,n_jx)\leq T_m(n_j)\leq N(n_j,n_jy),S_k=r)\\ &\geq \frac{1}{r!}\bigg(\frac{\mathrm{e}^{(\mathrm{e}-1)x-d}}{\sqrt{2\pi \mathrm{e}}}\bigg)^r \exp\bigg[\frac{-\mathrm{e}^{(\mathrm{e}-1)y-d}}{\sqrt{2\pi \mathrm{e}}}\bigg] \lim_{j\rightarrow\infty}\mathbb{P}(N(n_j,n_jx)\leq T_m(n_j)\leq N(n_j,n_jy)).\end{align*}

Since the choices of $x<y$ were arbitrary, we can see that the joint limiting distribution holds.

For Theorem 1.2(ii), again let $x<y$ be constants. We will show that

\begin{equation*} \lim_{n\rightarrow\infty}\mathbb{P}(N(n,nx)\leq T_m(n_j)\leq N(n,ny),S_k\geq 1) = 0, \end{equation*}

which implies the desired result since $x<y$ are arbitrary. We have

\begin{align*} & \mathbb{P}(N(n,nx)\leq T_m(n)\leq N(n,ny),S_k\geq 1) \\[5pt] & \leq \sum_{M=\lceil N(n,nx)\rceil}^{\lfloor N(n,ny)\rfloor} \frac{\mathbb{P}(S_{m-1,M-1}=1,S_{k,M-1}\geq 1)}{n} \\[5pt] & \leq \sum_{M=\lceil N(n,nx)\rceil}^{\lfloor N(n,ny)\rfloor} \frac{\mathbb{P}(S_{k,M-1}\geq 1)}{n} \\ & \leq \sum_{M=\lceil N(n,nx)\rceil}^{\lfloor N(n,ny)\rfloor} \frac{n\cdot \mathbb{P}(c_0\text{ occurs exactly }k\text{ times in the collection of }M-1\text{ coupons})}{n} \end{align*}

\begin{align*} & = \sum_{M=\lceil N(n,nx)\rceil}^{\lfloor N(n,ny)\rfloor} \mathbb{P}(c_0\text{ occurs exactly }k\text{ times in the collection of }M\text{ coupons})\\[5pt] & = \sum_{M=\lceil N(n,nx)\rceil}^{\lfloor N(n,ny)\rfloor} \binom{M}{k}\bigg(1-\frac{1}{n}\bigg)^{M-k}\frac{1}{n^k} \\[5pt] & < \frac{1}{k!} \sum_{M=\lceil N(n,nx)\rceil}^{\lfloor N(n,ny)\rfloor} \bigg(1-\frac{1}{n}\bigg)^{M-k}\frac{M^k}{n^k} \\[5pt] & < \frac{n(y-x)}{k!}\bigg(1-\frac{1}{n}\bigg)^{N(n,nx)-k}(\!\log n+(m-1)\log\log n+y)^k. \end{align*}

As $n\rightarrow\infty$ , we have$

(2.17)

\begin{align} \frac{n(y-x)}{k!}&\bigg(1-\frac{1}{n}\bigg)^{N(n,nx)-k}(\!\log n+(m-1)\log\log n+y)^k \nonumber \\[5pt] & \sim \frac{(y-x)(\!\log n+(m-1)\log\log n+y)^k}{k!(\!\log n)^{m-1}\mathrm{e}^x} \nonumber \\[5pt] & \sim \frac{y-x}{\sqrt{2\pi \mathrm{e}}(\!\log n)^{m-\frac{1}{2}}\mathrm{e}^x} \bigg(\frac{\mathrm{e}\log n+\mathrm{e}(m-1)\log\log n+\mathrm{e} y}{k}\bigg)^k \nonumber \\[5pt] & \sim \frac{(y-x)\mathrm{e}^{\left(m-\frac{1}{2}\right)\log\log n+\mathrm{e} y-g(n)}} {\sqrt{2\pi \mathrm{e}}(\!\log n)^{m-\frac{1}{2}}\mathrm{e}^x} \nonumber \\[5pt] & = \frac{(y-x)\mathrm{e}^{\mathrm{e} y-x-g(n)}}{\sqrt{2\pi \mathrm{e}}}. \end{align}

We have

\begin{equation*} \lim_{n\rightarrow\infty}\frac{(y-x)\mathrm{e}^{\mathrm{e} y-x-g(n)}}{\sqrt{2\pi \mathrm{e}}} = 0, \end{equation*}

giving us our result.

For Theorem 1.2(iii), again let $x<y$ be constants. We will show that

\begin{equation*} \lim_{n\rightarrow\infty}\mathbb{P}(N(n,nx)\leq T_m(n)\leq N(n,ny),S_k=0) = 0, \end{equation*}

which implies the desired result since $x<y$ are arbitrary. We have$

(2.18)

\begin{align} \mathbb{P}(N(n,nx)\leq T_m(n)\leq N(n,ny),S_k=0) & \leq \sum_{M=\lceil N(n,nx)\rceil}^{\lfloor N(n,ny)\rfloor} \frac{\mathbb{P}(S_{m-1,M-1}=1,S_{k,M-1}=0)}{n} \nonumber \\[5pt] & \leq \sum_{M=\lceil N(n,nx)\rceil}^{\lfloor N(n,ny)\rfloor} \frac{\mathbb{P}(S_{k,M-1}=0)}{n}. \end{align}

Note that

\begin{align*} \mathbb{E}(S_{k,M}) & = \binom{M}{k}\bigg(1-\frac{1}{n}\bigg)^{M-k}\frac{1}{n^{k-1}} , \\[5pt] \mathbb{E}(S_{k,M}^2) & = \binom{M}{k}\binom{M-k}{k}\bigg(1-\frac{2}{n}\bigg)^{M-2k}\frac{(n-1)}{n^{2k-1}} + \binom{M}{k}\bigg(1-\frac{1}{n}\bigg)^{M-k}\frac{1}{n^{k-1}} \\[5pt] & < \binom{M}{k}^2\bigg(1-\frac{1}{n}\bigg)^{2M-4k}\frac{1}{n^{2k-2}} + \binom{M}{k}\bigg(1-\frac{1}{n}\bigg)^{M-k}\frac{1}{n^{k-1}}. \end{align*}

We have

\begin{align*} \mathrm{Var}(S_{k,M}) = \mathbb{E}(S_{k,M}^2) - \mathbb{E}(S_{k,M})^2 \leq \mathbb{E}(S_{k,M})^2\bigg(\bigg(1-\frac{1}{n}\bigg)^{-2k}-1\bigg) + \mathbb{E}(S_{k,M}). \end{align*}

Since $k<n$ for sufficiently large n, we have that there exists $C_1>0$ such that, for sufficiently large n,

\begin{equation*} \mathrm{Var}(S_{k,M}) \leq \frac{C_1k\mathbb{E}(S_{k,M})^2}{n} + \mathbb{E}(S_{k,M}). \end{equation*}

Similarly to the derivation of (2.17), we can deduce that there exists $C_2>0$ depending only on x and y such that, for sufficiently large n, $\mathbb{E}(S_{k,M})<C_2\mathrm{e}^{g(n)}<C_2\log n$ . Therefore, for sufficiently large n, we have $\mathrm{Var}(S_{k,M}) \leq \mathbb{E}(S_{k,M}) + 1 < 2\mathbb{E}(S_{k,M})$ . Thus, for sufficiently large n, $\sigma(S_{k,M}) < \sqrt{2\mathbb{E}(S_{k,M})}$ . By Chebyshev’s inequality, we therefore have$

(2.19)

\begin{align} \mathbb{P}(S_{k,M}=0) & \leq \mathbb{P}(|S_{k,M}-\mathbb{E}(S_{k,M})| \geq \mathbb{E}(S_{k,M})) \nonumber \\[5pt] & \leq \mathbb{P}\bigg(|S_{k,M}-\mathbb{E}(S_{k,M})| \geq \frac{\sigma(S_{k,M})\sqrt{\mathbb{E}(S_{k,M})}}{\sqrt{2}}\bigg) \leq \frac{2}{\mathbb{E}(S_{k,M})} \end{align}

for sufficiently large n. Also, similarly to the derivation of (2.17), we have that there exists $C_3>0$ depending on only x and y such that $C_3\mathrm{e}^{g(n)}<\mathbb{E}(S_{k,M})$ . Thus, from (2.18) and (2.19), we have

\begin{equation*} \mathbb{P}(N(n,nx)\leq T_m(n)\leq N(n,ny),S_k=0) \leq \sum_{M=\lceil N(n,nx)\rceil}^{\lfloor N(n,ny)\rfloor} \frac{2}{C_3n\mathrm{e}^{g(n)}} = \frac{2(y-x)}{C_3\mathrm{e}^{g(n)}} \end{equation*}

for sufficiently large n. Since $\lim_{n\rightarrow\infty}g(n)=\infty$ , we have our result.

3. Future work

There are a few open questions that are worth exploring with respect to the coupon collector’s problem and the number of k-tons. For instance, can we extend our ranges for k? We have results for $k=o(\!\log n)$ if $m=1$ and $k=o({\log n}/{\log\log n})$ if $m\geq 2$ , as well as $k=\mathrm{e}\log n+\big((\mathrm{e}-1)(m-1)-\frac{1}{2}\big)\log\log n+d$ , but we can also ask similar questions for values of k between these minimum and maximum values. For instance, we can consider $k=\theta(\!\log n)$ or $k=\theta({\log n}/{\log\log n})$ . We can also ask what happens if we have m increasing with n. Also, [Reference Doumas and Papanicolaou3] studied the limiting distribution of $T_m(n)$ in the case of unequal coupon probabilities. We could also study the limiting distribution of k-tons in the case of unequal coupon probabilities.

Acknowledgement

The author would like to thank Dr. Daniel Berend for helpful comments on the paper.

Funding information

The research of J. C. Saunders is supported by an Azrieli International Postdoctoral Fellowship, as well as a Postdoctoral Fellowship at the University of Calgary.

Competing interests

There were no competing interests to declare which arose during the preparation or publication process of this article.

References

Baum, L. and Billingsley, P. (1965). Asymptotic distributions for the coupon collector’s problem. Ann. Math. Statist. 36, 1835–1839.CrossRef Google Scholar

Berenbrink, P. and Sauerwald, T. (2009). The weighted coupon collector’s problem and applications. In Computing and Combinatorics (LNCS 5609), ed H. Q. Ngo. Springer, Berlin, pp. 449–458.CrossRef Google Scholar

Doumas, A. and Papanicolaou, V. (2012). The coupon collector’s problem revisited: Asymptotics of the variance. Adv. Appl. Prob. 44, 166–195.CrossRef Google Scholar

Doumas, A. and Papanicolaou, V. (2016). The coupon collector’s problem revisited: Generalizing the double Dixie cup problem of Newman and Shepp. ESIAM Prob. Statist. 20, 367–399.CrossRef Google Scholar

Erdös, P. and Rényi, A. (1961). On a classical problem of probability theory. Magyar Tudományos Akadémia Matematikai Kutató Intézetének Közleményei 6, 215–220.Google Scholar

Foata, D. and Han, G. (2001). Les nombres hyperharmoniques et la fratrie du collectionneur de vignettes. Séminaire Lotharingien de Combinatoire 47, B47a.Google Scholar

Myers, A. and Wilf, H. (2006). Some new aspects of the coupon collector’s problem. SIAM Rev. 48, 549–565.10.1137/060654438CrossRef Google Scholar

Neal, P. (2008). The generalised coupon collector problem. J. Appl. Prob. 45, 621–629.CrossRef Google Scholar

Newman, D. and Shepp, L. (1960). The double Dixie cup problem. Amer. Math. Monthly 67, 58–61.10.2307/2308930CrossRef Google Scholar

Penrose, M. (2009). Normal approximation for isolated balls in an urn allocation model. Electron. J. Prob. 14, 2155–2181.CrossRef Google Scholar

Article contents

The number of K-tons in the coupon collector problem

Abstract

Keywords

MSC classification

1. Introduction

2. Proofs

3. Future work

Acknowledgement

Funding information

Competing interests

References

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests