Covariances in Pólya urn schemes

Hosam Mahmoud

doi:10.1017/S0269964821000450

Covariances in Pólya urn schemes

Part of: Limit theorems Special processes Combinatorial probability

Published online by Cambridge University Press: 08 October 2021

Hosam Mahmoud

Show author details

Hosam Mahmoud*: Affiliation:
Department of Statistics, The George Washington University, Washington, D.C. 20052, USA. E-mail: [email protected]

Article contents

Abstract
Introduction
Pólya urns
Notation
Covariance recursion
Growing tenable balanced urns
Limitations
Conflicts of interest
Footnotes
References

Rights & Permissions

Abstract

By now there is a solid theory for Polya urns. Finding the covariances is somewhat laborious. While these papers are “structural,” our purpose here is “computational.” We propose a practicable method for building the asymptotic covariance matrix in tenable balanced urn schemes, whereupon the asymptotic covariance matrix is obtained by solving a linear system of equations. We demonstrate the use of the method in growing tenable balanced irreducible schemes with a small index and in critical urns. In the critical case, the solution to the linear system of equations is explicit in terms of an eigenvector of the scheme.

Keywords

L2-limit theorems Pólya urn

MSC classification

Primary: 60C05: Combinatorial probability 60F25: $L^p$-limit theorems

Secondary: 60K40: Other physical applications of random processes

Type: Research Article
Information: Probability in the Engineering and Informational Sciences , Volume 37 , Issue 1 , January 2023 , pp. 60 - 71

DOI: https://doi.org/10.1017/S0269964821000450 [Opens in a new window]
Copyright: Copyright © The Author(s), 2021. Published by Cambridge University Press

1. Introduction

Pólya urns are a versatile modeling tool. They have been proposed as a backbone for myriad applications, including contagion (Eggenberger-Pólya urn [Reference Eggenberger and Pólya5]), gas diffusion (Ehrenfest urn [Reference Ehrenfest and Ehrenfest6]), and random data structures (Bagchi-Pal urn [Reference Bagchi and Pal2]). For a survey of urn models and applications, see [Reference Kotz and Balakrishnan15], and for a textbook discussion, see [Reference Johnson and Kotz14] or [Reference Mahmoud16]. It is important to understand the urn process. Indeed, many classic papers are focused on theoretical aspects of Pólya urn schemes. Over time, a comprehensive theory emerged for single-ball drawing [Reference Athreya and Karlin1,Reference Chauvin, Pouyanne and Sahnoun4,Reference Freedman7,Reference Janson10,Reference Smythe18].

One common feature of the various theoretical formulations is the complexity of the computation of the covariances among the number of balls of different colors in the urn. For instance, Smythe [Reference Smythe18] gives a central limit theorem for an extended class of Pólya urns in which the result he finds an asymptotic multivariate joint Gaussian law. This source does not specify the covariance matrix.

There are recent developments concerning this point [Reference Janson10,Reference Janson12,Reference Janson and Pouyanne13,Reference Pouyanne17]. These latter sources describe algorithms by which one gets the covariance matrix associated with a Pólya urn. However, these algorithms are rather elaborate and fall back on the deep structure of the Pólya urn scheme. While these papers are “structural,” our purpose here is “computational.” We propose a practicable method for building the asymptotic covariance matrix. While admitting random replacements, we can get the method to work for tenable balanced irreducible urns, when the so-called index is small or critical. We can even coax the method to deliver results for some tenable balanced reducible urn schemes, as we illustrate in Example 5.2. It appears that the method does not deliver for the so-called large-index schemes. Perhaps, it needs finer asymptotics that are not yet available in the literature of urns.

The method proposed in this manuscript is straightforward and does not go beyond solving a system of first-order algebraic equations. A rudiment of the method is discussed in [Reference Bhutani, Kalpathy and Mahmoud3], applied only to deterministic schemes with a small index, which appear in an application in hooking networks.

The method is based on an idea commonly used in calculus. If a limit exits, the value of the limit is determined by solving an equation satisfied by the limit. From the theoretical studies, the existence of such a limit is ascertained for many classes of urn schemes. This is discussed further in the sequel.

This note is organized in sections. In Section 2, we outline the mechanics of a Pólya urn scheme. In that section, three subsections (Subsections 2.1–2.3) are set aside for specifying the notions of tenability, balance and irreducibility and classes. In Section 3, we collect notation needed for the analysis. Subsection 3.1 goes briefly over a classification of urn scheme by their index. In Section 4, we go over the basic covariance recurrence from which we extract asymptotics. The small-index scheme and critical schemes are treated in two separate subsections (Subsections 5.1 and 5.2). Limitations of the method are discussed in Section 6.

2. Pólya urns

A multicolor Pólya urn is initially nonempty and experiences evolution according to some scheme of ball drawing and replacements. Up to $c$ ball colors may appear in the urn. Suppose we numbered the $c$ distinct colors with elements of the set $[c]=\{1, 2, \ldots , c\}$. When $c$ is small, we can name the colors, as in white, blue, red, green, etc. When needed, we shall always go through the palette in this order—color 1 is white, color 2 is blue, color 3 is red, and color 4 is green.

At the $n$th discrete point of time, a ball is drawn at random from the urn (all balls being equally likely) and its color is noted. If the color of the ball withdrawn is $i\in [c]$, we put it back in the urn and add $a_{i,j}(n)$ balls of color $j\in [c]$. The execution of the replacement rules are taken to be instantaneous. These dynamics are captured in a $c \times c$ replacement matrix:

$${{\mathbf{A}}}_n = \begin{pmatrix} a_{1,1}(n) & a_{1,2}(n) & \ldots & a_{1,c}(n)\\ a_{2,1}(n) & a_{2,2} (n) & \ldots & a_{2,c}(n)\\ \vdots & \vdots & \ddots & \vdots\\ a_{c,1}(n) & a_{c,2}(n) & \ldots & a_{c,c}(n) \end{pmatrix}.$$

The replacement matrix is independent of the composition of the urn at any previous stage. In this matrix, the rows are indexed with the color of the ball drawn; the element $a_{i,j}(n)$ is the number of balls of color $j\in [c]$ that we add upon drawing a ball of color $i\in [c]$. We alert the reader to that some authors prefer to present the urn dynamics via the transpose of ${{\mathbf {A}}}_n$, see [Reference Idriss9,Reference Janson12].

In a general setting, the entries of the replacement matrix can be negative or even random. We only consider the case in which $\{{{\mathbf {A}}}_n\}_{n=1}^{\infty }$ is a process of independent identically distributed matrices. We can assume that ${{\mathbf {A}}}_n$, for all $n \ge 1$, are distributed like a generic matrix ${{\mathbf {A}}}$ (unindexed).

We call the expectation ${{\mathbb {E}}}[{{\mathbf {A}}}_n]$ the generator. We denote the generator by $\bar {{\mathbf {A}}}_n$. Since we are considering independent identically distributed matrices, we have ${{\mathbb {E}}}[{{\mathbf {A}}}_n] = {{\mathbb {E}}}[{{\mathbf {A}}}] =: \bar {{\mathbf {A}}}$, for all $n\ge 1$. We assume throughout that all the entries of $\bar {{\mathbf {A}}}$ are finite.

2.1. Tenability

An urn scheme is called tenable if it is always possible to draw balls and execute the rules. In a tenable scheme, the urn never becomes empty and the scheme never calls for taking out balls of a color when there is not a sufficient number of balls of that color present in the urn. For instance, a scheme with nonnegative entries in ${{\mathbf {A}}}$ is tenable. No matter which stochastic path is followed, in a tenable scheme the drawing can be continued ad infinitum.

Some elements of ${{\mathbf {A}}}$ could be negative and the urn is still tenable, such as in Ehrenfest's urn scheme, with the generic replacement matrix $\left (\begin {smallmatrix} -1 & 1 \\ 1 & -1 \end {smallmatrix}\right )$, when such an urn starts out nonempty. For an initially nonempty urn, untenability arises only when some entries at certain positions of ${{\mathbf {A}}}$ are negative. An instance of this is an urn scheme on white and blue colors with the generic replacement matrix $\left (\begin {smallmatrix} 0 & -1 \\ -1 & 0 \end {smallmatrix}\right )$, which underlies the OK Corral gunfight. Starting with two white and five blue balls, the drawing comes to a halt after three blue draws or six white draws, whichever comes first.

2.2. Balance

An urn scheme is said to be balanced, if the number of balls added at each step is constant, say $\theta \ge 0$, (that is, $\sum _{j=1}^{c} a_{i,j}(n) =\theta$). The parameter $\theta$ is called the balance factor. Note that an urn scheme can be balanced, even when the replacement matrix contains random elements. Take, for example, the balanced scheme with the generic replacement matrix

$${{\mathbf{A}}} = \begin{pmatrix} Y & 1 & 8-Y\\ B & B & 9 - 2B\\ 2 & 4 & 3 \end{pmatrix},$$

where $Y$ is a binomial random variable on $4$ independent identically distributed trials, with success probability $\frac 1 2$ in each, and $B$ is a Bernoulli random variable with success probability $0.473$.

A theorem of Perron and Frobenius [Reference Horn and Johnson8] asserts that the principal eigenvalue (of largest real part) of a matrix of nonnegative entries is the maximal sum across any row. By the balance condition, all the entries on a row (even if they are random) are bounded. Furthermore, the tenability necessitates that if there are negative elements at all, they appear on the diagonal of the replacement matrix; all off-diagonal elements are nonnegative. Hence, for large enough $b>0$, the matrix $\bar {{\mathbf {A}}} + b{{\mathbf {I}}}$ is all nonnegative, with principal eigenvalue $\theta +b$. Thus, $\theta$ is a principal eigenvalue of $\bar {{\mathbf {A}}}$ and of $\bar {{\mathbf {A}}}^{T}$.

2.3. Classes and irreducibility

Using the language of [Reference Janson10], we say that color $i\in [c]$ dominates color $j\in [c]$, if a ball of color $j$ appears at some point in time (including time 0), when we start with only one ball of color $i$. We follow the notation of [Reference Janson10] in representing the relation $i$ dominates $j$ symbolically by $i \succ j$. If $i\succ j$ and $j\succ i$, we say $i$ and $j$ are in the same class. Since additionally $i\succ i$, and $i\succ j,\ j\succ k \implies i \succ k$, being in the same class is a reflexive, symmetric and transitive relation (i.e., an equivalence relation) and the class to which $i$ and $j$ belong is an equivalence class. So, the relation “dominates” creates a partition of the colors into equivalent classes.

When all the colors in $[c]$ fall in the same class, we have one part in the partition and say the scheme is irreducible. When color $i$ dominates every other color, we say color $i$ is dominating in the urn.

To illustrate these concepts, take a scheme with the replacement matrix

\begin{align*}\begin{pmatrix} -1 & 2 & 3 & 1\\ 0 & 5 & 0 & 0\\ 0 & 2 & 3 & 8\\ 0 & 3 & 1 & 3 \end{pmatrix}.\end{align*}

In this instance, color 1 is dominating in the urn, but none of the other colors is. In this unbalanced scheme, we have three classes, partitioning $[4]$ as $\{1\}, \{2\}$ and $\{3,4\}$.

3. Notation

We typeset matrices and vectors in boldface. Toward less indexing, we leave the matrices and vectors unsubscripted by dimensions, but if changing over time, they carry the timestamp as a subscript. For unsubscripted matrices and vectors, the dimension is surmised from the context. For instance, the identity matrix is ${{\mathbf {I}}}$, and ${{\mathbf 0}}$ is a matrix or a vector of zeros (the context will tell), with appropriate dimensions discernible from the situation. Vectors are one-column matrices, and the transpose of a matrix is shown via $T$ as a superscript. The notation $\mathbf{\sim }\ g(n)$, $\textbf {o}(g(n))$ and $\textbf {O}(g(n))$ is for a matrix or a vector in which all the components are, respectively, $\sim g(n)$, $o(g(n))$ and $O(g(n))$ in the usual sense.

The notation ${\textbf {Diag}}(y_1, \ldots , y_c)$ is the usual representation of a diagonal matrix, with the numbers $y_1, \ldots , y_c$ on the diagonal, and the off-diagonal elements are all 0. For a column vector ${{\mathbf {y}}} = (y_1, \ldots , y_c)^{T}$, the notation ${\textbf {Diag}}({{\mathbf {y}}})$ stands for the matrix

$${\textbf{Diag}}(y_1, \ldots, y_c) = \begin{pmatrix} y_1 & 0 & 0 & \ldots & 0\\ 0 & y_2 & 0 & \ldots & 0\\ \vdots & \vdots & \vdots & \ddots & \vdots\\ 0 & 0 & 0 & \ldots & y_c \end{pmatrix}$$

that puts the elements of ${{\mathbf {y}}}$ on the diagonal.

Let $T_n$ be the total number of balls in an urn. In a tenable balanced scheme with the balance factor $\theta$, we have

$$T_n = \theta n + T_0,$$

For tenability, the initial number of balls $T_0$ is at least 1.

Let $X_{n,i}$ be the number of balls of color $i\in [c]$ in the urn after $n$ draws, and let ${{\mathbf {X}}}_n$ be the composition vector with these components after $n$ draws. We denote the covariance matrix of ${{\mathbf {X}}}_n$ by ${{\mathbb {C}\textrm {ov}}}({{\mathbf {X}}}_n)$. We arrange the eigenvalues of the generator according to the decreasing order of real parts:

$$\Re(\lambda_1) \ge \Re ( \lambda_2) \ge \Re ( \lambda_3) \ge \cdots \ge \Re (\lambda_c).$$

Note that some eigenvalues may be repeated and complex. When $\lambda _1$ is unique (of multiplicity 1), we call it a principal eigenvalue. The corresponding eigenvector of ${{\mathbf {A}}}^{T}$ is called a principal eigenvector and is denoted by ${{\mathbf {v}}}_1$.

The principal eigenvector is scaled to be of $L^{1}$-norm equal to 1. This is always possible, since for a tenable balanced irreducible urn scheme the components of the principle eigenvector are all positive [Reference Janson10]. The eigenvector corresponding to $\lambda _2$ is scaled to have the top component equal to 1. This is always possible, since not all the components of ${{\mathbf {v}}}_2$ are 0, and we can take as color 1 the one corresponding to a nonezero component, then choose a scale that renders its value 1.

3.1. Classification by the index

For a tenable balanced urn, the ratio $\Re (\lambda _2)/ \lambda _1 =: \Lambda$ is called the urn index. There is a standard classification of urn schemes by the regimes of the index. When the index is less than $\frac 1 2$, the urn scheme is considered “small,” and when the index equals $\frac 1 2$, the scheme is “critical”; the scheme is “large” when its index is greater than $\frac 1 2$. The reason for this particular classification is that the urn behavior is different in each regime.

4. Covariance recursion

The next lemma requires only tenability and balance.

Lemma 4.1. In a tenable balanced Pólya urn scheme with generator $\bar {{\mathbf {A}}}$, the covariance of the composition vector ${{\mathbf {X}}}_n$ satisfies the recurrence

\begin{align*} {{\mathbb{C}\textrm{ov}}}[{{\mathbf{X}}}_n] & = {{\mathbb{C}\textrm{ov}}}[{{\mathbf{X}}}_{n-1}] + \frac 1 {T_{n-1}} \bar {{\mathbf{A}}}^{T}\,{{\mathbb{C}\textrm{ov}}}[{{\mathbf{X}}}_{n-1}] + \frac 1 {T_{n-1}} \,{{\mathbb{C}\textrm{ov}}}[{{\mathbf{X}}}_{n-1}] \bar {{\mathbf{A}}} \\ & \quad + \frac 1 {T_{n-1}}\bar {{\mathbf{A}}}^{T} \,{\textbf{Diag}}({{\mathbb{E}}}[X_{n-1,1}], \ldots, {{\mathbb{E}}}[X_{n-1,c}]) \bar {{\mathbf{A}}} \\ & \quad - \frac 1 {T_{n-1}^{2}} \bar {{\mathbf{A}}}^{T} {{\mathbb{E}}}[{{\mathbf{X}}}_{n-1}]\, {{\mathbb{E}}}[{{\mathbf{X}}}_{n-1}^{T}] \bar {{\mathbf{A}}}. \end{align*}

Proof. Let $\mathbb {I}_{n,i}$ be the indicator of the event of picking a ball of color $i$ in the $n$th draw, and let ${{\mathbf {J}}}_n$ be the vector with these components. For $i=1, \ldots , c$, the random variable $X_{n,i}$ satisfies the stochastic recurrence

$$X_{n,i} = X_{n-1, i} + a_{1,i}(n) \mathbb{I}_{n,1} + \cdots + a_{c,i}(n) \mathbb{I}_{n,c}.$$

Putting these $c$ equations in matrix form, we obtain

(1)

\begin{equation} {{\mathbf{X}}}_n = {{\mathbf{X}}}_{n-1} + {{\mathbf{A}}}^{T}(n) {{\mathbf{J}}}_n. \end{equation}

The covariance matrix ${{\mathbb {C}\textrm {ov}}}[{{\mathbf {X}}}_n]$ is ${{\mathbb {E}}}[{{\mathbf {X}}}_n{{\mathbf {X}}}_n^{T}] - {{\mathbb {E}}}[{{\mathbf {X}}}_n] \, {{\mathbb {E}}}[{{\mathbf {X}}}_n^{T}]$. The strategy in the proof is to develop a recurrence for ${{\mathbb {C}\textrm {ov}}}[{{\mathbf {X}}}_n]$. Take the transpose of (1), and multiply the resulting equation by (1) to get

$${{\mathbf{X}}}_n{{\mathbf{X}}}_n^{T} = {{\mathbf{X}}}_{n-1}{{\mathbf{X}}}_{n-1}^{T} + {{\mathbf{A}}}^{T}(n) {{\mathbf{J}}}_n {{\mathbf{X}}}_{n-1}^{T} + {{\mathbf{X}}}_{n-1} {{\mathbf{J}}}_n^{T} {{\mathbf{A}}}(n) + {{\mathbf{A}}}^{T}(n) {{\mathbf{J}}}_n {{\mathbf{J}}}_n^{T}{{\mathbf{A}}}(n).$$

For $i \not = j$, the indicators $\mathbb {I}_{n,i}$ and $\mathbb {I}_{n,j}$ are mutually exclusive. So, the product ${{\mathbf {J}}}_n {{\mathbf {J}}}_n^{T}$ is the diagonal matrix ${\textbf {Diag}}(\mathbb {I}_{n,1}, \ldots , \mathbb {I}_{n,c}) = {\textbf {Diag}}({{\mathbf {J}}}_n)$. We now have

\begin{align*} {{\mathbb{E}}}[{{\mathbf{X}}}_n{{\mathbf{X}}}_n^{T}] - {{\mathbb{E}}}[{{\mathbf{X}}}_n]\, {{\mathbb{E}}}[{{\mathbf{X}}}_n^{T}] & = {{\mathbb{E}}}[{{\mathbf{X}}}_{n-1}{{\mathbf{X}}}_{n-1}^{T} + {{\mathbf{A}}}^{T}(n) {{\mathbf{J}}}_n {{\mathbf{X}}}_{n-1}^{T} + {{\mathbf{X}}}_{n-1} {{\mathbf{J}}}_n^{T}] {{\mathbf{A}}}(n) \\ & \quad + {{\mathbf{A}}}^{T}(n)\,{\textbf{Diag}}({{\mathbf{J}}}_n^{T}){{\mathbf{A}}}(n) \\ & \quad - {{\mathbb{E}}}[{{\mathbf{X}}}_{n-1} + {{\mathbf{A}}}^{T}(n) {{\mathbf{J}}}_n] \, {{\mathbb{E}}}[{{\mathbf{X}}}_{n-1}^{T} + {{\mathbf{J}}}_n^{T} {{\mathbf{A}}}(n)]. \end{align*}

Using the independence of ${{\mathbf {A}}}(n)$ and ${{\mathbf {X}}}_{n-1}$, we collect the terms of the covariance toward a recursion:

(2)

\begin{align} {{\mathbb{C}\textrm{ov}}}[{{\mathbf{X}}}_n] & = {{\mathbb{C}\textrm{ov}}}[{{\mathbf{X}}}_{n-1}] + \bar {{\mathbf{A}}}^{T}{{\mathbb{E}}}[{{\mathbf{J}}}_n {{\mathbf{X}}}_{n-1}^{T}] + {{\mathbb{E}}}[{{\mathbf{X}}}_{n-1} {{\mathbf{J}}}_n^{T}] \bar {{\mathbf{A}}} \nonumber\\ & \quad + \bar {{\mathbf{A}}}^{T} \,{\textbf{Diag}}({{\mathbb{E}}}[{{\mathbf{J}}}_n]) \bar {{\mathbf{A}}} - {{\mathbb{E}}}[{{\mathbf{X}}}_{n-1}]\, {{\mathbb{E}}}[{{\mathbf{J}}}_n^{T}] \bar {{\mathbf{A}}} \nonumber\\ & \quad - \bar {{\mathbf{A}}}^{T} {{\mathbb{E}}}[{{\mathbf{J}}}_n] \, {{\mathbb{E}}}[{{\mathbf{X}}}_{n-1}^{T}] - \bar {{\mathbf{A}}}^{T} {{\mathbb{E}}}[{{\mathbf{J}}}_n]\, {{\mathbb{E}}}[{{\mathbf{J}}}_n^{T}] \bar {{\mathbf{A}}}. \end{align}

We get the terms involving ${{\mathbf {J}}}_n$ by conditioning on ${{\mathbb {F}}}_{n-1}$, the sigma field generated by the first $n-1$ draws:

$${{\mathbb{E}}}[{{\mathbf{J}}}_n\, \vert \, {{\mathbb{F}}} _{n-1}] = \frac 1 {T_{n-1}} {{\mathbf{X}}}_{n-1}.$$

An iterated expectation gives ${{\mathbb {E}}}[{{\mathbf {J}}}_n] = (1/{T_{n-1}})\, {{\mathbb {E}}}[{{\mathbf {X}}}_{n-1}]$. In the same manner, we have

$${{\mathbb{E}}}[{{\mathbf{J}}}_n {{\mathbf{X}}}_{n-1}^{T}\, \vert \, {{\mathbb{F}}}_{n-1}] = {{\mathbb{E}}}[{{\mathbf{J}}}_n \, \vert \, {{\mathbb{F}}}_{n-1}] {{\mathbf{X}}}_{n-1}^{T} = \frac 1 {T_{n-1}}{{\mathbf{X}}}_{n-1} {{\mathbf{X}}}_{n-1}^{T} .$$

An iterated expectation gives ${{\mathbb {E}}}[{{\mathbf {J}}}_n {{\mathbf {X}}}_{n-1}^{T}] = (1/ {T_{n-1}})\,{{\mathbb {E}}}[{{\mathbf {X}}}_{n-1} {{\mathbf {X}}}_{n-1}^{T}]$. Similarly, we get ${{\mathbb {E}}}[{{\mathbf {X}}}_{n-1} {{\mathbf {J}}}_n ^{T}] = (1/{T_{n-1}})\,{{\mathbb {E}}}[{{\mathbf {X}}}_{n-1} {{\mathbf {X}}}_{n-1}^{T}]$. Plugging this back into (2), we get

\begin{align*} {{\mathbb{C}\textrm{ov}}}[{{\mathbf{X}}}_n]& = {{\mathbb{C}\textrm{ov}}}[{{\mathbf{X}}}_{n-1}] + \frac 1 {T_{n-1}} \bar {{\mathbf{A}}}^{T}{{\mathbb{E}}}[{{\mathbf{X}}}_{n-1} {{\mathbf{X}}}_{n-1}^{T}] + \frac 1 {T_{n-1}} {{\mathbb{E}}}[{{\mathbf{X}}}_{n-1} {{\mathbf{X}}}_{n-1}^{T}] \bar {{\mathbf{A}}} \\ & \quad + \frac 1 {T_{n-1}}\bar{{\mathbf{A}}}^{T}\,{\textbf{Diag}}({{\mathbb{E}}}[{{\mathbf{X}}}_{n-1}]) \bar{{\mathbf{A}}} \\ & \quad - \frac 1 {T_{n-1}} {{\mathbb{E}}}[{{\mathbf{X}}}_{n-1}]\, {{\mathbb{E}}}[{{\mathbf{X}}}_{n-1}^{T}] \bar{{\mathbf{A}}} - \frac 1 {T_{n-1}}\bar {{\mathbf{A}}}^{T} {{\mathbb{E}}}[{{\mathbf{X}}}_{n-1}] \, {{\mathbb{E}}}[{{\mathbf{X}}}_{n-1}^{T}] \\ & \quad - \frac 1 {T_{n-1}^{2}} \bar{{\mathbf{A}}}^{T} {{\mathbb{E}}}[{{\mathbf{X}}}_{n-1}]\, {{\mathbb{E}}}[{{\mathbf{X}}}_{n-1}^{T}] \bar{{\mathbf{A}}} \\ & = {{\mathbb{C}\textrm{ov}}}[{{\mathbf{X}}}_{n-1}] + \frac 1 {T_{n-1}} \bar {{\mathbf{A}}}^{T}\,{{\mathbb{C}\textrm{ov}}}[{{\mathbf{X}}}_{n-1}] + \frac 1 {T_{n-1}} \,{{\mathbb{C}\textrm{ov}}}[{{\mathbf{X}}}_{n-1}] \bar {{\mathbf{A}}} \\ & \quad + \frac 1 {T_{n-1}}\bar {{\mathbf{A}}}^{T} \,{\textbf{Diag}}({{\mathbb{E}}}[{{\mathbf{X}}}_{n-1}])\bar {{\mathbf{A}}} \\ & \quad - \frac 1 {T_{n-1}^{2}} \bar{{\mathbf{A}}}^{T} {{\mathbb{E}}}[{{\mathbf{X}}}_{n-1}]\, {{\mathbb{E}}}[{{\mathbf{X}}}_{n-1}^{T}]\bar {{\mathbf{A}}} . \end{align*}

5. Growing tenable balanced urns

Suppose the balance factor is $\theta > 0$. In such a case, a positive number of balls is added after each drawing. The urn size grows without limit. Janson [Reference Janson12] specifies sufficient conditions for small-index schemes under which ${{\mathbb {C}\textrm {ov}}}({{\mathbf {X}}}_n)$ is $\textbf {O}(n)$. Under the same conditions, a critical urn has a covariance matrix that is $\textbf {O}(n\ln n)$. These conditions are:

(1) The urn scheme is tenable.
(2) ${{\mathbb {E}}}[a_{i,j}^{2}(n)] < \infty$.
(3) $\lambda _1$ is real and simple.
(4) There is a dominating color and there are initially balls of that color in the urn.
(5) $\lambda _1$ belongs to a dominating class.
(6) The index $\Lambda = {\Re (\lambda _2)}/{\lambda _1}$ is at most $\frac 1 2$.

We call these conditions Janson's conditions. We deal only with tenable balanced urns. So conditions (1)–(2) are always satisfied, since in a balanced urn scheme the variables in the generic matrix are all bounded. The rest of the conditions require a special generator structure and special initial conditions adapted to tenability. Condition (5) means that the principal eigenvalue is determined by the submatrix remaining after removing the rows and columns corresponding to nondominant colors. Unless otherwise is stated explicitly, in what follows, we assume the scheme satisfies Janson's conditions (1)–(6).

5.1. Schemes with a small index

According to [Reference Janson12], for a tenable balanced scheme, when the balance factor is positive and the index is small ($\Lambda = {\Re ( \lambda _2)}/{\lambda _1} < \frac 1 2$), it follows that $(1/n)\,{{\mathbb {C}\textrm {ov}}}[{{\mathbf {X}}}_n]$ converges to a $c\times c$ limit matrix, ${{\mathbf \Sigma }}$. In other words, we have

(3)

\begin{equation} {{\mathbb{C}\textrm{ov}}}[{{\mathbf{X}}}_n] = n {{\mathbf \Sigma}}+ \textbf{R}_n, \end{equation}

where the remainder $\textbf {R}_n$ is a $c\times c$ matrix in which all the components are $o(n)$. The methods used in [Reference Janson10,Reference Janson12,Reference Pouyanne17] do not specify the rate of convergence.

Theorem 5.1. In a growing tenable balanced Pólya urn scheme (with the balance factor $\theta$) with a small index satisfying Janson's conditions, the covariance matrix is asymptotically given by ${{\mathbb {C}\textrm {ov}}}[{{\mathbf {X}}}_n] \sim n {{\mathbf \Sigma }}$, and the limiting (matrix) coefficient ${{\mathbf \Sigma }}$ of linearity solves the equation

$$\theta\, {{\mathbf \Sigma}} = \bar {{\mathbf{A}}}^{T} {{\mathbf \Sigma}} + {{\mathbf \Sigma}}\bar {{\mathbf{A}}} +\theta \bar{{\mathbf{A}}}^{T}\, {\textbf{Diag}}({{\mathbf{v}}}_1) \bar{{\mathbf{A}}} - \theta^{3} {{\mathbf{v}}}_1 {{\mathbf{v}}}_1^{T},$$

where ${{\mathbf {v}}}_1$ is the principle eigenvector of $\bar {{\mathbf {A}}}^{T}$.

Proof. The urn index is small. That is $\Lambda = \Re ( \lambda _2)/\lambda _1 < \frac 1 2$. We use $1/T_n = (1/{\theta n}) (1 +O(1/n))$, and the known average relation ${{\mathbb {E}}}[{{\mathbf {X}}}_n] = \theta n {{\mathbf {v}}}_1 + \textbf {O} (n^{\Lambda })$, see [Reference Janson10]. By Lemma 4.1, these facts give rise to the asymptotic equation

\begin{align*} \lim_{n\to\infty} ( {{\mathbb{C}\textrm{ov}}}[{{\mathbf{X}}}_n]- {{\mathbb{C}\textrm{ov}}}[{{\mathbf{X}}}_{n-1}] ) & = \frac 1 {\theta} \bar {{\mathbf{A}}}^{T} {{\mathbf \Sigma}} +\frac 1 {\theta} {{\mathbf \Sigma}} \bar {{\mathbf{A}}} \\ & \quad + \bar {{\mathbf{A}}}^{T} \,{\textbf{Diag}}({{\mathbf{v}}}_1) \bar{{\mathbf{A}}} - \bar {{\mathbf{A}}}^{T} {{\mathbf{v}}}_1{{\mathbf{v}}}_1^{T} \bar {{\mathbf{A}}}; \end{align*}

with the right-hand side being merely a constant, such a limit exists. As $\lambda _1$ is a unique principal eigenvalue, we have $\bar {{\mathbf {A}}}^{T} {{\mathbf {v}}}_1 = \lambda _1 {{\mathbf {v}}}_1= \theta {{\mathbf {v}}}_1$. From (3), the argument of the limit is $(n{{\mathbf \Sigma }} +\textbf {R}_n) - ((n-1){{\mathbf \Sigma }} +\textbf {R}_{n-1})$. We can write the latter equation as

\begin{align*} \lim_{n\to\infty} ( {{\mathbf{R}}}_n-{{\mathbf{R}}}_{n-1}) & ={-} {{\mathbf \Sigma}} + \frac 1 \theta \bar {{\mathbf{A}}}^{T} {{\mathbf \Sigma}} +\frac 1\theta {{\mathbf \Sigma}} \bar{{\mathbf{A}}} + \bar {{\mathbf{A}}}^{T} \,{\textbf{Diag}}({{\mathbf{v}}}_1) \bar {{\mathbf{A}}} - \theta^{2} {{\mathbf{v}}}_1{{\mathbf{v}}}_1^{T}. \end{align*}

This asserts that $\lim _{n\to \infty } ( {{\mathbf {R}}}_n-{{\mathbf {R}}}_{n-1})$ exists. Suppose we called this limit $\textbf {b}$. This limiting equation implies that

$${{\mathbf{R}}}_n = {{\mathbf{R}}}_{n-1} + \textbf{b} + \textbf{o}(1).$$

We conclude that $\textbf {b} =\textbf {0}$, for if it were not, iterating the recurrence for ${{\mathbf {R}}}_n$ would produce $n$ as the exact order for ${{\mathbf {R}}}_n$, contradicting the actual $\textbf {o} (n)$ order.

Example 5.1 (Friedman urn)

Consider Friedman's urn on white and blue balls, with the generic symmetric replacement matrix ${{\mathbf {A}}} =\left (\begin {smallmatrix} a & b\\ b & a \end {smallmatrix}\right )$, with $-\infty < a < 3b$. For tenability, if $a < 0$, both $b$ and the initial number of white and blue balls must be multiples of $|a|$.

Let $W_n$ and $B_n$ be, respectively, the number of white and blue balls after $n$ draws. With $T_n = W_n + B_n$ (deterministic), we have ${{\mathbb {V}\textrm {ar}}}[W_n] = {{\mathbb {V}\textrm {ar}}}[B_n] = - {{\mathbb {C}\textrm {ov}}}[W_n, B_n]$. The (matric) coefficient of linearity has the form ${\bf \Sigma} =\left (\begin {smallmatrix} x & -x\\ -x & x \end {smallmatrix}\right )$. In view of this perfect symmetry, we have ${{\mathbf {A}}}{{\mathbf \Sigma }} = {{\mathbf \Sigma }}{{\mathbf {A}}}$.

The eigenvalues of ${{\mathbf {A}}}^{T} = {{\mathbf {A}}}$ are $\lambda _1=a+b = \theta$ and $\lambda _2 = a-b$, and the principal eigenvector ${{\mathbf {v}}}_1$ of ${{\mathbf {A}}}^{T}$ is $\frac 1 2\left (\begin {smallmatrix}1\\ 1 \end {smallmatrix}\right )$. With $a < 3b$, we have $\lambda _2 < \frac 1 2 \lambda _1$. According to Theorem 5.1, the limit of $(1/n)\,{{\mathbb {C}\textrm {ov}}}[W_n, B_n]$ solves the equation stated there. With ${{\mathbf {A}}}{{\mathbf \Sigma }} = {{\mathbf \Sigma }}{{\mathbf {A}}}$, the equation simplifies to

$$\theta {{\mathbf \Sigma}} = 2 {{\mathbf{A}}}{{\mathbf \Sigma}} + \theta {{\mathbf{A}}} \,{\textbf{Diag}}({{\mathbf{v}}}_1) {{\mathbf{A}}} - \theta^{3} {{\mathbf{v}}}_1 {{\mathbf{v}}}_1^{T},$$

with the solution

\begin{align*} {{\mathbf \Sigma}} & = \theta(\theta\,{{\mathbf{I}}} - 2 {{\mathbf{A}}})^{{-}1} ({{\mathbf{A}}} \,{\textbf{Diag}}({{\mathbf{v}}}_1){{\mathbf{A}}} - \theta^{2} {{\mathbf{v}}}_1 {{\mathbf{v}}}_1^{T}) \\ & = (a+b)\left((a+b)\begin{pmatrix}1 & 0\\ 0 & 1 \end{pmatrix} -2 \begin{pmatrix} a & b\\ b & a \end{pmatrix}\right)^{{-}1} \\ & \quad \times\left( \begin{pmatrix}a & b\\ b & a\end{pmatrix} \begin{pmatrix}\frac 1 2 & 0\\ 0 & \frac 1 2\end{pmatrix} \begin{pmatrix}a & b\\ b & a\end{pmatrix} - (a+b)^{2} \begin{pmatrix}\frac 1 2 \\ \frac 1 2\end{pmatrix}\left(\frac 1 2\quad \frac 1 2\right) \right) \\ & = \frac {(b-a)^{2}(a + b)} {4(3b-a)} \begin{pmatrix}1 & -1\\ -1 & 1 \end{pmatrix}. \end{align*}

This result is obtained in [Reference Freedman7]; see also Example (2.4) in [Reference Janson10].

Example 5.2. Consider a scheme with the generic replacement matrix ${{\mathbf {A}}} =\left (\begin {smallmatrix} 1 & 1 & 3\\ 0 & 2 & 3\\ 0 & 4 & 1 \end {smallmatrix}\right )$. This replacement matrix has two classes in it, which are $\{1\}$ and $\{2,3\}$.

Let $W_n, B_n$ and $R_n$ be, respectively, the number of white, blue and red balls after $n$ draws. The eigenvalues of ${{\mathbf {A}}}^{T}$ are $\lambda _1=5$, $\lambda _2 = 1$ and $\lambda _3 = -2$. The principal eigenvector of ${{\mathbf {A}}}^{T}$ is $(0, \frac 4 7, \frac 3 7)^{T}$.

We have $\lambda _2 < \frac 1 2 \lambda _1$. According to Theorem 5.1, the limit of $(1/n)\, {{\mathbb {C}\textrm {ov}}}[W_n, B_n, R_n]$ solves the equation stated there:

$$5{{\mathbf \Sigma}} = \begin{pmatrix} 1 & 0 & 0\\ 1 & 2 & 4\\ 3 & 3 & 1 \end{pmatrix}{{\mathbf \Sigma}} + {{\mathbf \Sigma}} \begin{pmatrix} 1 & 1 & 3\\ 0 & 2 & 3\\ 0 & 4 & 1 \end{pmatrix} + \frac {240} {49}\begin{pmatrix} 0 & 0 & 0\\ 0 & 1 & -1\\ 0 & -1 & 1 \end{pmatrix} ,$$

with the solution

$${{\mathbf \Sigma}} = \frac {80} {147} \begin{pmatrix}0 & 0 & 0\\ 0 & 1 & -1\\ 0 & -1 & 1 \end{pmatrix}.$$

Interpretation 5.1. Suppose in Example 5.2, we combined the blue and red balls and painted them violet. We would then have an urn scheme on white and violet balls, with the replacement matrix $\left (\begin {smallmatrix} 1 & 4\\ 0 & 5 \end {smallmatrix}\right )$; the bottom row represents the additions upon drawing violet. White balls in this coupling behave as in the original 3-color scheme. This is a triangular scheme. Hence, the variance of the number of blue balls is sublinear [Reference Janson11,Reference Zhang, Chen and Mahmoud19], and the variance of the number of white balls and the covariances involving white are sublinear, too. In the original 3-color scheme, the variance of the number of white balls, as well as covariances involving white, are sublinear. Normalizing by $n$ kills all variances and covariances involving white in the $3\times 3$ covariance matrix.

5.2. Schemes with a critical index

According to [Reference Janson12], for a growing tenable balanced critical scheme ($\Lambda = \Re (\lambda _2)/ \lambda _1 = \frac 1 2$) satisfying Janson's conditions, the scaled covariance matrix $(1/{n\ln n})\, {{\mathbb {C}\textrm {ov}}}[{{\mathbf {X}}}_n]$ converges to a $c\times c$ limit matrix, ${{\mathbf \Sigma }}$. Note that the normalizing factor is not linear as in the case of a small index. So, we have

(4)

\begin{equation} {{\mathbb{C}\textrm{ov}}}[{{\mathbf{X}}}_n] = n\ln n {{\mathbf \Sigma}} + \textbf{R}_n, \end{equation}

with ${{\mathbf {R}}}_n = o(n \ln n)$.

Theorem 5.2. Consider a growing tenable balanced (with a balance factor $\theta$) irreducible critical Pólya urn scheme satisfying Janson's conditions. The covariance matrix is asymptotically given by

$${{\mathbb{C}\textrm{ov}}}[{{\mathbf{X}}}_n] = h\, n \ln n {{\mathbf{v}}}_2 {{\mathbf{v}}}_2^{T} + \textbf{O} (\ln n),$$

where the constant $h$ is the top left element of $\bar {{\mathbf {A}}}^{T}\, {\textbf {Diag}}({{\mathbf {v}}}_1) \bar {{\mathbf {A}}} - \theta ^{2} {{\mathbf {v}}}_1 {{\mathbf {v}}}_1^{T}$.

Proof. Scale the equation in Lemma 4.1 with $(\ln n)^{-1}$. We use $1/T_n = (1/{\lambda _1 n}) (1 +O(1/n))$, and the known average relation ${{\mathbb {E}}}[{{\mathbf {X}}}_n] = \lambda _1 n {{\mathbf {v}}}_1 + \textbf {O} (n^{\Lambda }) = \lambda _1 n {{\mathbf {v}}}_1 + \textbf {O} (n^{\frac 1 2})$, see [Reference Janson10]. Equipped with these facts, Lemma 4.1 ascertains that

$$\lim_{n\to\infty} \frac 1 {\ln n}({{\mathbb{C}\textrm{ov}}}[{{\mathbf{X}}}_n] - {{\mathbb{C}\textrm{ov}}}[{{\mathbf{X}}}_{n-1}] ) = \frac 1 {\lambda_1}{{\mathbf{A}}}^{T} {{\mathbf \Sigma}} + \frac 1 {\lambda_1} {{\mathbf \Sigma}}{{\mathbf{A}}} + {{{\mathbf 0}}}=: \textbf{K}.$$

From (3), the argument of the limit is

\begin{align*} & \frac 1 {\ln n} ((n\ln n {{\mathbf \Sigma}} +\textbf{R}_n) - ((n-1)\ln (n-1){{\mathbf \Sigma}} +\textbf{R}_{n-1}))\\ & \quad = \frac 1 {\ln n} \left({-}n \ln \frac {n-1} n + \ln (n-1)\right){{\mathbf \Sigma}} +\frac {\textbf{R}_n -\textbf{R}_{n-1}} {\ln n} \\ & \quad = \frac 1 {\ln n} (\ln n + O(1)){{\mathbf \Sigma}} +\frac {\textbf{R}_n -\textbf{R}_{n-1}} {\ln n} . \end{align*}

So, we have

$$\lim_{n\to\infty} \frac 1 {\ln n}( {{\mathbf{R}}}_n-{{\mathbf{R}}}_{n-1})= \textbf{K} - \boldsymbol{\Sigma}.$$

This limiting equation implies that

$${{\mathbf{R}}}_n = {{\mathbf{R}}}_{n-1} + (\textbf{K} - \boldsymbol{\Sigma}) {\ln n} + \textbf{o}(\ln n).$$

We conclude that $\textbf {K} - \boldsymbol {\Sigma } =\textbf {0}$, for if it were not, iterating the recurrence for ${{\mathbf {R}}}_n$ would produce $n \ln n$ as the exact order for ${{\mathbf {R}}}_n$, contradicting its actual $\textbf {o} (n\ln n)$ order. We have thus established that the (matrix) coefficient in (4) satisfies the equation

(5)

\begin{equation} \lambda_1 \boldsymbol{\Sigma} = {\bar{\mathbf{A}}}^{T}\boldsymbol{\Sigma} + \boldsymbol{\Sigma}{\bar{\mathbf{A}}}. \end{equation}

In view of the criticality of the urn, for any $h\in \mathbb {R}$, the matrix $h {{\mathbf {v}}}_2{{\mathbf {v}}}_2^{T}$ satisfies this equation which on the right-hand side gives

$${\bar{\mathbf{A}}}^{T}\boldsymbol{\Sigma} + \boldsymbol{\Sigma}{\bar{\mathbf{A}}} = h {\bar{\mathbf{A}}}^{T}{{\mathbf{v}}}_2 {{\mathbf{v}}}_2^{T} + h {{\mathbf{v}}}_2 {{\mathbf{v}}}_2^{T}{\bar{\mathbf{A}}} = h \lambda_2 {{\mathbf{v}}}_2 {{\mathbf{v}}}_2^{T} + h \lambda _2{{\mathbf{v}}}_2 {{\mathbf{v}}}_2^{T} = h \lambda _1{{\mathbf{v}}}_2 {{\mathbf{v}}}_2^{T}.$$

Having determined the right asymptotic form of the covariance matrix, we next determine the scale $h$. Write ${{\mathbb {C}\textrm {ov}}}({{\mathbf {X}}}_n) = (h n\ln n + r_n){{\mathbf {v}}}_2 {{\mathbf {v}}}_2^{T}$, for a scalar error term as an ansatz. We next verify that this form asymptotically satisfies the recurrence.

With this ansatz, we have

\begin{align*} {{\mathbb{C}\textrm{ov}}}({{\mathbf{X}}}_n) - {{\mathbb{C}\textrm{ov}}}({{\mathbf{X}}}_{n-1}) & = (h n\ln n + r_n) {{\mathbf{v}}}_2 {{\mathbf{v}}}_2^{T} \\ & \quad - (h(n-1)\ln (n-1) + r_{n-1}) {{\mathbf{v}}}_2 {{\mathbf{v}}}_2^{T} \\ & = (h\ln (n-1)+ h + r_n - r_{n-1}) {{\mathbf{v}}}_2{{\mathbf{v}}}_2^{T} + \textbf{O} \left(\frac 1 n\right). \end{align*}

On the right-hand side, we have

\begin{align*} & \frac 1 {T_{n-1}} \bar {{\mathbf{A}}}^{T}\,{{\mathbb{C}\textrm{ov}}}[{{\mathbf{X}}}_{n-1}] + \frac 1 {T_{n-1}} \,{{\mathbb{C}\textrm{ov}}}[{{\mathbf{X}}}_{n-1}] \bar {{\mathbf{A}}} + \frac 1 {T_{n-1}}\bar {{\mathbf{A}}}^{T} \,{\textbf{Diag}}({{\mathbb{E}}}[{{\mathbf{X}}}_{n-1}])\bar {{\mathbf{A}}} \\ & \qquad - \frac 1 {T_{n-1}^{2}} \bar{{\mathbf{A}}}^{T} {{\mathbb{E}}}[{{\mathbf{X}}}_{n-1}]\, {{\mathbb{E}}}[{{\mathbf{X}}}_{n-1}^{T}]\bar {{\mathbf{A}}} \\ & \quad = \frac {h (n-1)\ln (n-1) +r_{n-1}} {T_{n-1}} (\bar {{\mathbf{A}}}^{T}{{\mathbf{v}}}_2{{\mathbf{v}}}_2^{T} + {{\mathbf{v}}}_2{{\mathbf{v}}}_2^{T} \bar {{\mathbf{A}}})+ {{\mathbf{B}}}_n \\ & \quad= \frac {h (n-1)\ln (n-1) +r_{n-1}} {T_{n-1}} \, (2\lambda_2{{\mathbf{v}}}_2{{\mathbf{v}}}_2^{T}) + {{\mathbf{B}}}_n\\ & \quad= h \ln (n-1){{\mathbf{v}}}_2{{\mathbf{v}}}_2^{T} + \textbf{O} \left(\frac {\ln n} n\right) + \frac {\lambda_1 r_{n-1}} {T_{n-1}} {{\mathbf{v}}}_2{{\mathbf{v}}}_2^{T} + {{\mathbf{B}}}_n, \end{align*}

where we used the criticality in $2\lambda _2= \lambda _1$, the fact that ${{\mathbf {v}}}_2$ is the eigenvector corresponding to $\lambda _2$, and collected the other terms in ${{\mathbf {B}}}_n$, for which we have an asymptotic equivalent:

\begin{align*} {{\mathbf{B}}}_n & = \frac 1 {T_{n-1}}\bar {{\mathbf{A}}}^{T} \,{\textbf{Diag}}({{\mathbb{E}}}[{{\mathbf{X}}}_{n-1}]) \bar {{\mathbf{A}}} - \frac 1 {T_{n-1}^{2}} \bar{{\mathbf{A}}}^{T} {{\mathbb{E}}}[{{\mathbf{X}}}_{n-1}]\, {{\mathbb{E}}}[{{\mathbf{X}}}_{n-1}^{T}] \bar {{\mathbf{A}}} \\ & = \frac 1 {\lambda_1 (n-1) + T_0}\bar {{\mathbf{A}}}^{T} \,{\textbf{Diag}}(\lambda_1 {{\mathbf{v}}}_1 n + \textbf{O} (\sqrt n))\bar {{\mathbf{A}}}\\ & \quad - \frac 1 {(\lambda_1 (n-1) + T_0)^{2}} \bar{{\mathbf{A}}}^{T} (\lambda_1{{\mathbf{v}}}_1 n + \textbf{O} (\sqrt n)) (\lambda_1{{\mathbf{v}}}_1^{T} n + \textbf{O} (\sqrt n)) \bar {{\mathbf{A}}}\\ & = \bar {{\mathbf{A}}}^{T} \,{\textbf{Diag}}({{\mathbf{v}}}_1) \bar {{\mathbf{A}}} - \bar {{\mathbf{A}}}^{T} {{\mathbf{v}}}_1 {{\mathbf{v}}}_1^{T} \bar{{\mathbf{A}}} + \textbf{O} \left(\frac 1 {\sqrt n}\right)\\ & =: {{\mathbf{B}}} + \textbf{O} \left( \frac 1 {\sqrt n}\right). \end{align*}

Putting the parts together, under such an ansatz we get

\begin{align*} & (h\ln (n-1) + h + r_n - r_{n-1}) {{\mathbf{v}}}_2{{\mathbf{v}}}_2^{T} + \textbf{O} \left(\frac 1 n\right)\\ & \quad = h \ln (n-1){{\mathbf{v}}}_2{{\mathbf{v}}}_2^{T} + \textbf{O} \left(\frac {\ln n} n\right) + \frac {\lambda_1 r_{n-1}} {T_{n-1}} {{\mathbf{v}}}_2{{\mathbf{v}}}_2^{T} + {{\mathbf{B}}} + \textbf{O} \left( \frac 1 {\sqrt n}\right). \end{align*}

We see the cancellation of $\ln (n-1)$ on both sides, and after rearrangement the ansatz takes the form

\begin{align*} r_n {{\mathbf{v}}}_2{{\mathbf{v}}}_2^{T} & = \left(1 + \frac {\lambda_1} {T_{n-1}} \right)\, r_{n-1} {{\mathbf{v}}}_2{{\mathbf{v}}}_2^{T} + {{\mathbf{B}}} - h {{\mathbf{v}}}_2{{\mathbf{v}}}_2^{T}+ \textbf{O} \left(\frac {\ln n} n\right)\\ & = \frac {T_n} {T_{n-1}} r_{n-1} {{\mathbf{v}}}_2{{\mathbf{v}}}_2^{T} + {{\mathbf{B}}} - h {{\mathbf{v}}}_2{{\mathbf{v}}}_2^{T}+ \textbf{O} \left(\frac {\ln n} n\right). \end{align*}

The ansatz is true, if $r_n$ is a function that solves the last recurrence. Extract the top left component in this matric recurrence. Write its recurrence as

\begin{align*} \frac { r_n} {T_n} = \frac {r_{n-1}} {T_{n-1}} + \frac {b_{1,1}-h} {T_n} + O \left(\frac {\ln n} {n^{2}}\right), \end{align*}

where $b_{1,1}$ is the top left element in ${{\mathbf {B}}}$. Unwinding the recurrence, we find $r_n = (b_{1,1} - h) n\ln n + O(\ln n)$. (Recall that the top component of ${{\mathbf {v}}}_2$ is 1.) This remainder would be of order $n \ln n$, unless $h = b_{1,1}$. We assumed $r_n$ to be $o(n\ln n)$. So, $h$ must be the top left element of

$${{\mathbf{B}}}=\bar {{\mathbf{A}}}^{T} \,{\textbf{Diag}}({{\mathbf{v}}}_1) \bar {{\mathbf{A}}} - \bar {{\mathbf{A}}}^{T} {{\mathbf{v}}}_1 {{\mathbf{v}}}_1^{T} = \bar {{\mathbf{A}}}^{T} \,{\textbf{Diag}}({{\mathbf{v}}}_1) \bar {{\mathbf{A}}} - \theta^{2} {{\mathbf{v}}}_1 {{\mathbf{v}}}_1^{T},$$

and $r_n = O(\ln n)$.

Example 5.3 (Critical Bagchi-Pal urn)

Consider the urn scheme with the generic replacement matrix ${{\mathbf {A}}} = \left (\begin {smallmatrix} a & b\\ \frac {a-b} 2 & \frac {a+3b} 2 \end {smallmatrix}\right )$, for some integers $a > b > 0$, with $a-b$ even.

Let $W_n$ and $B_n$ be, respectively, the number of white and blue balls after $n$ draws. The eigenvalues of ${{\mathbf {A}}}^{T}$ are $\lambda _1=a+b$ and $\lambda _2 = (a+b)/2$, and the principal eigenvector of ${{\mathbf {A}}}^{T}$ is $(1/{(a+b)})\left (\begin {smallmatrix}a-b\\ 2b \end {smallmatrix}\right )$, and the eigenvector of ${{\mathbf {A}}}^{T}$ corresponding to $\lambda _2$ is $\left (\begin {smallmatrix}1\\ -1 \end {smallmatrix}\right )$.

According to Theorem 5.2, we have

$$\boldsymbol{\Sigma} = h \begin{pmatrix} 1\\ -1 \end{pmatrix} (1\quad - 1) = h \begin{pmatrix} 1 & -1\\ -1 & 1 \end{pmatrix},$$

and $h$ is the top left component in the matrix

\begin{align*}\begin{pmatrix} a & \dfrac {a-b} 2\\ b & \dfrac {a+3b} 2 \end{pmatrix} \begin{pmatrix} \dfrac {a-b}{a+b} & 0\\ 0 & \dfrac {2b}{a+b} \end{pmatrix} \begin{pmatrix} a & b\\ \dfrac {a-b} 2 & \dfrac {a+3b} 2 \end{pmatrix} - (a+b)^{2} \begin{pmatrix} \dfrac {a-b}{a+b}\\ \dfrac {2b}{a+b} \end{pmatrix} \left(\frac {a-b}{a+b}\quad \frac {2b}{a+b}\right);\end{align*}

that is, $h = \frac 1 2 b(a-b)$. So, we have

$${{\mathbb{C}\textrm{ov}}}[W_n , B_n] = \frac 1 2 b(a-b) \begin{pmatrix} 1 & -1\\ -1 & 1 \end{pmatrix} + \textbf{O} (\ln n).$$

This recovers the result in [Reference Bagchi and Pal2], where it is derived without the lower order term by an exceptionally long procedure.

Example 5.4 (A critical urn with random entries)

Consider an urn scheme with replacement matrix

$${{\mathbf{A}}}_n =\begin{pmatrix} 5 & Y & 3 -Y\\ 2 & 4 & 2\\ 1 & 2 & 5 \end{pmatrix},$$

where $Y$ is a binomial random variable on three independent identically distributed trials, with success probability $\frac 2 3$ in each.

The generator of this scheme is $\bar {{\mathbf {A}}} = \left (\begin {smallmatrix} 5 & 2 & 1\\ 2 & 4 & 2\\ 1 & 2 & 5 \end {smallmatrix}\right )$. The eigenvalues of the generator are $\lambda _1 = 8, \lambda _2 = 4$, and $\lambda _3 = 2$. The eigenvectors corresponding to $\lambda _1$ and $\lambda _2$ are, respectively, $\frac 1 3 \left (\begin {smallmatrix} 1\\ 1\\ 1 \end {smallmatrix}\right )$ and $\left (\begin {smallmatrix} 1\\ 0\\ -1 \end {smallmatrix}\right )$. Note that we followed the normalization and directions of the eigenvectors as mentioned in Section 3.

Let $W_n, B_n$ and $R_n$ be, respectively, the number of white, blue and red balls after $n$ draws.

According to Theorem 5.2, we have

$$\boldsymbol{\Sigma} = h \begin{pmatrix} 1\\ 0\\ -1 \end{pmatrix} (1\quad 0\quad - 1) = h \begin{pmatrix} 1 & 0 & -1\\ 0 & 0 & 0\\ -1 & 0 & 1 \end{pmatrix} ,$$

and $h$ is the top left component in the matrix

$$\frac 1 3 \begin{pmatrix} 5 & 2 & 1\\ 2 & 4 & 2\\ 1 & 2 & 5 \end{pmatrix} \begin{pmatrix} 1 & 0 & 0\\ 0 & 1 & 0\\ 0 & 0 & 1 \end{pmatrix} \begin{pmatrix} 5 & 2 & 1\\ 2 & 4 & 2\\ 1 & 2 & 5 \end{pmatrix}- \frac {64} 9\begin{pmatrix} 1\\ 1\\ 1 \end{pmatrix} (1\quad 1 \quad 1);$$

that is, $h = \frac {26} 9$. Hence, we have

$${{\mathbb{C}\textrm{ov}}} [W_n, B_n, R_n] = \frac {26} 9 \begin{pmatrix} 1 & 0 & -1\\ 0 & 0 & 0\\ -1 & 0 & 1 \end{pmatrix} n \ln n + \textbf{O}(\ln n).$$

Interpretation 5.2. Suppose in Example 5.4 we combined white and red balls and painted them pink. We would then have an urn scheme on pink and blue balls, with the generator $\left (\begin {smallmatrix} 6 & 2\\ 4 & 4 \end {smallmatrix}\right )$; the top row represents the average additions upon drawing pink. Blue balls in this coupling behave as in the original 3-color scheme. This is a scheme with a small index (the eigenvalues are 8 and 2). Hence, the variance of the number of blue balls is linear (cf. Theorem 5.1), and so is the covariance between the number of pink and blue balls. That is why in the original 3-color scheme, the variance of the number of blue balls is linear and the covariances involving blue are linear, too. Normalizing by $n\ln n$ kills all variances and covariances involving blue in the $3\times 3$ covariance matrix.

6. Limitations

The key to success in the cases of small index and critical urns is that the differencing of ${{\mathbb {C}\textrm {ov}}}[{{\mathbf {X}}}_n]$ and ${{\mathbb {C}\textrm {ov}}}[{{\mathbf {X}}}_{n-1}]$ produces a small remainder and information can be extracted from the toll function on the right-hand side. Variances in the large-index case are superlinear of order $n^{2\Lambda }$, with $2\Lambda > 1$. Differencing then produces a large error that takes over the toll function, and we cannot extract useful information.

Acknowledgment

The author is grateful for comments on the scope of this research by Svante Janson. The author also thanks Shuyang Gao for a careful reading of the manuscript.

7. Conflicts of interest

The authors declare no conflict of interest.

Footnotes

To Robert Smythe, a mentor, coauthor and friend, on his 80th birthday

References

Athreya, K. & Karlin, S. (1968). Embedding of urn schemes into continuous time Markov branching processes and related limit theorems. The Annals of Mathematical Statistics 39: 1801–1817.10.1214/aoms/1177698013CrossRef Google Scholar

Bagchi, A. & Pal, A. (1985). Asymptotic normality in the generalized Pólya-Eggenberger urn model with applications to computer data structures. SIAM Journal on Algebraic and Discrete Methods 6: 394–405.10.1137/0606041CrossRef Google Scholar

Bhutani, K., Kalpathy, R., & Mahmoud, H. (2021+). Degrees in random

$m$-ary hooking networks (submitted).Google Scholar

Chauvin, B., Pouyanne, N., & Sahnoun, R. (2011). Limit distributions for large Pólya urns. The Annals of Applied Probability 21: 1–32.10.1214/10-AAP696CrossRef Google Scholar

Eggenberger, F. & Pólya, G. (1923). Über die statistik verketteter vorgänge. Zeitschrift für Angewandte Mathematik und Mechanik 1: 279–289.10.1002/zamm.19230030407CrossRef Google Scholar

Ehrenfest, P. & Ehrenfest, T. (1907). Über zwei bekannte Einwände gegen das Boltzmannsche H-theorem. Physikalische Zeitschrift 8: 311–314.Google Scholar

Freedman, D. (1965). Bernard Friedman's urn. The Annals of Mathematical Statistics 36: 956–970.10.1214/aoms/1177700068CrossRef Google Scholar

Horn, R. & Johnson, C (1985). Matrix analysis. Cambridge, UK: Cambridge University Press.10.1017/CBO9780511810817CrossRef Google Scholar

Idriss, S. (2021). Nonlinear unbalanced urn models via stochastic approximation. Methodology and Computing in Applied Probability. doi:10.1007/s11009-021-09858-6.Google Scholar

Janson, S. (2004). Functional limit theorems for multitype branching processes and generalized Pólya urns. Stochastic Processes and Their Applications 110: 177–245.10.1016/j.spa.2003.12.002CrossRef Google Scholar

Janson, S. (2006). Limit theorems for triangular urn schemes. Probability Theory and Related Fields 134: 417–452.10.1007/s00440-005-0442-7CrossRef Google Scholar

Janson, S. (2020). Mean and variance of balanced Pólya urns. Advances in Applied Probability 52: 1224–1248.10.1017/apr.2020.38CrossRef Google Scholar

Janson, S. & Pouyanne, N. (2018). Moment convergence of balanced Pólya processes. Electrononic Journal of Probability 23: 1–13.Google Scholar

Johnson, N. & Kotz, S (1977). Urn models and their application. New York: John Wiley.Google Scholar

Kotz, S. & Balakrishnan, N. (1997). Advances in urn models during the past two decades. In N. Balakrishnan (ed.), Advances in Combinatorial Methods and Applications to Probability and Statistics. Boston, MA: Birkhäuser, pp. 203–257.10.1007/978-1-4612-4140-9_14CrossRef Google Scholar

Mahmoud, H (2008). Pólya urn models. Orlando, FL: Chapman-Hall.10.1201/9781420059847CrossRef Google Scholar

Pouyanne, N. (2008). An algebraic approach to Pólya processes. Annales de l'Institut Henri Poincaré, Probabilités et Statistiques 44: 293–323.10.1214/07-AIHP130CrossRef Google Scholar

Smythe, R. (1996). Central limit theorems for urn models. Stochastic Processes and Their Applications 65: 115–137.10.1016/S0304-4149(96)00094-4CrossRef Google Scholar

Zhang, P., Chen, C., & Mahmoud, H. (2015). Explicit characterization of moments of balanced triangular Pólya urns by an elementary approach. Statistics & Probability Letters 96: 149–153.10.1016/j.spl.2014.09.016CrossRef Google Scholar

Article contents

Covariances in Pólya urn schemes

Abstract

Keywords

MSC classification

1. Introduction

2. Pólya urns

2.1. Tenability

2.2. Balance

2.3. Classes and irreducibility

3. Notation

3.1. Classification by the index

4. Covariance recursion

5. Growing tenable balanced urns

5.1. Schemes with a small index

5.2. Schemes with a critical index

6. Limitations

Acknowledgment

7. Conflicts of interest

Footnotes

References

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests