Hostname: page-component-cd9895bd7-fscjk Total loading time: 0 Render date: 2024-12-26T04:52:06.622Z Has data issue: false hasContentIssue false

Disparity-persistence and the multistep friendship paradox

Published online by Cambridge University Press:  25 July 2023

Kenneth S. Berenhaut*
Affiliation:
Department of Statistical Sciences, Wake Forest University, Winston Salem, NC, USA
Chi M. Zhang
Affiliation:
Department of Statistical Sciences, Wake Forest University, Winston Salem, NC, USA
*
Corresponding author: Kenneth S. Berenhaut; Email: [email protected]
Rights & Permissions [Opens in a new window]

Abstract

In this paper, we consider the friendship paradox in the context of random walks and paths. Among our results, we give an equality connecting long-range degree correlation, degree variability, and the degree-wise effect of additional steps for a random walk on a graph. Random paths are also considered, as well as applications to acquaintance sampling in the context of core-periphery structure.

Type
Research Article
Creative Commons
Creative Common License - CCCreative Common License - BY
This is an Open Access article, distributed under the terms of the Creative Commons Attribution licence (http://creativecommons.org/licenses/by/4.0), which permits unrestricted re-use, distribution and reproduction, provided the original article is properly cited.
Copyright
© The Author(s), 2023. Published by Cambridge University Press.

1. Introduction

The friendship paradox, introduced by Feld in [Reference Feld15], states roughly that your friends have more friends than you do on average (for an explicit statement, see Theorem 1.1). Following the work of [Reference Kramer, Cutler and Radcliffe24], extending the friendship paradox to multiple steps (i.e., iterated friendships), here we quantify an explicit connection between long-range degree correlation, degree variability, and the degree-wise effect of additional steps for random walks on a graph. Results for random paths are also considered.

Throughout, we suppose $G=(\mathcal{V},\mathcal{E})$ is a connected graph with node set $\mathcal{V}=\{v_1,v_2,\ldots,v_n\}$ and undirected edge set $\mathcal{E}$ and define the degree function $\mathrm{d}$, so that for $v\in \mathcal{V}$, $\mathrm{d}(v)$ is the degree of v (i.e., the number of neighbors of v) in G. Moreover, we denote by $\bf{A}=[A_{i,j}]$ the associated n × n adjacency matrix; the degree of a node vi can then be computed via

\begin{align*} \mathrm{d}(v_i)=\sum_{j=1}^{n}A_{i,j}. \end{align*}

As we will be interested in the expected degree over random sequences of nodes in the graph, it will be convenient to consider a time-homogeneous random walk $\bf{X}=(X_0,X_1,\ldots)$ dictated by a transition matrix, $\bf{P}=[P_{i,j}]$, with

\begin{align*} P_{i,j}\stackrel{\rm def}{=} \mathbb{P}(X_{k+1}=v_j\mid X_k=v_i)=\frac{A_{i,j}}{\mathrm{d}(v_i)}, \end{align*}

for $1\leq i$, $j\leq n$, and $k\geq 0$. Importantly, we will assume throughout that X 0 is uniformly selected from V.

We first restate the friendship paradox formalized in [Reference Feld15].

Theorem 1.1. ([Reference Feld15]) Suppose X 0 is a node selected uniformly at random from $\mathcal{V}$, and $E=\{V,W\}$ is an edge pair selected uniformly at random from $\mathcal{E}$. Then, selecting Y 1 from the nodes in E, each with half chance,

(1)\begin{align} \mathbb{E}(\mathrm{d}(Y_1))\geq \mathbb{E}(\mathrm{d}(X_0)). \end{align}

Similarly, several authors have employed the following counterpart for random walks with a uniformly selected initial node X 0 (see for instance [Reference Berenhaut and Jiang3, Reference Cao and Ross6, Reference Jackson21]).

Theorem 1.2. Suppose $\bf{X}=(X_0,X_1,X_2,\ldots)$ is a simple random walk on the graph G, where X 0 is selected uniformly at random from $\mathcal{V}$. Then

(2)\begin{align} \mathbb{E}(\mathrm{d}(X_1))\geq \mathbb{E}(\mathrm{d}(X_0)). \end{align}

The following two results regarding multiple step walks and paths can be found in [Reference Kramer, Cutler and Radcliffe24].

Theorem 1.3. ([Reference Kramer, Cutler and Radcliffe24]) Suppose $\bf{X}=(X_0,X_1,X_2,\ldots)$ is a simple random walk on the graph G, where X 0 is selected uniformly at random from $\mathcal{V}$. Then, for $k\geq 0$,

(3)\begin{align} \mathbb{E}(\mathrm{d}(X_k))\geq \mathbb{E}(\mathrm{d}(X_0)). \end{align}

Theorem 1.4. ([Reference Kramer, Cutler and Radcliffe24]) Suppose $k\geq 1$ is odd, X 0 is selected uniformly at random from $\mathcal{V}$, and $W=\{Y_0,Y_1,\ldots,Y_k\}$ is a path selected uniformly at random from the set of all length-k paths on G. Then, we have

(4)\begin{align} \mathbb{E}(\mathrm{d}(Y_k))\geq \mathbb{E}(\mathrm{d}(X_0)). \end{align}

For further recent theoretical results regarding the friendship paradox, see for instance [Reference Berenhaut, Jiang, McNab and Krizay4, Reference Cantwell, Kirkley and Newman5, Reference Dizon-Ross and Ross12, Reference Pal, Yu, Novick, Swami and Bar-Noy31].

In place of comparisons with $\mathbb{E}(\mathrm{d}(X_0))$, as in the results above, the primary concern of this paper is the study of the degree-wise benefit of additional steps on the graph (for both random walks and random paths). Motivation for such considerations is provided by the recent success of applications of the one-step friendship paradox (see [Reference Christakis and Fowler8, Reference Cohen, Havlin and Ben-Avraham9, Reference Eom and Jo14, Reference Garcia-Herranz, Moro, Cebrian, Christakis and Fowler18, Reference Herrera, Srinivasan, Brownstein, Galvani and Meyers20, Reference Kim, Hwong, Stafford, Hughes, O’Malley, Fowler and Christakis23, Reference Singer34]).

Several authors have alluded to weakness in expression of the friendship paradox in networks exhibiting positive degree correlation over edges (see for instance [Reference Cantwell, Kirkley and Newman5, Reference Lee, Lee, Eom, Holme and Jo25, Reference Momeni and Rabbat28, Reference Pal, Yu, Novick, Swami and Bar-Noy31]). Before formally stating our results, it will be convenient to consider longer-range degree correlation. It is common for networks to exhibit sharing of similar characteristics across edges (i.e., “homophily,” “birds of a feather flock together,” “assortativity”; see [Reference Newman29]). In social networks, in particular, individuals may prefer to be associated with others sharing, for instance, similar age, education, or occupations. In the case where the characteristic of interest is nodal degree, Pearson correlation $\rho=\rho(G)$ refers to the tendency of nodes in a network to be associated with others sharing similar (or different) degrees. Specifically, for the graph G, suppose $E=\{V,W\}$ is an edge selected uniformly at random, then

(5)\begin{align} \rho\stackrel{\rm def}{=}\mathrm{cor}(\mathrm{d}(V),\mathrm{d}(W))=\frac{\mathrm{cov}(\mathrm{d}(V),\mathrm{d}(W))}{\mathrm{sd}(\mathrm{d}(V))\ \mathrm{sd}(\mathrm{d}(W))}. \end{align}

A positive degree correlation, ρ > 0, indicates a tendency for high-degree nodes in the graph to connect to other high-degree nodes and similarly for low-degree nodes.

Values of ρ have been studied in various classes of networks (see [Reference Chen and Olvera-Cravioto7, Reference Foster, Foster, Grassberger and Paczuski16, Reference Li, Wang and Van Mieghem26, Reference Piraveenan, Prokopenko and Zomaya33]). For connections between assortativity and other topological network properties, see [Reference D’Agostino, Scala, Zlatić and Caldarelli11, Reference Ellens, Spieksma, Van Mieghem, Jamakovic and Kooij13, Reference Van Mieghem, Ge, Schumm, Trajanovski and Wang36, Reference Van Mieghem, Wang, Ge, Tang and Kuipers37]. For work considering assortativity in the context of the friendship paradox, see for instance [Reference Jo and Eom22, Reference Lee, Lee, Eom, Holme and Jo25, Reference Momeni and Rabbat28, Reference Pal, Yu, Novick, Swami and Bar-Noy31]. See [Reference Noldus and Van Mieghem30] and the references therein for some further work on assortativity.

Although much previous research has focused on degree correlation among nearest neighbors, some recent considerations of the concept of assortative mixing beyond first neighbors can be found in [Reference Allen-Perkins, Pastor and Estrada1, Reference Arcagni, Grassi, Stefani and Torriero2, Reference Fujiki, Takaguchi and Yakubo17, Reference Mayo, Abdelzaher and Ghosh27] and can in a general sense be referred to as long-range degree correlation (see [Reference Arcagni, Grassi, Stefani and Torriero2]).

Assortativity, as defined in [Reference Newman29], can be interpreted as a measure of the correlation between nodal degrees based on the first-order adjacency matrix. Here, we define the kth-order path-based degree correlation as the Pearson coefficient, measuring the correlation between degrees for the two end points of a randomly chosen path with length k.

Definition 1.5. For $k \geq 0$, let $Y_k^-$ and $Y_k^+$ be the beginning and terminal nodes for a path selected uniformly at random from the set of all length-k paths on G. The kth-order (path-based) degree correlation $\rho_{0,k}$ is given by

\begin{align*} \rho_{0,k}\stackrel{\rm def}{=} \mathrm{cor}(\mathrm{d}(Y_k^-),\mathrm{d}(Y_{k}^+))=\frac{\mathrm{cov}(\mathrm{d}(Y_k^-),\mathrm{d}(Y_k^+))}{\mathrm{sd}(\mathrm{d}(Y_k^-))\ \mathrm{sd}(\mathrm{d}(Y_{k}^+))}. \end{align*}

In parallel with the definition above, we also consider the walk-based degree correlation as that for the starting node, X 0, of a random walk $\bf{X}=(X_0,X_1,X_2,\ldots)$ (with uniform initial distribution) and the node reached at the walk’s kth step, Xk. Note that the use of the uniform distribution here is motivated by the desired applications. This is in contrast to recent work regarding long-range correlation, wherein the initial distribution is stationary for the corresponding Markov chain (see [Reference Arcagni, Grassi, Stefani and Torriero2, Reference Gutiérrez-Gómez and Delvenne19, Reference Peel, Delvenne and Lambiotte32]).

Definition 1.6. Consider a random walk, $\bf{X}=\{X_0, X_1, X_2, \ldots\}$, on the graph G (with uniform initial distribution). The kth-order (walk-based) degree correlation, $\phi_{0,k}=\phi_{0,k}(\bf{X})$, is given by

\begin{align*} \phi_{0,k}\stackrel{\rm def}{=} \mathrm{cor}(\mathrm{d}(X_0),\mathrm{d}(X_k))=\frac{\mathrm{cov}(\mathrm{d}(X_{0}),\mathrm{d}(X_{k}))}{\mathrm{sd}(\mathrm{d}(X_{0}))\ \mathrm{sd}(\mathrm{d}(X_{k}))}. \end{align*}

Note. As an aside, it may be of some value to consider applications for which $\phi_{0,1}$ as in Definition 1.6 may be of interest. The correlation computed involves the same node pairs as in (5) but with weighting in a manner that places equal emphases on each node regardless of degree. Ties in a sense are weaker (or diluted) for nodes of large degree.

Now, for the random walk $\bf{X}=(X_0,X_1,\dots)$, let $X_\infty$ be a node selected according to the stationary distribution $\pi=(\pi_1, \dots, \pi_n)$ of the random walk X, that is, $\pi_i=\mathrm{d}(v_i)/\sum_j \mathrm{d}(v_j)$. A main quantity of interest will be what we term the proportional residual benefit at time k for the walk, which measures the degree-wise (remaining) benefit of additional steps of a random walk.

Definition 1.7. The proportional residual benefit for the walk X at time k, $\tau_k=\tau_k(\bf{X})$ is given by

\begin{align*} \tau_k\stackrel{\rm def}{=}\frac{\mathbb{E}(\mathrm{d}(X_\infty))-\mathbb{E}(\mathrm{d}(X_k))}{\mathbb{E}(\mathrm{d}(X_k))}. \end{align*}

We now have the following quantitative relationship between the walk-based long-range degree correlation and the proportional residual benefit. The proof is provided in Section 2. Throughout, for a random variable U with finite second moment, we denote by $\mathrm{c_v}(U)$, the coefficient of variation of U, that is,

(6)\begin{align} \mathrm{c_v}(U)\stackrel{\rm def}{=}\frac{\mathrm{sd}(U)}{\mathbb{E}(U)}. \end{align}

Theorem 1.8. Suppose $\bf{X}=(X_0,X_1,\ldots)$ is a random walk on the graph G. The proportional residual benefit τk can be written as the product of the kth-order degree correlation $\phi_{0,k}$ and the two coefficients of variation, $c_v(\mathrm{d}(X_0))$ and $c_v(\mathrm{d}(X_k))$, that is

(7)\begin{align} \tau_k&=\mathrm{cor}(\mathrm{d}(X_{0}),d(X_{k}))\cdot\frac{\mathrm{sd}(\mathrm{d}(X_{0}))}{\mathbb{E}(\mathrm{d}(X_{0}))}\cdot\frac{\mathrm{sd}(\mathrm{d}(X_{k}))}{\mathbb{E}(\mathrm{d}(X_{k}))}\nonumber \\ &=\phi_{0,k}\ c_v(\mathrm{d}(X_0)) c_v(\mathrm{d}(X_k)). \end{align}

Now, as in Definition 1.5, for $k \geq 0$, let $Y_k^-$ and $Y_k^+$ be the beginning and terminal nodes for a path selected uniformly at random from the set of all length-k paths. We have the following definition and result.

Definition 1.9. Suppose $k\geq 0$. The proportional one-step benefit at length k, γk, is given by

\begin{align*} \gamma_k\stackrel{\rm def}{=}\frac{\mathbb{E}(\mathrm{d}(Y_{k+1}^+))-\mathbb{E}(\mathrm{d}(Y_k^+))}{\mathbb{E}(\mathrm{d}(Y_k^+)))}. \end{align*}

Theorem 1.10. The proportional one-step benefit, γk, at length k can be written as the product of the kth-order degree correlation $\rho_{0,k}$ and the two coefficients of variation, ${c_v}(\mathrm{d}(Y_{k}^-))$ and ${c_v}(\mathrm{d}(Y_{k}^+))$, that is,

(8)\begin{align} \gamma_k&= \mathrm{cor}(\mathrm{d}(Y_{k}^-),\mathrm{d}(Y_{k}^+))\cdot \frac{\mathrm{sd}(\mathrm{d}(Y_{k}^-))}{\mathbb{E} (\mathrm{d}(Y_{k}^-))}\cdot\frac{\mathrm{sd}(\mathrm{d}(Y_{k}^+))}{\mathbb{E}(\mathrm{d}(Y_{k}^+))} \nonumber \\ &=\rho_{0,k}\ {c_v}(\mathrm{d}(Y_{k}^-))\ {c_v}(\mathrm{d}(Y_{k}^+)). \end{align}

In terms of residual benefit, in the case of random paths, we will also prove the following.

Theorem 1.11. Suppose G is a non-bipartite (and connected) graph. If $\bf{(a)}$ $k\geq 0$ is even or $\bf{(b)}$ $k\geq 1$ is odd and the corresponding kth-order path-based degree correlation $\rho_{0,k}$ is nonnegative, then the limiting expected degree of $Y_i^+$ is no less than that of $Y_k^+$, that is,

(9)\begin{align} \lim_{i\rightarrow \infty} \mathbb{E}(\mathrm{d}(Y_i^+)) \geq \mathbb{E}(\mathrm{d}(Y_k^+)). \end{align}

Note. (Disparity persistence and core-periphery structure) Theorems 1.8, 1.10, and 1.11 provide insight into when it may be beneficial in acquaintance sampling and elsewhere to continue on to neighbors of neighbors in an iterated fashion. In fact, the benefit can be high in networks wherein the degree correlation is positive out to a longer range, but, globally, the degree variability is high. We refer to such a phenomenon as disparity-persistence, since disparity in degree persists over extended neighborhoods. A prime example wherein such behavior can occur is social networks exhibiting strong core-periphery structure, with a large core of high-degree nodes and a periphery of loosely connected nodes, with low degree.

The remainder of the paper proceeds as follows. In Section 2, we prove Theorem 1.8, while in Section 3, we prove Theorems 1.10 and 1.11.

2. Proof of Theorem 1.8 (The random walk case)

In this section, we will prove Theorem 1.8.

Proof of Theorem 1.8

Set $d_i=\mathrm{d}(v_i)$, $\bf{d}=(d_1,d_2,\ldots,d_n)^{\prime}$, and $\bf{1}=(1,1,\ldots,1)^{\prime}$. Let P be the transition matrix for the random walk, X, and note that $\bf{d}=\bf{A} \bf{1}$. Now, note that

\begin{align*} \mathbb{E}\left(\mathrm{d}(X_0)\ \mathrm{d}(X_k)\right)=\frac{1}{n}\sum_i\sum_j d_id_j (\bf{P}^k)_{i,j} =\frac{\bf{d}^{\prime}\bf{P}^k\bf{d}}{\bf{1}^{\prime}\bf{1}}. \end{align*}

Hence, letting D be the diagonal matrix with diagonal d and noting that $\bf{P}=\bf{D}^{-1}\bf{A}$ gives

\begin{align*} \bf{d}^{\prime}\bf{P}^{k}\bf{d}&=\bf{d}^{\prime}\bf{D}^{-1}\bf{A}\bf{P}^{k-1}\bf{d} = \bf{1}^{\prime}\bf{A}\bf{P}^{k-1}\bf{d} = \bf{d}^{\prime}\bf{P}^{k-1}\bf{d} \\ &=\cdots = \bf{d}^{\prime}\bf{d}. \end{align*}

Thus,

\begin{align*} \mathbb{E}(\mathrm{d}(X_0)\ \mathrm{d}(X_k))=(\bf{d}^{\prime}\bf{d})/(\bf{1}^{\prime}\bf{1}). \end{align*}

Now,

\begin{align*} \mathbb{E}(\mathrm{d}(X_\infty))&=\sum_i \left(\frac{d_i}{\sum_j d_j}\right)d_i=\frac{\bf{d}^{\prime}\bf{d}}{\bf{1}^{\prime}\bf{A}\bf{1}},\\ \mathbb{E}(\mathrm{d}(X_0))&=\frac{1}{n}\sum_i d_i=\frac{\bf{1}^{\prime} \bf{A} \bf{1}}{\bf{1}^{\prime}\bf{1}}, \end{align*}

and hence

\begin{align*} \mathbb{E}(\mathrm{d}(X_\infty))\cdot\mathbb{E}(\mathrm{d}(X_0))=\frac{\bf{d}^{\prime}\bf{d}}{\bf{1}^{\prime}\bf{A}\bf{1}}\frac{\bf{1}^{\prime} \bf{A} \bf{1}}{\bf{1}^{\prime}\bf{1}}=\frac{\bf{d}^{\prime}\bf{d}}{\bf{1}^{\prime}\bf{1}}= \mathbb{E}\left(\mathrm{d}(X_0)\ \mathrm{d}(X_k)\right). \end{align*}

Thus,

\begin{align*} \tau_k&=\frac{\mathbb{E}(\mathrm{d}(X_\infty))-\mathbb{E}(\mathrm{d}(X_k))}{\mathbb{E}(d(X_k))},\\ &= \frac{1}{\mathbb{E}(\mathrm{d}(X_k))}\left(\frac{\mathbb{E}(\mathrm{d}(X_0)\ \mathrm{d}(X_k))}{\mathbb{E}(\mathrm{d}(X_0))}-\mathbb{E}(\mathrm{d}(X_k))\right), \\ &= \frac{\mathrm{cov}(d(X_0),d(X_k))}{\mathbb{E}(\mathrm{d}(X_k))\cdot\mathbb{E}(\mathrm{d}(X_0))}, \\ &= \phi_{0,k}\left(\frac{\mathrm{sd}(\mathrm{d}(X_0))}{\mathbb{E}(\mathrm{d}(X_0))}\right)\cdot\left(\frac{\mathrm{sd}(\mathrm{d}(X_k))}{\mathbb{E}(\mathrm{d}(X_k))}\right)=\phi_{0,k}\ {c_v}(\mathrm{d}(X_0)) {c_v}(\mathrm{d}(X_k)).\\[-42pt] \end{align*}

We easily derive the following corollary.

Corollary 2.1. Suppose $\bf{X}=(X_0,X_1,\ldots)$ is a random walk on the graph $G=(\mathcal{V},\mathcal{E})$. For $k\geq 1$, the expected degree of $X_\infty$ is no less than that of Xk if and only if the kth-order degree correlation $\phi_{0,k}$ is nonnegative, that is,

(10)\begin{align} \mathbb{E}(\mathrm{d}(X_\infty))\geq \mathbb{E}(\mathrm{d}(X_k)) \quad \text{if and only if} \quad \phi_{0,k}=\mathrm{cor}(\mathrm{d}(X_{0}), \mathrm{d}(X_{k}))\geq 0. \end{align}

3. Proof of Theorems 1.10 and 1.11 (The random path case)

In this section, we will prove Theorem 1.10 and further discuss the relationship between the proportional one-step benefit γk and the path-based degree correlation $\rho_{0,k}$ in parallel with the random-walk case, $(X_k)_{k\geq 0}$. As per convention, throughout, we will take the zeroth power of the adjacency matrix, A, of a graph G (i.e., A0) to be the n × n identity matrix, and hence for $k\geq 0$, the entry $(A^k)_{i,j}$ counts the number of distinct paths of length k connecting nodes vi and vj.

As before, suppose for $i \geq 0$, $Y_{i}^-$ and $Y_{i}^+$ are the beginning and terminal nodes of a path selected uniformly at random from the set of all length-i paths on G. We have the following lemma (see also Lemma 4 in [Reference Kramer, Cutler and Radcliffe24]).

Lemma 3.1. Suppose $k\geq 0$. The expected degree of $Y_k^{+}$ can be written in terms of successive entries in the sequence $(N_0, N_1, \ldots)$, where Nj is the number of length j paths on G, that is,

(11)\begin{align} \mathbb{E}(\mathrm{d}(Y_k^{+}))=\frac{N_{k+1}}{N_k}. \end{align}

Proof. We have

\begin{align*} \mathbb{E}(\mathrm{d}(Y_k^{+}))=\sum_{j=1}^{n}d_j \frac{\sum_{i=1}^{n}A_{i,j}^{k}}{N_k}=\frac{\bf{d}^{\prime}(\bf{A}^{k})^{\prime}\bf{1}}{N_k}=\frac{\bf{1}^{\prime} \bf{A}\bf{A}^{k}\bf{1}}{N_k}=\frac{N_{k+1}}{N_k}, \end{align*}

where the third equality follows from the symmetry of the adjacency matrix, A.

Now we turn to a proof of Theorem 1.10.

Proof of Theorem 1.10

Suppose $k\geq 0$ and $Y_k^{-}$ and $Y_k^{+}$ are two nodes connected by a randomly chosen path with length k. We have

(12)\begin{align} \mathbb{E}(\mathrm{d}(Y_k^{-})\ \mathrm{d}(Y_k^{+}))=\sum_{i=1}^{n}\sum_{j=1}^{n}d_i d_j \frac{A_{i,j}^{k}}{N_k}=\frac{\bf{d}^{\prime}\bf{A}^{k}\bf{d}}{N_k}=\frac{N_{k+2}}{N_k}. \end{align}

Note that by Lemma 3.1 and symmetry,

(13)\begin{align} \mathbb{E}(\mathrm{d}(Y_k^{-}))=\frac{N_{k+1}}{N_k}=\mathbb{E}(\mathrm{d}(Y_k^{+})). \end{align}

Hence, using Eqs. (12) and (13),

(14)\begin{align} \mathrm{cov}(d(Y_k^{-}),d(Y_k^{+}))&=\frac{N_{k+2}}{N_k}-\left(\frac{N_{k+1}}{N_k}\right)^2=\frac{N_{k+1}}{N_k}\left(\frac{N_{k+2}}{N_{k+1}}-\frac{N_{k+1}}{N_k}\right),\nonumber \\ &=\mathbb{E}(d(Y_k^{+}))\cdot(\mathbb{E}(d(Y_{k+1}^+))-\mathbb{E}(d(Y_k^+))), \end{align}

and hence by the definition of γk, and Eqs. (13) and (14),

(15)\begin{align} \gamma_k&=\frac{\mathbb{E}(\mathrm{d}(Y_{k+1}^+))-\mathbb{E}(\mathrm{d}(Y_k^+))}{\mathbb{E}(\mathrm{d}(Y_k^+))}\cdot \frac{\mathbb{E}(\mathrm{d}(Y_{k}^+))}{\mathbb{E}(\mathrm{d}(Y_k^-))}\cdot\frac{\mathrm{sd}(\mathrm{d}(Y_{k}^-))}{\mathrm{sd}(\mathrm{d}(Y_k^-))}\cdot\frac{\mathrm{sd}(\mathrm{d}(Y_{k}^+))}{\mathrm{sd}(\mathrm{d}(Y_k^+))}, \nonumber\\ &=\frac{\mathrm{cov}(d(Y_k^-),d(Y_K^+))}{\mathrm{sd}(\mathrm{d}(Y_k^-))\ \mathrm{sd}(\mathrm{d}(Y_{k}^+))}\cdot \frac{\mathrm{sd}(\mathrm{d}(Y_k^-)}{\mathbb{E}(\mathrm{d}(Y_k^-))}\cdot \frac{\mathrm{sd}(\mathrm{d}(Y_{k}^+)}{\mathbb{E}(\mathrm{d}(Y_k^+))},\nonumber \\ &=\rho_{0,k}\ \mathrm{c_v}(\mathrm{d}(Y_k^-))\ \mathrm{c_v}(\mathrm{d}(Y_k^+)).\nonumber \\[-40pt] \end{align}

Therefore, the proportional increase in relative expected degree for an increase of one in the length of our random path can be determined by the corresponding path-based degree correlation, along with the two coefficients of variation. We have the following corollary.

Corollary 3.2. Suppose $k\geq 0$. The expected degree of $Y_{k+1}^{+}$ is no less than that of $Y_k^{+}$ if and only if the kth-order degree correlation $\rho_{0,k}$ is nonnegative, that is,

(16)\begin{align} \mathbb{E}(\mathrm{d}(Y_{k+1}^{+}))\geq \mathbb{E}(\mathrm{d}(Y_k^{+})) \quad\text{if and only if } \quad \rho_{0,k}=\mathrm{cor}(\mathrm{d}(Y_k^{-}), \mathrm{d}(Y_{k}^{+}))\geq 0. \end{align}

Note that employing an inequality on the number of paths, developed by [Reference Täubig, Weihmann, Kosub, Hemmecke and Mayr35] (with c = 1 and b = 0), we have

(17)\begin{align} \frac{N_{2a+1}}{N_{2a}}\leq \frac{N_{2a+2}}{N_{2a+1}} \end{align}

for $a\geq 0$, and hence by Lemma 3.1, when k is even, $\mathbb{E}(\mathrm{d}(Y_{k+1}^{+}))\geq \mathbb{E}(\mathrm{d}(Y_k^{+}))$. In the case k = 0, this is simply the result of Feld in Theorem 1.1. In addition, by Corollary 3.2, for any network G, G is assortative (i.e., $\rho_{0,1} \gt 0$) if and only if $\mathbb{E}(\mathrm{d}(Y_{2}^+)) \gt \mathbb{E}(\mathrm{d}(Y_1^+))$. For consideration of degree comparison for X 1 and X 2 under a random graph model, see [Reference Berenhaut, Jiang, McNab and Krizay4].

The following lemma follows directly from eigen decomposition of the matrix A and Lemma 3.1 (see for instance [Reference Cvetkovic, Cvetković, Rowlinson and Simic10]).

Lemma 3.3. ([Reference Cvetkovic, Cvetković, Rowlinson and Simic10]) Suppose $k\geq 0$ and the graph G is non-bipartite. The expected degree of $Y_k^{+}$ tends to λ 1, the largest eigenvalue of the adjacency matrix, A, as k tends to infinity, that is,

(18)\begin{align} \lim_{k\to\infty} E(\mathrm{d}(Y_k^+)) = \lambda_1. \end{align}

We now turn to a proof of Theorem 1.11

Proof of Theorem 1.11

Suppose $m\geq 0$. From Rayleigh’s inequality, we have

(19)\begin{align} \mathbb{E}(\mathrm{d}(Y_{2m}^+))=\frac{N_{2m+1}}{N_{2m}}=\frac{\left(\bf{A}^{m}\bf{1}\right)^{\prime}\bf{A}\left(\bf{A}^{m}\bf{1}\right)}{\left(\bf{A}^{m}\bf{1}\right)^{\prime}\cdot\left(\bf{A}^{m}\bf{1}\right)}\leq \lambda_1. \end{align}

The result for k even follows by Lemma 3.3. Now, suppose k is odd and $\rho_{0,k}\geq 0$. Employing Corollary 3.2 and Eq. (19) then gives

\begin{align*} \mathbb{E}(\mathrm{d}(Y_{k}^+))\leq \mathbb{E}(\mathrm{d}(Y_{k+1}^+))\leq \lambda_1, \end{align*}

and the result follows.

Competing interests

The authors declare no conflict of interest.

References

Allen-Perkins, A., Pastor, J.M., & Estrada, E. (2017). Two-walks degree assortativity in graphs and networks. Applied Mathematics and Computation 311: 262271.CrossRefGoogle Scholar
Arcagni, A., Grassi, R., Stefani, S., & Torriero, A. (2017). Higher order assortativity in complex networks. European Journal of Operational Research 262(2): 708719.CrossRefGoogle Scholar
Berenhaut, K.S. & Jiang, H. (2019). The friendship paradox for weighted and directed networks. Probability in the Engineering and Informational Sciences 33(1): 136145.CrossRefGoogle Scholar
Berenhaut, K.S., Jiang, H., McNab, K.M., & Krizay, E.J. (2018). The degree-wise effect of a second step for a random walk on a graph. Journal of Applied Probability 55(4): 12031210.CrossRefGoogle Scholar
Cantwell, G.T., Kirkley, A., & Newman, M. (2021). The friendship paradox in real and model networks. Journal of Complex Networks 9(2): .CrossRefGoogle Scholar
Cao, Y. & Ross, S.M. (2016). The friendship paradox. Mathematical Scientist 41(1): 6164.Google Scholar
Chen, N. & Olvera-Cravioto, M. (2013). Directed random graphs with given degree distributions. Stochastic Systems 3(1): 147186.CrossRefGoogle Scholar
Christakis, N.A. & Fowler, J.H. (2010). Social network sensors for early detection of contagious outbreaks. PLoS One 5(9): .CrossRefGoogle ScholarPubMed
Cohen, R., Havlin, S., & Ben-Avraham, D. (2003). Efficient immunization strategies for computer networks and populations. Physical Review Letters 91(24): .CrossRefGoogle ScholarPubMed
Cvetkovic, D., Cvetković, D.M., Rowlinson, P., & Simic, S. (1997). Eigenspaces of graphs 66, Cambridge: Cambridge University Press.CrossRefGoogle Scholar
D’Agostino, G., Scala, A., Zlatić, V., & Caldarelli, G. (2012). Robustness and assortativity for diffusion-like processes in scale-free networks. Europhysics Letters 97(6): .Google Scholar
Dizon-Ross, R. & Ross, S.M. (2022). A probabilistic friendship network model. Probability in the Engineering and Informational Sciences 36(1): 116.CrossRefGoogle Scholar
Ellens, W., Spieksma, F.M., Van Mieghem, P., Jamakovic, A., & Kooij, R.E. (2011). Effective graph resistance. Linear Algebra and its applications 435(10): 24912506.CrossRefGoogle Scholar
Eom, Y.-H. & Jo, H.-H. (2015). Tail-scope: Using friends to estimate heavy tails of degree distributions in large-scale complex networks. Scientific Reports 5: . doi:10.1038/srep09752.CrossRefGoogle ScholarPubMed
Feld, S.L. (1991). Why your friends have more friends than you do. American Journal of Sociology 96(6): 14641477.CrossRefGoogle Scholar
Foster, J.G., Foster, D.V., Grassberger, P., & Paczuski, M. (2010). Edge direction and the structure of networks. Proceedings of the National Academy of Sciences of the United States of America 107(24): 1081510820.CrossRefGoogle ScholarPubMed
Fujiki, Y., Takaguchi, T., & Yakubo, K. (2018). General formulation of long-range degree correlations in complex networks. Physical Review E 97(6): .CrossRefGoogle ScholarPubMed
Garcia-Herranz, M., Moro, E., Cebrian, M., Christakis, N.A., & Fowler, J.H. (2014). Using friends as sensors to detect global-scale contagious outbreaks. PLoS One 9(4): .CrossRefGoogle ScholarPubMed
Gutiérrez-Gómez, L. & Delvenne, J.-C. (2019). Multi-hop assortativities for network classification. Journal of Complex Networks 7(4): 603622.CrossRefGoogle Scholar
Herrera, J.L., Srinivasan, R., Brownstein, J.S., Galvani, A.P., & Meyers, L.A. (2016). Disease surveillance on complex social networks. PLoS Computational Biology 12(7): .CrossRefGoogle ScholarPubMed
Jackson, M.O. (2019). The friendship paradox and systematic biases in perceptions and social norms. Journal of Political Economy 127(2): 777818. doi:10.1086/701031.CrossRefGoogle Scholar
Jo, H.-H. & Eom, Y.-H. (2014). Generalized friendship paradox in networks with tunable degree-attribute correlation. Physical Review E 90(2): .CrossRefGoogle ScholarPubMed
Kim, D.A., Hwong, A.R., Stafford, D., Hughes, D.A., O’Malley, A.J., Fowler, J.H., & Christakis, N.A. (2015). Social network targeting to maximise population behaviour change: A cluster randomised controlled trial. The Lancet 386(9989): 145153.CrossRefGoogle ScholarPubMed
Kramer, J.B., Cutler, J., & Radcliffe, A. (2016). The multistep friendship paradox. American Mathematical Monthly 123(9): 900908.CrossRefGoogle Scholar
Lee, E., Lee, S., Eom, Y.-H., Holme, P., & Jo, H.-H. (2019). Impact of perception models on friendship paradox and opinion formation. Physical Review E 99(5): .CrossRefGoogle ScholarPubMed
Li, C., Wang, H., & Van Mieghem, P. (2013). Epidemic threshold in directed networks. Physical Review E 88(6): .CrossRefGoogle ScholarPubMed
Mayo, M., Abdelzaher, A., & Ghosh, P. (2015). Long-range degree correlations in complex networks. Computational Social Networks 2(1): 113.CrossRefGoogle Scholar
Momeni, N. & Rabbat, M.G. (2018). Effectiveness of alter sampling in social networks. Preprint, arXiv:1812.03096.Google Scholar
Newman, M.E. (2002). Assortative mixing in networks. Physical Review letters 89(20): .CrossRefGoogle ScholarPubMed
Noldus, R. & Van Mieghem, P. (2015). Assortativity in complex networks. Journal of Complex Networks 3(4): 507542.CrossRefGoogle Scholar
Pal, S., Yu, F., Novick, Y., Swami, A., & Bar-Noy, A. (2019). A study on the friendship paradox–quantitative analysis and relationship with assortative mixing. Applied Network Science 4(1): 126.CrossRefGoogle Scholar
Peel, L., Delvenne, J.-C., & Lambiotte, R. (2018). Multiscale mixing patterns in networks. Proceedings of the National Academy of Sciences of the United States of America 115(16): 40574062.CrossRefGoogle Scholar
Piraveenan, M., Prokopenko, M., & Zomaya, A.Y. (2012). Assortative mixing in directed biological networks IEEE/ACM Transactions on Computational Biology and Bioinformatics 9(1): 6678. doi:10.1109/TCBB.2010.80.CrossRefGoogle ScholarPubMed
Singer, Y. (2016). Influence maximization through adaptive seeding. ACM SIGecom Exchanges 15(1): 3259.CrossRefGoogle Scholar
Täubig, H., Weihmann, J., Kosub, S., Hemmecke, R., & Mayr, E.W. (2013). Inequalities for the number of walks in graphs. Algorithmica 66(4): 804828.CrossRefGoogle Scholar
Van Mieghem, P., Ge, X., Schumm, P., Trajanovski, S., & Wang, H. (2010). Spectral graph analysis of modularity and assortativity. Physical Review E 82(5): .CrossRefGoogle ScholarPubMed
Van Mieghem, P., Wang, H., Ge, X., Tang, S., & Kuipers, F.A. (2010). Influence of assortativity and degree-preserving rewiring on the spectra of networks. The European Physical Journal B 76(4): 643652.CrossRefGoogle Scholar