1. Introduction
Nonlinear Markov processes are a particular class of stochastic processes where the transition probabilities depend not only on the state, but also on the distribution of the process. McKean [Reference McKean13] introduced processes of this type to tackle mechanical transport problems. Thereafter they have been studied by several authors (see the monographs of Kolokoltsov [Reference Kolokoltsov10] and Sznitman [Reference Sznitman21]). Recently, the close connection to continuous-time mean-field games has led to significant progress in the analysis of McKean–Vlasov stochastic differential equations, in particular the control of these systems (see for example [Reference Carmona and Delarue5, Reference Pham and Wei18]).
In this paper, we consider a special class of these processes—namely, nonlinear Markov chains in continuous time with a finite state space—and provide first insights regarding their long-term behaviour. Nonlinear Markov chains with finite state space arise naturally, in particular in evolutionary biology, epidemiology, and game theory. Specifically, replicator dynamics, several infection models, and also the dynamics of learning procedures in game theory are nonlinear Markov chains [Reference Kolokoltsov10]. Moreover, the population dynamics in mean-field games with finite state and action space are also nonlinear Markov chains [Reference Neumann15].
Note that the sense in which we use the term ‘nonlinear’ here is different from the sense in which it occurs in the discussion of general (or nonlinear) birth–death processes, as for example in [Reference Bather3]. Indeed, in the latter context the term ‘nonlinear’ means that the transition rates depend on the current state not linearly but in a more general way (see [Reference Novozhilov, Karev and Koonin17] and the references therein for an overview). However, in our context the term ‘nonlinear’ refers to the Markov semigroup $\Phi^t(m_0)$ , which is linear for classical Markov chains and nonlinear in our setting, so that the transition probabilities can now depend on the current distribution of the process.
The main focus of this paper lies in the characterization of the long-term behaviour of these processes. We show that an invariant distribution always exists and provide a sufficient criterion for the uniqueness of this invariant distribution. We then turn to the long-term behaviour, for which we first illustrate by two examples that the limit behaviour is much more complex than for classical Markov chains. More precisely, we show that the marginal distributions of a nonlinear Markov chain may be periodic and that irreducibility of the generator does not necessarily imply ergodicity. Then we provide easy-to-verify sufficient criteria for ergodicity for small state spaces (two or three states). All of the conditions that we propose are simple and rely only on the shape of the nonlinear generator, not on the shape of the transition probabilities.
The long-term behaviour of general nonlinear Markov chains in continuous time with a finite state space has not been analysed before. The closest contributions in the literature are results for specific continuous-time Markov chains associated to pressure and resistance games [Reference Kolokoltsov and Malafeyev12], as well as ergodicity criteria for nonlinear Markov processes in discrete time [Reference Butkovsky4, Reference Saburov20]. These latter criteria are a generalization of Dobrushin’s ergodicity condition, and the proofs rely crucially on the sequential nature of the problem.
The rest of the paper is structured as follows. In Section 2 we review the relevant definitions and notation. In Section 3 we present the results on existence and uniqueness of the invariant distribution. In Section 4 we provide examples of limit behaviour that cannot arise in the context of classical Markov chains. In Section 5 we present the ergodicity results for small state spaces. Appendix A contains the proofs of two technical results.
2. Continuous-time nonlinear Markov chains with finite state space
This section gives a short overview of the relevant definitions, notation, and preliminary facts regarding nonlinear Markov chains. For more details about these processes we refer the reader to [Reference Kolokoltsov10, Chapter 1]. Moreover, this section introduces the relevant notions for characterizing the long-term behaviour of these processes.
Let $\mathcal{S}=\{1, \ldots, S\}$ be the state space of the nonlinear Markov chain, and denote by $\mathcal{P}(\mathcal{S})$ the probability simplex over $\mathcal{S}$ . A nonlinear Markov chain is characterized by a continuous family of nonlinear transition probabilities $P(t,m)= (P_{ij}(t,m))_{i,j \in \mathcal{S}}$ , which is a family of stochastic matrices that depends continuously on $t \ge 0$ and $m \in \mathcal{P}(\mathcal{S})$ , such that the nonlinear Chapman–Kolmogorov equation
is satisfied. As usual, $P_{ij}(t,m_0)$ is interpreted as the probability that the process is in state j at time t given that the initial state was i and the initial distribution of the process was $m_0$ . Such a family yields a nonlinear Markov semigroup $(\Phi^t({\cdot}))_{t \ge 0}$ of continuous transformations of $\mathcal{P}(\mathcal{S})$ via
Also, $\Phi^t(m_0)$ has the usual interpretation that it represents the marginal distribution of the process at time t when the initial distribution is $m_0$ . A nonlinear Markov chain with initial distribution $m_0 \in \mathcal{P}(\mathcal{S})$ can then be identified with a time-inhomogeneous Markov chain with initial distribution $m_0$ and transition probabilities $p(s,i,t,j) = P_{ij}(t-s, \Phi^s(m_0))$ .
Before we move on, let us briefly connect nonlinear Markov chains with classical time-homogeneous Markov chains. A classical Markov chain is characterized by a family of transition probabilities $P(t)=(P_{ij}(t))_{i,j \in \mathcal{S}}$ which is a family of stochastic matrices that continuously depends on $t \ge 0$ and which satisfies the Chapman–Kolmogorov equations. Therefore, classical Markov chains are a special case of the nonlinear Markov chains we consider here; namely, they are nonlinear Markov chains where the family of transition probabilities does not depend on the distribution of the process. In this case the associated Markov semigroup reads
which is linear in $m_0$ . This gives rise to the name nonlinear Markov chain for the objects we investigate in this paper.
As in the theory of classical continuous-time Markov chains, the infinitesimal generator will be the cornerstone of the description and analysis of such processes. Let $\Phi^t(m)$ be differentiable in $t=0$ and $m \in \mathcal{P}(\mathcal{S})$ ; then the (nonlinear) infinitesimal generator of the semigroup $(\Phi^t({\cdot}))_{t \ge 0}$ is given by a transition rate matrix function $Q({\cdot})$ such that for $f(m)\,:\!=\, \left. \frac{\partial}{\partial t} \Phi^t(m)\right|_{t=0}$ we have $f_j(m) = \sum_{i \in \mathcal{S}} m_i Q_{ij}(m)$ for all $j \in \mathcal{S}$ and $m \in \mathcal{P}(\mathcal{S})$ .
By [Reference Kolokoltsov10, Section 1.1], any differentiable nonlinear semigroup has a nonlinear infinitesimal generator. However, the converse problem is more important: given a transition rate matrix function (that is, a function $Q\,:\, \mathcal{P}(\mathcal{S}) \rightarrow \mathbb{R}^{S \times S}$ such that Q(m) is a transition rate matrix for all $m \in \mathcal{P}(\mathcal{S})$ ), is there a nonlinear Markov semigroup (and thus a nonlinear Markov chain) such that Q is the nonlinear infinitesimal generator of the process? Relying on the semigroup identity $\Phi^{t+s} = \Phi^t \Phi^s$ , this problem is equivalent to the following Cauchy problem: is there, for any $m_0 \in \mathcal{P}(\mathcal{S})$ , a solution $(\Phi^t(m_0))_{t \ge 0}$ of
such that $\Phi^t({\cdot})$ is a continuous function ranging from $\mathcal{P}(\mathcal{S})$ to itself, and such that $\Phi^t(m) \in \mathcal{P}(\mathcal{S})$ for all $t \ge 0$ and $m \in \mathcal{P}(\mathcal{S})$ ?
In the monograph [Reference Kolokoltsov10] the problem of constructing a semigroup from a given generator is treated in a very general setting. Here, we present a result with easy-to-verify conditions tailored to the specific situation of nonlinear Markov chains with finite state space. The proof of the result, which relies on classical arguments from the theory of ordinary differential equations, is presented in the appendix.
Proposition 1. Let $Q\,:\, \mathcal{P}(\mathcal{S}) \rightarrow \mathbb{R}^{S \times S}$ be a transition rate matrix function such that $Q_{ij}(m)$ is Lipschitz continuous for all $i,j \in \mathcal{S}$ . Then there is a unique Markov semigroup $(\Phi^t({\cdot}))_{t \ge 0}$ such that Q is the infinitesimal generator for $(\Phi^t({\cdot}))_{t \ge 0}$ .
This proposition sheds more light on the additional modelling possibilities of nonlinear Markov chains compared to classical Markov chains: indeed, whereas classical Markov chains are characterized through a transition rate matrix $Q \in \mathbb{R}^{S \times S}$ , the nonlinear Markov chains that we consider here are described by a function $Q\,:\, \mathcal{P}(\mathcal{S}) \rightarrow \mathbb{R}^{S \times S}$ . This function now allows for the transition rates of the processes to depend on the current distribution of the process. This often occurs in applications, for example in evolutionary game dynamics or infection models (e.g. susceptible–infectious–recovered models) [Reference Kolokoltsov10]. Moreover, such processes naturally arise as the limit of weakly interacting particles or agents [Reference Kolokoltsov11], which is why these processes play a role in mean-field game theory.
Let us illustrate both aspects in a simple toy example. Namely, let the function $Q\,:\,\mathcal{P}(\{1,2\}) \rightarrow \mathbb{R}^{2 \times 2}$ be given by
where both a and b are positive. This is a simple nonlinear Markov chain, where the transition rate from state 2 to state 1 is as in a classical Markov chain given by the fixed constant b, and the transition rate from state 1 to state 2 increases linearly in the current probability of state 1. This Markov chain naturally arises as a limit of classical Markov chains as follows. Let us assume there are N agents/particles which can be in either state 1 or state 2. Let $n_i$ denote the number of agents/particles in state i, and let $E_N = \{(n_1,n_2)\,:\, n_1 + n_2 = N\}$ be the state space. Moreover, let the transition rate matrix of a classical Markov chain describing the motion of these particles be given by
for all $(n_1,n_2), (k_1,k_2) \in E_N$ . In this process, the more agents/particles are in state 1, the larger the transition rate is for other agents to go to state 2; i.e., the agents/particles face congestion effects in state 1. Renormalizing the state space to $\frac{1}{N} E_N \subseteq \mathcal{P}(\{1,2\})$ , one can then show that the Markov chain describing the N-particle system converges in distribution and in probability to the nonlinear Markov chain described above (see [Reference Kolokoltsov11] and the references therein for general results of this type). This means that for a large number of agents/particles the nonlinear Markov chain approximately describes the behaviour of the distribution of the agents/particles.
In this paper we are now mainly interested in the characterization of the long-term behaviour of nonlinear Markov chains. We say that $m \in \mathcal{P}(\mathcal{S})$ is an invariant distribution if $\frac{\partial}{\partial t} \Phi^0(m) = 0$ and thus also $\frac{\partial}{\partial t} \Phi^t(m) = 0$ . An equivalent condition with respect to the generator is that a vector $m \in \mathcal{P}(\mathcal{S})$ is an invariant distribution if it solves $0 = m^TQ(m)$ .
We say that a nonlinear Markov chain with nonlinear semigroup $(\Phi^t({\cdot}))_{t \ge 0}$ is strongly ergodic if there exists an $\bar{m} \in \mathcal{P}(\mathcal{S})$ such that for all $m_0 \in \mathcal{P}(\mathcal{S})$ we have
3. Existence and uniqueness of the invariant distribution
The invariant distributions of a nonlinear Markov chain are exactly the fixed points of the set-valued map
Using Kakutani’s fixed point theorem, we directly obtain the existence of an invariant distribution for any generator, as follows.
Proposition 2. Let $Q({\cdot})$ be a nonlinear generator such that the map $Q\,:\, \mathcal{P}(\mathcal{S}) \rightarrow \mathbb{R}^{S\times S}$ is continuous. Then the nonlinear Markov chain with generator $Q({\cdot})$ has an invariant distribution.
Proof. By [Reference Iosifescu8, Theorem 5.3], the set of all invariant distributions given a fixed generator matrix Q(m) is the convex hull of the invariant distributions given the recurrent communication classes of Q(m). Therefore, the values of the map s are non-empty, convex, and compact. Moreover, the graph of the map s is closed: let $(m^n, x^n)_{n \in \mathbb{N}}$ be a converging sequence such that $x^n \in s(m^n)$ . Denote its limit by (m, x). Then $0 = (x^n)^T Q(m^n)$ for all $n \in \mathbb{N}$ . By continuity of $Q({\cdot})$ we have $0=x^TQ(m)$ , which implies $x \in s(m)$ . Thus, Kakutani’s fixed point theorem yields a fixed point of the map s, which is an invariant distribution given $Q({\cdot})$ .
If Q(m) is irreducible for all $m \in \mathcal{P}(\mathcal{S})$ , the sets s(m) will be singletons [Reference Asmussen1, Theorem 4.2]. Let x(m) denote this point. We remark that there are explicit representation formulas for x(m) (e.g. [Reference Neumann16, Reference Resnick19]). With these insights we provide the following sufficient criterion for the uniqueness of the invariant distribution.
Theorem 1. Assume that Q(m) is irreducible for all $m \in \mathcal{P}(\mathcal{S})$ . Furthermore, assume that $f(m)\,:\!=\,x(m)-m$ is continuously differentiable and that the matrix
is non-singular for all $m \in \mathcal{P}(\mathcal{S})$ . Then there is a unique invariant distribution.
Proof. We first note that any invariant distribution of a nonlinear Markov chain with generator $Q({\cdot})$ is an invariant distribution m of a classical Markov chain with generator Q(m). Since any invariant distribution of a classical Markov chain with generator Q(m) has to satisfy that all components are strictly positive [Reference Asmussen1, Theorem 4.2], no invariant distribution of $Q({\cdot})$ lies on the boundary of $\mathcal{P}(\mathcal{S})$ . Therefore, we only need to ensure the existence of a unique invariant distribution in the interior of $\mathcal{P}(\mathcal{S})$ .
The set $\mathcal{P}(\mathcal{S})$ is homeomorphic to $\bar{\Omega}$ with
where the continuous bijections are given as the restrictions of
Define $\bar{f}\,:\, \bar{\Omega} \rightarrow \bar{\Omega}$ by $m \mapsto \psi(f(\phi(m)))$ . By the chain rule we obtain
The matrix M(m) is, by assumption, non-singular for all $m \in \mathcal{P}(\mathcal{S})$ . Thus,
Since $\phi$ , $\psi$ , f, and det are continuous functions, we obtain that the function $m \mapsto \det\! \left( \frac{\partial \bar{f}(m)}{\partial m} \right)$ is also continuous. Thus, the intermediate value theorem yields that $\det\! \left( \frac{\partial \bar{f}(m)}{\partial m} \right)$ has uniform sign over $\bar{\Omega}$ .
Furthermore, we note that by assumption M(m) is in particular non-singular for all $m \in \phi ( \bar{f}^{-1}(\{0\}))$ . Thus, 0 is a non-critical value of $\bar{f}$ .
The map $\bar{h}\,:\,[0, 1] \times \bar{\Omega} \rightarrow \mathbb{R}^{S-1}$ given by
is continuous. Furthermore, $0 \notin \bar{h}(t, \partial \Omega)$ : indeed, a point $m \in \partial \Omega$ satisfies either $m_i=0$ for some $i \in \{1, \ldots, S-1\}$ or $\sum_{i=1}^{S-1} m_i = 1$ . However, by [Reference Asmussen1, Theorem 4.2], all components of the invariant distribution for an irreducible generator are strictly positive. Thus, we obtain in the first case that $h_i(t,m)>0$ and in the second case that the sum of all components is strictly negative, which in both cases implies that $h(t, m) \neq 0$ .
With these preparations we can make use of the Brouwer degree (see [Reference Deimling6, Sections 1.1 and 1.2]); namely we obtain that
Since for continuously differentiable maps g and regular values $y \notin g(\partial \Omega)$ the degree is given by
we obtain that
Because the determinant has uniform sign over $\Omega \supseteq \bar{f}^{-1}(\{0\})$ , we obtain that $\bar{f}^{-1}(\{0\})$ consists of exactly one element. Thus, there is a unique stationary point for the nonlinear Markov chain with nonlinear generator $Q({\cdot})$ .
Example 1. We illustrate the use of the result in an example. Consider a nonlinear Markov chain with the following generator:
where all constants are strictly positive. This nonlinear Markov chain arises in a mean-field game model of consumer choice with congestion effects (see [Reference Neumann15], which also gives detailed calculations). In this setting the invariant distributions are given as the solution(s) of the nonlinear equation $0 = m^T Q(m)$ , for which closed-form solutions are hard or impossible to obtain. However, it is possible to verify that the matrix M(m) is non-singular for all $m \in \mathcal{P}(\mathcal{S})$ yielding a unique invariant distribution. This information can in particular be used to obtain certain characteristic properties of the solutions.
4. Examples of peculiar limit behaviour
The following examples show that the limit behaviour of nonlinear Markov chains (even in the case of small state spaces) is more complex than that of classical continuous-time Markov chains. In particular, it may be that the marginal distributions do not converge, but are periodic; and a nonlinear Markov chain with an irreducible nonlinear generator may not be strongly ergodic, but may exhibit convergence towards several different invariant distributions.
4.1. An example with periodic marginal distributions
Let $B = \mathcal{P}(\{1,2,3\}) \cap \{m \in \mathbb{R}^3\,:\, \min \{m_1, m_2, m_3\} \ge \frac{1}{10} \}$ , and for all $m \in B$ define the matrix Q as follows:
where $\mathbb{I}_A$ is 1 if A is true and 0 otherwise. Since all transition rates on B are Lipschitz continuous functions, there is an extension of $Q_{ij}({\cdot})$ on $\mathcal{P}(\mathcal{S})$ for all $i, j \in \mathcal{S}$ , which is again Lipschitz continuous. Thus, a nonlinear Markov chain with generator Q exists, and whenever $\Phi^t(m_0) \in B$ , we have
Thus, for any neighbourhood $U \subseteq B$ of $\left( \frac{1}{3}, \frac{1}{3}, \frac{1}{3} \right)^T$ the first two components of the marginal behave like the classical harmonic oscillator. Therefore, there are initial distributions such that the marginals are periodic. An example is the initial distribution $m_0 = (0.2, 0.4, 0.4)$ , for which the marginals are plotted in Figure 1.
4.2. An example of a nonlinear Markov chain with irreducible generator that is not strongly ergodic
Let
This matrix is irreducible for all $m \in \mathcal{P}(\{1,2\})$ , since $m_1^2 + m_1 +1 \ge 1$ and $\frac{29}{3} m_1^2 - 16 m_1 + \frac{22}{3} \ge \frac{62}{82}$ for all $m_1 \ge 0$ .
The ordinary differential equation describing the marginals for the initial condition $m_0 \in \mathcal{P}(\{1,2\})$ is given by
We obtain that there are three stationary points, $m^1= (0.25,0.75)$ , $m^2 = (0.5,0.5)$ , and $m^3=(0.75,0.25)$ , and the following convergence behaviour:
-
Since the function $f({\cdot})$ is strictly positive on $[0,0.25)$ , the trajectories will converge towards $m_1 = 0.25$ for all initial conditions $(m_0)_1 \in [0,0.25)$ .
-
Since the function $f({\cdot})$ is strictly negative on $(0.25,0.5)$ , the trajectories will converge towards $m_1 = 0.25$ for all initial conditions $(m_0)_1 \in (0.25,0.5)$ .
-
Since the function $f({\cdot})$ is strictly positive on $(0.5,0.75)$ , the trajectories will converge towards $m_1 = 0.75$ for all initial conditions $(m_0)_1 \in (0.5,0.75)$ .
-
Since the function $f({\cdot})$ is strictly negative on $(0.75,1]$ , the trajectories will converge towards $m_1 = 0.75$ for all initial conditions $(m_0)_1 \in (0.75,1]$ .
This behaviour is illustrated in Figure 2, where several trajectories for different initial conditions are plotted.
5. Sufficient criteria for ergodicity for small state spaces
Although nonlinear Markov chains have more complex limit behaviour, we still obtain sufficient criteria for ergodicity in the case of a small number of states. Here we present these criteria and discuss applicability as well as the problems that occur for larger state spaces.
Proposition 3. Let $S=2$ , and assume that $f\,:\, [0, 1] \rightarrow \mathbb{R}$ defined via
is continuous. Furthermore, assume that $(\bar{m},1-\bar{m})$ is the unique stationary point given Q. Then the nonlinear Markov chain is strongly ergodic.
Proof. An equilibrium point is characterized by the property that $\frac{\partial}{\partial t} \Phi^t(m)=0$ . By flow-invariance of $\mathcal{P}(\mathcal{S})$ for the ordinary differential equation $\frac{\partial}{\partial t} \Phi^t(m_0) = \Phi^t(m_0) Q(\Phi^t(m_0))$ (see the proof of Theorem 1), which implies that $\frac{\partial}{\partial t} \Phi^t_1(m) + \frac{\partial}{\partial t} \Phi^t_2(m) =0$ , this property is equivalent to the fact that $\frac{\partial}{\partial t} \Phi^t_1(m) = 0$ .
Since $\frac{\partial}{\partial t} \Phi^t_1(m) = f(m_1)$ and since we have a unique equilibrium point, we obtain that $f(\bar{m})=0$ and $f(m_1) \neq 0$ for all $m_1 \neq \bar{m}$ . Since $f({\cdot})$ is continuous, we obtain that $f({\cdot})$ is non-vanishing on $[0,\bar{m})$ and $(\bar{m}, 1]$ and has uniform sign on each of these sets. Since $Q({\cdot})$ is a conservative generator we moreover obtain that $f(0)\ge 0$ and $f(1)\le 0$ . Thus, we obtain that $f(m_1)>0$ for all $m_1 \in [0,\bar{m})$ and $f(m_1)<0$ for all $m_1 \in (\bar{m},1]$ . This in turn yields that [0, 1] is flow-invariant for $\dot{m}_1 = f(m_1)$ .
Fix $m_0 \in \mathcal{P}(\mathcal{S})$ . Then the systems $\frac{\partial}{\partial t} \Phi^t(m_0) = Q(\Phi^t(m_0))^T \Phi^t(m_0)$ and $\frac{\partial}{\partial t} \tilde{\Phi}^t(m_0)_1 = f(\tilde{\Phi}^t(m_0))$ are equivalent in the sense that $\Phi^t_1(m_0)= \tilde{\Phi}^t(m_0)$ for all $t \ge 0$ , $m_0 \in \mathcal{P}(\{1,2\})$ : Indeed, let $\Phi^t(m_0) = (\Phi^t_1(m_0),\Phi^t_2(m_0))$ be a solution of the differential equation $\frac{\partial}{\partial t} \Phi^t(m_0) = Q(\Phi^t(m_0))^T \Phi^t(m_0)$ with initial condition $\Phi^0(m_0)=m_0$ . By flow-invariance of $\mathcal{P}(\mathcal{S})$ for $\frac{\partial}{\partial t} \Phi^t(m_0) = Q(\Phi^t(m_0))^T \Phi^t(m_0)$ (see Theorem 1), we have $\Phi^t_2(m_0) = 1-\Phi^t_1(m_0)$ for all $t \ge 0$ . Thus, $\frac{\partial}{\partial t} \Phi^t(m_0) = Q(\Phi^t(m_0))^T \Phi^t(m_0)$ is equivalent to
Therefore, $\Phi^t_1(m_0)$ is indeed a solution of $\frac{\partial}{\partial t} \Phi^t_1(m_0) = f(\Phi^t_1(m_0))$ . For the converse implication we first note that, because Q(m) is conservative for all $m \in \mathcal{P}(\mathcal{S})$ , the last equation of (1) is the first equation multiplied by $({-}1)$ . If $\tilde{\Phi}^t(m_0)$ satisfies $\frac{\partial}{\partial t} \tilde{\Phi}^t(m_0) = f(\tilde{\Phi}^t(m_0))$ , $\tilde{\Phi}^0(m_0) = (m_0)_1 \in [0, 1]$ , then, by flow-invariance, $\tilde{\Phi}^t(m_0)) \in [0, 1]$ for all $t\ge0$ . Thus, the function $\Phi^t(m_0) = (\tilde{\Phi}^t(m_0), 1- \tilde{\Phi}^t(m_0))$ satisfies $\frac{\partial}{\partial t} \Phi^t(m_0) = Q(\Phi^t(m_0))^T \Phi^t(m_0)$ .
The desired convergence statement follows directly from $f(m_1)>0$ for all $m_1 \in [0,\bar{m})$ and $f(m_1)<0$ for all $m_1 \in (\bar{m},1]$ .
We also obtain a sufficient criterion for the case of three states. The proof technique is similar to the two-state case. Indeed, we first show that our system is equivalent to a two-dimensional system, for which we can then use standard tools for two-dimensional dynamical systems, exploiting that the dynamical system has a particular shape since $Q({\cdot})$ is a conservative generator.
As mentioned, for systems with three states we obtain that, given $m_0 \in \mathcal{P}(\mathcal{S})$ , the function $\Phi^t(m_0) = (\Phi^t_1(m_0), \Phi^t_2(m_0), \Phi^t_3(m_0))$ is a solution of $\frac{\partial}{\partial t} \Phi^t(m_0) =Q(\Phi^t(m_0))^T \Phi^t(m_0)$ , $\Phi^0(m_0)=m_0$ if and only of $(\Phi^t_1(m_0), \Phi^t_2(m_0))$ is a solution of
where
and $\hat{m} = (m_1,m_2, 1- m_1-m_2)$ . Indeed, the proof is analogous to the proof for the two-state case; the central adjustment is to prove the flow-invariance of $\{(m_1,m_2) \in [0,\infty)\,:\, m_1 +$ $m_2 \le 1\}$ for
instead of the flow-invariance of [0, 1] for $\Phi^t_1(m_0) = f(\Phi^t_1(m_0))$ . This statement is proven in the appendix (Lemma 1).
To show the desired convergence statement, we now rely on the Poincaré–Bendixson theorem [Reference Teschl22, Chapter 7], which characterizes the $\omega$ -limit sets $\omega_+(m_0)$ of a trajectory with initial condition $\Phi^0(m_0)=m_0$ .
Theorem 2. Let $O \supseteq \{(m_1,m_2) \in [0,\infty)^2 \,:\, m_1 + m_2 \le 1\}$ be a simply connected and bounded region such that there is a continuously differentiable function $f\,:\,O \rightarrow \mathbb{R}^2$ satisfying (2) on $\mathcal{P}(\mathcal{S})$ . Let $\bar{m}$ be the unique stationary point given $Q({\cdot})$ . Furthermore, assume that
-
(a) $\frac{\partial f_1}{\partial m_1} (m) + \frac{\partial f_2}{\partial m_2} (m)$ is non-vanishing for all $m \in O$ and has uniform sign on O, and
-
(b) it holds that
\begin{equation*}\frac{\partial f_1}{\partial m_1} (\bar{m}) \cdot \frac{\partial f_2}{\partial m_2} (\bar{m}) - \frac{\partial f_1}{\partial m_2} (\bar{m}) \cdot \frac{\partial f_2}{\partial m_1} (\bar{m})>0\end{equation*}or that\begin{equation*}\left(\frac{\partial f_1}{\partial m_1} (\bar{m}) + \frac{\partial f_2}{\partial m_2} (\bar{m}) \right)^2 - 4 \!\left( \frac{\partial f_1}{\partial m_1} (\bar{m}) \cdot \frac{\partial f_2}{\partial m_2} (\bar{m}) - \frac{\partial f_1}{\partial m_2} (\bar{m}) \cdot \frac{\partial f_2}{\partial m_1} (\bar{m}) \right) <0 .\end{equation*}
Then the nonlinear Markov chain is strongly ergodic.
Proof. Since the set $F\,:\!=\, \{ (m_1, m_2)^T \in \mathbb{R}^2\,:\, m_1, m_2 \ge 0 \wedge m_1 + m_2 \le 1\}$ is flow-invariant for
any trajectory will stay in this set. Since the set F is compact, we obtain by [Reference Teschl22, Lemma 6.6] that $\omega_+(m_0 )$ lies in F. Since there is, by assumption, only one stationary point, we can apply the Poincaré–Bendixson theorem [Reference Teschl22, Theorem 7.16], which yields that one of the following three cases holds:
-
(i) $\omega_+(m_0) = \{\bar{m}\}$ ,
-
(ii) $\omega_+(m_0)$ is a regular periodic orbit, or
-
(iii) $\omega_+(m_0)$ consists of (finitely many) fixed points $x_1, \ldots, x_k$ and non-closed orbits $\gamma(z)$ such that $\omega_\pm(z) \in \{x_1, \ldots, x_k\}$ .
By the condition (a) and Bendixson’s criterion [Reference Jordan and Smith9, Theorem 3.5], the case (ii) is not possible. Since, by the condition (b), the point $\bar{m}$ is not a saddle point, there is no homoclinic path joining $\bar{m}$ to itself. Therefore, since $\bar{m}$ is the only stationary point, the case (iii) is also not possible. Thus, $\omega_+(m_0)=\{\bar{m}\}$ . Since the trajectory considered lies in the compact set F, we moreover obtain by [Reference Teschl22, Lemma 6.7] that
Remark 1. The equivalence of the systems considered above and $S-1$ systems on some subset of $\mathbb{R}^{S-1}$ , as well as the construction performed in Section 4.1, hint at the general problem for a larger number of states ( $S \ge 4$ ). It might happen that the dynamics of the nonlinear Markov chain describe a classical ‘chaotic’ nonlinear system like the Lorentz system. In other words, the difficulties that arise in the classical theory of dynamical systems might also arise here, for which reason criteria for a larger number of states are more complex.
Example 2. Theorem 2 now yields strong ergodicity of the nonlinear Markov chain introduced at the end of Section 3. In this setting the function f is given by
moreover, we have $\frac{\partial f_1}{\partial m_1} (m) + \frac{\partial f_2}{\partial m_2}(m) <0$ for all $m \in N_\epsilon([0, 1]^2)$ , as well as
for all $m \in [0, 1]^2$ and thus in particular for the unique invariant distribution. Therefore, by Theorem 2 we obtain strong ergodicity.
Appendix A. Proofs
Proof of Theorem 1.We first note that
is Lipschitz continuous on $\mathcal{P}(\mathcal{S})$ . Indeed, let L be a Lipschitz constant for all functions $Q_{ij}({\cdot})$ ( $i,j \in \mathcal{S}$ ) simultaneously. Moreover, since $\mathcal{P}(\mathcal{S})$ is compact there is a finite constant
Thus, we have
By McShane’s extension theorem [Reference McShane14], there is a Lipschitz continuous extension $\tilde{f}\,:\, \mathbb{R}^S \rightarrow \mathbb{R}^S$ of f. Let us fix an arbitrary $m_0 \in \mathcal{P}(\mathcal{S})$ . By the classical existence and uniqueness theorem for ordinary differential equations, we obtain that there is a unique solution of $\Phi^\cdot(m_0) \,:\,[0,\infty) \rightarrow \mathbb{R}^S$ of $\frac{\partial}{\partial t} \Phi^t(m_0) = \tilde{f}(\Phi^t(m_0)), \Phi^0(m_0) = m_0$ .
As a next step we show that the vectors $f(m) = \tilde{f}(m)$ lie, for all $m \in \mathcal{P}(\mathcal{S})$ , in the Bouligand tangent cone
where the second line follows from [Reference Aubin and Cellina2, Proposition 5.1.7]. Indeed, since for all interior points of $\mathcal{P}(\mathcal{S})$ the condition is trivially satisfied, it suffices to consider the boundary points $m \in \partial \mathcal{P}(\mathcal{S})$ . These points satisfy that there is at least one $j \in \mathcal{P}(\mathcal{S})$ such that $m_j=0$ . Since the only non-positive column entry of $Q_{\cdot j}$ (which is $Q_{jj}$ ) gets weight $m_j$ , the vector $f(m) = (\sum_{i \in \mathcal{S}} m_i Q_{ija}(m))_{j \in \mathcal{S}}$ will have non-negative entries at each $j \in \mathcal{S}$ such that $m_j = 0$ . Since Q is conservative, we moreover obtain that
Thus, $f(m) = \tilde{f}(m) \in T_{\mathcal{P}(\mathcal{S})}(m)$ for all $m \in \mathcal{P}(\mathcal{S})$ . Therefore, by the classical flow-invariance statement for ordinary differential equations ([Reference Walter23, Theorem 10.XVI]), we find that the solution satisfies $m(t) \in \mathcal{P}(\mathcal{S})$ for all $t \ge 0$ . Thus, $\Phi^\cdot(m_0) \,:\,[0,\infty) \rightarrow \mathbb{R}^S$ is also the unique solution of $\frac{\partial}{\partial t} \Phi^t(m_0) = f(\Phi^t(m_0))$ , $\Phi^0(m_0) = m_0$ . The continuity of $\Phi^t({\cdot})$ follows from a classical general dependence theorem [Reference Walter23, Theorem 12.VII].
Lemma 1. The set $N= \{(m_1,m_2) \in [0,\infty)\,:\, m_1 + m_2 \le 1 \}$ is flow-invariant for
Proof. The statement follows from [Reference Fernandes and Zanolin7, Lemma 1]. This lemma states that for an open set $O \subseteq \mathbb{R}^S$ and a family of continuously differentiable functions $g_i\,:\,O \rightarrow \mathbb{R}$ ( $i\in \{1, \ldots, k\}$ ), the set
is flow-invariant for $\dot{x} = f(x)$ whenever for any $x \in \partial M$ there is an $i \in \{1, \ldots, k\}$ such that $g_i(x)=0$ and
Indeed, in our case we have
and the boundary points of this set satisfy either $m_i=0$ for at least one $i \in \{1,2\}$ , or $m_1+m_2 =1$ . Since $Q({\cdot})$ is conservative and irreducible, we obtain
in the first case and
in the second case. Therefore, the claim follows.
Funding information
There are no funding bodies to thank in relation to the creation of this article.
Competing interests
There were no competing interests to declare which arose during the preparation or publication process for this article.