1. Introduction
This paper studies the stationary distribution in continuous-time upper block-Hessenberg Markov chains (UBH-MCs). This study focuses on establishing a theoretical procedure for constructing the exact stationary distribution in an algorithmic way, rather than establishing a practical procedure for computing an approximate stationary distribution (of course, such practicality is a significant issue, too). Additionally, this study would be a partial answer to the question, ‘What is a class of Markov chains whose exact stationary distributions are theoretically constructible by an algorithmic procedure?’
The class of UBH-MCs is characterized by the upper block-Hessenberg (UBH) structure of the infinitesimal generator matrix (called generator for short), and this class includes M/G/1-type Markov chains and level-dependent quasi-birth-and-death processes (LD-QBDs). Remarkably, even a multidimensional random walk on the nonnegative lattice is expressed as a UBH-MC such that its level variable is the sum of coordinate components and its phase variable is the set of all the coordinate components but one. Thus, the class of UBH-MCs is a powerful tool for modeling information and communication systems, inventory systems, transportation systems, etc. Moreover, the time-average performance of these systems is evaluated through the stationary distribution. That is why this study focuses on the stationary distribution in UBH-MCs.
We now introduce the definition of UBH-MCs in order to describe the background and purpose of this study. Let $\mathbb{S}$ denote a countable set such that
where $\mathbb{Z}_+=\{0,1,2,\dots\}$ and $\mathbb{M}_k = \{1,2,\dots,M_k\} \subset \mathbb{N}\,:\!=\,\{1,2,3,\dots\}$ . Let $\boldsymbol{Q}\,:\!=$ $\big(q\big(k,i;\,\ell,j\big)\big)_{(k,i;\,\ell,j) \in\mathbb{S}^2}$ denote an essential Q-matrix (see Definition B.1), and thus
Assume that $\boldsymbol{Q}$ is of UBH form:
where $\mathbb{L}_k \,:\!=\, \{k\} \times \mathbb{M}_k$ is called level k and an element i of $\mathbb{M}_k$ is called phase i (of level k). A Markov chain having this Q-matrix $\boldsymbol{Q}$ as its generator is said to be an upper block-Hessenberg Markov chain (UBH-MC). Note that a UBH-MC may be called a level-dependent M/G/1-type Markov chain. Note also that if $\boldsymbol{Q}_{k,k+m} = \boldsymbol{O}$ for all $m \ge 2$ and $k \in \mathbb{Z}_+$ then the generator is of block-tridiagonal form and thus the UBH-MC is reduced to an LD-QBD.
We provide a basic assumption and some definitions associated with the stationary distribution vector of the UBH generator $\boldsymbol{Q}$ given in (1.1). Assume that $\boldsymbol{Q}$ is ergodic (i.e., irreducible and positive recurrent) throughout the paper, unless otherwise stated. Let $\boldsymbol{\pi}\,:\!=\,(\pi(k,i))_{(k,i)\in\mathbb{S}}$ denote the unique and positive stationary distribution vector of the ergodic generator $\boldsymbol{Q}$ (see, e.g., [Reference Anderson1, Chapter 5, Theorems 4.4 and 4.5]). By definition,
where $\boldsymbol{e}$ denotes a column vector of ones of appropriate dimension. For later reference, let $\boldsymbol{\pi}_k=(\pi(k,i))_{i\in\mathbb{M}_k}$ for $k \in \mathbb{Z}_+$ , which leads to the level-wise partition of $\boldsymbol{\pi}$ :
Furthermore, let $\boldsymbol{\pi}^{(n)}$ , $n \in \mathbb{Z}_+$ , denote
which is referred to as the (finite-level) conditional stationary distribution vector. For any sufficiently large n, $\boldsymbol{\pi}^{(n)}$ can be considered an approximation to $\boldsymbol{\pi}$ .
The main purpose of this paper is to present a quasi-algorithmically constructible solution for the stationary distribution vector $\boldsymbol{\pi}$ of the ergodic UBH generator $\boldsymbol{Q}$ given in (1.1). We here provide our notion (not strict definition) of quasi-algorithmic constructions for the stationary distribution (the stationary distribution vector) in ergodic countable-state Markov chains, which are not, of course, restricted to UBH-MCs.
Notion 1.1 (Quasi-algorithmic constructions for the stationary distribution in countable-state Markov chains.) Consider an ergodic countable-state Markov chain with generator (essential Q-matrix) $\mathcal{Q}$ . Let $\pi$ denote the stationary distribution of the ergodic Markov chain. Let $\mathscr{P}$ denote a procedure for sequentially generating tentative solutions $\pi(0),\pi(1),\pi(2),\dots$ for the stationary distribution $\pi$ . The procedure $\mathscr{P}$ is said to be a quasi-algorithmic construction of $\pi$ , and furthermore $\pi$ is said to be quasi-algorithmically constructible (or have quasi-algorithmic constructibility), if $\mathscr{P}$ has the following features:
-
(i) The procedure $\mathscr{P}$ generates each tentative solution $\pi(n)$ by using at most a finite (not necessarily bounded) number of elements of $\mathcal{Q}$ together with (if required) some or all of the components of the previous tentative solutions $\{\pi(m);\,m \in \mathbb{Z}_{[0,n-1]}\}$ , where $\mathbb{Z}_{[0,k]} = \{0,1,\dots,k\}$ for $k \in \mathbb{Z}_+$ .
-
(ii) It takes at most finite complexity for $\mathscr{P}$ to generate each tentative solution $\pi(n)$ .
-
(iii) The sequence $\{\pi(n);\,n\in\mathbb{Z}_+\}$ converges to $\pi$ in some mathematical sense (e.g., in the $\ell_1$ -distance).
Remark 1.2. To the best of our knowledge, there is no universal formal definition of ‘algorithm’. Indeed, the term ‘algorithm’ is widely used with some ambiguity in various contexts. Nevertheless, it would be generally accepted, as an informal definition, that an ‘algorithm’ is a finite set of operations (of finite computational complexity) to accomplish a particular task. Considering this situation, we introduce the notion of quasi-algorithmic constructions in order to distinguish our new solution presented in this paper from the existing ones for the stationary distribution in UBH-MCs.
Remark 1.3. Any quasi-algorithmic construction can be implemented as an ‘algorithm’ if the construction is equipped with an appropriate stopping criterion so that it stops after finitely many iterations, and if it is possible to store the elements required to compute each tentative solution.
As far as we know, there have been no studies that achieve a quasi-algorithmic construction of the stationary distribution vector in UBH-MCs including LD-QBDs. There are, however, many related studies on the computation of the stationary distribution vector in these Markov chains.
Bright and Taylor [Reference Bright and Taylor5] presented a nice matrix-product form of the stationary distribution vector $\boldsymbol{\pi}=(\boldsymbol{\pi}_0,\boldsymbol{\pi}_1,\dots)$ of the LD-QBD generator:
The set of matrices $\{\boldsymbol{R}_k;\,k\in\mathbb{N}\}$ is the minimal nonnegative solution for the system of matrix equations
The matrix-product form (1.3) appears easy to compute. Thus, based on it, Bright and Taylor [Reference Bright and Taylor5] proposed a procedure for approximately computing the stationary distribution vector of the LD-QBD generator. In addition, some researchers have used the matrix-product form (1.3) as the foundation of their respective algorithms (see, e.g., [Reference Baumann and Sandmann3, Reference Phung-Duc, Masuyama, Kasahara and Takahashi17]).
However, the matrix-product form (1.3) does not lead to any quasi-algorithmic construction of the stationary distribution vector in LD-QBDs. Equation (1.4) yields
This recursive formula shows that each $\boldsymbol{R}_k$ can be computed provided that $\boldsymbol{R}_{k+1}$ is given. Therefore, to implement the algorithms based on (1.3), we have to truncate the infinite sequence $\{\boldsymbol{R}_k;\, k \in \mathbb{N}\}$ at a sufficiently large $K^* \in \mathbb{Z}_+$ by letting $\boldsymbol{R}_{K^*+1} = \boldsymbol{O}$ . Moreover, owing to this implementation, computing better approximations (to the stationary distribution) requires setting R-matrices of higher levels to be zero matrices, which implies that all the components of such an approximation are computed anew from scratch.
Several researchers have studied the approximate computation of the stationary distribution in UBH-MCs. Takine [Reference Takine20] presented an algorithm for the conditional stationary distribution vector $\boldsymbol{\pi}^{(n)}$ under some additional conditions, which were removed by Kimura and Takine [Reference Kimura and Takine6]. As mentioned in Section 1 of [Reference Takine20], the conditional stationary distribution vector $\boldsymbol{\pi}^{(n)}$ can be a good approximation to the stationary distribution vector $\boldsymbol{\pi}$ for all sufficiently large $n \in \mathbb{N}$ . Klimenok and Dudin [Reference Klimenok and Dudin10], Li et al. [Reference Li, Lian and Liu13], and Shin and Pearce [Reference Shin and Pearce19] proposed algorithms for approximately computing the stationary distribution vector $\boldsymbol{\pi}$ in UBH-MCs by making transition rates (or transition probabilities) eventually level-independent.
In summary, all the existing algorithms mentioned above were originally designed for approximately computing the stationary distribution vector $\boldsymbol{\pi}$ in UBH-MCs (including LD-QBDs). Thus, although those algorithms fulfill practical purposes, they do not make any theoretical contribution to the quasi-algorithmic construction of the stationary distribution vector $\boldsymbol{\pi}$ .
Unlike these previous studies, Masuyama [Reference Masuyama16] has taken the first step towards the quasi-algorithmic construction of $\boldsymbol{\pi}$ in UBH-MCs. To describe the numerical procedure proposed in [Reference Masuyama16] (and to facilitate later discussion), we introduce the following notation: a finite-dimensional matrix (resp. vector) is extended (if necessary) to an infinite-dimensional matrix (resp. vector) by appending zeros to it with its original elements fixed in their original positions. This notation enables us to define addition, subtraction, and product over vectors and matrices with different sizes. Furthermore, for $k,n \in \{0,\pm1, \pm2,\dots\}$ , the notation ‘ $\prod_{m=k}^n \downarrow$ ’ denotes a product operator for matrices (including scalars) such that
Masuyama [Reference Masuyama16] proposed a sequential update algorithm (Algorithm 1 in [Reference Masuyama16]) that generates a matrix-infinite-product form (MIP-form) solution for $\boldsymbol{\pi}=(\boldsymbol{\pi}_0,\boldsymbol{\pi}_1,\dots)$ :
where (i) $\boldsymbol{U}_n = \boldsymbol{Q}_{n+1,n}\boldsymbol{U}_n^*$ ; (ii) each $\boldsymbol{U}_n^*$ , $n\in\mathbb{Z}_+$ , is computed in a finite number of operations starting from $\boldsymbol{U}_0^*=\big({-}\boldsymbol{Q}_{0,0}\big)^{-1}$ ; and (iii) $\boldsymbol{\alpha}^{\dagger}_n$ is a $1 \times M_n$ probability vector that is an optimal solution for a certain linear fractional programming (LFP) problem, Problem 2.3 in Section 2.3. For the reader’s convenience, we summarize the sequential update algorithm [Reference Masuyama16, Algorithm 1] in Algorithm 1 in Section 4.
The existing MIP-form solution (1.5) (and thus the sequential update algorithm in [Reference Masuyama16]) requires Conditions 1.4 and 1.5 below (see [Reference Masuyama16, Conditions 1 and 2]).
Condition 1.4. (Foster–Lyapunov drift condition) There exist a constant $b \in (0,\infty)$ , a finite set $\mathbb{C} \subset \mathbb{S}$ , and a positive column vector $\boldsymbol{v}\,:\!=\,(v(k,i))_{(k,i)\in\mathbb{S}}$ such that $\inf_{(k,i)\in\mathbb{S}}v(k,i) > 0$ and
where $\textbf{1}_{\mathbb{A}}\,:\!=\,(1_{\mathbb{A}}(k,i))_{(k,i)\in\mathbb{S}}$ , $\mathbb{A} \subseteq \mathbb{S}$ , denotes a column vector given by
The space of parameter sets $(\boldsymbol{v},b,\mathbb{C})$ satisfying (1.6) is denoted by $\mathcal{V}$ , for convenience.
Condition 1.5. (Convergence condition for the sequential update algorithm in [Reference Masuyama16])
where, for any matrix $\boldsymbol{X}$ (resp. vector $\boldsymbol{x}$ ), the symbol $|\boldsymbol{X}|$ (resp. $|\boldsymbol{x}|$ ) denotes the matrix (resp. vector) obtained by taking the absolute value of each element of $\boldsymbol{X}$ (resp. $\boldsymbol{x}$ ).
Conditions 1.4 and 1.5 diminish the usefulness of the MIP-form solution (1.5). A parameter set $(\boldsymbol{v},b,\mathbb{C}) \in \mathcal{V}$ as in Condition 1.4 is required to describe the objective function of Problem 2.3 with optimal solution $\boldsymbol{\alpha}^{\dagger}_n$ . In fact, Condition 1.4 holds if and only if the irreducible generator $\boldsymbol{Q}$ is ergodic (see [Reference Kontoyiannis and Meyn11, Theorem 1.1]). However, even though we know that $\boldsymbol{Q}$ is ergodic, it may not be easy to find $(\boldsymbol{v},b,\mathbb{C}) \in \mathcal{V}$ in some cases. In addition, Condition 1.5 is required to prove the convergence of the MIP-form solution (1.5) (see [Reference Masuyama16, Theorem 3.2]), though this condition does not necessarily hold for all UBH-MCs (such an example is presented in Appendix A).
We note that the existing MIP-form solution (1.5) is not quasi-algorithmically constructible. For each $n \in \mathbb{Z}_+$ , the MIP-form solution (1.5) requires an optimal solution $\boldsymbol{\alpha}_n^{\dagger}$ for Problem 2.3, and its objective function includes the infinite sums:
where $\boldsymbol{v}_{\ell} =(v(\ell,i))_{i\in\mathbb{M}_{\ell}}$ is a subvector of $\boldsymbol{v}$ . Clearly, the infinite sum has infinite computational complexity, in general. Therefore, the existing MIP-form solution (1.5) is not quasi-algorithmically constructible, and computing it requires the truncation of the infinite sums (1.7), which causes truncation error.
This paper presents a new MIP-form solution for $\boldsymbol{\pi}=(\boldsymbol{\pi}_0,\boldsymbol{\pi}_1,\dots)$ that is quasi-algorithmically constructible:
where $K \in \mathbb{Z}_+$ is a free parameter, and where $\boldsymbol{\alpha}_n^*$ , $n\ge K$ , is a $1 \times M_n$ probability vector that is an optimal solution for the LFP problem (Problem 3.1 in Section 3) different from Problem 2.3 of the existing MIP-form solution (1.5). The objective function of Problem 3.1 does not include such a parameter set as $(\boldsymbol{v},b,\mathbb{C}) \in \mathcal{V}$ and is computed in a finite number of operations without any infinite sum like (1.7). Thus, the new MIP-form solution (1.8) is quasi-algorithmically constructible. In addition, the new solution holds without any additional conditions, such as Condition 1.5 for convergence, except for the ergodicity of $\boldsymbol{Q}$ .
The rest of this paper is divided into five sections. Section 2 provides preliminary results, including the existing MIP-form solution presented in [Reference Masuyama16]. Section 3 presents our new MIP-form solution together with its theoretical foundations, which is the main result of this paper. Section 4 considers the advantages of our MIP-form solution over the existing one. Section 5 discusses special cases where our solution can be established more effectively. Finally, Section 6 contains concluding remarks.
2. The existing MIP-form solution
This section reviews the existing MIP-form solution (1.5) to facilitate describing our new MIP-form solution (established in Section 3) and understanding its advantages over the existing one. First, we introduce the foundation of the existing MIP-form solution, the last-block-column linearly augmented (LBC-LA) truncation approximation to the stationary distribution vector $\boldsymbol{\pi}$ . We then provide a matrix-product form of the LBC-LA truncation approximation. Finally, as a certain limit of the matrix-product form, we express the existing MIP-form solution (1.5) for $\boldsymbol{\pi}$ .
For later reference, we define the $\ell_1$ -distance for vectors of different dimensions. Let $\mathbb{G}$ be a countable (possibly infinite) set. Let $\boldsymbol{h}_1 =(h_1(\,j))_{j\in\mathbb{G}_1}$ and $\boldsymbol{h}_2\,:\!=\,(h_2(\,j))_{j\in\mathbb{G}_2}$ be real-valued vectors, where $\mathbb{G}_1$ and $\mathbb{G}_2$ are subsets of $\mathbb{G}$ . For vectors $\boldsymbol{h}_1$ and $\boldsymbol{h}_2$ , let
this is referred to as the $\ell_1$ -distance between $\boldsymbol{h}_1$ and $\boldsymbol{h}_2$ .
2.1. Definition of the LBC-LA truncation approximation
We begin with some definitions to describe the LBC-LA truncation approximation. Let $\mathbb{S}_n = \bigcup_{k=0}^n \mathbb{L}_k$ for $n \in \mathbb{Z}_+$ and $\mathbb{S}_{-1} = \varnothing$ for convenience. For any set $\mathbb{X}$ , let $|\mathbb{X}|$ denote the cardinality of $\mathbb{X}$ . For any $n \in \mathbb{Z}_+$ , let ${}_{{(n)}}\boldsymbol{Q}$ denote a submatrix of $\boldsymbol{Q}$ consisting of the first $|\mathbb{S}_n|$ rows and columns, that is,
Furthermore, let ${}_{{(n)}}\widehat{\boldsymbol{\alpha}}$ , $n \in \mathbb{Z}_+$ , denote a $1 \times |\mathbb{S}_n|$ probability vector that has its probability masses on the last block (corresponding to $\mathbb{L}_n$ ), that is,
where $\boldsymbol{\alpha}_n\,:\!=\,(\alpha_n(\,j))_{j\in\mathbb{M}_n}$ is a probability vector.
We define the LBC-LA truncation approximation to $\boldsymbol{\pi}$ as a stationary distribution vector of the finite essential Q-matrix ${}_{(n)}\widehat{\boldsymbol{Q}}$ given by
The Q-matrix ${}_{(n)}\widehat{\boldsymbol{Q}}$ is referred to as the last-block-column linearly augmented (LBC-LA) truncation of generator $\boldsymbol{Q}$ . Clearly, the LBC-LA truncation ${}_{(n)}\widehat{\boldsymbol{Q}}$ has at least one stationary distribution vector. One of them is given by
This stationary distribution vector ${}_{(n)}\widehat{\boldsymbol{\pi}}$ is referred to as the LBC-LA truncation approximation to $\boldsymbol{\pi}$ , and is partitioned level-wise:
2.2. A matrix-product form of the LBC-LA truncation approximation
This subsection presents a matrix-product form of the LBC-LA truncation approximation ${}_{(n)}\widehat{\boldsymbol{\pi}}$ . We first introduce some vectors and matrices needed to express the matrix-product form of ${}_{(n)}\widehat{\boldsymbol{\pi}}$ . We then present the matrix-product form together with recursive formulas for the components of it. Finally, we mention a connection between the matrix-product form and the existing MIP-form solution (1.5) for $\boldsymbol{\pi}$ .
We define the component matrices of the matrix-product form of ${}_{(n)}\widehat{\boldsymbol{\pi}}$ . Let $\boldsymbol{U}_n^*$ , $n \in \mathbb{Z}_+$ , denote an $M_n \times M_n$ matrix such that its (i, j)th element represents the expected total sojourn time in state (n, j) before the first visit to $\overline{\mathbb{S}}_n\,:\!=\,\mathbb{S}\setminus \mathbb{S}_n=\cup_{k=n+1}^{\infty}\mathbb{L}_k$ (i.e., to any state in levels $n+1,n+2,\dots$ ) starting from state (n, i). The matrices $\boldsymbol{U}_n^*$ , $n \in \mathbb{Z}_+$ , are determined recursively:
and, for $n \in \mathbb{N}$ ,
Furthermore, let $\boldsymbol{U}^*_{n,k}$ , $n \in \mathbb{Z}_+$ , $k\in\mathbb{Z}_{[0,n]}$ , denote
It then follows from (2.2) that
For simplicity of notation, we define $\boldsymbol{U}_m$ , $m\in\mathbb{Z}_+$ , as
Using (2.5), we rewrite (2.3) as
Finally, $\boldsymbol{u}^*_n\,:\!=\,(u_n^*(i))_{i\in\mathbb{M}_n}$ , $n \in \mathbb{Z}_+$ , is defined as the row sum vector of the $|\mathbb{L}_n| \times |\mathbb{S}_n|$ matrix $\big(\boldsymbol{U}_{n,0}^*\, \boldsymbol{U}_{n,1}^*\, \cdots\, \boldsymbol{U}_{n,n}^*\big)$ :
where $\boldsymbol{u}_n^* > \textbf{0}$ follows from [Reference Masuyama16, Remark 2.3].
Remark 2.1. The probabilistic interpretation of $\boldsymbol{U}_n^*$ leads to that of $\boldsymbol{U}_{n,\ell}^*$ : its (i, j)th element represents the expected total sojourn time in state $(\ell,j)$ before the first visit to $\overline{\mathbb{S}}_{n}$ starting from state (n, i). This interpretation implies that the matrix $\big(\boldsymbol{U}_{n,0}^*\, \boldsymbol{U}_{n,1}^*\, \cdots\, \boldsymbol{U}_{n,n}^*\big)$ is identified with the rows in level n of $({-}{}_{(n)}\boldsymbol{Q})^{-1}$ , because this inverse matrix consists of the expected total sojourn times in the respective states in $\mathbb{S}_n$ before the first visit to $\overline{\mathbb{S}}_{n}$ starting from the states in $\mathbb{S}_n$ . We thus have (see [Reference Latouche and Ramaswami12, Theorem 5.5.1] and [Reference Masuyama16, Remark 2.2])
We are now ready to present the matrix-product form of ${}_{(n)}\widehat{\boldsymbol{\pi}}=\big({}_{(n)}\widehat{\boldsymbol{\pi}}_0,{}_{(n)}\widehat{\boldsymbol{\pi}}_1,\dots,{}_{(n)}\widehat{\boldsymbol{\pi}}_n\big)$ .
Proposition 2.2. ([Reference Masuyama16, Equation (2.21) and Lemma 2.2]) For $n \in \mathbb{Z}_+$ and $k \in \mathbb{Z}_{[0,n]}$ ,
The components $\boldsymbol{U}_{n,k}^*$ and $\boldsymbol{u}_n^*$ of the matrix-product form (2.9) are the following recursive formulas (see [Reference Masuyama16, Equations (3.23) and (3.24)]):
and
where $\boldsymbol{U}_n^*$ is given in (2.4).
Proposition 2.2 implies that an MIP-form solution for $\boldsymbol{\pi}$ can be obtained as a limit of the matrix-product form (2.9), that is,
However, as mentioned in [Reference Masuyama16, Section 2.3], achieving a limit like (2.12) convergent to $\boldsymbol{\pi}$ requires choosing the probability vectors $\boldsymbol{\alpha}_n$ , $n \in \mathbb{Z}_+$ , appropriately. To do this, Masuyama [Reference Masuyama16] formulated a certain series of LFP problems. The details are given in the next subsection.
2.3. The existing MIP-form solution for the stationary distribution vector
This subsection provides a brief summary of the existing MIP-form solution (1.5) for $\boldsymbol{\pi}$ . We first show the LFP problem whose optimal solution is the key component of the existing MIP-form solution. We then present the proposition that summarizes the theoretical results on the existing MIP-form solution.
To obtain an MIP-form solution for $\boldsymbol{\pi}$ , Masuyama [Reference Masuyama16] formulates the following LFP problem indexed with $n \in \mathbb{Z}_+$ .
Problem 2.3.
where $\boldsymbol{y}_n\,:\!=\,(y_n(i))_{i\in\mathbb{M}_n}$ is given by
Remark 2.4. Equation (2.7) implies that $\boldsymbol{\alpha}_n\boldsymbol{u}_n^* > 0$ for any feasible solution $\boldsymbol{\alpha}_n$ . Therefore, the objective function of Problem 2.3 is well-defined.
Remark 2.5. If the UBH generator $\boldsymbol{Q}$ satisfies $\boldsymbol{Q}_{k,k+m} = \boldsymbol{O}$ for $m \ge 2$ ( $\boldsymbol{Q}$ is an LD-QBD generator), then (2.13) is reduced to
Proposition 2.6 below shows how to find an optimal solution for Problem 2.3 and how to construct, with such a solution, the existing MIP-form solution (1.5) originally presented in [Reference Masuyama16].
Proposition 2.6. ([Reference Masuyama16, Theorems 3.1 and 3.2, and Corollary 3.1]) If Condition 1.4 is satisfied, then the following hold:
-
(i) For $n \in \mathbb{Z}_+$ , let $\boldsymbol{\alpha}_n^{\dagger}\,:\!=\,\big(\alpha_n^{\dagger}(\,j)\big)_{j\in\mathbb{M}_n}$ denote a probability vector such that
\begin{equation*}\alpha_n^{\dagger}(\,j)=\left\{\begin{array}{ll}1, & \quad j=j_n^{\dagger},\\[5pt]0, & \quad j \neq j_n^{\dagger},\end{array}\right.\end{equation*}where(2.14) \begin{equation}j_n^{\dagger}\in \mathop{\textrm{arg min}}\limits_{j\in\mathbb{M}_n} {\frac{y_n(\,j)}{u_n^*(\,j)}}.\end{equation}The probability vector $\boldsymbol{\alpha}_n^{\dagger}$ is an optimal solution for Problem 2.3. -
(ii) For $n \in \mathbb{Z}_+$ , let ${}_{(n)}\widehat{\boldsymbol{\pi}}^{\dagger}\,:\!=\, \big({}_{(n)}\widehat{\boldsymbol{\pi}}_{0}^{\dagger},{}_{(n)}\widehat{\boldsymbol{\pi}}_{1}^{\dagger},\dots,{}_{(n)}\widehat{\boldsymbol{\pi}}_{n}^{\dagger}\big)$ denote a probability vector such that
(2.15) \begin{equation}{}_{(n)}\widehat{\boldsymbol{\pi}}_{k}^{\dagger}= {\frac{\boldsymbol{\alpha}_n^{\dagger}\boldsymbol{U}_{n,k}^*}{\boldsymbol{\alpha}_n^{\dagger} \boldsymbol{u}_n^*}}= {\frac{\mathrm{row}_{j_n^{\dagger}}\big(\boldsymbol{U}_{n,k}^*\big)}{u_n^*\big(\,j_n^{\dagger}\big)}},\qquad k \in \mathbb{Z}_{[0,n]},\end{equation}where $\mathrm{row}_j({\cdot})$ denotes the jth row of the matrix in the parentheses. Furthermore, if a parameter set $(\boldsymbol{v},b,\mathbb{C}) \in \mathcal{V}$ as in Condition 1.4 satisfies Condition 1.5, then\begin{align*}\lim_{n\to\infty} \| {}_{(n)}\widehat{\boldsymbol{\pi}}^{\dagger} - \boldsymbol{\pi} \|_1 = 0,\end{align*}and(2.16) \begin{equation}\boldsymbol{\pi}=\lim_{n\to\infty}\!\left({\frac{\boldsymbol{\alpha}_n^{\dagger} \boldsymbol{U}_{n,k}^*}{\boldsymbol{\alpha}_n^{\dagger} \boldsymbol{u}_n^*}}\right)_{\!\!k\in\mathbb{Z}_{[0,n]}}= \lim_{n\to\infty}\!\left({\frac{\boldsymbol{\alpha}_n^{\dagger} \boldsymbol{U}_n^* \prod_{m=k}^{n-1} \downarrow \boldsymbol{U}_m}{\boldsymbol{\alpha}_n^{\dagger} \boldsymbol{U}_n^*\sum_{\ell=0}^n \!\big(\prod_{m=\ell}^{n-1} \downarrow \boldsymbol{U}_m\big) \boldsymbol{e}}}\right)_{\!\!k\in\mathbb{Z}_{[0,n]}}.\end{equation}
Remark 2.7. The symbols $\boldsymbol{\alpha}_n^{\dagger}$ , $j_n^{\dagger}$ , and ${}_{(n)}\widehat{\boldsymbol{\pi}}^{\dagger}$ correspond to $\boldsymbol{\alpha}_n^*$ $j_n^*$ , and ${}_{(n)}\widehat{\boldsymbol{\pi}}^*$ , respectively, in the original notation of [Reference Masuyama16].
Remark 2.8. Constructing the MIP-form solution (2.16) (and thus (1.5)) requires the sequence $\big\{\boldsymbol{\alpha}_n^{\dagger}\big\}$ . Each $\boldsymbol{\alpha}_n^{\dagger}$ is an optimal solution for Problem 2.3 with index n. Therefore, we only have to solve a single optimization problem for each $n \in \mathbb{Z}_+$ .
3. A new MIP-form solution
This section presents a new MIP-form solution for $\boldsymbol{\pi}$ . To construct the new MIP-form solution, we first formulate an LFP problem different from Problem 2.3. We then provide a way to find its optimal solution. Finally, we show the new MIP-form solution for $\boldsymbol{\pi}$ .
We introduce some definitions to describe our LFP problem for the new MIP-form solution. Let $\mathbb{Z}_{\ge m}\,:\!=\,\{m,m+1,\dots\}$ for any integer m. We then define $\boldsymbol{u}_{n,\mathbb{K}}^*\,:\!=\,\big(u_{n,\mathbb{K}}^*(i)\big)_{i\in\mathbb{M}_n}$ , $n \in \mathbb{Z}_{\ge K}$ , as
where the second equality is due to (2.6). We also define $\mathbb{I}_n^+$ , $n \in \mathbb{Z}_+$ , as
where $({\cdot})_j$ denotes the jth element of the (row or column) vector in the parentheses. Since the generator $\boldsymbol{Q}$ is irreducible, $\big(\boldsymbol{e}^{\top}\boldsymbol{Q}_{n+1,n}\big)_j > 0$ for at least one $j \in \mathbb{M}_n$ , and thus $\mathbb{I}_n^+ \neq \varnothing$ . Furthermore, since $\boldsymbol{\pi}_{n+1} > \textbf{0}$ and $\boldsymbol{Q}_{n+1,n} \ge \boldsymbol{O}, \neq \boldsymbol{O}$ ,
which is used later.
The following is our LFP problem, which is formulated for each $n \in \mathbb{Z}_{\ge K}$ .
Problem 3.1.
Remark 3.2. The objective function of Problem 3.1 is well-defined, as with Problem 2.3 (see Remark 2.4).
An optimal solution for Problem 3.1 is given by Lemma 3.3 below.
Lemma 3.3. For $n \in \mathbb{Z}_{\ge K}$ , fix $j_n^* \in \mathbb{I}_n^+ \neq \varnothing$ such that
and let $\boldsymbol{\alpha}_n^*\,:\!=\,(\alpha_n^*(\,j))_{j\in\mathbb{M}_n}$ denote a unit row vector such that
The vector $\boldsymbol{\alpha}_n^*$ is then an optimal solution for Problem 3.1, and the optimal (and thus maximum) value $r_n\big(\boldsymbol{\alpha}_n^*\big)=u_{n,\mathbb{K}}^*\big(\,j_n^*\big)/u_{n}^*\big(\,j_n^*\big)$ is positive.
Proof. First, we prove that $\boldsymbol{\alpha}_n^*$ is an optimal solution for Problem 3.1. It follows from (3.8) and (3.9) that
which leads to $u_{n,\mathbb{K}}^*(\,j) \le \xi_n u_n^*(\,j)$ for all $j \in \mathbb{I}_n^+$ . Thus, for any feasible solution $\boldsymbol{\alpha}_n$ satisfying (3.5)–(3.7), we have
Combining this result with (3.4) and (3.10) yields
which shows that $\boldsymbol{\alpha}_n^*$ is an optimal solution for Problem 3.1.
Next, we prove $r_n\big(\boldsymbol{\alpha}_n^*\big) > 0$ by contradiction. To this end, suppose that $r_n\big(\boldsymbol{\alpha}_n^*\big) = 0$ , or equivalently, that $\boldsymbol{\alpha}_n \boldsymbol{u}_{n,\mathbb{K}}^* = 0$ for any $\boldsymbol{\alpha}_n$ satisfying (3.5)–(3.7). We then have
Therefore, using (3.3), (3.1), and (2.8), we obtain
which contradicts $\boldsymbol{\pi}>\textbf{0}$ . The proof has been completed.
The next theorem shows that the optimal solution $\boldsymbol{\alpha}_n^*$ for Problem 3.1 leads to an MIP-form solution for $\boldsymbol{\pi}$ , which is different from the existing one (2.16) based on Problem 2.3.
Theorem 3.4. For each $n \in \mathbb{Z}_{\ge K}$ , let ${}_{(n)}\widehat{\boldsymbol{\pi}}^*= \big({}_{(n)}\widehat{\boldsymbol{\pi}}_{0}^*,{}_{(n)}\widehat{\boldsymbol{\pi}}_{1}^*,\dots,{}_{(n)}\widehat{\boldsymbol{\pi}}_{n}^*\big)$ , where ${}_{(n)}\widehat{\boldsymbol{\pi}}_{k}^*$ , $k\in\mathbb{Z}_{[0,n]}$ , is a $1 \times M_k$ vector obtained by replacing $\boldsymbol{\alpha}_n$ in (2.9) with the optimal solution $\boldsymbol{\alpha}_n^*$ for Problem 3.1, i.e.,
where $j_n^* \in \mathbb{J}_n^*$ defined in (3.8). We then have
and therefore
Remark 3.5. We only have to solve a single optimization problem (Problem 3.1) for each $n \in \mathbb{Z}_+$ to construct the new MIP-form solution (3.12), as with the existing one (2.16) (see Remark 2.8).
Proof of Theorem 3.4. According to Theorem B.10, it suffices to show that
To do this, we define $\widetilde{\boldsymbol{\alpha}}_n\,:\!=\,(\widetilde{\alpha}_n(\,j))_{j\in\mathbb{M}_n}$ as
Equations (3.3) and (3.14) imply that
and thus $\widetilde{\boldsymbol{\alpha}}_n$ is a feasible solution for Problem 3.1. Recall here that $\boldsymbol{\alpha}_n^*$ maximizes the objective function $r_n({\cdot})$ (see Lemma 3.3). Therefore, it follows from (3.4), (3.1), and (3.11) that
It also follows from (3.14), (3.1), and (2.7) that
where the second equality is due to (2.8). Combining (3.15) and (3.16) leads to
Consequently, (3.13) holds. The proof has been completed.
Theorem 3.4 shows that the new MIP-form solution (3.12) is obtained by finding the optimal solution $\boldsymbol{\alpha}_n^*$ of Problem 3.1, that is, by finding an element in $\mathbb{J}_n^*\subseteq \mathbb{I}_n^+$ . This task can be lightened by the following theorem.
Theorem 3.6. The set $\mathbb{J}_n^*$ is given by
where $\mathbb{O}_n^+$ is defined as
Proof. Lemma 3.3 states that, for any $ j \in \mathbb{J}_n^*$ , $u_{n,\mathbb{K}}^*(\,j)/u_{n}^*(\,j) = r_n\big(\boldsymbol{\alpha}_n^*\big) > 0$ . This implies that $\mathbb{J}_n^*$ does not include j such that $u_{n,\mathbb{K}}^*(\,j) = 0$ . It thus suffices to show that
It follows from (3.1) and (2.10b) that
It also follows from (3.18) that
These two equations show that (3.19) holds, which completes the proof.
4. Advantages of the new MIP-form solution
This section explains the advantages of our new MIP-form solution (3.12) over the existing one (2.16). We first point out the drawbacks of the existing MIP-form solution. We then show that our new solution (3.12) is quasi-algorithmically constructible (see Notion 1.1), unlike the existing one.
The existing MIP-form solution (2.16) has three drawbacks, compared with our new one. First, the existing solution (2.16) requires a parameter set $(\boldsymbol{v},b,\mathbb{C}) \in \mathcal{V}$ as in Condition 1.4 such that $\boldsymbol{v}$ satisfies Condition 1.5 (see Proposition 2.6). In some cases, Condition 1.5 does not hold; such an unfavorable case is provided in Appendix A. Second, finding a parameter set $(\boldsymbol{v},b,\mathbb{C}) \in \mathcal{V}$ may not be easy. Finally, it is of infinite computational complexity to obtain the optimal solution $\boldsymbol{\alpha}_n^{\dagger}$ for Problem 2.3. This is because the objective function of Problem 2.3 includes the infinite sums $\sum_{\ell=n+1}^{\infty} \boldsymbol{Q}_{k,\ell}\boldsymbol{v}_{\ell}$ , $k \in \mathbb{Z}_{[0,n]}$ , in (2.13).
Owing to these drawbacks, especially to the last one, the existing MIP-form solution (2.16) for $\boldsymbol{\pi}$ is not quasi-algorithmically constructible. Nevertheless, the construction of $\boldsymbol{\pi}$ via the existing solution (2.16) is formally expressed in an algorithm style equipped with a stopping criterion, as the sequential update algorithm [Reference Masuyama16, Algorithm 1]. This is described in Algorithm 1 below.
Remark 4.1. In the original description of Algorithm 1, the following operation is inserted as the first step: ‘Find $\boldsymbol{v}>\textbf{0}$ , $b>0$ , and $\mathbb{C} \in \mathbb{S}$ such that Conditions 1.4 and 1.5 hold’ (see [Reference Masuyama16, Algorithm 1]). However, in general, this cannot be performed algorithmically. Rather, $\boldsymbol{v}>\textbf{0}$ , $b>0$ , and $\mathbb{C} \in \mathbb{S}$ are input parameters of the algorithm, as described in Algorithm 1 above.
Remark 4.2. In general, we have to truncate the infinite sums $\sum_{\ell=n+1}^{\infty} \boldsymbol{Q}_{k,\ell}\boldsymbol{v}_{\ell}$ , $k \in \mathbb{Z}_{[0,n]}$ , in order to compute $\boldsymbol{y}_n$ (see Remark 2.5 for an exceptional case).
In contrast, our new MIP-form solution (3.12) is quasi-algorithmically constructible. The new solution uses the sequence of probability vectors $\big\{\boldsymbol{\alpha}_n^*\big\}$ , and each of them is an optimal solution for Problem 3.1. This problem is formulated without finding a parameter set $(\boldsymbol{v},b,\mathbb{C}) \in \mathcal{V}$ satisfying Condition 1.5. Moreover, the objective function of Problem 3.1 consists of only the vectors $\boldsymbol{u}_n^*$ and $\boldsymbol{u}_{n,\mathbb{K}}^*$ of finite computational complexity. Indeed, the finite computational complexity of $\boldsymbol{u}_n^*$ is guaranteed by the recursion (2.11), and that of $\boldsymbol{u}_{n,\mathbb{K}}^*$ is guaranteed by the following recursion:
which follows from (3.1) and (3.20). Therefore, it is of finite computational complexity to obtain the optimal solution $\boldsymbol{\alpha}_n^*$ for Problem 3.1. In summary, constructing our new MIP-form solution (3.12) requires $\big\{\boldsymbol{\alpha}_n^*;\,n\in\mathbb{Z}_{\ge K}\big\}$ , $\big\{\boldsymbol{u}_n^*;\,n\in\mathbb{Z}_+\big\}$ , and $\big\{\boldsymbol{U}_{n,k}^*;\,n\in\mathbb{Z}_+,k\in\mathbb{Z}_{[0,n]}\big\}$ , which are computed by iterating a recursive procedure consisting of an increasing but finite number of operations per iteration. Consequently, our new solution (3.12) for $\boldsymbol{\pi}$ is quasi-algorithmically constructible.
Shown below is an algorithm-style description of the computation of our new MIP-form solution (3.12) with a stopping criterion. This is the sequential update algorithm for our new solution.
Remark 4.3. To compute the probability vector ${}_{(n)}\widehat{\boldsymbol{\pi}}^*=\big({}_{(n)}\widehat{\boldsymbol{\pi}}_0^*,{}_{(n)}\widehat{\boldsymbol{\pi}}_1^*,\dots,{}_{(n)}\widehat{\boldsymbol{\pi}}_n^*\big)$ , Step (3.d.i) finds an element $j_n^*$ of $\mathbb{J}_n^*$ . Finding an element $j_n^*$ of $\mathbb{J}_n^*$ is equivalent to solving Problem 3.1, and it requires the construction of the sets $\mathbb{I}_n^+$ and $\mathbb{O}_n^+$ . This construction can easily be performed based on the probabilistic interpretation: the set $\mathbb{I}_n^+$ consists of the phases of $\mathbb{L}_n$ which accept direct (incoming) transitions from $\mathbb{L}_{n+1}$ ; the set $\mathbb{O}_n^+$ consists of the phases of $\mathbb{L}_n$ which are the starting points of (outgoing) paths leaving $\mathbb{L}_n$ eventually and reaching $\mathbb{L}_{n-1}$ while avoiding $\overline{\mathbb{S}}_n=\cup_{k=n+1}^{\infty}\mathbb{L}_k$ . Note that the set $\mathbb{O}_n^+$ is obtained by identifying the positive rows of $\boldsymbol{U}_n^*\boldsymbol{Q}_{n,n-1}$ , which is a crucial component of the recursion (2.10b) for $\big\{\boldsymbol{U}_{n,k}^*\big\}$ .
Remark 4.4. The choice of finite set $\mathbb{K}\subset \mathbb{Z}_+$ can affect the convergence speed of Algorithm 2. However, it would be difficult to discuss theoretically an optimal set $\mathbb{K}$ . A reasonable choice of $\mathbb{K}$ would be $\mathbb{Z}_{[0,K]} = \{0,1,\dots,K\}$ .
Remark 4.5. A single run of Algorithm 2 generates the probability vector ${}_{{(n_{\ell})}}\widehat{\boldsymbol{\pi}}^*=\big({}_{{(n_{\ell})}}\widehat{\boldsymbol{\pi}}_0^*,{}_{{(n_{\ell})}}\widehat{\boldsymbol{\pi}}_1^*,\dots,{}_{{(n_{\ell})}}\widehat{\boldsymbol{\pi}}_{n_{\ell}}^*\big)$ with $\ell$ being equal to the value fixed on completion of all the operations. Recall that ${}_{{(n_{\ell})}}\widehat{\boldsymbol{\pi}}^*$ is an LBC-LA truncation approximation to the stationary distribution vector $\boldsymbol{\pi}$ . Thus, $\sum_{k=1}^{n_{\ell}}k^m{}_{{(n_{\ell})}}\widehat{\boldsymbol{\pi}}_k^*$ , $m\in\mathbb{N}$ , can be considered an approximation to the moment vector $\sum_{k=1}^{\infty}k^m\boldsymbol{\pi}_k$ of the stationary distribution, though, as the order m is larger, a smaller $\varepsilon > 0$ would need to be chosen (as the parameter of the stopping criterion) in order to achieve satisfactory accuracy of the approximation.
Finally, in the next two paragraphs, we mention the advantages of Algorithm 2 based on our new MIP-form solution (3.12), comparing with the existing algorithms [Reference Kimura and Takine6, Reference Klimenok and Dudin10, Reference Li, Lian and Liu13, Reference Shin and Pearce19, Reference Takine20] for UBH-MCs. These existing algorithms are classified into two types: (i) approximation by conditional stationary distribution [Reference Kimura and Takine6, Reference Takine20]; (ii) approximation by level-homogenizing [Reference Klimenok and Dudin10, Reference Li, Lian and Liu13, Reference Shin and Pearce19]. Both types were originally designed to compute a single approximation to the stationary distribution, but performing them iteratively can generate a convergent sequence of approximations to the stationary distribution. Considering such applications of the existing algorithms, we compare them with our algorithm in the following.
The algorithms [Reference Kimura and Takine6, Reference Takine20] compute a single conditional stationary distribution $\boldsymbol{\pi}^{(n)}$ , as an approximation to $\boldsymbol{\pi}$ , in a single run. To generate each approximation $\boldsymbol{\pi}^{(n)}$ , the algorithms [Reference Kimura and Takine6, Reference Takine20] have to perform a large (theoretically infinite) number of preliminary computations involving the levels above n. The preliminary numerical results can be used to construct better approximations $\boldsymbol{\pi}^{(N)}$ with $N > n$ , but actually to do this with satisfactory accuracy requires further preliminary computations involving higher levels. In contrast, Algorithm 2 computes each tentative solution ${}_{(n)}\widehat{\boldsymbol{\pi}}^*$ without any information on the higher levels above n except $\boldsymbol{Q}_{n+1,n}$ .
The other algorithms [Reference Klimenok and Dudin10, Reference Li, Lian and Liu13, Reference Shin and Pearce19] conduct the level-homogenizing of the transition law above some level, say level n, in order to compute an approximation to $\boldsymbol{\pi}$ , denoted by $\boldsymbol{\pi}(n)$ . More specifically, the approximation $\boldsymbol{\pi}(n)$ requires, as the input for computation, a (level-independent) G-matrix obtained by level-homogenizing the transition law above level n. Thus, each approximation $\boldsymbol{\pi}(n)$ is computed from scratch, which is time-consuming. In contrast, Algorithm 2 sequentially constructs tentative solutions $\{{}_{(n)}\widehat{\boldsymbol{\pi}}^*\}$ convergent to the stationary distribution vector $\boldsymbol{\pi}$ , making the most of the components of the ‘old’ tentative solutions.
5. Special cases not requiring the solution of Problem 3.1
This section discusses special cases where an MIP-form solution for $\boldsymbol{\pi}$ is constructed without solving Problem 3.1. We first show a basic theorem on this matter, and then derive its corollary, which implies what cases do not require us to solve Problem 3.1. Finally, we mention representative examples of such favorable cases.
The following theorem provides the basis for our discussion here.
Theorem 5.1. Let N be a nonnegative integer, and let $\boldsymbol{\varphi}_n$ , $n \in \mathbb{Z}_{\ge N+1}$ , denote a $1 \times M_n$ probability vector such that
where $\{\varepsilon_n \in [0,1);\,n \in \mathbb{Z}_{\ge N+1}\}$ is a sequence such that $\lim_{n\to\infty}\varepsilon_n = 0$ . Furthermore, for $n \in \mathbb{Z}_{\ge N}$ , let ${}_{(n)}\breve{\boldsymbol{\pi}}\,:\!=\,\big({}_{(n)}\breve{\boldsymbol{\pi}}_0,{}_{(n)}\breve{\boldsymbol{\pi}}_1,\dots,{}_{(n)}\breve{\boldsymbol{\pi}}_n\big)$ denote a probability vector such that
We then have
Proof. Fix arbitrary $n \in \mathbb{Z}_{\ge N}$ , and note that (5.3) holds if
Indeed, the triangle inequality and (1.2) yield
and combining this with (5.4) results in (5.3).
To complete the proof, we show that (5.4) holds. Equation (5.1) implies that
Applying (5.5) to (5.2) yields
Furthermore, substituting (2.8) into the above inequality leads to
and thus
which shows that (5.4) holds. The proof has been completed.
Remark 5.2. Substituting
into the matrix product form (2.9) of the LBC-LA truncation approximation ${}_{(n)}\widehat{\boldsymbol{\pi}}=\big({}_{(n)}\widehat{\boldsymbol{\pi}}_0,{}_{(n)}\widehat{\boldsymbol{\pi}}_1,\dots,{}_{(n)}\widehat{\boldsymbol{\pi}}_n\big)$ , and using (2.5), we obtain ${}_{(n)}\widehat{\boldsymbol{\pi}}={}_{(n)}\breve{\boldsymbol{\pi}}$ . Therefore, ${}_{(n)}\breve{\boldsymbol{\pi}}$ is an LBC-LA truncation approximation to $\boldsymbol{\pi}$ .
Theorem 5.1 implies in what cases we do not have to solve Problem 3.1. However, Theorem 5.1 requires us to identify $\boldsymbol{\pi}_n /(\boldsymbol{\pi}_n\boldsymbol{e})$ to some extent, which is not easy in general. We thus provide a more effective (but more restrictive) result that can actually contribute to the construction of MIP-form solutions that do not require solving Problem 3.1.
Corollary 5.3. Suppose that there exists some $N \in \mathbb{Z}_+$ such that $\mathbb{M}_n = \mathbb{M}\,:\!=\,\{1,2,\dots,M\} \subset \mathbb{N}$ for all $n \in \mathbb{Z}_{\ge N+1}$ . Furthermore, suppose that there exists some $1 \times M$ probability vector $\boldsymbol{\varpi}$ such that
We then have
Proof. According to (5.6) and Theorem 5.1, it suffices to show that (5.1) holds with $\boldsymbol{\varphi}_n = \boldsymbol{\varpi}$ for all $n \in \mathbb{Z}_{\ge N+1}$ . Equation (5.6) implies that there exists some $\big\{\delta_n \in [0,1);\, n \in \mathbb{Z}_{\ge N+1}\big\}$ such that $\lim_{n\to\infty}\delta_n=0$ and
This inequality is equivalent to
We now fix
We then have $\lim_{n\to\infty}\varepsilon_n=0$ and
Thus, (5.7) yields
which shows that (5.1) holds with $\boldsymbol{\varphi}_n = \boldsymbol{\varpi}$ for all $n \in \mathbb{Z}_{\ge N+1}$ .
A typical example covered by Corollary 5.3 is a UBH-MC with block monotonicity (see [Reference Masuyama15, Definition 3.2]). If the generator $\boldsymbol{Q}$ is block monotone, then, for all $k \in \mathbb{Z}_+$ , $\sum_{\ell=\max\!(k-1,0)}^{\infty}\boldsymbol{Q}_{k,\ell}$ is constant and thus is interpreted as the generator of the underlying Markov chain on the phase set $\mathbb{M}$ , which controls the level process (see [Reference Masuyama15, Lemma 3.1]). In this case, $\boldsymbol{\varpi}$ is equal to the stationary distribution vector of the underlying Markov chain.
Another example is the (level-independent) M/G/1-type Markov chain. For this chain, several sufficient conditions are established for the tail asymptotics of $\boldsymbol{\pi}=(\boldsymbol{\pi}_0,\boldsymbol{\pi}_1,\dots\!)$ (see [Reference Kimura, Daikoku, Masuyama and Takahashi7–Reference Kimura, Masuyama and Takahashi9, Reference Masuyama14]) such that, for some $c > 0$ , $\boldsymbol{\mu} > \textbf{0}$ , and random variable Y on $\mathbb{Z}_+$ ,
Moreover (assuming additional conditions, if required), we can obtain (see, e.g., [Reference Kimura, Masuyama and Takahashi8, Sections 4 and 5])
which implies that (5.6) holds.
6. Concluding remarks
We have established a quasi-algorithmic construction of the exact (not approximate) and whole (not partial) stationary distribution vector $\boldsymbol{\pi}$ in upper block-Hessenberg Markov chains (UBH-MCs). The core of this theoretical construction is that all we have to do is to solve just one linear fractional programming (LFP) problem (Problem 3.1) in each iteration for the construction. We have also presented some special cases that do not require solving the LFP problem. To find other such special cases is an interesting future task. A possible case would be the class of UBH-MCs with asymptotically block-Toeplitz structure. The generator $\boldsymbol{Q}$ of such a chain satisfies (1.1) and the following: $M_n = M \in \mathbb{N}$ for all sufficiently large n and
where $\sum_{k=-1}^{\infty}\boldsymbol{Q}_k$ is an essential Q-matrix. In [Reference Klimenok and Dudin10], this special UBH-MC is referred to as a multidimensional asymptotically quasi-Toeplitz Markov chain.
Appendix A. An example violating Condition 1.5
This section presents an example of UBH Q-matrices for which the existing MIP-form solution in [Reference Masuyama16] does not hold. To describe the example, we assume that $\boldsymbol{Q}$ is the generator of a specific M/G/1-type Markov chain with power-like level increments. For such a generator $\boldsymbol{Q}$ , we show that Condition 1.5 does not hold for any parameter set $(\boldsymbol{v},b,\mathbb{C})$ as in Condition 1.4, and therefore the existing MIP-form solution is not established.
Let $\mathbb{S} = \mathbb{Z}_+ \times \mathbb{M} = \mathbb{Z}_+ \times \{1,2,\dots,M\}$ , and suppose that $\boldsymbol{Q}=\big(q\big(k,i;\,\ell,j\big)\big)_{(k,i;\,\ell,j)\in\mathbb{S}}$ is an M/G/1-type generator given by
where $\boldsymbol{A}_k\,:\!=\,\big(A_{k,i.j}\big)_{i,j\in\mathbb{M}}$ , $k\in\mathbb{Z}_{\ge -1}$ , is an $M \times M$ matrix. We then assume the following.
Assumption A.1 (i) The generator $\boldsymbol{Q}$ in (A.1) is irreducible; (ii) $\sum_{k=-1}^{\infty}\boldsymbol{A}_k$ is an essential Q-matrix with stationary distribution vector $\boldsymbol{\varpi}$ ; (iii) $\boldsymbol{\varpi}\!\sum_{k=-1}^{\infty} k\boldsymbol{A}_k\boldsymbol{e} < 0$ ; and (iv) $\lim_{k\to\infty}k^3\boldsymbol{A}_k = \boldsymbol{C}_A$ for some nonnegative matrix $\boldsymbol{C}_A \neq \boldsymbol{O}$ .
We confirm that there exists some $(\boldsymbol{v},b,\mathbb{C}) \in \mathcal{V}$ as in Condition 1.4 under Assumption A.1. Uniformizing the generator $\boldsymbol{Q}$ (see, e.g., [Reference Asmussen2, Problem II.4.1]) yields an M/G/1-type stochastic matrix that satisfies the stability condition [Reference Asmussen2, Chapter XI, Proposition 3.1] for GI/G/1-type Markov chains (including M/G/1-type ones) thanks to the conditions (i)–(iii) of Assumption A.1. Therefore, Condition 1.4 holds for the generator $\boldsymbol{Q}$ satisfying Assumption A.1 (see, e.g., [Reference Kontoyiannis and Meyn11, Theorem 1.1]).
We confirm that Condition 1.5 does not hold for any $(\boldsymbol{v},b,\mathbb{C}) \in \mathcal{V}$ as in Condition 1.4 under Assumption A.1. Fix $(\boldsymbol{v},b,\mathbb{C}) \in \mathcal{V}$ arbitrarily. Since $\mathbb{C}$ is a finite subset of the state space $\mathbb{S} = \mathbb{Z}_+ \times \mathbb{M}$ , there exists some $N \in \mathbb{Z}_+$ such that $\mathbb{C} \subseteq \mathbb{S}_N=\mathbb{Z}_{[0,N]} \times \mathbb{M}$ . Fix such $N \in \mathbb{Z}_+$ , and let $\big\{(X_t,J_t);\, t \ge 0\big\}$ denote an M/G/1-type Markov chain on the state space $\mathbb{S}$ with the generator $\boldsymbol{Q}$ satisfying Assumption A.1. It then follows from [Reference Kontoyiannis and Meyn11, Theorem 1.1] that there exists some $\gamma > 0$ such that
where $T_{\mathbb{C}}(1) = \inf\{t \ge 1\,:\, (X_t,J_t) \in \mathbb{C}\}$ and $T_{\mathbb{C}} = \inf\{t > 0\,:\, (X_t,J_t) \in \mathbb{C}\}$ . It also follows from $\mathbb{C} \subseteq \mathbb{S}_N$ and the M/G/1-type structure of $\boldsymbol{Q}$ that there exists some $\delta > 0$ such that
Combining (A.2) and (A.3), we have
Furthermore, from [Reference Kimura, Masuyama and Takahashi8, Theorem 4.1.1] and Assumption A.1 (in particular the condition (iv)), we have
for some constant $c > 0$ . Using (A.1), (A.4), and (A.5), we obtain
Therefore, Condition 1.5 does not hold for any $(\boldsymbol{v},b,\mathbb{C}) \in \mathcal{V}$ in the present example.
Appendix B. Convergent approximations to the stationary distribution vector of an essential Q-matrix
The purpose of this section is to provide the fundamental results needed in the proof of the main theorem (Theorem 3.4). This section consists of two subsections. Section B.1 describes several notions associated with essential Q-matrices. Section B.2 presents a necessary and sufficient condition for the convergence of a sequence of approximations to the stationary distribution vector of an essential Q-matrix.
B.1. Definition of basic notions
This subsection introduces the notions used in the next subsection. We first redefine the symbols $\mathbb{S}$ and $\boldsymbol{Q}$ introduced in the body of this paper. We next define several notions, such as essential Q-matrices, subinvariant measures, irreducibility, and recurrence. We then present a proposition on the recurrence and subinvariant measures of an irreducible essential Q-matrix.
Let $\mathbb{S}$ denote an arbitrary countable set, and let $\boldsymbol{Q}\,:\!=\,(q(i, j))_{i,j\in\mathbb{S}}$ denote an arbitrary Q-matrix (see, e.g., [Reference Anderson1, p. 64]), that is, a diagonally dominant matrix such that
The following is the definition of an essential Q-matrix.
Definition B.1. The Q-matrix $\boldsymbol{Q}$ is said to be essential if and only if the following hold for all $i \in \mathbb{S}$ : (i) q(i, i) is finite; (ii) $\sum_{j \in \mathbb{S}} q(i, j) = 0$ ; (iii) $q(i,i) < 0$ . Note that any essential Q-matrix can be interpreted as the infinitesimal generator of a continuous-time Markov chain. Thus, an essential Q-matrix may be referred to as an infinitesimal generator or generator, depending on the context.
Remark B.2. The conditions (i) and (ii), respectively, imply that $\boldsymbol{Q}$ is stable and conservative (see, e.g., [Reference Brémaud4, Definition 13.3.10]).
Remark B.3. An essential Q-matrix is not necessarily regular (or equivalently, non-explosive). Indeed, provided that $\boldsymbol{Q}$ is an essential Q-matrix, $\boldsymbol{Q}$ is regular if and only if, for any $\lambda > 0$ , the system of equations
has no bounded solution other than $\boldsymbol{x}=\textbf{0}$ (see, e.g., [Reference Anderson1, Chapter 2, Theorem 2.7] and [Reference Brémaud4, Theorem 13.3.11]).
To proceed further, we assume that $\boldsymbol{Q}$ is an essential Q-matrix, and then define $\{\Phi(t);\,t\ge0\}$ as a Markov chain on $\mathbb{S}$ having this essential $\boldsymbol{Q}$ as its generator. For later use, we also define $\boldsymbol{P}$ as a stochastic matrix such that
where $\mathrm{diag}\{{-}\boldsymbol{Q}\}$ is a diagonal matrix whose diagonal elements are identical to those of $({-}\boldsymbol{Q})$ . Note that $\boldsymbol{P}$ is interpreted as the transition probability matrix of an embedded discrete-time Markov chain for $\{\Phi(t)\}$ with generator $\boldsymbol{Q}$ (see [Reference Brémaud4, Section 13.3.2]). Hence, we call $\boldsymbol{P}$ the embedded transition probability matrix of $\{\Phi(t)\}$ .
For the essential Q-matrix $\boldsymbol{Q}$ , we define subinvariant measures and stationary distribution vectors.
Definition B.4. Let $\boldsymbol{\mu}\,:\!=\,(\mu(\,j))_{j\in\mathbb{S}}$ denote an arbitrary nonnegative and nonzero vector.
-
(i) If $\boldsymbol{\mu}\boldsymbol{Q} \le \textbf{0}$ , $\boldsymbol{\mu}$ is said to be a subinvariant measure of $\boldsymbol{Q}$ or the Markov chain $\{\Phi(t)\}$ . Furthermore, if a subinvariant measure $\boldsymbol{\mu}$ satisfies $\boldsymbol{\mu}\boldsymbol{Q} = \textbf{0}$ , $\boldsymbol{\mu}$ is said to be invariant or stationary.
-
(ii) If an invariant measure $\boldsymbol{\mu}$ satisfies $\boldsymbol{\mu}\boldsymbol{e} = 1$ , $\boldsymbol{\mu}$ is said to be a stationary distribution vector of $\boldsymbol{Q}$ or the Markov chain $\{\Phi(t)\}$ .
Finally, we provide the definitions of irreducibility, recurrence, and their related notions, and present a proposition that summarizes basic results on the recurrence and subinvariant measures of an irreducible essential Q-matrix.
Definition B.5.
-
(i) An essential Q-matrix $\boldsymbol{Q}$ and its Markov chain $\{\Phi(t)\}$ are irreducible (resp. transient, recurrent) if and only if the embedded transition probability matrix $\boldsymbol{P}$ in (B.1) is irreducible (resp. transient, recurrent) (see [Reference Brémaud4, Definitions 13.4.1 and 13.4.2]).
-
(ii) An irreducible essential Q-matrix $\boldsymbol{Q}$ and its Markov chain $\{\Phi(t)\}$ are ergodic (i.e., positive recurrent) if and only if there exists a stationary distribution vector, or equivalently, a summable invariant measure unique up to constant multiples (see [Reference Brémaud4, Definition 13.4.8 and Theorem 13.4.10]).
Proposition B.6. Suppose that $\boldsymbol{Q}$ is an irreducible essential Q-matrix. The following statements hold:
-
(i) There always exists a subinvariant measure of $\boldsymbol{Q}$ .
-
(ii) Any subinvariant measure of $\boldsymbol{Q}$ is positive; that is, its elements are all positive.
-
(iii) If $\boldsymbol{Q}$ is recurrent, then it has an invariant measure, which is unique up to constant multiples.
-
(iv) Any subinvariant measure of recurrent $\boldsymbol{Q}$ is, in fact, an invariant measure.
-
(v) The matrix $\boldsymbol{Q}$ has a subinvariant measure that is not invariant if and only if it is transient.
Proof. Let $\boldsymbol{\mu}=(\mu_i)_{i\in\mathbb{S}}$ be a nonnegative and nonzero vector, and let
It then follows from (B.1) and (B.2) that
Thus, $\boldsymbol{\eta}$ is a subinvariant (resp. an invariant) measure of the embedded transition probability matrix $\boldsymbol{P}$ if and only if $\boldsymbol{\mu}\boldsymbol{Q} \le \textbf{0}$ (resp. $\boldsymbol{\mu}\boldsymbol{Q} = \textbf{0}$ ), or equivalently, $\boldsymbol{\mu}$ is a subinvariant (resp. an invariant) measure of the Q-matrix $\boldsymbol{Q}$ (see [Reference Seneta18, Definition 5.3]). Furthermore, since the Q-matrix $\boldsymbol{Q}$ is essential (that is, $-q(i,i) > 0$ for all $i \in \mathbb{S}$ ),
Therefore, the present proposition is an immediate consequence of [Reference Seneta18, Theorem 5.4] (together with its corollary) and [Reference Seneta18, Lemmas 5.5 and 5.6]. That implication is as follows:
[Reference Seneta18, Lemma 5.5] $\Longrightarrow$ Statement (i);
[Reference Seneta18, Lemma 5.6] $\Longrightarrow$ Statement (ii);
[Reference Seneta18, Theorem 5.4] $\Longrightarrow$ Statement (iii);
[Reference Seneta18, the corollary of Theorem 5.4] $\Longrightarrow$ Statement (iv);
[Reference Seneta18, Theorem 5.4] $\Longrightarrow$ Statement (v).
The proof has been completed.
B.2. Convergence condition for approximations
This subsection proves a single theorem (Theorem B.10), which provides a necessary and sufficient condition for the convergence of a sequence of approximations to the stationary distribution of an essential Q-matrix. That necessary and sufficient condition is crucial to the proof of Theorem 3.4, the main theorem of this paper.
We begin with the following assumption.
Assumption B.7.
-
(i) The essential Q-matrix $\boldsymbol{Q}$ is ergodic with the unique stationary distribution vector $\boldsymbol{\pi}\,:\!=\,(\pi(\,j))_{j\in\mathbb{S}} > \textbf{0}$ (the uniqueness and positivity of $\boldsymbol{\pi}$ are due to Proposition B.6).
-
(ii) For all $n \in \mathbb{Z}_+$ , $\boldsymbol{Q}_n\,:\!=\,(q_n(i, j))_{i,j\in\mathbb{S}}$ is a stable and conservative Q-matrix (see Remark B.2) such that $\lim_{n\to\infty}q_n(i, j) = q(i, j)$ for all $i,j \in \mathbb{S}$ .
-
(iii) Each $\boldsymbol{Q}_n$ has at least one stationary distribution vector, denoted by $\boldsymbol{\pi}_n\,:\!=\,(\pi_n(\,j))_{j\in\mathbb{S}}$ , which can be considered an approximation to $\boldsymbol{\pi}$ .
Remark B.8. The vector $\boldsymbol{\pi}_n$ in Assumption B.7 is redefined here and thus is completely different from the ‘ $\boldsymbol{\pi}_n$ ’ used as a subvector of $\boldsymbol{\pi}$ in the body of this paper.
Under Assumption B.7, we have the next lemma, which is used to prove the theorem of this subsection.
Lemma B.9. Suppose that Assumption B.7 holds. Fix arbitrary $j_0 \in \mathbb{S}$ , and let $\mathbb{H}$ denote an infinite subset of $\mathbb{Z}_+$ such that $\{\pi_n(\,j_0);\,n\in\mathbb{H}\}$ converges to some $\alpha \in [0,\infty)$ , that is,
Under these conditions, we have the statements (i) and (ii) below:
-
(i) It holds that
(B.4) \begin{align}\lim_{\scriptstyle n\to\infty \atop \scriptstyle n \in \mathbb{H}}\pi_n(\,j)= {\alpha \over \pi(\,j_0)} \pi(\,j)\quad{for\ all\ j \in \mathbb{S}},\end{align}(B.5) \begin{align}\alpha \le \pi(\,j_0).\end{align} -
(ii) Furthermore, if $\alpha > 0$ , then
(B.6) \begin{equation}\lim_{\scriptstyle n\to\infty \atop \scriptstyle n \in \mathbb{H}}\!\left\|{\boldsymbol{\pi}_n^{\mathbb{A}} \over \boldsymbol{\pi}_n^{\mathbb{A}} \boldsymbol{e} }-{\boldsymbol{\pi}^{\mathbb{A}} \over \boldsymbol{\pi}^{\mathbb{A}} \boldsymbol{e} }\right\|_1=0\quad for\ all\ finite\ \mathbb{A} \subset \mathbb{S},\end{equation}where, for any vector $\boldsymbol{x}\,:\!=\,(x(\,j))_{j\in\mathbb{S}}$ , $\boldsymbol{x}^{\mathbb{A}}\,:\!=\,\big(x^{\mathbb{A}}(\,j)\big)_{j\in\mathbb{S}}$ denotes a vector such that\begin{equation*}x^{\mathbb{A}}(\,j)=\left\{\begin{array}{ll}x(\,j), & \quad j \in \mathbb{A},\\[5pt]0, & \quad j \not\in \mathbb{A}.\end{array}\right.\end{equation*}
Proof. We begin with the proof of the statement (i). To prove (B.4) and (B.5), it suffices to show that, for all infinite $\mathbb{H}^{\prime} \subseteq \mathbb{H}$ ,
Indeed, for any $j \in \mathbb{S}$ , there exists some infinite $\mathbb{H}_j \subseteq \mathbb{H}$ such that
Note that $\mathbb{H}_j \subseteq \mathbb{H}$ and $\mathbb{H} \subseteq \mathbb{H}$ (any set is a subset of itself). Therefore, if (B.7) holds for all infinite $\mathbb{H}^{\prime} \subseteq \mathbb{H}$ , then
Combining (B.8) and (B.9) yields
which shows that (B.4) and (B.5) hold.
To complete the proof of (B.4) and (B.5), we prove that (B.7) holds for all infinite $\mathbb{H}^{\prime} \subseteq \mathbb{H}$ . Let $\mathbb{H}^{\prime}$ be an arbitrary infinite subset of $\mathbb{H}$ , and let
From (B.3) and (B.10), we then have
Furthermore, using (B.10), $q(\,j,j)=\lim_{n\to\infty}q_n(\,j,j)$ (by Assumption B.7(ii)), $\boldsymbol{\pi}_n\boldsymbol{Q}_n=\textbf{0}$ , and Fatou’s lemma, we obtain
which leads to
It thus follows from Proposition B.6(iii)–(iv) that if $\boldsymbol{\mu}\,:\!=\,(\mu(\,j))_{j\in\mathbb{S}} \neq \textbf{0}$ then $\boldsymbol{\mu}$ is a unique (up to constant multiples) invariant measure of the ergodic Q-matrix $\boldsymbol{Q}$ . Therefore, unifying the two cases $\boldsymbol{\mu} \neq \textbf{0}$ and $\boldsymbol{\mu} = \textbf{0}$ , we can see that there exists some $c \ge 0$ such that
Substituting (B.12) into (B.11) yields
Combining this with (B.10) and (B.12) results in
In addition, it follows from $\boldsymbol{\pi}_n\boldsymbol{e}=\boldsymbol{\pi}\boldsymbol{e}=1$ , Fatou’s lemma, and (B.13) that
This result and (B.13) imply (B.7). Consequently, (B.4) and (B.5) have been proved.
We move on to the proof of the statement (ii). Suppose $\alpha > 0$ , and let $\mathbb{A}$ be a nonempty and finite $\mathbb{A} \subset \mathbb{S}$ . Since $\mathbb{A}$ is finite, it follows from (B.4) that
where the positivity of the second limit is due to $\alpha > 0$ , $\boldsymbol{\pi}>\textbf{0}$ , and $\mathbb{A} \neq \varnothing$ . The above two equations yield
which implies that (B.6) holds. The proof has been completed.
The following theorem presents a necessary and sufficient condition for the sequence $\{\boldsymbol{\pi}_n\}$ of approximations to converge to the stationary distribution vector $\boldsymbol{\pi}$ .
Theorem B.10. Under Assumption B.7, $\lim_{n\to\infty}\|\boldsymbol{\pi}_n - \boldsymbol{\pi} \|_1 = 0$ if and only if
Proof. We note that if $\lim_{n\to\infty}\|\boldsymbol{\pi}_n - \boldsymbol{\pi} \|_1 = 0$ holds then
which implies (B.14). Therefore, to complete the proof, we prove that (B.14) implies (B.15), and then prove that (B.15) implies $\lim_{n\to\infty}\|\boldsymbol{\pi}_n - \boldsymbol{\pi} \|_1 = 0$ .
We prove that (B.14) implies (B.15). To this end, suppose that (B.14) holds while (B.15) does not hold, that is,
Let $\mathbb{H}$ be an arbitrary infinite subset $\mathbb{H} \subset \mathbb{Z}_+$ such that
It then follows from Lemma B.9(i) that
It also follows from (B.17), the finiteness of $\mathbb{A}$ , and $\mathbb{H} \subset \mathbb{Z}_+$ that
which contradicts (B.14). Therefore, (B.14) implies (B.15).
We prove that (B.15) implies $\lim_{n\to\infty}\|\boldsymbol{\pi}_n - \boldsymbol{\pi} \|_1 = 0$ . For any fixed $j \in \mathbb{S}$ , there exist some infinite subset $\mathbb{H}_j \subset\mathbb{Z}_+$ such that
Thus, applying Lemma B.9(i) to the sequence $\{\boldsymbol{\pi}_n;\,n\in\mathbb{H}_j\}$ yields
Combining this and (B.15) leads to
Note here that $\boldsymbol{\pi}_n$ and $\boldsymbol{\pi}$ are probability vectors. Therefore, by the dominated convergence theorem, (B.18) implies $\lim_{n\to\infty}\|\boldsymbol{\pi}_n - \boldsymbol{\pi} \|_1 = 0$ . It has been proved that (B.15) implies $\lim_{n\to\infty}\|\boldsymbol{\pi}_n - \boldsymbol{\pi} \|_1 = 0$ , which completes the proof.
Remark B.11. Lemma B.9(i) is the Q-matrix version of [Reference Wolf21, Lemma 2.1] for stochastic matrices. Furthermore, Lemma B.9(ii) and Theorem B.10 are the Q-matrix versions of Parts (i) and (ii), respectively, of Corollary 2.2 in [Reference Wolf21].
Acknowledgement
The author thanks Dr. Masakiyo Miyazawa for valuable comments on the presentation of the notion of quasi-algorithmic constructions.
Funding information
The research of the author was supported in part by JSPS KAKENHI Grant Number JP21K11770.
Competing interests
There were no competing interests to declare which arose during the preparation or publication process for this article.