1. Introduction
1.1. Dynamical systems and entropy
Topological pressure and its variational principle have been significant in several fields, including the dimension theory of dynamical systems. Recently, Feng and Huang devised an innovative invariant called weighted topological pressure for factor maps between dynamical systems and proved its variational principle [Reference Feng and HuangFH16]. Their work inspired Tsukamoto to suggest a new definition of this invariant [Reference TsukamotoTsu22]. He also established a variational principle, revealing the non-trivial coincidence of the two definitions. Tsukamoto focused on the simplest case with two dynamical systems.
In this paper, we extend Tsukamoto’s definition to the case of an arbitrary number of dynamical systems and prove its variational principle. With our result, we can plainly calculate the Hausdorff dimension of self-affine sponges, a topic studied by Kenyon and Peres [Reference Kenyon and PeresKP96]. Furthermore, we will show in §6 that we can determine the Hausdorff dimension of certain sofic sets embedded in higher-dimensional Euclidean space.
We review the basic notions of dynamical systems in this subsection. Refer to the book of Walters [Reference WaltersWal82] for the details.
A pair $(X, T)$ is called a dynamical system if X is a compact metrizable space and $T: X \rightarrow X$ is a continuous map. A map $\pi : X \rightarrow Y$ between dynamical systems $(X, T)$ and $(Y, S)$ is said to be a factor map if $\pi $ is a continuous surjection and $\pi \circ T = S \circ \pi $ . We sometimes write as $\pi : (X, T) \rightarrow (Y, S)$ to clarify the dynamical systems in question.
For a dynamical system $(X, T)$ , denote its topological entropy by $h_{\mathrm {top}}(T)$ . Let $P(f)$ be the topological pressure for a continuous function $f: X \rightarrow \mathbb {R}$ (see §2 for the definition of these quantities). Let $\mathscr {M}^T(X)$ be the set of T-invariant probability measures on X and $h_\mu (T)$ the measure-theoretic entropy for $\mu \in \mathscr {M}^T(X)$ (see §3.2). The variational principle then states that [Reference DinaburgDin70, Reference GoodmanGm71, Reference GoodwynGw69, Reference RuelleRu73, Reference WaltersWal75]
1.2. Background
We first look at self-affine sponges to understand the background of weighted topological entropy introduced by Feng and Huang. Let $m_1, m_2, \ldots , m_r$ be natural numbers with $m_1 \leq m_2 \leq \cdots \leq m_r$ . Consider an endomorphism T on $\mathbb {T}^r = \mathbb {R}^r/\mathbb {Z}^r$ represented by the diagonal matrix $A = \mathrm {diag}(m_1, m_2, \ldots , m_r)$ . For $D \subset \prod _{i=1}^r \{0, 1, \ldots , m_i-1\}$ , define
This set is compact and T-invariant, that is, $TK(T, D) = K(T, D)$ .
These sets for $r = 2$ are known as Bedford–McMullen carpets or self-affine carpets. Figure 1 exhibits a famous example, the case of $D = \{(0,0), (1,1), (0,2)\} \subset \{0, 1\} \times \{0, 1, 2\}$ . The analysis of these sets is complicated compared with ‘self-similar’ sets. Bedford [Reference BedfordBed84] and McMullen [Reference McMullenMcM84] independently studied these sets and showed that, in general, their Hausdorff dimension is strictly smaller than their Minkowski dimension (also known as box-counting dimension). Figure 1 has Hausdorff dimension $\log _2{(1+2^{\log _3{2}})} = 1.349 \cdots $ and Minkowski dimension $1 + \log _3{\tfrac {3}{2}} = 1.369 \cdots $ .
The sets $K(T, D)$ for $r \geq 3$ are called self-affine sponges. Kenyon and Peres [Reference Kenyon and PeresKP96] calculated their Hausdorff dimension for the general case (see Theorem 1.5 in this section). In addition, they showed the following variational principle for the Hausdorff dimension of $K(T, D)$ :
Here, the endomorphism $T_i$ on $\mathbb {T}^{r-i+1}$ is defined from $A_i = \mathrm {diag}(m_1, m_2, \ldots , m_{r-i+1})$ , and $\mu _i$ is defined as the push-forward measure of $\mu $ on $\mathbb {T}^{r-i+1}$ by the projection onto the first $r-i+1$ coordinates. Feng and Huang’s definition of weighted topological entropy of $K(T, D)$ equals $\mathrm {dim}_H K(T, D)$ with a proper setting.
1.3. Original definition of the weighted topological pressure
Motivated by the geometry of self-affine sponges described in the previous subsection, Feng and Huang introduced a generalized notion of pressure. Consider dynamical systems $(X_i, T_i)$ ( $i=1, 2, \ldots , r$ ) and factor maps $\pi _i: X_i \rightarrow X_{i+1}\ (i=1, 2, \ldots , r-1)$ :
We refer to this as a sequence of dynamical systems. Let $\boldsymbol {w} = (w_1, w_2, \ldots , w_r)$ be a vector with $w_1> 0$ and $w_i \geq 0$ for $i \geq 2$ . Feng and Huang [Reference Feng and HuangFH16] ingeniously defined the $\boldsymbol {w}$ -weighted topological pressure $P^{\boldsymbol {w}}_{\mathrm {FH}}(f)$ for a continuous function $f:X_1 \rightarrow \mathbb {R}$ and established the variational principle [Reference Feng and HuangFH16, Theorem 1.4]:
Here, $\pi ^{(i)}$ is defined by
and ${\pi ^{(i-1)}}_* \mu $ is the push-forward measure of $\mu $ by $\pi ^{(i-1)}$ on $X_i$ . The $\boldsymbol {w}$ -weighted topological entropy $h^{\boldsymbol {w}}_{\mathrm {top}}(T_1)$ is the value of $P^{\boldsymbol {w}}_{\mathrm {FH}}(f)$ when $f \equiv 0$ . In this case, equation (1.2) becomes
We will explain here Feng and Huang’s method of defining $h^{\boldsymbol {w}}_{\mathrm {top}}(T_1)$ . For the definition of $P^{\boldsymbol {w}}_{\mathrm {FH}}(f)$ , see their original paper [Reference Feng and HuangFH16].
Let n be a natural number and $\varepsilon $ a positive number. Let $d^{(i)}$ be a metric on $X_i$ . For $x \in X_1$ , define the nth $\boldsymbol {w}$ -weighted Bowen ball of radius $\varepsilon $ centered at x by
Consider $\Gamma = \{ B^{\boldsymbol {w}}_{n_j}(x_j, \varepsilon ) \}_j$ , an at-most countable cover of $X_1$ by weighted Bowen balls. Let $n(\Gamma ) = \min _j n_j$ . For $s \geq 0$ and $N \in \mathbb {N}$ , let
This quantity is non-decreasing as $N \to \infty $ . The following limit hence exists:
There is a value of s where $\Lambda ^{\boldsymbol {w}, s}_{\varepsilon }$ jumps from $\infty $ to $0$ , which we will denote by $h^{\boldsymbol {w}}_{\mathrm {top}}(T_1, \varepsilon )$ :
The value $h^{\boldsymbol {w}}_{\mathrm {top}}(T_1, \varepsilon )$ is non-decreasing as $\varepsilon \to 0$ . Therefore, we can define the $\boldsymbol {w}$ -weighted topological entropy $h^{\boldsymbol {w}}_{\mathrm {top}}(T_1)$ by
An important point about this definition is that in some dynamical systems, such as self-affine sponges, the quantity $h^{\boldsymbol {w}}_{\mathrm {top}}(T_1)$ is directly related to the Hausdorff dimension of $X_1$ .
Example 1.1. Consider the self-affine sponges introduced in §1.2. Define $p_i: \mathbb {T}^{r-i+1} \rightarrow \mathbb {T}^{r-i}$ by
Let $X_1 = K(T, D)$ , $X_i = p_{i-1} \circ p_i \circ \cdots \circ p_1(X_1)$ , and $T_i: X_i \rightarrow X_i$ be the endomorphism defined by $A_i = \mathrm {diag}(m_1, m_2, \ldots , m_{r-i+1})$ . Define the factor maps $\pi _i: X_i \rightarrow X_{i+1}$ as the restrictions of $p_i$ . Let
Then each nth $\boldsymbol {w}$ -weighted Bowen ball is approximately a square of side length $\varepsilon m_1^{-n}$ . Therefore,
1.4. Tsukamoto’s approach and its extension
Following the work of Feng and Huang [Reference Feng and HuangFH16] described in §1.3, Tsukamoto [Reference TsukamotoTsu22] published an intriguing approach to these invariants. There, he gave a new definition of the weighted topological pressure for a factor map between two dynamical systems:
He then proved the variational principle using his definition, showing the surprising coincidence of the two definitions. His definition of weighted topological entropy allowed for relatively easy calculations for sets like self-affine carpets.
We will extend Tsukamoto’s idea, redefine the weighted topological pressure for a sequence of dynamical systems of arbitrary length, and establish the variational principle. Here we will explain our definition in the case $f \equiv 0$ . See §2 for the general setting. We will not explain Tsukamoto’s definition itself since it is obtained by letting $r=2$ in the following argument.
Consider a sequence of dynamical systems:
Take a metric $d^{(i)}$ on $X_i$ . Let ${\boldsymbol {a}}=(a_1, a_2, \ldots , a_{r-1})$ with $0 \leq a_i \leq 1$ for each i. Let N be a natural number and $\varepsilon $ a positive number. We define a new metric $d^{(i)}_N$ on $X_i$ by
We inductively define a quantity $\#^{\boldsymbol {a}}_i(\Omega , N, \varepsilon )$ for $\Omega \subset X_i$ . For $\Omega \subset X_1$ , set
(The quantity $\#^{\boldsymbol {a}}_1(\Omega , N, \varepsilon )$ is independent of the parameter $\boldsymbol {a}$ . However, we use this notation for the convenience of what follows.) Let $\Omega \subset X_{i+1}$ . Suppose $\#^{\boldsymbol {a}}_i$ is already defined. We set
We define the topological entropy of ${\boldsymbol {a}}$ -exponent $h^{\boldsymbol {a}}(\boldsymbol {T})$ , where $\boldsymbol {T} = (T_i)_i$ , by
This limit exists since $\log {\#^{\boldsymbol {a}}_r(X_r, N, \varepsilon )}$ is sub-additive in N and non-decreasing as $\varepsilon $ tends to $0$ .
From ${\boldsymbol {a}}=(a_1, a_2, \ldots , a_{r-1})$ , we define a probability vector (that is, all entries are non-negative, and their sum is 1) $\boldsymbol {w_a} = (w_1, \ldots , w_r)$ by
The following theorem is a direct consequence of our main result in Theorem 2.1.
Theorem 1.2. For ${\boldsymbol {a}}=(a_1, a_2, \ldots , a_{r-1})$ with $0 \leq a_i \leq 1$ for each i,
The strategy of the proof is adopted from Tsukamoto’s paper. However, there are some additional difficulties. Let $h^{\boldsymbol {a}}_{\mathrm {var}}(\boldsymbol {T})$ be the right-hand side of equation (1.6). We use the ‘zero-dimensional trick’ for proving $h^{\boldsymbol {a}}(\boldsymbol {T}) \leq h^{\boldsymbol {a}}_{\mathrm {var}}(\boldsymbol {T})$ , meaning we reduce the proof to the case where all dynamical systems are zero-dimensional. Merely taking a zero-dimensional extension for each $X_i$ does not work. Therefore, we realize this by taking step by step extensions of the whole sequence of dynamical systems (see §3.3). Then we show $h^{\boldsymbol {a}}(\boldsymbol {T}) \leq h^{\boldsymbol {a}}_{\mathrm {var}}(\boldsymbol {T})$ by using an appropriate measure, the definition of which is quite sophisticated (see $\sigma _N$ in the proof of Theorem 4.1). In proving $h^{\boldsymbol {a}}(\boldsymbol {T}) \geq h^{\boldsymbol {a}}_{\mathrm {var}}(\boldsymbol {T})$ , the zero-dimensional trick can not be used. The proof, therefore, requires a detailed estimation of these quantities for arbitrary covers, which is more complicated than the original argument in [Reference TsukamotoTsu22].
Theorem 1.2 and Feng and Huang’s version of variational principle in equation (1.3) yield the following corollary.
Corollary 1.3. For ${\boldsymbol {a}}=(a_1, a_2, \ldots , a_{r-1})$ with $0 < a_i \leq 1$ for each i,
This corollary is rather profound, connecting the two seemingly different quantities. We can calculate the Hausdorff dimension of self-affine sponges using this result as in the following example. Additionally, we will show in §6 that we can now determine the Hausdorff dimension of certain sofic sets in higher-dimensional Euclidean space.
Example 1.4. Let us take another look at self-affine sponges. Kenyon and Peres [Reference Kenyon and PeresKP96, Theorem 1.2] calculated their Hausdorff dimension as follows. Recall the notation in §1.2 and that $m_1 \leq m_2 \leq \cdots \leq m_r$ .
Theorem 1.5. Define a sequence of real numbers $(Z_j)_j$ as follows. Let $Z_r$ be the indicator of D, namely, $Z_r(i_1, \ldots , i_r) = 1$ if $(i_1, \ldots , i_r) \in D$ and $0$ otherwise. Define $Z_{r-1}$ by
More generally, if $Z_{j+1}$ is already defined, let
Then
We can prove this result in a fairly elementary way by Corollary 1.3 without requiring measure theory on the surface. Set $a_i = \log _{m_{r-i+1}} m_{r-i}$ for each $1 \leq i \leq r-1$ , then $\boldsymbol {w_a}$ equals $\boldsymbol {w}$ in equation (1.4). Combining equation (1.5) and Corollary 1.3, we have
Hence, we need to show the following claim.
Claim 1.6. We have
Proof. Observe first that taking the infimum over closed covers instead of open ones in the definition of $h^{\boldsymbol {a}}(\boldsymbol {T})$ does not change its value. Define a metric $d^{(i)}$ on each $X_i$ by
Let
Define $p_i: D_{r-i+1} \rightarrow D_{r-i}$ by $p_i(e_1, \ldots , e_{r-i+1}) = (e_1, \ldots , e_{r-i})$ . Fix $0 < \varepsilon < {1}/{m_r}$ and take a natural number n with $m_1^{-n} < \varepsilon $ . Fix a natural number N and let $\psi _i: D_{r-i+1}^{N+n} \rightarrow D_{r-i}^{N+n}$ be the product map of $p_i$ , that is, $\psi _i(v_1, \ldots , v_{N+n}) = (p_i(v_1), \ldots , p_i(v_{N+n}))$ .
For $x \in D_{r-i+1}^{N+n}$ , define (recall that $A_i = \mathrm {diag}(m_1, m_2, \ldots , m_{r-i+1})$ )
Then $\{U^{(i)}_x\}_{x \in D^{N+n}_{r-i+1}}$ is a closed cover of $X_i$ with $\mathrm {diam}(U^{(i)}_x, d^{(i)}_N) < \varepsilon $ . For $x, y \in D^{N+n}_{r-i+1}$ , we write $x \backsim y$ if and only if $U^{(i)}_x \cap U^{(i)}_y \ne \varnothing $ . We have for any i and $x \in D_{r-i}^{N+n}$ ,
Notice that for each $x \in D_{r-i}^{N+n}$ , the number of $x' \in D_{r-i}^{N+n}$ with $x' \backsim x$ is not more than $3^r$ . Therefore, for every $v = (v_1^{(1)}, \ldots , v_{N+n}^{(1)}) \in D_{r-1}^{N+n}$ , there are $(v_1^{(k)}, \ldots , v_{N+n}^{(k)}) \in D_{r-1}^{N+n}$ , $k = 2, 3, \ldots , L$ , and $L \leq 3^r$ , with
We inductively continue while considering that the multiplicity is at most $3^r$ and obtain
Therefore,
Next, we prove $h^{\boldsymbol {a}}(\boldsymbol {T}) \geq \log {Z_0}$ . We fix $0 < \varepsilon < {1}/{m_r}$ and use $\varepsilon $ -separated sets. Take and fix $\boldsymbol {s} = (t_1, \ldots , t_r) \in D$ , and set $\boldsymbol {s}_i = (t_1, \ldots , t_{r-i+1})$ . Fix a natural number N and let $\psi _i: D_{r-i+1}^N \rightarrow D_{r-i}^N$ be the product map of $p_i$ as in the previous definition. Define
Then $Q_i$ is an $\varepsilon $ -separated set with respect to the metric $d^{(i)}_N$ on $X_i$ . Consider an arbitrary open cover $\mathscr {F}^{(i)}$ of $X_i$ for each i with the following properties (this $(\mathscr {F}^{(i)})_i$ is defined as a chain of open (N, $\varepsilon $ )-covers of $(X_i)_i$ in Definition 3.1).
-
(1) For every i and $V \in \mathscr {F}^{(i)}$ , we have $\mathrm {diam}(V, d^{(i)}_N) < \varepsilon $ .
-
(2) For each $1 \leq i \leq r-1$ and $U \in \mathscr {F}^{(i+1)}$ , there is $\mathscr {F}^{(i)}(U) \subset \mathscr {F}^{(i)}$ such that
$$ \begin{align*} \pi_i^{-1}(U) \subset \bigcup \mathscr{F}^{(i)}(U) \end{align*} $$and$$ \begin{align*} \mathscr{F}^{(i)} = \bigcup_{U \in \mathscr{F}^{(i+1)}} \mathscr{F}^{(i)}(U). \end{align*} $$
We have $\#(V \cap Q_i ) \leq 1$ for each $V \in \mathscr {F}^{(i)}$ by (1). Let $(e^{(2)}_1, e^{(2)}_2, \ldots , e^{(2)}_N) \in D_{r-1}^N$ and suppose $U \in \mathscr {F}^{(2)}$ satisfies
Then $\pi _1^{-1}(U)$ contains at least $Z_{r-1}(e^{(2)}_1)\cdots Z_{r-1}(e^{(2)}_N)$ points of $Q_1$ . Hence,
We continue this reasoning inductively and get
This implies
We conclude that
We would like to mention the work of Barral and Feng [Reference Barral and FengBF12, Reference FengFe11], and of Yayama [Reference YayamaYa11]. These papers studied the related invariants when $(X_i, T_i) (i =1, \ldots , r)$ are subshifts over finite alphabets. In this subshift case, our definition of $h^{\boldsymbol {a}}(\boldsymbol {T})$ (and its pressure version in §2) is essentially the same as that given in [Reference Barral and FengBF12, Theorem 3.1]. Hence, we can say that our definition generalizes the approach in [Reference Barral and FengBF12, Theorem 3.1] from subshifts to general dynamical systems.
2. Weighted topological pressure
Here, we introduce the generalized, new definition of weighted topological pressure. Let $(X_i, T_i)$ ( $i\hspace{-1pt}=\hspace{-1pt}1, 2, \ldots , r$ ) be dynamical systems and $\pi _i: X_i\hspace{-1pt} \rightarrow\hspace{-1pt} X_{i+1}\, (i\hspace{-1pt}=\hspace{-1pt}1, 2, \ldots , r\hspace{-1pt}-\hspace{-1pt}1)$ factor maps. For a continuous function $f: X_1 \to \mathbb {R}$ and a natural number N, set
Let $d^{(i)}$ be a metric on $X_i$ . Recall that we defined a new metric $d^{(i)}_N$ on $X_i$ by
We may write these as $S_N^{T_1}f$ or $d^{T_i}_N$ to clarify the maps $T_1$ and $T_i$ in the definitions above.
Let ${\boldsymbol {a}}=(a_1, a_2, \ldots , a_{r-1})$ with $0 \leq a_i \leq 1$ for each i and $\varepsilon $ a positive number. We inductively define a quantity $P^{\boldsymbol {a}}_i(\Omega , f, N, \varepsilon )$ for $\Omega \subset X_i$ . For $\Omega \subset X_1$ , set
Let $\Omega \subset X_{i+1}$ . If $P^{\boldsymbol {a}}_i$ is already defined, let
We define the topological pressure of ${\boldsymbol {a}}$ -exponent $P^{\boldsymbol {a}}(f)$ by
This limit exists since $\log {P^{\boldsymbol {a}}_r(X_r, f, N, \varepsilon )}$ is sub-additive in N and non-decreasing as $\varepsilon $ tends to $0$ . When $r=1$ , this coincides with the standard definition of the topological pressure $P(f)$ on $(X_1, T_1)$ . The topological entropy $h_{\mathrm {top}}(T_1)$ is the value of $P(f)$ when $f \equiv 0$ . When we want to clarify the maps $T_i$ and $\pi _i$ used in the definition of $P^{\boldsymbol {a}}(f)$ , we will denote it by $P^{\boldsymbol {a}}(f, \boldsymbol {T})$ or $P^{\boldsymbol {a}}(f, \boldsymbol {T}, \boldsymbol {\pi })$ with $\boldsymbol {T}=(T_i)_{i=1}^r$ and $\boldsymbol {\pi } = (\pi _i)_{i=1}^r$ .
Recall that we defined a probability vector $\boldsymbol {w_a} = (w_1, \ldots , w_r)$ from ${\boldsymbol {a}}=(a_1, a_2, \ldots , a_{r-1})$ by
Let
We can now state the main result of this paper.
Theorem 2.1. Let $(X_i, T_i)$ ( $i=1, 2, \ldots , r$ ) be dynamical systems and $\pi _i: X_i \rightarrow X_{i+1}\ (i=1, 2, \ldots , r-1)$ factor maps. For any continuous function $f: X_1 \to \mathbb {R}$ ,
We define $P^{\boldsymbol {a}}_{\mathrm {var}}(f)$ to be the right-hand side of this equation, where ‘var’ is the abbreviation of ‘variational’. Then we need to prove
3. Preparation
In this section, we prepare several tools which will be used in the proof of Theorem 2.1.
3.1. Basic properties and tools
Let $(X_i, T_i)$ ( $i=1, 2, \ldots , r$ ) be dynamical systems, $\pi _i: X_i \rightarrow X_{i+1}\ (i=1, 2, \ldots , r-1)$ factor maps, $\boldsymbol {a} = (a_1, \ldots , a_{r-1}) \in [0, 1]^{r-1}$ , and $f: X_1 \to \mathbb {R}$ a continuous function.
We will use the following notions in §§3.3 and 5.
Definition 3.1. Consider a cover $\mathscr {F}^{(i)}$ of $X_i$ for each i. For a natural number N and a positive number $\varepsilon $ , the family $(\mathscr {F}^{(i)})_i$ is said to be a chain of ( $\boldsymbol {N}$ , $\boldsymbol {\varepsilon }$ )-covers of $(X_i)_i$ if the following conditions are true.
-
(1) For every i and $V \in \mathscr {F}^{(i)}$ , we have $\mathrm {diam}(V,d^{(i)}_N) < \varepsilon $ .
-
(2) For each $1 \leq i \leq r-1$ and $U \in \mathscr {F}^{(i+1)}$ , there is $\mathscr {F}^{(i)}(U) \subset \mathscr {F}^{(i)}$ such that
$$ \begin{align*} \pi_i^{-1}(U) \subset \bigcup \mathscr{F}^{(i)}(U) \end{align*} $$and$$ \begin{align*} \mathscr{F}^{(i)} = \bigcup_{U \in \mathscr{F}^{(i+1)}} \mathscr{F}^{(i)}(U). \end{align*} $$
Moreover, if all the elements of each $\mathscr {F}^{(i)}$ are open/closed/compact, we call $(\mathscr {F}^{(i)})_i$ a chain of open/closed/compact ( $\boldsymbol {N}$ , $\boldsymbol {\varepsilon }$ )-covers of $(X_i)_i$ .
Remark 3.2. Note that we can rewrite $P^{\boldsymbol {a}}_r(X_r, f, N, \varepsilon )$ using chains of open covers as follows. For a chain of (N, $\varepsilon $ )-covers $(\mathscr {F}^{(i)})_i$ of $(X_i)_i$ , let
Then
Just like the classic notion of pressure, we have the following property.
Lemma 3.3. For any natural number m,
where $\boldsymbol {T}^m = ({T_i}^m)_{i=1}^r$ .
Proof. Fix $\varepsilon> 0$ . It is obvious from the definition of $P^{\boldsymbol {a}}_1$ that for any $\Omega _1 \subset X_1$ and a natural number N,
Let $\Omega _{i+1} \subset X_{i+1}$ . By induction on i, we have
Thus,
There exists $0 < \delta < \varepsilon $ such that for any $1 \leq i \leq r$ ,
Then
Let $i=1$ in equation (3.2), then we have for any $\Omega _1 \subset X_1$ ,
Take $\Omega _{i+1} \subset X_{i+1}$ . Again by induction on i and by equation (3.2), we have
Hence,
Combining with equation (3.1), we have
Therefore,
We will later use the following standard lemma of calculus.
Lemma 3.4
-
(1) For $0 \leq a \leq 1$ and non-negative numbers $x, y$ ,
$$ \begin{align*} (x+y)^a \leq x^a+y^a. \end{align*} $$ -
(2) Suppose that non-negative real numbers $p_1, p_2, \ldots , p_n$ satisfy $\sum _{i=\mathrm {1}}^n p_i = \mathrm {1}$ . Then for any real numbers $x_1, x_2, \ldots , x_n$ , we have
$$ \begin{align*} \sum_{i=1}^n ( -p_i \log{p_i} +x_i p_i) \leq \log{\sum_{i=1}^n e^{x_i}}. \end{align*} $$In particular, letting $x_1=x_2=\cdots =x_n=0$ gives$$ \begin{align*} \sum_{i=1}^n(-p_i \log{p_i}) \leq \log{n}. \end{align*} $$Here, $0 \cdot \log {0}$ is defined as $0$ .
The proof for item (1) is elementary. See [Reference WaltersWal82, §9.3, Lemma 9.9] for item (2).
3.2. Measure theoretic entropy
In this subsection, we will introduce the classical measure-theoretic entropy (also known as Kolmogorov–Sinai entropy) and state some of the basic lemmas we need to prove Theorem 2.1. The main reference is the book of Walters [Reference WaltersWal82].
Let $(X, T)$ be a dynamical system and $\mu \in \mathscr {M}^T(X)$ . A set $\mathscr {A} = \{A_1, \ldots , A_n\}$ is called a finite partition of X with measurable elements if $X = A_1 \cup \cdots \cup A_n$ , each $A_i$ is a measurable set, and $A_i \cap A_j = \varnothing $ for $i \ne j$ . In this paper, a partition is always finite and consists of measurable elements.
Let $\mathscr {A}$ and $\mathscr {A}'$ be partitions of X. We define a new partition $\mathscr {A} \vee \mathscr {A}'$ by
For a natural number N, we define a refined partition $\mathscr {A}_N$ of $\mathscr {A}$ by
where $T^{-i}\mathscr {A} = \{ T^{-i}(A) | A \in \mathscr {A}\}$ is a partition for $i \in \mathbb {N}$ .
For a partition $\mathscr {A}$ of X, let
We set
This limit exists since $H_\mu (\mathscr {A}_N)$ is sub-additive in N. The measure theoretic entropy $h_\mu (T)$ is defined by
Let $\mathscr {A}$ and $\mathscr {A}'$ be partitions. Their conditional entropy is defined by
Lemma 3.5
-
(1) $H_\mu (\mathscr {A})$ is sub-additive in $\mathscr {A}$ : that is, for partitions $\mathscr {A}$ and $\mathscr {A}'$ ,
$$ \begin{align*} H_\mu(\mathscr{A} \vee \mathscr{A}') \leq H_\mu(\mathscr{A}) + H_\mu(\mathscr{A}'). \end{align*} $$ -
(2) $H_\mu (\mathscr {A})$ is concave in $\mu $ : that is, for $\mu , \nu \in \mathscr {M}^T(X)$ and $0 \leq t \leq 1$ ,
$$ \begin{align*} H_{(1-t)\mu+t\nu}(\mathscr{A}) \geq (1-t)H_\mu(\mathscr{A}) + tH_\nu(\mathscr{A}). \end{align*} $$ -
(3) For partitions $\mathscr {A}$ and $\mathscr {A}'$ ,
$$ \begin{align*} h_\mu(T, \mathscr{A}) \leq h_\mu(T, \mathscr{A}') + H_\mu(\mathscr{A}' | \mathscr{A}). \end{align*} $$
For the proof, confer with [Reference WaltersWal82, Theorem 4.3(viii), §4.5] for item (1), [Reference WaltersWal82, Remark, §8.1] for item (2), and [Reference WaltersWal82, Theorem 4.12, §4.5] for item (3).
3.3. Zero-dimensional principal extension
Here we will see how we can reduce the proof of $P^{\boldsymbol {a}}(f) \leq P^{\boldsymbol {a}}_{\mathrm {var}}(f)$ to the case where all dynamical systems are zero- dimensional.
First, we review the definitions and properties of (zero-dimensional) principal extension. The introduction here closely follows Tsukamoto’s paper [Reference TsukamotoTsu22] and the book of Downarowicz [Reference DownarowiczDow11]. Suppose $\pi : (Y, S) \rightarrow (X, T)$ is a factor map between dynamical systems. Let d be a metric on Y. We define the conditional topological entropy of $\pi $ by
Here,
A factor map $\pi : (Y, S) \rightarrow (X, T)$ between dynamical systems is said to be a principal factor map if
Also, $(Y, S)$ is called a principal extension of $(X, T)$ .
The following theorem is from [Reference DownarowiczDow11, Corollary 6.8.9].
Theorem 3.6. Suppose $\pi : (Y, S) \rightarrow (X, T)$ is a principal factor map. Then $\pi $ preserves measure-theoretic entropy, namely,
for any S-invariant probability measure $\mu $ on Y.
More precisely, it is proved in [Reference DownarowiczDow11, Corollary 6.8.9] that $\pi $ is a principal factor map if and only if it preserves measure-theoretic entropy, provided that $h_{\mathrm {top}}(X, T) < \infty $ .
Suppose $\pi : (X_1, T_1) \rightarrow (X_2, T_2)$ and $\phi : (Y, S) \rightarrow (X_2, T_2)$ are factor maps between dynamical systems. We define a fiber product $(X_1 \times _{X_2}^{} Y, T_1 \times S)$ of $(X_1, T_1)$ and $(Y, S)$ over $(X_2, T_2)$ by
We have the following commutative diagram:
Here, $\pi '$ and $\psi $ are restrictions of the projections onto Y and $X_1$ , respectively:
Since $\pi $ and $\phi $ are surjective, both $\pi '$ and $\psi $ are factor maps. The following lemma is proved in [Reference TsukamotoTsu22, Lemma 5.3].
Lemma 3.7. If $\phi $ is a principal extension in the diagram in equation (3.3), then $\psi $ is also a principal extension.
A dynamical system $(Y, S)$ is said to be zero-dimensional if there is a clopen basis of the topology of Y, where clopen means any element in the basis is both closed and open. A basic example of a zero-dimensional dynamical system is the Cantor set $\{ 0, 1 \}^{\mathbb {N}}$ with the shift map.
A principal extension $(Y, S)$ of $(X, T)$ is called a zero-dimensional principal extension if $(Y, S)$ is zero-dimensional. The following important theorem can be found in [Reference DownarowiczDow11, Theorem 7.6.1].
Theorem 3.8. For any dynamical system, there is a zero-dimensional principal extension.
Let $(Y_i, R_i)$ ( $i=1, 2, \ldots , m$ ) be dynamical systems, $\pi _i: Y_i \rightarrow Y_{i+1}\, (i=1, 2, \ldots , m-1)$ factor maps, and $\boldsymbol {a} = (a_1, \ldots , a_{m-1}) \in [0, 1]^{m-1}$ . Fix $2 \leq k \leq m-1$ and take a zero-dimensional principal extension $\phi _k: (Z_k, S_k) \rightarrow (Y_k, R_k)$ . For each $1 \leq i \leq k-1$ , let $(Y_i \times _{Y_k} Z_k, R_i \times S_k)$ be the fiber product and $\phi _i: Y_i \times _{Y_k} Z_k \rightarrow Y_i$ be the restriction of the projection as in the earlier definition. We have
By Lemma 3.7, $\phi _i$ is a principal factor map. We define $\Pi _i: Y_i \times _{Y_k} Z_k \rightarrow Y_{i+1} \times _{Y_k} Z_k$ by $\Pi _i(x, y) = ( \pi _i(x), y )$ for each $1 \leq i \leq k-2$ , and $\Pi _{k-1}: Y_{k-1} \times _{Y_k} Z_k \rightarrow Z_k$ as the projection. Then we have the following commutative diagram:
Let
Lemma 3.9. In the settings above,
and
Here, $\boldsymbol {R} = (R_i)_i$ , $\boldsymbol {\pi } = (\pi _i)_i$ , $\boldsymbol {S} = (S_i)_i$ and $\boldsymbol {\Pi } = (\Pi _i)_i$ .
Proof. We remark that the following proof does not require $Z_k$ to be zero-dimensional. Let
and
Let $\nu \in \mathscr {M}^{S_1}(Y_1)$ and $1 \leq i \leq m$ . Since all the horizontal maps in equation (3.4) are principal factor maps, we have
It follows that
(The reversed inequality is generally true by the surjectivity of factor maps, yielding equality. However, we do not use this fact.)
Let $d^i$ be a metric on $Y_i$ for each i and $\widetilde {d^k}$ a metric on $Z_k$ . We define a metric $\widetilde {d^i}$ on $(Z_i, S_i)$ for $1 \leq i \leq k-1$ by
Set $\widetilde {d^i} = d^i$ for $k+1 \leq i \leq m$ . Take an arbitrary positive number $\varepsilon $ . There exists $0 < \delta < \varepsilon $ such that for every $1 \leq i \leq m$ ,
Let N be a natural number. We claim that
Take $M> 0$ with
Then there exists a chain of open (N, $\delta $ )-covers $(\mathscr {F}^{(i)})_i$ of $(Z_i)_i$ (see Definition 3.1 and Remark 3.2) with
We can find a compact set $C_U \subset U$ for each $U \in \mathscr {F}^{(m)}$ such that $\bigcup _{U \in \mathscr {F}^{(m)}} C_U = Z_m$ . Let $\mathscr {K}^{(m)} := \{ C_U | U \in \mathscr {F}^{(m)} \}$ . Since $\Pi _{m-1}^{-1}(C_U) \subset \Pi _{m-1}^{-1}(U)$ is compact for each $U \in \mathscr {F}^{(m)}$ , we can find a compact set $E_V \subset V$ for each $V \in \mathscr {F}^{(m-1)}(U)$ such that $\Pi _{m-1}^{-1}(C_U) \subset \bigcup _{V \in \mathscr {F}^{(k)}(U)} E_V$ . Let $\mathscr {K}^{(m-1)}(C_U) := \{ E_V | V \in \mathscr {F}^{(m-1)}(U) \}$ and $\mathscr {K}^{(m-1)} := \bigcup _{C \in \mathscr {K}^{(m)}} \mathscr {K}^{(m-1)}(C)$ . We continue likewise and obtain a chain of compact (N, $\delta $ )-covers $(\mathscr {K}^{(i)})_i$ of $(Z_i)_i$ with
Let $\phi _i(\mathscr {K}^{(i)}) = \{ \phi _i(C) | C \in \mathscr {K}^{(i)} \}$ for each i. Note that for any $\Omega \subset Z_i$ ,
This and equation (3.5) assure that $(\phi _i(\mathscr {K}^{(i)}))_i$ is a chain of compact (N, $\varepsilon $ )-covers of $(Y_i)_i$ . We have
Since f is continuous and each $\phi _i(\mathscr {K}^{(i)})$ is a closed cover, we can slightly enlarge each set in $\phi _i(\mathscr {K}^{(i)})$ and create a chain of open (N, $\varepsilon $ )-covers $(\mathscr {O}^{(i)})_i$ of $(Y_i)_i$ satisfying
Therefore,
Since $M> P^{\boldsymbol {a}}_r(f \circ \phi _1, \boldsymbol {S}, \boldsymbol {\Pi }, N, \delta )$ was chosen arbitrarily, we have
This implies
The following proposition reduces the proof of $P^{\boldsymbol {a}}(f) \leq P^{\boldsymbol {a}}_{\mathrm {var}}(f)$ in the next section to the case where all dynamical systems are zero-dimensional.
Proposition 3.10. For all dynamical systems $(X_i, T_i)$ ( $i=1, 2, \ldots , r$ ) and factor maps $\pi _i: X_i \rightarrow X_{i+1}\ (i=1, 2, \ldots , r-1)$ , there are zero-dimensional dynamical systems $(Z_i, S_i)\ (i=1, 2, \ldots , r)$ and factor maps $\Pi _i: Z_i \rightarrow Z_{i+1}\ (i=1, 2, \ldots , r-1)$ with the following property; for every continuous function $f: X_1 \rightarrow \mathbb {R}$ , there exists a continuous function $g: Z_1 \rightarrow \mathbb {R}$ with
and
Proof. We will first construct zero-dimensional dynamical systems $(Z_i, S_i)$ ( $i=1, 2, \ldots , r$ ) and factor maps $\Pi _i: Z_i \rightarrow Z_{i+1}\ (i=1, 2, \ldots , r-1)$ alongside the following commutative diagram of dynamical systems and factor maps:
where all the horizontal maps are principal factor maps.
By Theorem 3.8, there is a zero-dimensional principal extension $\psi _r: (Z_r, S_r) \rightarrow (X_r, T_r)$ . The set $\{*\}$ is the trivial dynamical system, and the maps $X_r \rightarrow \{*\}$ and $Z_r \rightarrow \{*\}$ send every element to $*$ . For each $1 \leq i \leq r-1$ , the map $X_i \times _{X_r}^{} Z_r \rightarrow X_i$ in the following diagram is a principal factor map by Lemma 3.7:
For $1 \leq i \leq r-2$ , define $\pi _i^{(2)}: X_i \times _{X_r}^{} Z_r \rightarrow X_{i+1} \times _{X_r}^{} Z_r$ by
Then every horizontal map in the right two rows of diagram (3.6) is a principal factor map. Next, take a zero-dimensional principal extension $\psi _{r-1}: (Z_{r-1}, S_{r-1}) \rightarrow (X_{r-1} \times _{X_r}^{} Z_r, T_{r-1} \times S_r)$ and let $\Pi _{r-1} = \pi _{r-1}^{(2)} \circ \psi _{r-1}$ . The rest of diagram (3.6) is constructed similarly, and by Lemma 3.7, each horizontal map is a principal factor map.
Let $f: X_1 \rightarrow \mathbb {R}$ be a continuous map. Applying Lemma 3.9 to the right two rows of diagram (3.6), we get
and
for $\boldsymbol {\Pi ^{(2)}} = (\pi ^{(2)}_i)_i$ and $\boldsymbol {S^{(2)}} = (T_i \times S_r)_i$ . Again by Lemma 3.9,
and
where $\boldsymbol {\Pi ^{(3)}} = ( (\pi ^{(3)}_i)_{i=1}^{r-2}, \Pi _{r-1})$ , and $\boldsymbol {S^{(3)}}$ is the collection of maps associated with $Z_r$ and the third row from the right of diagram (3.6). We continue inductively and obtain the desired inequalities, where g is taken as $f \circ \phi _1 \circ \phi _2 \circ \cdots \circ \phi _r$ .
4. Proof of $P^{\boldsymbol {a}}(f) \leq P^{\boldsymbol {a}}_{\mathrm {var}}(f)$
Let $\boldsymbol {a} = (a_1, \ldots , a_{r-1}) \in [0, 1]^{r-1}$ . Recall that we defined $(w_1, \ldots , w_r)$ by
and $P^{\boldsymbol {a}}_{\mathrm {var}}(f)$ by
where
The following theorem suffices by Proposition 3.10 in proving $P^{\boldsymbol {a}}(f) \leq P^{\boldsymbol {a}}_{\mathrm {var}}(f)$ for arbitrary dynamical systems.
Theorem 4.1. Suppose $(X_i, T_i)$ ( $i=1, 2, \ldots , r$ ) are zero-dimensional dynamical systems and $\pi _i: X_i \rightarrow X_{i+1}\ (i=1, 2, \ldots , r-1)$ are factor maps. Then we have
for any continuous function $f: X_1 \rightarrow \mathbb {R}$ .
Proof. Let $d^{(i)}$ be a metric on $X_i$ for each $i=1, 2, \ldots , r$ . Take a positive number $\varepsilon $ and a natural number N. First, we will backward inductively define a finite clopen partition $\mathscr {A}^{(i)}$ of $X_i$ for each i. Since $X_r$ is zero-dimensional, we can take a sufficiently fine finite clopen partition $\mathscr {A}^{(r)}$ of $X_r $ . That is, each $A \in \mathscr {A}^{(r)}$ is both open and closed, and $\mathrm {diam}(A, d^{(r)}_N) < \varepsilon $ . Suppose $\mathscr {A}^{(i+1)}$ is defined. For each $A \in \mathscr {A}^{(i+1)}$ , take a clopen partition $\mathscr {B}(A)$ of $\pi _{i}^{-1} (A) \subset X_{i}$ such that any $B \in \mathscr {B}(A)$ satisfies $\mathrm {diam}(B, d^{(i)}_N) < \varepsilon $ . We let $\mathscr {A}^{(i)} = \bigcup _{A \in \mathscr {A}^{(i+1)}} \mathscr {B}(A)$ . Then $\mathscr {A}^{(i)}$ is a finite clopen partition of $X_i$ . We define
We employ the following notation. For $i<j$ and $A \in \mathscr {A}^{(j)}_N$ , let $\mathscr {A}^{(i)}_N(A)$ be the set of ‘children’ of A:
Also, for $B \in \mathscr {A}^{(i)}_N$ and $i<j$ , we denote by $\widetilde {\pi }_j B$ the unique ‘parent’ of B in $\mathscr {A}^{(j)}_N$ :
We will evaluate $P^{\boldsymbol {a}}(f, N, \varepsilon )$ from above using $\{ \mathscr {A}^{(i)} \}$ . Let $A \in \mathscr {A}^{(2)}_N$ , and start by setting
Let $A \in \mathscr {A}^{(i+1)}_N$ . If $Z^{(i-1)}_N$ is already defined, set
We then define $Z_N$ by
It is straightforward from the construction that
Therefore, we only need to prove that there is a $T_1$ -invariant probability measure $\mu $ on $X_1$ such that
Since each $A \in \mathscr {A}^{(1)}_N$ is closed, we can choose a point $x_A \in A$ so that
We define a probability measure $\sigma ^{}_N$ on $X_1$ by
where $\delta _{x^{}_A}$ is the Dirac measure at $x_A$ . This is indeed a probability measure on $X_1$ since
Although $\sigma _N^{}$ is not generally $T_1$ -invariant, the following well-known trick allows us to create a $T_1$ -invariant measure $\mu $ . We begin by setting
Since $X_1$ is compact, we can take a sub-sequence of $(\mu _N^{})_N$ so that it weakly converges to a probability measure $\mu $ on $X_1$ . Then $\mu $ is $T_1$ -invariant by the definition of $\mu _N$ . We will show that this $\mu $ satisfies
We first prove
To simplify the notation, let
and
Claim 4.2. We have the following equations:
Here, $\sum _{j=r}^{r-1}\ ((a_{j}-1)/Z_n) W_N^{(j)}$ is defined to be $0$ .
Proof. Let $A \in \mathscr {A}^{(1)}_N$ . We have
Then
For term $(\mathrm {I})$ , we have
We will show that $(\mathrm {I}\mathrm {I}) = W_N^{(j)}$ . Let $A' \in \mathscr {A}^{(j+1)}_N$ . Then any $A \in \mathscr {A}^{(1)}_N(A')$ satisfies $\widetilde {\pi }_{j+1} A = A'$ . Hence,
The term $(\mathrm {I}\mathrm {I})'$ can be calculated similarly to how we showed $\sigma _N^{}(X_1)=1$ . Namely,
Thus, we get
This completes the proof of the first assertion.
Next, let $2 \leq i \leq r$ . For any $A \in \mathscr {A}^{(i)}_N$ ,
As in the evaluation of term $(\mathrm {I}\hspace {-0.5pt}\mathrm {I})'$ , we have
Hence,
Therefore,
Note that we can calculate term $(\mathrm {I}\hspace {-0.5pt}\mathrm {I}\hspace {-0.5pt}\mathrm {I})$ as
We conclude that
This completes the proof of the claim.
By this claim,
However, we have
Indeed, the coefficient of $W_N^{(k)}$ ( $1 \leq k \leq r-1$ ) is
Thus, we have
Let $\mu ^{(i)} = {\pi ^{(i-1)}}_* \mu $ and $\mu ^{(i)}_N = {\pi ^{(i-1)}}_* \mu _N^{}$ .
Lemma 4.3. Let N and M be natural numbers. For any $1 \leq i \leq r$ ,
Here, $ \lvert {\mathscr {A}^{(i)}} \rvert $ is the number of elements in $\mathscr {A}^{(i)}$ .
Suppose this is true, and let N and M be natural numbers. Together with equation (4.1), we obtain the following evaluation:
Let $N = N_k \to \infty $ along the sub-sequence $(N_k)$ for which $\mu _{N_k}^{} \rightharpoonup \mu $ . This yields
We let $M \to \infty $ and get
Hence,
We are left to prove Lemma 4.3.
Proof of Lemma 4.3
This statement appears in the proof of variational principle in [Reference WaltersWal82, Theorem 8.6], and Tsukamoto also proves it in [Reference TsukamotoTsu22, Claim 6.3]. The following proof is taken from the latter. We will explain for $i=1$ ; the same argument works for all i.
Let $\mathscr {A} = \mathscr {A}^{(1)}$ . Recall that $\mu _N^{} = ({1}/{N}) \sum _{k=0}^{N-1} {{T_1}^k}_* \sigma _N^{}$ . Since the entropy function is concave (Lemma 3.5), we have
Let $N = qM + r$ with $0 \leq r < M$ , then
We will evaluate $\sum _{s=0}^q H_{\sigma _N^{}}(T_1^{-sM-t}\mathscr {A}_M)$ from below for each $0 \leq t \leq M-1$ . First, observe that
We have
without multiplicity. Therefore,
This implies
Now, we sum over t and obtain
Combining with equation (4.2), this implies
It follows that
This completes the proof of Theorem 4.1.
5. Proof of $P^{\boldsymbol {a}}_{\mathrm {var}}(f) \leq P^{\boldsymbol {a}}(f)$
It seems difficult to implement the zero-dimensional trick to prove $P^{\boldsymbol {a}}_{\mathrm {var}}(f) \leq P^{\boldsymbol {a}}(f)$ . Hence, the proof is more complicated.
Theorem 5.1. Suppose that $(X_i, T_i)$ ( $i=1, 2, \ldots , r$ ) are dynamical systems and $\pi _i: X_i \rightarrow X_{i+1}\ (i=1, 2, \ldots , r-1)$ are factor maps. Then we have
for any continuous function $f: X_1 \rightarrow \mathbb {R}$ .
Proof. Take and fix $\mu \in \mathscr {M}^{T_1}(X_1)$ . Let $\mu _i= {\pi ^{(i-1)}}_* \mu $ . We need to prove
However, the following argument assures that giving an evaluation up to a constant is sufficient: suppose there is a positive number C which does not depend on f nor $(T_i)_i$ satisfying
Applying this to $S_mf$ and $\boldsymbol {T}^m = ({T_i}^m)_i$ for $m \in \mathbb {N}$ yields
We employ Lemma 3.3 and get
Divide by m and let $m \to \infty $ . We obtain the desired inequality
Therefore, we only need to prove inequality (5.1).
Let $\mathscr {A}^{(i)} = \{ A^{(i)}_1, A^{(i)}_2, \ldots , A^{(i)}_{m_i} \}$ be an arbitrary partition of $X_i$ for each i. We will prove
We start by approximating elements of $\mathscr {A}^{(i)}$ with compact sets using backward induction. For $1\leq i \leq r$ , let
We will denote an element $(j_r, j_{r-1}, \ldots , j_i)$ in $\Lambda _i^{0}$ or $\Lambda _i$ by $j_r j_{r-1} \cdots j_i$ . For each $A^{(r)}_j \in \mathscr {A}^{(r)}$ , take a compact set $C^{(r)}_j \subset A^{(r)}_j$ such that
Define $C^{(r)}_0$ as the remainder of $X_r$ , which may not be compact:
Then $\mathscr {C}^{(r)} := \{ C^{(r)}_0, C^{(r)}_1, \ldots , C^{(r)}_{m_r} \}$ is a measurable partition of $X_r$ .
Next, consider the partition $\pi _{r-1}^{-1}(\mathscr {C}^{(r)}) \vee \mathscr {A}^{(r-1)}$ of $X_{r-1}$ . For $j_r j_{r-1} \in \Lambda _{r-1}$ , let
Then
and for each $j_r \in \Lambda _r^0$ ,
For each $j_rj_{r-1} \in \Lambda _{r-1}$ , take a compact set $C^{(r-1)}_{j_r j_{r-1}} \subset B^{(r-1)}_{j_r j_{r-1}}$ (which could be empty) such that
Define $C^{(r-1)}_{j_r 0}$ as the remainder of $\pi _{r-1}^{-1}(C^{(r)}_{j_r})$ :
Then $\mathscr {C}^{(r-1)} = \{ C^{(r-1)}_{j_r j_{r-1}} | j_r j_{r-1} \in \Lambda _{r-1}^{0} \} $ is a measurable partition of $X_{r-1}$ .
Continue in this manner, and suppose we have obtained the partition $\mathscr {C}^{(k)} = \{ C^{(k)}_J | J \in \Lambda _k^{0} \}$ of $X_k$ for $k = i+1, i+2, \ldots , r$ . We will define $\mathscr {C}^{(i)}$ . Each element in $\pi _i^{-1}(\mathscr {C}^{(i+1)}) \vee \mathscr {A}^{(i)}$ can be expressed using $J' \in \Lambda _{i+1}^0$ and $j_i \in \{1,2, \ldots , m_i\}$ by
Choose a compact set $C_J^{(i)} \subset B_J^{(i)}$ for each $J \in \Lambda _{i}$ so that
Finally, for $J' \in \Lambda _{j+1}^{0}$ , let
Set $\mathscr {C}^{(i)} = \{ C^{(i)}_J | J \in \Lambda _i^{0} \}$ . This is a partition of $X_i$ .
Lemma 5.2. For $\mathscr {C}^{(i)}$ constructed above, we have
Proof. By Lemma 3.5,
Since $C^{(i)}_J \subset B^{(i)}_J$ for $J \in \Lambda _i$ ,
By Lemma 3.4, we have
Therefore,
Recall the definition of $\boldsymbol {w}$ in equation (2.1). We have
Here, we used the relation
We fix N and evaluate from above the following terms using backward induction:
First, consider the term
For $C \in \mathscr {C}^{(i+1)}_N$ , let $\mathscr {C}^{(i)}_N(C) = \{ D \in \mathscr {C}^{(i)}_N | \pi _i(D) \subset C \}$ , then by Lemma 3.4,
Applying this inequality to equation (5.2), the following term appears:
This can be evaluated similarly using Lemma 3.4 as
Continue likewise and obtain the following upper bound for equation (5.2):
For $1\leq i \leq r$ , let $\mathscr {C}^{(i)}_c = \{ C \in \mathscr {C}^{(i)} | C \text { is compact} \}$ . There is a positive number $\varepsilon _i$ such that $d^{(i)}(y_1, y_2)> \varepsilon _i$ for any $C_1, C_2 \in \mathscr {C}^{(i)}_c$ and $y_1 \in C_1, y_2 \in C_2$ . Fix a positive number $\varepsilon $ with
Let $\mathscr {F}^{(i)}$ be a chain of open (N, $\varepsilon $ )-covers of $X_i$ (see Definition 3.1). Consider
We will evaluate equation (5.4) from above by equation (5.6) up to a constant. We need the next lemma.
Lemma 5.3
-
(1) For any $V \subset X_r$ with $\mathrm {diam}(V,d^{(r)}_N) < \varepsilon $ ,
$$ \begin{align*} | \{ D \in \mathscr{C}^{(r)}_N | D \cap V \ne \varnothing \} | \leq 2^N. \end{align*} $$ -
(2) Let $1\leq i \leq r-1$ and $C \in \mathscr {C}^{(i+1)}_N$ . For any $V \subset X_i$ with $\mathrm {diam}(V,d^{(i)}_N) < \varepsilon $ ,
$$ \begin{align*} | \{ D \in \mathscr{C}^{(i)}_N(C) | D \cap V \ne \varnothing \} | \leq 2^N. \end{align*} $$
Proof. (1) $D \in \mathscr {C}^{(r)}_N$ can be expressed using $C^{(r)}_{k_s} \in \mathscr {C}^{(r)}$ ( $s =0, 1, \ldots , N-1$ ) as
If $D \cap V \ne \varnothing $ , we have $T_r^{-s}(C^{(r)}_{k_s}) \cap V \ne \varnothing $ for every $0 \leq s \leq N-1$ . Then for each s,
By equation (5.5), each $k_s$ is either $0$ or one of the elements in $\{ 1, 2, \ldots , m_r \}$ . Therefore, there are at most $2^N$ such sets.
(2) The proof works in the same way as in item (1). C can be written using $J_k \in \Lambda _{i+1}^0$ ( $k=0, 1, \ldots , N-1$ ) as
Then any $D \in \mathscr {C}^{(i)}_N(C)$ is of the form
with $0 \leq k_l \leq m_i$ ( $l = 1, 2, \ldots , N-1$ ). If $D \cap V \ne \varnothing $ , then each $k_l$ is either $0$ or one of the elements in $\{ 1, 2, \ldots , m_i \}$ . Therefore, there are at most $2^N$ such sets.
For any $C^{(1)} \in \mathscr {C}^{(1)}_N$ , there is $V \in \mathscr {F}^{(1)}$ with $V \cap C^{(1)} \ne \varnothing $ and
Let $C^{(2)} \in \mathscr {C}^{(2)}_N$ , then by Lemma 5.3,
By Lemma 3.4,
For $C^{(3)} \in \mathscr {C}^{(3)}_N$ , we apply Lemmas 5.3 and 3.4 similarly and obtain
We continue this reasoning and get
Here, $\alpha $ stands for $\sum _{i=1}^{r-1} a_i a_{i+1}\cdots a_{r-1}$ . We take the logarithm of both sides; the left-hand side equals equation (5.4), which is an upper bound for equation (5.2). Furthermore, consider the infimum over the chain of open (N, $\varepsilon $ )-covers $(\mathscr {F}^{(i)})_i$ on the right-hand side. By Remark 3.2, this yields
Divide by N, then let $N \to \infty $ and $\varepsilon \to 0$ . We obtain
Lemma 5.2 yields
We take the supremum over the partitions $(\mathscr {A}^{(i)})_i$ :
By the argument at the beginning of this proof, we conclude that
6. Example: sofic sets
Kenyon and Peres [Reference Kenyon and PeresKP96-2] calculated the Hausdorff dimension of sofic sets in $\mathbb {T}^2$ . In this section, we will see that we can calculate the Hausdorff dimension of certain sofic sets in $\mathbb {T}^d$ with arbitrary d. We give an example for the case $d=3$ .
6.1. Definition of sofic sets
This subsection is referred to [Reference Kenyon and PeresKP96-2]. Weiss [Reference WeissWe82] defined sofic systems as subshifts which are factors of shifts of finite type. Boyle, Kitchens, and Marcus proved in [Reference Boyle, Kitchens and MarcusBKM85] that this is equivalent to the following definition.
Definition 6.1. [Reference Kenyon and PeresKP96-2, Proposition 3.6]
Consider a finite directed graph $G = \langle V, E \rangle $ in which loops and multiple edges are allowed. Suppose its edges are colored in l colors in a ‘right-resolving’ fashion: every two edges emanating from the same vertex have different colors. Then the set of color sequences that arise from infinite paths in G is called the sofic system.
Let $m_1 \leq m_2 \leq \cdots \leq m_r$ be natural numbers, T an endomorphism on $\mathbb {T}^r = \mathbb {R}^r/\mathbb {Z}^r$ represented by the diagonal matrix $A = \mathrm {diag}(m_1, m_2, \ldots , m_r)$ , and $D = \prod _{i=1}^r \{0, 1, \ldots , m_i-1\}$ . Define a map $R_r: D^{\mathbb {N}} \rightarrow \mathbb {T}^r$ by
where $e^{(k)} = (e^{(k)}_1, \ldots , e^{(k)}_r) \in D$ for each k. Suppose the edges in some finite directed graph are labeled by the elements in D in the right-resolving fashion, and let $S \subset D^{\mathbb {N}}$ be the resulting sofic system. The image of S under $R_r$ is called a sofic set.
6.2. An example of a sofic set
Here we will look at an example of a sofic set and calculate its Hausdorff dimension via its weighted topological entropy. Let $D = \{0, 1\} \times \{0, 1, 2\} \times \{0, 1, 2, 3\}$ , and consider the directed graph $G = \langle V, E \rangle $ with $V = \{1, 2, 3\}$ and D-labeled edges E in Figure 2.
Let $Y_1 \subset D^{\mathbb {N}}$ be the resulting sofic system. Let $C = \{0, 1\} \times \{0, 1, 2\}$ and $B = \{0, 1\}$ . Define $p_1: D \rightarrow C$ and $p_2: C \rightarrow B$ by
Let $p_1^{\mathbb {N}}: D^{\mathbb {N}} \rightarrow C^{\mathbb {N}}$ and $p_2^{\mathbb {N}}: C^{\mathbb {N}} \rightarrow B^{\mathbb {N}}$ be the product map of $p_1$ and $p_2$ , respectively. Set $Y_2 = p_1^{\mathbb {N}}(Y_1)$ and $Y_3 = p_2^{\mathbb {N}}(Y_2)$ . Note that $Y_2 = \{ (0, 0), (1,0), (0, 1) \}^{\mathbb {N}}$ and $Y_3 = \{0, 1\}^{\mathbb {N}}$ , meaning they are full shifts.
The sets $X_i = R_i(Y_i) (i = 1, 2, 3)$ are sofic sets. Define $\pi _1: X_1 \rightarrow X_2$ and $\pi _2: X_2 \rightarrow X_3$ by
Furthermore, let $T_1$ , $T_2$ , and $T_3$ be the endomorphism on $X_1$ , $X_2$ , and $X_3$ represented by the matrices $\mathrm {diag}(2, 3, 4)$ , $\mathrm {diag}(2, 3)$ , and $\mathrm {diag}(2)$ , respectively. Then $(X_i, T_i)_i$ and $(\pi _i)_i$ form a sequence of dynamical systems.
For a natural number N, denote by $Y_i|_N$ the restriction of $Y_i$ to its first N coordinates, and let $p_{i, N}: Y_i|_N \rightarrow Y_{i+1}|_N$ be the projections for $i = 1, 2$ . As in Example 1.4, we have for any exponent $\boldsymbol {a} = (a_1, a_2) \in [0, 1]^2$ ,
Now, let us evaluate $| {p_{1, N}}^{-1}(v) |$ using matrix products. This idea of using matrix products is due to Kenyon and Peres [Reference Kenyon and PeresKP96-2]. Fix $(a, b) \in {\{0, 1\}}^2$ and let
Define a $3 \times 3$ matrix by $A_{(a, b)} = (a_{ij})_{ij}$ . Then we have
Note that ${A_{(0, 0)}}^2 = A_{(0, 1)}$ and ${A_{(0, 0)}}^3 = A_{(1, 0)}$ . For $v = (v_1, \ldots , v_N) \in Y_2|_N$ , we have
Here, $A \asymp B $ means there is a constant $c> 0$ independent of N with $c^{-1}B \leq A \leq cB$ . For $\alpha = ({1 + \sqrt {5}})/{2}$ , we have $\alpha ^2 = \alpha + 1$ and
Therefore,
where $\unicode{x3bb} _{(0,0)} = \alpha $ , $\unicode{x3bb} _{(0,1)} = \alpha ^2$ , $\unicode{x3bb} _{(1,0)} = \alpha ^3$ .
Take $u \in Y_3 = {\{0, 1\}}^{\mathbb {N}}$ and suppose there are n numbers of zeros in u. Also, if there are k numbers of $(0, 0)$ terms in $v = (v_1, \ldots , v_N) \in {p_{2, N}}^{-1}(u)$ , there are $n - k$ numbers of $(0, 1)$ terms and $N - n$ numbers of $(1, 0)$ terms in v. Then,
Therefore (recall that $Y_2 = \{(0,0), (1,0), (0,1)\}^{\mathbb {N}}$ ),
This implies
We conclude that
As in Example 1.4, the Hausdorff dimension of $X_1$ is obtained by letting $a_1 = \log _4{3}$ and $a_2 = \log _3{2}$ :
Acknowledgements
I am deeply grateful to my mentor, Masaki Tsukamoto, who not only has reviewed this paper several times throughout the writing process but has patiently helped me understand ergodic theory in general with his expertise. I also want to thank my family and friends for their unconditional support and everyone who has participated in my study for their time and willingness to share their knowledge. This work could not have been possible without their help.