1. Introduction
The Potts model, originally invented to study ferromagnetism [Reference Potts18], is a model from statistical physics; it also plays a central role in probability theory, combinatorics and computer science.
Let $G = (V,E)$ be a finite graph. The Potts model on the graph $G$ has two parameters, a number of states $q \in \mathbb{Z}_{\geq 2}$ and an interaction parameter $w\geq 0$ . The case $q=2$ is known as the Ising model. A configuration is a map $\sigma\;:\;V \to [q]\;:\!=\;\{1,\ldots,q\}$ . Associated with such a configuration is a weight Footnote 1 $w^{m(\sigma )}$ , where $m(\sigma )$ is the number of monochromatic edges in the configuration $\sigma$ . The $q$ -state partition function of the Potts model is the sum of the weights over all configurations; we denote it as $ Z(G, q, w) = \sum _{ \sigma:V \rightarrow [q]}\!w^{m(\sigma )}.$ In statistical physics one has $w=e^{kJ/T}$ , with $J$ being an interaction parameter, $k$ the Boltzmann constant and $T$ the temperature. We write $Z(G)$ to keep notation short.
The Gibbs measure is the probability measure $\mathbb{P}_G[\cdot]$ on the set of configurations of $G=(V,E)$ , where the probability of a random configurationFootnote 2 $\boldsymbol{\Phi }$ being equal to a given configuration $\phi\;:\;V\to [q]$ is proportional to the weight of $\phi$ :
The Potts model is said to be ferromagnetic if $w \gt 1$ and anti-ferromagnetic if $w \lt 1$ . The ferromagnetic Potts model favours configurations with a large number of monochromatic edges, while the anti-ferromagnetic Potts model favours configurations with a small number of monochromatic edges, i.e., configurations that are ‘close’ to proper colourings.
In statistical physics, models like the Potts model are typically considered on infinite graphs such as $\mathbb{Z}^d$ or the Bethe lattice $\mathbb{T}_d$ , also known as the infinite $(d+1)$ -regular tree. One can extend the notion of Gibbs measures for finite graphs to this infinite setting (see below for more details). The Gibbs measure on a finite graph is clearly unique. For infinite graphs, depending on the underlying parameter $w$ , there may however be multiple Gibbs measures. The transition from having a unique Gibbs measure to multiple Gibbs measures is referred to as a phase transition in statistical physics [Reference Friedli and Velenik10] and it is an important problem to determine when this happens in terms of the underlying parameters of the model. Moreover, for several $2$ -state models, the uniqueness region and the transition from uniqueness to non-uniqueness of the Gibbs measure on $\mathbb{T}_d$ have been connected to the tractability of approximately computing partition functions of these models. See e.g. [Reference Weitz24, Reference Sly21, Reference Sly and Sun22, Reference Galanis, Štefankovič and Vigoda13, Reference Sinclair, Srivastava and Thurley23, Reference Li, Lu and Yin15]. In the case of the anti-ferromagnetic Potts model it is known that in the uniqueness regime on $\mathbb{T}_d$ there is an efficient algorithm to approximately compute the partition function and sample from the Gibbs measure on random $(d+1)$ -regular graphs [Reference Blanca, Galanis, Goldberg, Štefankovič, Vigoda and Yang4]. See also [Reference Efthymiou9] for related results on Erdős-Rényi random graphs without any assumption on uniqueness. It is moreover expected that the uniqueness to non-uniqueness transition for the anti-ferromagnetic Potts model says something about the tractability of approximating the partition function for the entire class of bounded degree graphs. In particular, approximating the partition function of the Potts model is NP-hard on graphs of maximum degree $d+1$ when $0\lt w\lt 1 - \frac{q}{d+1}$ [Reference Galanis, Štefankovič and Vigoda12] (for even $q$ ). It is a major open problem to determine whether there exist efficient algorithms for all $w\in (1-\frac{q}{d+1},1]$ .
In the present paper we consider the problem of determining when the anti-ferromagnetic Potts model on the infinite $(d+1)$ -regular tree has a unique Gibbs measure. Before stating our main result, we first give a formal definition of Gibbs measures on the $(d+1)$ -regular tree.
Gibbs measures, uniqueness and main result
We follow Brightwell and Winkler [Reference Brightwell and Winkler7, Reference Brightwell and Winkler8] to introduce the notion of Gibbs measures on $\mathbb{T}_d$ , see also [Reference Rozikov20, Reference Friedli and Velenik10] for more details and background.
Throughout we fix a degree $d\geq 2$ and an integer $q\geq 2$ . We denote the vertex set of $\mathbb{T}_d$ by $V_d$ and we denote the space of all configurations $\{\psi\;:\;V_d\to [q]\}$ by $\Omega _{q,d}$ . For a set $U\subset V_d$ we denote by $\partial U$ the set of vertices in $U$ that are adjacent to some vertex in $V_d\setminus U$ . We refer to $\partial U$ as the boundary of $U$ . We denote by $U^{\circ }\;:\!=\;U\setminus \partial U$ the interior of $U$ . For $\psi \in \Omega _{q,d}$ and $U\subset V_d$ we denote the restriction of $\psi$ to $U$ by $\psi \!\restriction _U$ .
Definition 1.1. (Gibbs measure) We equip $\Omega _{q,d}$ with the sigma algebra generated by sets of the form $\{\psi \in \Omega _{q,d}\mid \psi \!\restriction _U=\phi \}$ where $U\subset V_d$ is a finite set and $\phi\;:\;U\to [q]$ a fixed colouring of $U$ . A probability measure $\mu$ on $\Omega _{q,d}$ is called a Gibbs measure if for any finite set $U\subset V_d$ and $\mu$ -almost every $\phi \in \Omega _{q,d}$ , we have
where the second probability $\mathbb{P}_{U}$ denotes the probability of seeing configuration $\phi$ on the finite graph $\mathbb{T}_d[U]$ induced by $U$ conditioned on the event of being equal to $\phi$ on $\partial U$ . This latter probability is obtained by dividing the weight of ${\boldsymbol \Phi }\!\restriction _U$ by the sum of the weights of all colourings of $U$ that agree with $\phi$ on $\partial U$ , cf. (1).
Remark 1.2. Note that the conditional probability on the left-hand side of (2) cannot be computed using the standard formula for conditional probabilities, as we in general condition on an event of measure zero. Therefore, the formalism of conditional expectations should be used to evaluate this conditional probability. See [Reference Friedli and Velenik10] for more details.
By a compactness argument one can show that there always is at least one Gibbs measure on $\Omega _{q,d}$ cf. [Reference Friedli and Velenik10, Reference Brightwell and Winkler7]. The question of whether there is a unique Gibbs measure can be reformulated in terms of a certain decay of correlations. To do so we require some definitions. We denote by $\mathbb{T}^n_{d}$ the finite tree obtained from $\mathbb{T}_d$ by fixing a root vertex $r_d$ , deleting all vertices at distance more than $n$ from the root, deleting one of the neighbours of $r_d$ and keeping the connected component containing $r_d$ . We denote the set of leaves of $\mathbb{T}^n_{d}$ by $\Lambda _{n,d}$ , except when $n=0$ , in which case we let $\Lambda _{d,0}=\{r_d\}$ . We omit the reference to $d$ when this is clear from the context. The next lemma reformulates uniqueness of the Gibbs measure in terms of the dependence on the distribution of the colours of the root vertex on the colouring of the leaves.
Lemma 1.3. The $q$ -state Potts model with parameter $w\geq 0$ on the infinite $(d+1)$ -regular tree has a unique Gibbs measure if and only if for all colours $c \in [q]$ it holds that
While this result is well known we will provide a proof for convenience of the reader in Appendix A based on Brightwel and Winkler’s proof [Reference Brightwell and Winkler8, Theorem 3.3] for the case $w=0$ . We moreover note that (3) is the property of uniqueness used in algorithmic applications [Reference Blanca, Galanis, Goldberg, Štefankovič, Vigoda and Yang4].
Define $w_c\;:\!=\;\max \{0,1-\frac{q}{d+1}\}$ . It is a folklore conjecture that this Gibbs measure is unique if and only if $w\geq w_c$ (if $q=d+1$ , the inequality should be read as a strict inequality). Non-uniqueness for $w\lt w_c$ has been known for a long time [Reference Peruggi, di Liberto and Monroy17, Reference Peruggi, di Liberto and Monroy16]
For $q\gt d+1$ and thus $w_c=0$ , a Gibbs measure is supported on proper $q$ -colourings and in this case the conjecture has been shown to be true by Jonasson [Reference Jonasson14]. For the case $q=3$ and $d\geq 2$ and the case $q=4$ and $d=4$ this has recently been proved by Galanis, Goldberg and Yang [Reference Galanis, Goldberg and Yang11]. Our main result confirms this conjecture for $q=4$ and $d\geq 4$ . Very recently Bencs together with the authors of the present paper [Reference Bencs, de Boer, Buys and Regts2] have confirmed this conjecture for all $q\geq 5$ provided $d$ is large enough.
Main Theorem. Let $d\in \mathbb{N}_{\geq 4}$ . Then the $4$ -state anti-ferromagnetic Potts model on $\mathbb{T}_d$ has a unique Gibbs measure if and only if $w\geq 1-\frac{4}{d+1}$ .
Our proof of this result follows a different approach than the one taken in [Reference Galanis, Goldberg and Yang11], which heavily relies on rigorous (but not easily verifiable) computer calculations. In particular, our approach allows us to recover the results from [Reference Galanis, Goldberg and Yang11], thereby removing the need for these computer calculations. See Theorem 3.8 below for the full statement of what we prove with our approach.
Organization. In the next section we discuss our approach towards proving our main theorem arriving at a geometric condition for uniqueness that we check in Section 3 to prove our main theorem, deferring the verification of a crucial inequality to Section 4. Finally, in Section 5 we finish with some concluding remarks and open questions.
2. Approach and setup
Our main goal in this section is to derive a geometric condition for ratios of probabilities that implies uniqueness of the Gibbs measure on the $(d+1)$ -regular tree. This condition will then be verified in the following sections. Along the way we will comment on how our approach relates to the approach of Galanis, Goldberg and Yang [Reference Galanis, Goldberg and Yang11].
2.1 Ratios of probabilities and the tree recursion
Instead of working directly with the probabilities we work with ratios of probabilities just as in [Reference Galanis, Goldberg and Yang11].
Let us introduce a few concepts to facilitate the discussion. Fix $n,d\in \mathbb{N}$ and write $\mathbb{T}^n_{d}=(V,E)$ . Let $\tau\;:\;\Lambda _{n,d}\to [q]$ . This will be called a boundary condition. We denote by
the restricted partition function. For $i \in [q]$ we denote by $Z_{i,\tau }(\mathbb{T}^n_{d})$ the sum (4) restricted to those $\sigma$ that associate colour $i$ to the root vertex. We define the ratio
Note that $R_{q,\tau }(\mathbb{T}^n_{d})=1$ . We moreover remark that $R_{i,\tau }(\mathbb{T}^n_{d})$ can be interpreted as the ratio of the probabilities that the root gets colour $i$ (resp. $q$ ) given the boundary condition $\tau$ .
We define for $n \geq 0$ , $\widehat{\mathbb{T}}^n_{d}$ to be the rooted tree obtained from $\mathbb{T}^n_{d}$ by adding a new root $\hat r_d$ connecting it to the original root $r_d$ with a single edge. Note that the set of non-root leaves of $\widehat{\mathbb{T}}^n_{d}$ is just $\Lambda _{n,d}$ . For any boundary condition on $\tau\;:\;\Lambda _{n,d}\to [q]$ we define the restricted partition function, $Z_{i,\tau }(\widehat{\mathbb{T}}^n_{d})$ and ratio $R_{i,\tau }(\widehat{\mathbb{T}}^n_{d})$ analogously as for ${\mathbb{T}}^n_{d}$ .
The next lemma provides a sufficient condition for ${\mathbb{T}}_{d}$ to have a unique Gibbs measure in terms of these ratios, which we prove at the end of this section.
Lemma 2.1. Let $q,d \in \mathbb{N}$ and $w\in (0,1)$ . Suppose that for all $i \in [q-1]$ and for all $\delta \gt 0$ there exists $N \gt 0$ such that for all $n \geq N$ and for all boundary conditions $\tau\;:\;\Lambda _{n,d} \to [q]$ we have
then the tree ${\mathbb{T}}_{d}$ has a unique Gibbs measure.
An advantage of working with the ratios of probabilities is that the well known tree recursion for the Potts model takes a convenient form.
Lemma 2.2. Let $n,d\in \mathbb{N}$ and let $\tau\;:\;\Lambda _{n,d}\to [q]$ be a boundary condition. Let for $i=1,\ldots,d$ , $T_i=\widehat{\mathbb{T}}^{n-1}_{d}$ be the components of $\mathbb{T}^n_d-r_d$ where we attach a new root vertex to $r_{d-1}$ . Let $\tau _i$ be the restriction of $\tau$ to $\Lambda _{n-1,d}\to [q]$ viewed as a subset of the vertices of $T_i$ . Then we have for each $i \in [q-1],$
For completeness we provide a proof for this lemma at the end of this section.
A direct analysis of the recursion in Lemma 2.2 is not straightforward, as it does not contract uniformly on a symmetric domain. In [Reference Galanis, Goldberg and Yang11] this is remedied by looking at the two-step recursion, that is they analyze the behaviour of the ratio at depth $n$ as a function of the ratios at depth $n+2$ . They show with substantial, yet rigorous, aid of a computer algebra package that this two-step recursion does contract on a symmetric domain (when $q=3$ and $w$ and $d$ are as they should be). We however take a different, more geometric approach and work instead with the one-step recursion, as described in the next subsection.
2.2 A geometric condition for uniqueness
To state a geometric condition, we first introduce some functions that allow us to treat the tree recursion from Lemma 2.2 more concisely. Let $q \in \mathbb{Z}_{\geq 2},d \in \mathbb{Z}_{\geq 1}$ and $w \in [0,1)$ . For $i \in [q]$ let $\mu _i$ be the map from $\mathbb{R}_{\gt 0}^{q}$ to $\mathbb{R}_{\gt 0}$ given by
Furthermore, we define
and
Both $\tilde{F}$ and $\tilde{G}$ are homogeneous maps from $\mathbb{R}^{q}_{\gt 0}$ to itself. For $x,y \in \mathbb{R}_{\gt 0}^{q}$ we define an equivalence relation $x\sim y$ if and only if $x = \lambda y$ for some $\lambda \gt 0$ . We define $\mathbb{P}^{q-1}_{\gt 0}= \mathbb{R}_{\gt 0}^{q}/ \sim$ and denote elements of $\mathbb{P}^{q-1}_{\gt 0}$ as $[x_1\;:\;\ldots\;:\;x_q]$ . We note that since $\tilde{F}$ and $\tilde{G}$ are homogeneous they are also well defined as maps from $\mathbb{P}^{q-1}_{\gt 0}$ to itself and from now on we consider them as such.
Let $\pi\;:\;\mathbb{P}_{\gt 0}^{q-1} \to \mathbb{R}_{\gt 0}^{q-1}$ be the projection map defined by $\pi ([x_1\;:\ldots:\,\,x_q]) = (x_1/x_{q}, \ldots, x_{q-1}/x_{q})$ with inverse $\iota\;:\;\mathbb{R}_{\gt 0}^{q-1} \to \mathbb{P}_{\gt 0}^{q-1}$ defined by $\iota (x_1, \ldots, x_{q-1})=[x_1\;:\;\ldots\;:\;x_{q-1}\;:\;1]$ . Note that $\pi$ and $\iota$ are continuous. We define the maps $G,F$ from $\mathbb{R}_{\gt 0}^{q-1}$ to itself by $\pi \circ \tilde{G} \circ \iota$ and $\pi \circ \tilde{F} \circ \iota$ . Explicitly we have
for $q=3$ . For $q=4$ we have
and $F(x_1,x_2,x_3) = G(x_1^d,x_2^d,x_3^d)$ .
With this definition the recursion from Lemma 2.2 can now be stated as follows. Following the notation of the lemma, denote by $x_l$ the following log convex combination of the ratios $R_{l,\tau _s}(\widehat{\mathbb{T}}^{n-1}_{d})$ ,
Then
We say that a subset $\mathcal{T} \subseteq \mathbb{R}_{\gt 0}^n$ is log convex if $\log \!\left (\mathcal{T}\right )$ is a convex subset of $\mathbb{R}^n$ , where $\log \!\left (\mathcal{T}\right )$ denotes the set consisting of elements of $\mathcal{T}$ with the logarithm applied to their individual entries. The next lemma gives sufficient conditions for uniqueness on the infinite regular tree.
Lemma 2.3. Suppose that $q\geq 2$ , $d\geq 2$ and $w\gt 0$ are such that there exists a sequence $\{\mathcal{T}_n\}_{n\geq 0}$ of log convex subsets of $\mathbb{R}^{q-1}_{\gt 0}$ with the following properties.
-
(1) Both the vector with every entry equal to $1/w$ and the vectors obtained from the all-ones vector with a single entry changed to $w$ are elements of $\mathcal{T}_0$ .
-
(2) For every $m$ we have $F(\mathcal{T}_m) \subseteq \mathcal{T}_{m+1}$ .
-
(3) For every $\epsilon \gt 0$ there is an $M$ such that for all $m\geq M$ every element of $\mathcal{T}_m$ has at most distance $\epsilon$ to the all-ones vector.
Then the anti-ferromagnetic Potts model with parameter $w$ has has a unique Gibbs measure on $\mathbb{T}_{d}$ .
Proof. By Lemma 2.1, it suffices to show that regardless of the boundary condition $\tau$ on $\Lambda _{n,d}$ , $R_{i,\tau }(\widehat{\mathbb{T}}_d^n)\to 1$ as $n\to \infty$ .
First of all we claim that for all $n \geq 0$ and all boundary conditions $\tau\;:\;\Lambda _{n,d} \to [q]$ we have $(R_{1,\tau }(\widehat{\mathbb{T}}^{n}_{d}),\ldots,R_{q-1,\tau }(\widehat{\mathbb{T}}^{n}_{d})) \in \mathcal{T}_n$ . We prove this by induction on $n$ . For the base case, $n=0$ , we note that $\widehat{\mathbb{T}}^0_{d}$ consists of one free root $\hat r_{d}$ , connected to a coloured vertex $v$ . If $v$ is coloured $i \in [q-1]$ , then $(R_{1,\tau }(\widehat{\mathbb{T}}^{0}_{d}),\ldots,R_{q-1,\tau }(\widehat{\mathbb{T}}^{0}_{d}))$ consists of a $w$ on position $i$ and ones everywhere else. If $v$ is coloured $q$ , then $(R_{1,\tau }(\widehat{\mathbb{T}}^{0}_{d}),\ldots,R_{q-1,\tau }(\widehat{\mathbb{T}}^{0}_{d})) = (1/w, \ldots, 1/w)$ . So the base case follows from item (1).
Suppose next that for some $n \geq 0$ the claim holds. Let $\tau\;:\;\Lambda _{d,n+1} \to [q]$ be any boundary condition. It then follows from (7), (8) and the assumptions that $\mathcal{T}_n$ is log convex and $F(\mathcal{T}_n)\subseteq (\mathcal{T}_{n+1})$ that
completing the induction.
From the claim we just proved and item (3) it then follows that given $\varepsilon \gt 0$ there exists $N\gt 0$ such that for all $n\geq N$ , any boundary condition $\tau$ and color $i$ , $|R_{i,\tau }(\widehat{\mathbb{T}}_d^n)- 1|\lt \varepsilon$ . This concludes the proof.
In the next section we will construct a sequence of regions $\{\mathcal{T}_n\}_{n \geq 0}$ satisfying the conditions of the lemma. In Subsection 3.1 we describe a certain symmetry that the map $F$ exhibits, corresponding to the symmetry of the colours in the Potts model. When a region $\mathcal{T}$ has a corresponding symmetry it is easier to understand the image $F(\mathcal{T})$ . This is explained in Lemma 3.2. In Subsection 3.2 we define a two parameter family of sets $\mathcal{T}_{a,b}$ that display the required symmetry. In Lemma 3.4 we prove that if simple analytic conditions in $a$ and $b$ are satisfied the sets $\mathcal{T}_{a,b}$ are log-convex. In Lemma 3.6 we give inner and outer approximations of the sets $\mathcal{T}_{a,b}$ with simple polytopes. This is convenient since the map $G$ is a fractional linear transformation and therefore preserves convex sets. These are used in Lemma 3.7 in Subsection 3.3 where we show that if more involved analytical conditions are satisfied $\mathcal{T}_{a,b}$ gets mapped strictly inside itself by $F$ . We then combine all ingredients to prove the Main Theorem as Theorem 3.8. In the proof of Theorem 3.8 we show that we can construct a sequence $\mathcal{T}_n = \mathcal{T}_{a_n,b_n}$ that satisfies the conditions of Lemma 2.3 using the fact that we can keep satisfying the analytic conditions on $a_n$ and $b_n$ . This uses a number of technical inequalities whose verification we have moved to Section 4 to preserve the flow of the text.
We finish this section be providing proofs of Lemmas 2.1 and 2.2.
2.3 Proofs of Lemmas 2.1 and 2.2
Proof of Lemma 2.1 . The ratios for $\widehat{\mathbb{T}}^n_{d}$ and $\mathbb{T}^n_{d}$ can easily be expressed in terms of each other. Fix any $\tau\;:\;\Lambda _{n,d}\to [q]$ . Then for any $i=1,\ldots,q-1$ ,
We may thus assume that for each $\delta \gt 0$ there exist $N'$ such that for all $n\geq N'$ , $|R_{i,\tau }(\mathbb{T}_d^n)-1|\lt \delta$ for all $\tau\;:\;\Lambda _{n,d}\to [q].$
For readability, we omit the reference to the subscript $d$ in what follows. For any $i=1,\ldots,q$ we have
Hence for any $i \in [q]$ , upon dividing both the numerator and denominator by $Z_{q,\tau }({\mathbb{T}}^{n})$ , we obtain
Now since the map
is continuous and maps $(1, \ldots, 1)$ to $0$ , it follows that for every $\epsilon \gt 0$ there is a $\delta \gt 0$ such that $|R_{i,\tau }(\mathbb{T}_{n})-1| \lt \delta$ for all boundary conditions $\tau$ and $i=1.\ldots,q-1$ , implies that
We conclude that the conditions of Lemma 1.3 are satisfied and hence $\mathbb{T}_d$ has a unique Gibbs measure.
We next provide a proof for the tree recursion.
Proof of Lemma 2.2 . For readability we omit $d$ from the notation. We have
as a factor $w$ is picked up when the unique neighbour of the root vertex is assigned the same colour as the root vertex, $\hat{r}_d$ , of $\widehat{\mathbb{T}}^{n}$ . Note that for any color $c \in [q]$ we have $Z_{c,\tau }({\mathbb{T}}^{n}) = \prod _{s=1}^{d} Z_{c,\tau _s}(\widehat{\mathbb{T}}^{n-1})$ . Plugging this in into (10) and dividing the numerator and denominator by $\prod _{s=1}^{d} Z_{q,\tau _s}(\widehat{\mathbb{T}}^{n-1})$ , we arrive at the desired expression.
3. Proof of the main theorem
In this section we use Lemma 2.3 to prove the Main Theorem.
3.1 Symmetry of the map $\textbf{F}$
In order to find suitable sets $\mathcal{T}$ such that $F(\mathcal{T}) \subseteq \mathcal{T}$ we will exploit a symmetry that the map $F$ exhibits, due to the inherent symmetry of permuting the colours $[q]$ in the Potts model. To make this formal we will define a few self-maps on, and regions of, the spaces $\mathbb{P}^{q-1}_{\gt 0}, \mathbb{R}^{q-1}_{\gt 0}$ and $\mathbb{R}^{q-1}$ . To avoid confusion, we will denote self-maps on and subsets of $\mathbb{P}^{q-1}_{\gt 0}$ with a tilde, self-maps on and subsets of $\mathbb{R}^{q-1}_{\gt 0}$ without additional notation and self-maps on and subsets of $\mathbb{R}^{q-1}$ with a hat. When a self-map or subset is used as an index, we will drop the hat or tilde in the index.
The three spaces $\mathbb{P}^{q-1}_{\gt 0}, \mathbb{R}^{q-1}_{\gt 0}$ and $\mathbb{R}^{q-1}$ are homeomorphic, with homeomorphisms $\pi\;:\;\mathbb{P}_{\gt 0}^{q-1} \to \mathbb{R}_{\gt 0}^{q-1}$ with inverse $\iota$ and $\log\;:\;\mathbb{R}_{\gt 0}^{q-1} \to \mathbb{R}^{q-1}$ with inverse $\exp$ . We define the self-maps $\hat{G},\hat{F}$ on $\mathbb{R}^{q-1}$ by $\hat{G} = \log \circ{G} \circ \exp$ and $\hat{F} = \log \circ{F} \circ \exp$ . To summarise, we have the following diagram of continuous maps
Let $S_q$ denote the symmetric group on $q$ elements. This group acts on $\mathbb{P}^{q-1}_{\gt 0}$ be permuting the entries, which corresponds to permuting the colours in the Potts model. For $\sigma \in S_q$ we denote the map from $\mathbb{P}^{q-1}_{\gt 0}$ to itself corresponding to this action by $\tilde{M}_\sigma$ . We use this action to also define an action on $\mathbb{R}_{\gt 0}^{q-1}$ by letting $M_\sigma (x) = (\pi \circ \tilde{M}_\sigma \circ \iota )(x)$ . It is easy to see that the action of $S_q$ on $\mathbb{P}_{\gt 0}^{q-1}$ commutes with $\tilde{G}$ and $\tilde{F}$ . It follows that the action of $S_q$ on $\mathbb{R}_{\gt 0}^{q-1}$ also commutes with $F$ and $G$ . Similarly, we define the map $\hat{M}_\sigma$ on $x \in \mathbb{R}^{q-1}$ by $\hat{M}_\sigma (x) = \left (\log \circ M_\sigma \circ \exp \right )\!(x)$ and we note that this action commutes with $\hat{G}$ and $\hat{F}$ .
Example 3.1. As an example we present the table of the action of $S_q$ for $q=3$ on a point in all the three coordinates.
The three spaces $\mathbb{P}^{q-1}_{\gt 0}, \mathbb{R}^{q-1}_{\gt 0}$ and $\mathbb{R}^{q-1}$ are homeomorphic, with homeomorphisms $\pi\;:\;\mathbb{P}_{\gt 0}^{q-1} \to \mathbb{R}_{\gt 0}^{q-1}$ with inverse $\iota$ and $\log\;:\;\mathbb{R}_{\gt 0}^{q-1} \to \mathbb{R}^{q-1}$ with inverse $\exp$ . We define the self-maps $\hat{G},\hat{F}$ on $\mathbb{R}^{q-1}$ by $\hat{G} = \log \circ{G} \circ \exp$ and $\hat{F} = \log \circ{F} \circ \exp$ . To summarise, we have the following diagram of continuous maps
Note that in general $\hat{M}_\sigma$ is a linear map for all $\sigma \in S_q$ . In fact, the map $\sigma \mapsto \hat{M}_\sigma$ is an irreducible representation of $S_q$ called the standard representation, but we will not use this.
For any permutation $\tau \in S_q$ we define the following subset of $\mathbb{P}_{\gt 0}^{q-1}$
Furthermore, we let $\mathcal{R}_\tau = \pi (\tilde{\mathcal{R}}_{\tau })$ and $\hat{\mathcal{R}}_\tau = \log\!(\mathcal{R}_\tau )$ . Note that if $x\in \mathbb{P}_{\gt 0}^{q-1}$ has the property that $x_i \geq x_j$ then $\mu _{i}(x) \leq \mu _{j}(x)$ , recalling that
It follows that the map $\tilde{G}$ maps $\tilde{\mathcal{R}}_\tau$ into $\tilde{\mathcal{R}}_{\tau \circ m}$ , where $m \in S_q$ denotes the permutation with $m(l) = k+1-l$ for $l \in [q]$ . The same is true for $\tilde{F}$ because $x \mapsto x^d$ maps any $\tilde{\mathcal{R}}_\tau$ to itself. It follows that $G$ and $F$ map $\mathcal{R}_\tau$ into $\mathcal{R}_{\tau \circ m}$ and that $\hat{G}$ and $\hat{F}$ map $\hat{\mathcal{R}}_\tau$ into $\hat{\mathcal{R}}_{\tau \circ m}$ . In Figure 1 the regions $\mathcal{R}_\tau$ and $\hat{\mathcal{R}}_\tau$ are depicted when $q =3$ .
The main purpose of the considerations of this section up until this point is to state and prove the following simple lemma.
Lemma 3.2. Suppose $\mathcal{T} \subseteq \mathbb{R}_{\gt 0}^{q-1}$ is a set such that $M_\sigma (\mathcal{T}) = \mathcal{T}$ for all $\sigma \in S_q$ . Suppose also that there is a permutation $\tau \in S_q$ such that
Then $F(\mathcal{T}) \subseteq \text{int}(\mathcal{T})$ .
Proof. Let $x \in \mathcal{T}$ . There is a $\sigma \in S_q$ such that $M_\sigma (x) \in \mathcal{R}_{\tau }$ and thus $M_\sigma (x) \in \mathcal{T} \cap \mathcal{R}_{\tau }$ . It follows from the assumption that $(F\circ M_\sigma )(x) \in \text{int}(\mathcal{T})$ . Because $M_\sigma$ commutes with $F$ we find that $(M_\sigma \circ F)(x) \in \text{int}(\mathcal{T})$ . We conclude that $F(x) \in M_\sigma ^{-1}(\text{int}(\mathcal{T}))=M_{\sigma ^{-1}}(\text{int}(\mathcal{T}))\subseteq \mathcal{T}$ . Because $M_\sigma$ is continuous it follows that $M_\sigma ^{-1}(\text{int}(\mathcal{T}))$ is an open subset of $\mathcal{T}$ and hence $F(x) \in \text{int}(\mathcal{T})$ .
In the next section we will define a family of regions $\mathcal{T}_{a,b}$ for $a,b\gt 1$ with the property $M_\sigma (\mathcal{T}_{a,b})=\mathcal{T}_{a,b}$ for all $\sigma \in S_q$ . Our goal will be to show that for certain choices of parameters $(a,b)$ we have $F(\mathcal{T}_{a,b}) \subseteq \text{int}(T_{a,b})$ . Because of Lemma 3.2 it will be enough to restrict ourselves to one well chosen region $\mathcal{R}_{\tau }$ .
3.2 Definition and properties of the sets $\boldsymbol{\mathcal{T}\!}_{\textbf{a,b}}$
For $q=3$ and $q=4$ we will define a family of log convex sets $\mathcal{T}_{a,b}\subseteq \mathbb{R}_{\gt 0}^{q-1}$ with the property that $M_\sigma (\mathcal{T}_{a,b}) = \mathcal{T}_{a,b}$ for all $\sigma \in S_q$ . We will do this by defining the convex sets $\hat{\mathcal{T}}_{a,b} \subseteq \mathbb{R}^{q-1}$ and then letting $\mathcal{T}_{a,b} = \exp\!(\hat{\mathcal{T}}_{a,b})$ .
Let $a,b\gt 1$ . To avoid having to write too many logarithms we let $\hat{a} = \log\!(a)$ and $\hat{b} = \log\!(b)$ . For $q=3$ we define the following half-space of $\mathbb{R}^2$
Subsequently, we define
Similarly, for $q=4$ , we define the half-space
and the region
For both $q=3$ and $q=4$ we let $\mathcal{T}_{a,b} = \exp\!(\hat{\mathcal{T}}_{a,b})$ . Figure 1 contains an image of $\hat{\mathcal{T}}_{a,b}$ and $\mathcal{T}_{a,b}$ for $q=3$ . Figure 2 contains an image of $\hat{\mathcal{T}}_{a,b}$ for $q=4$ ; we highlighted the region $\hat{\mathcal{R}}_{(243)} \cap \hat{H}_{a,b}$ in orange. We have chosen to give the sets $\hat{\mathcal{T}}_{a,b}$ the same name for $q=3$ and $q=4$ . This is because many of the properties of $\hat{\mathcal{T}}_{a,b}$ that we will prove hold for both cases and are proved in a similar way. Unless otherwise stated one should assume that any statement involving $\hat{\mathcal{T}}_{a,b}$ refers to the corresponding statement for both $q=3$ and $q=4$ .
We first state a basic lemma relating the half-space representation and the vertex representation of a polytope. This lemma will be used a number of times in the remainder of the section to derive useful properties of the sets $\mathcal{T}_{a,b}$ .
Lemma 3.3. Let $H_1, \ldots, H_n$ be closed half-spaces in $\mathbb{R}^{n-1}$ . Furthermore, let $p_1, \ldots, p_n \in \mathbb{R}^{n-1}$ with the property that for all $i \in [n]$ we have $p_i \in \text{int}(H_i)$ and $p_i \in \partial H_j$ for $j \neq i$ . Then
where $\text{Conv}(S)$ denotes the convex hull of the set $S$ .
Proof. We give a sketch of the proof. The conditions on the $p_i$ imply that the set $\{p_1, \ldots, p_n\}$ is affinely independent, i.e. the set $\{p_1-p_n, \ldots, p_{n-1}-p_n\}$ is linearly independent. Therefore there exists an invertible affine transformation $T$ with $T(v) = M(v-p_n)$ for some invertible linear transformation $M$ , such that $T(p_i) = e_i$ for $i \in [n-1]$ , where $e_i$ denotes a standard basis vector. From the conditions on the $p_i$ it follows that $T(H_i) = \{x \in \mathbb{R}^{n-1}\;:\;x_i \geq 0 \}$ for $i \in [n-1]$ and $T(H_n) = \{x \in \mathbb{R}^{n-1}\;:\;\sum _{i=1}^{n-1} x_i \leq 1\}$ . As affine transformations preserve convexity, we see
The lemma now follows from the fact that $T$ is invertible.
Lemma 3.4. For $a,b \in \mathbb{R}_{\gt 1}$ with $b\leq a \leq b^2$ we have that $\hat{\mathcal{T}}_{a,b}$ is convex, or equivalently, that $\mathcal{T}_{a,b}$ is log convex.
Proof. Recall that we let $\hat{a} = \log\!(a)$ and $\hat{b} = \log\!(b)$ and observe that these are two positive real numbers. Also recall that the action of $S_q$ on $\mathbb{R}^{q-1}$ is given by linear maps. It follows that the half-space $\hat{H}_{a,b}$ gets mapped to a half-space by $\hat{M}_\sigma$ for any $\sigma \in S_q$ . We will show that for the choices of parameters stated in the lemma we have
for both $q=3$ and $q=4$ . This equality implies that $\hat{\mathcal{T}}_{a,b}$ is convex because an intersection of half-spaces is convex. In fact, it implies that $\hat{\mathcal{T}}_{a,b}$ is a convex polytope.
We will first prove that the right-hand side of (10) is contained in the left-hand side. To that effect take an element $x \in \bigcap _{\sigma \in S_q}\hat{M}_\sigma \!\left (\hat{H}_{a,b}\right )$ . Because the collection $\{\hat{\mathcal{R}}_{\sigma }\}_{\sigma \in S_q}$ covers $\mathbb{R}^{q-1}$ , there is a $\tau \in S_q$ such that $x \in \hat{\mathcal{R}}_{\tau }$ . For $q = 3$ let $\sigma \in S_3$ such that $\sigma \cdot (23) = \tau$ . We see that $x \in \hat{\mathcal{R}}_{\tau }\cap \hat{M}_\sigma (\hat{H}_{a,b})$ and thus $x \in \hat{M}_\sigma \left (\hat{\mathcal{R}}_{(23)}\cap \hat{H}_{a,b}\right )$ , from which it follows $x \in \hat{\mathcal{T}}_{a,b}$ . Similarly, for $q=4$ we let $\sigma \in S_4$ such that $\sigma \cdot (243) = \tau$ . It follows in exactly the same way that $x \in \hat{\mathcal{T}}_{a,b}$ .
The proof that the left-hand side of (10) is contained in the right-hand side is slightly more involved. Assume that $q=3$ . We first show that
While this is easily seen to be true from Figure 1, we provide a formal proof. Note that $\hat{\mathcal{R}}_{(23)}$ is the intersection of $\hat{H}_{x\leq 0} = \{(x,y) \in \mathbb{R}^2\;:\;x \leq 0\}$ and $\hat{H}_{y \geq 0} = \{(x,y) \in \mathbb{R}^2\;:\;y \geq 0\}$ . One can check that $(0,0) \in \partial \hat{H}_{x\leq 0} \cap \partial \hat{H}_{y\geq 0}\cap \text{int}(\hat{H}_{a,b})$ , $(\!-\hat{a},0) \in \partial \hat{H}_{a,b} \cap \partial \hat{H}_{y\geq 0}\cap \text{int}(\hat{H}_{x\leq 0} )$ and $(0,\hat{b}) \in \partial \hat{H}_{a,b} \cap \partial \hat{H}_{x\leq 0}\cap \text{int}(\hat{H}_{y\geq 0} )$ . Equation (12) then follows from Lemma 3.3.
We obtain
We want to show that this is a subset of $\bigcap _{\sigma \in S_3}\hat{M}_\sigma \left (\hat{H}_{a,b}\right )$ . Because all these half-spaces are convex, it is enough to show that the set $P = \{(0,0)\} \cup \bigcup _{\sigma \in S_3} \{\hat{M}_\sigma (\!-\hat{a},0),\hat{M}_\sigma (0,\hat{b})\}$ is a subset of $\hat{M}_\tau (H_{a,b})$ for all $\tau \in S_3$ . Because the set $P$ is invariant under the action of $S_3$ it is sufficient to show that $P \subseteq \hat{H}_{a,b}$ . We can calculate $P$ explicitly to obtain
To check that these points lie in $\hat{H}_{a,b}$ we have to check that for each $(x,y) \in P$ we have $-\hat{b}\cdot x + \hat{a}\cdot y \leq \hat{a}\hat{b}$ . The inequality is trivially true for all but the points $(\hat{a}, \hat{a})$ and $(\!-\hat{b}, -\hat{b})$ . One can confirm that the inequalities obtained by filling in these two points are simultaneously satisfied if and only if $\hat{b}/2 \leq \hat{a} \leq 2\hat{b}$ . Because $\hat{a} = \log\!(a)$ and $\hat{b}= \log\!(b)$ this is equivalent to $\sqrt{b} \leq a \leq b^2$ . This shows that for these choices of $a$ and $b$ the left-hand side of (11) is contained in the right-hand side, which concludes the proof for $q=3$ .
The proof for $q=4$ follows the same path. One can show in very similar way to the $q=3$ case that
and thus that
It is again sufficient to show that $P = \{(0,0,0)\} \cup \bigcup _{\sigma \in S_4} \{\hat{M}_\sigma (\!-\hat{a},0,0),\hat{M}_\sigma (0,0,\hat{b}),\hat{M}_\sigma (0,\hat{b},\hat{b})\}$ is a subset of $\hat{H}_{a,b}$ . Explicitly we have
We need to check that for $(x,y,z) \in P$ we have $(x,y,z) \in \hat{H}_{a,b}$ , that is, we have $-\hat{b}\cdot x + \hat{a}\cdot z \leq \hat{a}\hat{b}$ . This inequality holds simultaneously for the points $(\!-\hat{b},-\hat{b},0)$ and $(\hat{a},\hat{a},\hat{a})$ if and only $\hat{b} \leq \hat{a} \leq 2\hat{b}$ . It can be seen that the inequalities obtained from the other points also hold for this regime of parameters. This concludes the proof that the left-hand side of (11) is contained in the right-hand side for $q=4$ , which is the final thing that we needed to show to prove the lemma.
Lemma 3.2 states that it is enough to understand $F(\mathcal{T}_{a,b} \cap \mathcal{R}_\sigma )$ for a specific $\sigma$ to understand the whole image $F(\mathcal{T}_{a,b})$ . In the following two lemmas we calculate $\mathcal{T}_{a,b} \cap \mathcal{R}_\sigma$ more explicitly for $\sigma = (123)$ and $\sigma =(134)$ for $q=3$ and $q=4$ respectively. Because $F$ maps $\mathcal{R}_{(123)}$ into $\mathcal{R}_{(23)}$ for $q=3$ and $\mathcal{R}_{(134)}$ into $\mathcal{R}_{(243)}$ for $q=4$ , we describe $\mathcal{T}_{a,b} \cap \mathcal{R}_\sigma$ for these instances of $\sigma$ too. The choice for these specific permutations $\sigma$ is arbitrary, but does seem to make the upcoming analysis more pleasant than for some other choices.
Lemma 3.5. Let $a,b \in \mathbb{R}_{\gt 1}$ and define
For $q = 3$ we have
and
For $q=4$ we have
and
Proof. We will first prove the statement for $q=3$ . Recall that $\hat{\mathcal{T}}_{a,b} \cap \hat{\mathcal{R}}_{(23)} = \hat{\mathcal{H}}_{a,b}\cap \hat{\mathcal{R}}_{(23)}$ , where $\hat{\mathcal{H}}_{a,b} = \{(x,y) \in \mathbb{R}^2\;:-\hat{b}\cdot x + \hat{a}\cdot y \leq \hat{a}\hat{b}\}$ and $\hat{\mathcal{R}}_{(23)} = \{(x,y) \in \mathbb{R}^2\;:\;x \leq 0 \leq y\}$ . Therefore, we can write
If we replace $\hat{a}$ by $\log\!(a)$ and $\hat{b}$ by $\log\!(b)$ we find that
By applying $\exp$ to the individual components of the inequalities and replacing $x = e^{\hat{x}}$ and $y = e^{\hat{y}}$ , we obtain the equality stated in the lemma. To prove the other equality for $q=3$ we note that for $\sigma = (12)$ we have $\sigma (23) = (123)$ and thus $M_\sigma (\mathcal{T}_{a,b} \cap \mathcal{R}_{(23)}) = \mathcal{T}_{a,b} \cap \mathcal{R}_{(123)}$ . For $(x,y) \in \mathbb{R}_{\gt 0}^2$ we have $M_{(12)}(x,y) = (y,x)$ and thus
To prove the statements given for $q=4$ we recall that in that case $ \hat{H}_{a,b} = \{(x,y,z) \in \mathbb{R}^3\;:-\;\hat{b}\cdot x + \hat{a}\cdot z \leq \hat{a}\hat{b}\}$ and $\hat{\mathcal{R}}_{(243)} = \{(x,y,z) \in \mathcal{R}^3\;:\;x \leq 0\leq y \leq z\}$ . Therefore,
Similarly, as in the $q=3$ case, it follows that
If we let $\sigma = (13)(24)$ we have $\sigma \cdot (243) = (134)$ . For this $\sigma$ and $(x,y,z) \in \mathbb{R}_{\gt 0}^{3}$ we have $M_\sigma (x,y,z) = (z/y,1/y,x/y)$ . We find that
The next lemma provides inner and outer approximations of the sets $\mathcal{T}_{a,b}$ with simple polytopes.
Lemma 3.6. Let $a,b \in \mathbb{R}_{\gt 1}$ with $a \geq b$ . Then for q = 3 we have
and
For q = 4 we have
and
Proof. Let $l_{a,b}$ be as in Lemma 3.5. We define $c = \log\!(b)/\log\!(a)$ so that we can write $l_{a,b}(x) = b \cdot x^c$ . By assumption $c\leq 1$ , therefore the function $l_{a,b}$ is concave and thus the sets
are convex. It follows now from Lemma 3.5 that the sets $\mathcal{T}_{a,b} \cap \mathcal{R}_{(23)}$ for $q=3$ and $\mathcal{T}_{a,b} \cap \mathcal{R}_{(243)}$ for $q=4$ are convex. It is easy to see that the former set contains the points $(1,1),(1/a,1)$ and $(1,b)$ and that the latter set contains the points $(1,1,1),(1/a,1,1),(1,b,b)$ and $(1,1,b)$ . This is enough to conclude that the first stated inclusions for $q=3$ and $q=4$ hold.
Because $l_{a,b}$ is concave we find that for all $x\gt 0$
Therefore, using Lemma 3.5, we have the following inclusion for $q=3$
Note that in the latter set we do not require $x$ and $y$ to be positive. This set can also be written as the intersection of the following three half-spaces
Note that $(1,1) \in \partial H_1 \cap \partial H_2 \cap \text{int}(H_3)$ , $(b,1) \in \partial H_1 \cap \text{int}(H_2) \cap \partial H_3$ and $(1,1-\frac{b-1}{bc}) \in \text{int}(H_1) \cap \partial H_2 \cap \partial H_3$ . The second inclusion for $q=3$ stated in the lemma follows from Lemma 3.3.
From Lemma 3.5 and equation (13) we deduce that for q = 4
So $\mathcal{T}_{a,b} \cap \mathcal{R}_{(134)}$ is contained in the intersection of the following half-spaces
We see that
Using Lemma 3.3 we can conclude that the last stated inclusion in the lemma indeed holds.
3.3 Proof of the Main Theorem
In this section we prove the Main Theorem. We utilize a number of inequalities for which the proofs can be found in the next section.
Lemma 3.7. Let $q \in \{3,4\}$ , $d \in \mathbb{Z}_{\geq 2}$ for $q=3$ and $d \in \mathbb{Z}_{\geq 4}$ for $q =4$ and let $1- q/(d+1) \leq w \lt 1$ . Let $a,b \in \mathbb{R}_{\gt 1}$ such that
Then $F(\mathcal{T}_{a,b}) \subseteq \text{int}(\mathcal{T}_{a,b})$ .
Proof. Recall from Section 2.2 that we can write $F$ as the composition $G \circ P$ , where $P(x_1,\ldots, x_{q-1}) = (x_1^d, \ldots, x_{q-1}^d)$ . In logarithmic coordinates the map $\hat{P} = \log \circ P\circ \exp$ acts as multiplication by $d$ . In the proof of Lemma 3.4 we showed that $\hat{\mathcal{T}}_{a,b}$ is a polytope whose vertices have entries $0$ , $\pm \hat{a}$ or $\pm \hat{b}$ . It follows that $\hat{P}(\hat{\mathcal{T}}_{a,b})$ is the same polytope where $\hat{a}$ and $\hat{b}$ are replaced by $d\cdot \hat{a}$ and $d\cdot \hat{b}$ respectively. Because $\hat{a} = \log\!(a)$ and $\hat{b} = \log\!(b)$ , we can conclude that $P(\mathcal{T}_{a,b}) = \mathcal{T}_{a^d,b^d}$ . It follows from Lemma 3.2 that it is enough to show that $G (\mathcal{T}_{a^d,b^d} \cap \mathcal{R}_{(123)}) = F(\mathcal{T}_{a,b} \cap \mathcal{R}_{(123)}) \subseteq \text{int}(\mathcal{T}_{a,b})$ for $q=3$ and $G (\mathcal{T}_{a^d,b^d} \cap \mathcal{R}_{(134)}) = F(\mathcal{T}_{a,b} \cap \mathcal{R}_{(134)}) \subseteq \text{int}(\mathcal{T}_{a,b})$ for $q=4$ .
We use Lemma 3.6 to conclude that it is enough to show that
for $q=3$ and
for $q=4$ . We have to be careful here because initially we defined $G$ as a map on $\mathbb{R}_{\gt 0}^{q-1}$ . We can extend $G$ to the half-space $H = \{(x_1, \ldots, x_{q-1})\;:\;x_1 + \cdots + x_{q-1} + w \gt 0\}$ . To show that the sets in equations (3.5) and (3.6) are contained in $H$ it is enough to show that the vertices of these convex hulls are contained in $H$ . This is clear for all but the last written vertex in either case. We will show that the equation $x_1 + \cdots + x_{q-1} + w \gt 0$ does indeed hold for these two points. Namely, by (3.4) we have
as desired.
The map $G$ is a linear-fractional function, which means that $G$ sends line segments to line segments (see e.g. Section 2.3.3 of [Reference Boyd and Vandenberghe6]). Thus, for any set of points $p_1, \ldots, p_n$ we have $G(\text{Conv}(\{p_1,\ldots, p_n\})) = \text{Conv}{(\{G(p_1),\ldots,G(p_n)\})}$ . Let
The left-hand side of (15) is equal to
and the left-hand side of (16) is equal to
We can use Lemma 3.6 to see that it is enough to show that
to conclude that these sets are contained in $\mathcal{T}_{a,b} \cap \mathcal{R}_{(23)}$ and $\mathcal{T}_{a,b} \cap \mathcal{R}_{(243)}$ respectively. The first inequality follows directly from the assumptions. The second inequality follows from item (4) of Theorem 4.1 below. For the last inequality we note that $f_q$ is strictly decreasing and $1-\frac{(b^d-1)\log\!(a)}{b^d\log\!(b)}$ is also strictly decreasing in $a$ . Therefore, it is enough to show the inequality for $a= b^{\frac{b^d(q-1+w)(b-1)}{(b^d-1)(b-w)}}$ . We obtain
Because the inequalities in (17) are strict we can even conclude that $\mathcal{T}_{a,b}$ gets mapped strictly inside itself by $F$ , i.e. $F(\mathcal{T}_{a,b}) \subseteq \text{int}(\mathcal{T}_{a,b})$ .
Theorem 3.8. Let $q \in \{3,4\}$ , $d \in \mathbb{Z}_{\geq 2}$ for $q=3$ and $d \in \mathbb{Z}_{\geq 4}$ for $q =4$ and let $1- q/(d+1) \leq w \lt 1$ with $w\gt 0$ . Then the $q$ -state Potts model with weight $w$ on the infinite $(d+1)$ -regular tree, $\mathbb{T}_d$ , has a unique Gibbs measure.
Proof. We will construct a sequence of subsets $\{\mathcal{T}_n\}_{n\geq 0}$ as is described in Lemma 2.3. Define the functions
It follows from items (1), (2) and (3) of Theorem 4.1 that $L(b) \lt U(b)$ for $b \gt 1$ . We define $M(b) = (L(b)+U(b))/2$ and note that $L(b) \lt M(b) \lt U(b)$ for $b\gt 1$ . Any element of $\mathbb{R}_{\gt 0}^{q-1}$ is contained in $\mathcal{T}_{b,b}$ for a large enough value of $b$ . It follows that we can choose $b_0\gt 1$ such that $\mathcal{T}_{b_0,b_0}$ contains both the vector with every entry equal to $1/w$ and the vectors obtained from the all-ones vector with a single entry changed to $w$ . Because $M(b_0)\gt b_0$ we have $\mathcal{T}_{b_0,b_0} \subset \mathcal{T}_{M(b_0),b_0}$ and thus $\mathcal{T}_{M(b_0),b_0}$ contains these vectors too. Inductively we now define $b_n$ for $n\geq 1$ by
Because $\mathcal{T}_{M(b),b}$ moves continuously with $b$ it follows from Lemma 3.7 that $\{b_n\}_{n \geq 1}$ is a strictly decreasing sequence. The sequence is clearly bounded below by $1$ and thus it must have a limit. We claim that this limit is $1$ . For the sake of contradiction assume that it has a limit $b_\infty \gt 1$ . The set $\mathcal{T}_{M(b_\infty ),b_\infty }$ gets mapped strictly inside itself by $F$ and thus there is a $b'\lt b_\infty$ such that $\mathcal{T}_{M(b_\infty ),b_\infty }$ also gets mapped strictly inside $\mathcal{T}_{M(b'),b'}$ . This is an open condition, so there is an $\epsilon \gt 0$ such that $\mathcal{T}_{M(b),b}$ gets mapped strictly inside $\mathcal{T}_{M(b'),b'}$ for all $b \in [b_\infty,b_\infty + \epsilon )$ . There must be an integer $N$ such that $b_N \in [b_\infty,b_\infty + \epsilon )$ , but then $b_{N+1} \lt b' \lt b_\infty$ , so $b_\infty$ cannot be the limit of the decreasing sequence $\{b_n\}_{n \geq 0}$ .
We define $\mathcal{T}_n = \mathcal{T}_{M(b_n),b_n}$ . We have $b \lt M(b) \lt b^2$ , so it follows from Lemma 3.4 that every $\mathcal{T}_n$ is log-convex. We have chosen $\mathcal{T}_0$ such that condition (1) of Lemma 2.3 is satisfied. By construction $F(\mathcal{T}_{m}) \subseteq \mathcal{T}_{m+1}$ for all $m$ and thus condition (2) of Lemma 2.3 is satisfied. Finally, because both $b_n$ and $M(b_n)$ converge to $1$ , it follows that the sequence of sets $\mathcal{T}_n$ converges to the set consisting of just the all-ones vector. This means that condition (3) of Lemma 2.3 is satisfied. We can conclude that $\mathbb{T}_d$ has a unique Gibbs measure.
Remark 3.9. The assumption $w\gt 0$ is critical in the case $q=3$ and $d=2$ , as it is well known there are multiple Gibbs measures at $w_c=0$ when $q=d+1$ . One sees this in our argument as well. For the base case of the induction, condition (1) of Lemma 2.3, we need $\mathcal{T}_0$ to contain the vectors $(1,w) =(1,0)$ , $(w,1)=(0,1)$ and $(1/w,1/w) = (\infty,\infty )$ . If we take the log convex hull of these vectors and apply $F$ , we obtain a region that again contains the vectors $(1,w) =(1,0)$ , $(w,1)=(0,1)$ and $(1/w,1/w) = (\infty,\infty )$ . It is thus possible to choose boundary conditions that yield unbounded ratios at an arbitrary distance from the leaves. This observation is closely related to the existence of so-called frozen colourings [Reference Brightwell and Winkler5]. These give distinct trivial Gibbs measures, each supported on a single colouring of $\mathbb{T}_2$ .
4. Proof of the inequalities
This section is dedicated to showing all the inequalities from the previous section are satisfied. We define the following functions
We mostly consider these as functions in $b$ and consider only $b \geq 1$ . Note that $h(q,d,w,b)$ has a removable singularity in $b=1$ with $h(q,d,w,1) = \frac{q-1+w}{d(1-w)}$ . The main theorem we prove in this section is the following.
Theorem 4.1. For $q=3, d \geq 2$ and $w \in [1 - \frac{3}{d+1}, 1)$ or for $q=4, d \geq 4$ and $w \in [1 - \frac{4}{d+1}, 1)$ we have for each $b \gt 1$
-
(1) $u(q,d,w,b) \gt l(q,d,w,b)$ ,
-
(2) $u(q,d,w,b) \gt b$ ,
-
(3) $b^2\gt l(q,d,w,b)$ .
And for all $b \gt 1$ and $d \geq 3$ and $w \in [1 - \frac{4}{d+1}, 1)$ we have
-
(4) $g(d,w,b) \lt b$ .
In the next section we show it is enough to prove Theorem 4.1 holds for $w = w_c = 1 - \frac{q}{d+1}$ where we take $q=4$ in inequality (4). Subsequently, inequality (2) is proved in Corollary 4.5, inequality (3) is proved in Lemma 4.6 and inequality (4) is proved in Lemma 4.3. The proof of inequality (1) is the most involved and is the result of Lemmas 4.7 and 4.8.
4.1 Reduction to $w = w_c$
Lemma 4.2. Let $q \geq 2, d \geq 1$ and $w \in [0,1)$ . For $b\gt 1$ we have $l(q,d,w,b)$ and $g(d,w,b)$ are decreasing in $w$ , while $u(q,d,w,b)$ is increasing in $w$ .
Proof. We compute
We see that for $b\gt 1$ we have $\frac{\partial }{\partial w} l(q,d,w,b)\lt 0$ , $\frac{\partial }{\partial w} g(d,w,b) \lt 0$ and $\frac{\partial }{\partial w} u(q,d,w,b)\gt 0$ , so the lemma follows.
From Lemma 4.2 it follows that if we can show Theorem 4.1 holds for $w = w_c$ , then it also holds for all $ w\in [w_c,1)$ . So from now on we will work with $l(b,w_c,d,q)$ , $u(b,w_c,d,q)$ and $h(b,w_c,d)$ . To shorten notation we write
We note that the function $h$ has a removable singularity in $1$ with $h(1)=1$ .
4.2 Inequalities $g(b) \lt b$ , $u(b) \gt b$ and $b^2 \gt l(b)$
We will start by showing $g(b) \lt b$ holds for $b \gt 1$ and $d \geq 2$ .
Lemma 4.3. Let $d \geq 2$ and $b\gt 1$ . Then we have $ g(b) \lt b.$
Proof. We have $g(1) = 1$ and $g'(1) = 1$ . Furthermore, one can see
for $d \geq 2$ and $b \gt 1$ . This implies $g(b) \lt b$ for $d \geq 2$ and $b \gt 1$ .
Next we show that $h$ is increasing in $b$ . This fact will immediately give us inequality (2). Furthermore, it is also helpful in proving a sufficient condition for inequality (1) to hold, see Lemma 4.7 below.
Lemma 4.4. For all $b \gt 1,d\geq 2$ and $q \geq 2$ we have $h'(b)\gt 0$ .
Proof. We compute
It suffices to show that
is positive for $b\gt 1$ . We compute
We see $m''(b) \gt 0$ for $b \gt 1,d\geq 2$ and $q \geq 2$ . Noting that $m'(1) = 0$ and $m(1) = 0$ , it follows that $m'(b)$ and $m(b)$ are strictly positive for $b\gt 1$ .
This immediately implies inequality (2).
Corollary 4.5. For $b\gt 1$ we have $u(b) \gt b$ .
Proof. Recall $u(b) = b^{h(b)}$ . As $h(1) = 1$ and $h'(b)\gt 0$ for $b \gt 1$ by Lemma 4.4, we see $u(b) \gt b$ for $b \gt 1$ follows.
Until this point, we did not need to assume $q=3$ or $q=4$ for the computations to work, but for inequality (3) to hold we do need some restrictions on $q$ and $d$ .
Lemma 4.6. For $q=3$ and $d \geq 2$ and for $q = 4$ and $d \geq 4$ we have $ l(b) \lt b^2,$ for all $b\gt 1$ .
Proof. Multiplying both sides of the inequality with the positive factor $(d-q+1)b^d+(d+1) (q-1)$ we obtain the equivalent inequality
To show that this inequality holds we show that the polynomial
is strictly positive for $b\gt 1$ . For $d\geq 2$ we compute
Because $d+1 \geq k$ we find that for all $b \geq 1$
For $q=3$ this quantity is nonnegative for $d \geq 2$ and for $q=4$ this quantity is nonnegative for $d \geq 4$ . So in our case we can conclude that $Q'''(b) \geq 0$ for all $b \geq 1$ . As we have $Q''(1) = d(d+1) \gt 0$ , $Q'(1) = 3d\gt 0$ and $Q(1) = 0$ , it follows that $Q''(b),Q'(b)$ and $Q(b)$ are strictly positive for $b\gt 1$ .
4.3 The inequality $u(b) \gt l(b)$
The following lemma contains a sufficient condition to prove this inequality. In the remainder of the section we prove that this condition is satisfied.
Lemma 4.7. Suppose for all $b\gt 1$ we have
Then $u(b) \gt l(b)$ for all $b \gt 1$ .
Proof. As $l(b)$ and $u(b)$ are strictly positive for $b \geq 1$ , we can define
for $b \geq 1$ . Then we have
For $b\gt 1$ we have that $h'(b)\gt 0$ by Lemma 4.4 and $\log\!(b) \gt 2(b-1)/(b+1)$ , therefore
which is positive by (18). It is easy to see $F'(1) = 0$ . Hence $F$ has a global minimum in $b=1$ . As $F(1) = 0$ , it follows that $u(b) \gt l(b)$ for all $b \gt 1$ , which is what we wanted to show.
This lemma is useful because proving the inequality $u(b) \gt l(b)$ for all $b \gt 1$ can now be reduced to proving inequalities involving rational functions and with some work to inequalities involving only polynomials. The next lemma shows that (18) holds. For this to work we do need to restrict to $q=3$ and $d \geq 2$ or $q=4$ and $d \geq 4$ .
Lemma 4.8. For $q=3$ and $d\geq 2$ and for $q=4$ and $d\geq 4$ and any $b\gt 1$ we have
Proof. We introduce the following polynomials
Thus, $l(b) = p(b)/q(b)$ and $h(b)=s(b)/t(b)$ . Furthermore, we define $r(b) = q(b)p'(b) - p(b)q'(b)$ and $v(b) = t(b)s'(b)-s(b)t'(b)$ . It is worth noting that $r(b)$ simplifies to $q^2 d^2 b^{d-1}$ . The inequality we want to prove can now be written as
For $b \gt 1$ the quantity $b(b+1)p(b)q(b)t(b)^2$ is strictly positive and thus it is equivalent to prove the inequality, where we have multiplied both sides by this term. We see that it is enough to prove that the following polynomial is strictly positive for all $b\gt 1$
It can be checked that the terms $s(b)$ , $b\cdot v(b)$ and $b\cdot r(b)$ all contain a factor $qdb^d$ and thus $P_0(b) = P(b)/(kdb^d)$ is a polynomial in $b$ whose coefficients are polynomials in $d$ . The remainder of the proof will be dedicated to showing that $P_0(b)$ is strictly positive for $b\gt 1$ .
To avoid ambiguity later, we prove this for $q=3$ in the two cases $d=2$ and $d=3$ separately. For $d=2$ we have
and for $d=3$ we have
In both cases all the coefficients of $P_0(b)$ are strictly positive when written as a polynomial in $b-1$ and thus the polynomials are strictly positive for $b\gt 1$ .
We will now assume that $d \geq 4$ . It can be seen by cross-multiplying the terms in the individual polynomials in (19) that the only coefficients of $P_0(b)$ that can be non-zero appear in the $b^{i\cdot d+j}$ terms where $i,j \in \{0,1,2,3\}$ . The exact coefficients are recorded in Table 1. For $n \in \{1,2,3\}$ we inductively define the polynomials $P_n(b) = P_{n-1}^{(4)}(b)/b^{d-4}$ . Note that in this way $P_n(b)$ is a polynomial whose only non-zero coefficients appear in the $b^{i\cdot d + j}$ term, where $0\leq i \leq 3-n$ and $j \in \{0,1,2,3\}$ .
The values of $P_{j}^{(i)}(1)$ as a polynomial in $x = d-4$ , up to a common positive multiplicative factor, for $q=3$ and $q=4$ are contained in Tables 2 and 3 respectively. These polynomials have only nonnegative coefficients, from which it follows that their values are nonnegative for all $d\geq 4$ .
The polynomial $P_3(b)$ is a cubic polynomial and thus its third derivative $P_3^{(3)}(b)$ is constant. Its exact value, which is recorded in Table 2 for $q=3$ and in Table 3 for $q=4$ , is strictly positive for all $x \geq 0$ , i.e. for all $d \geq 4$ . We claim that it now follows inductively that $P_j^{(i)}(b)$ is strictly positive for all $b \gt 1$ . Namely, suppose that for $i\in \{0,1,2,3\}$ we have shown that $P_{j}^{(i+1)}(b)$ is strictly positive for $b\gt 1$ . Then it follows that $P_{j}^{(i)}(b)$ is strictly increasing. Because $P_{j}^{(i)}(1) \geq 0$ (cf. Tables 2 and 3), we can conclude from this that $P_{j}^{(i)}(b)$ is also strictly positive for $b\gt 1$ . Furthermore, if $P_{j+1}(b) \gt 0$ for $b\gt 1$ then the same follows for $P_{j}^{(4)}$ because $b^{d-4}\cdot P_{j+1}(b) = P_{j}^{(4)}(b)$ . In conclusion, it follows that $P_0(b)\gt 0$ for $b\gt 1$ , which is what we set out to prove.
We have shown the inequalities (1), (2), (3) all hold, thus the conditions in Lemmas 3.7 and 3.4 are satisfied.
5. Concluding remarks
We conclude with some remarks concerning the possibility of expanding our approach and with some questions.
Generalisation
The biggest challenge to generalising our method to other values of $(q,d)$ comes from the fact that inequality (3) from Theorem 4.1 is not necessarily true for all $b\gt 1$ . This suggests that it might not be possible in all cases to find arbitrarily large log convex regions that get mapped into themselves. We suspect that in general this is indeed impossible when one requires the regions to have the symmetry that we use in this paper, that is regions $\mathcal{T} \subset \mathbb{R}^{q-1}_{\gt 0}$ with $M_\sigma \!(\mathcal{T}) = \mathcal{T}$ for all $\sigma \in S_q$ . A consequence is that in some cases we cannot make the region large enough to start the induction laid out in Lemma 2.3. Fortunately, inequality (3) does hold near $b\gt 1$ for all $(q,d)$ with $d \geq q-1$ and $w \geq w_c$ . This suggests that, at least when $w_c$ is close to $1$ , i.e. when $d$ is large enough compared to $q$ , our methods could still be applied. Moreover, it might be possible to find a separate argument to show that the ratios of $\hat{\mathbb{T}}_d^n$ get at least moderately close to $1$ for some $n$ . This could then be used to bootstrap the induction in Lemma 2.3.
There are two more complications that prevent us from applying our method directly to other values of $(q,d)$ . We suspect that these can be overcome with more thorough analysis. The first one comes from the fact that inequality (1) from Theorem 4.1 is no longer satisfied for most values of $(q,d)$ and $w_c$ . The precise form of this inequality highly depends on our method of proof and specifically on our choice of upper bound for $l_{a,b}$ in equation (3.3). Computer analysis suggests that by taking different upper bounds for $l_{a,b}$ , specifically taking tangent lines at different points, the proof that $\mathcal{T}_{a,b}$ gets mapped into itself for $a$ and $b$ near $1$ can be salvaged. The other complication appears when $q\geq 5$ . In this case one obtains more inequalities analogous to inequality (4) from Theorem 4.1. These are not all satisfied when we take a naive generalisation of the region $\mathcal{T}_{a,b}$ . We suspect that this can be remedied by letting the regions depend on more than just two parameters. This leads to the analysis specifically of the log convexity of the regions becoming more involved.
The case $(q,d)=(4,3)$
Unfortunately, our approach does not allows us to handle the case $(q,d)=(4,3)$ . We briefly explain the complications. In inequality (3.3) we use the tangent line of $l_{a,b}(x)$ at $x=1$ to upper bound $l_{a,b}(x)$ ; this makes the calculus easier and this choice works for $q=4$ and $d\geq 4$ . We have evidence that by using the tangent line at a different point in inequality (3.3) the calculations that follow from this upper bound also work for the case $q=4$ and $d=3$ . However, we can show that inequality (3) in Theorem 4.1 fails when $(q,d)=(4,3)$ and $b \gt 1$ is large enough, meaning that in that case the set $\mathcal{T}_{a,b}$ cannot both be log convex and satisfy $F(\mathcal{T}_{a,b})\subset \mathcal{T}_{a,b}$ . For $b\gt 1$ close enough to $1$ inequality (3) in Theorem 4.1 does hold. We suspect that our approach can be tweaked to show uniqueness for all $w \in (0,1)$ when $q=4$ and $d=3$ , possibly by finding a separate argument to show that the ratios of $\hat{\mathbb{T}}_3^n$ get at least moderately close to $1$ for some $n$ , bootstrapping the induction in Lemma 2.3.
Zero-free region
Our final comment is related to the following question. Given $q\in \mathbb{N}$ and $d\geq q-1$ . Does there exist a region $U$ in $\mathbb{C}$ containing the interval $(1-\frac{q}{d+1},1]$ such that for any $w\in U$ and any graph $G$ of maximum degree $d+1$ the partition function $Z(G,q,w)\neq 0$ ? (If so this would yield an efficient algorithm for approximately computing $Z(G,q,w)$ in this region by Barvinok’s method [Reference Barvinok1] combined with [Reference Patel and Regts19].)
Following [Reference Bencs, Davies, Patel and Regts3], to prove this, for say $q=3$ , we would essentially need to find a log convex set $S\subsetneq \mathbb{C}^2$ such that the map $F$ maps $S$ into $S$ and such that $S$ satisfies some additional properties that we will not discuss here. We suspect that the sets $\mathcal{T}_{a,b}$ we have constructed may be helpful in determining whether such a set $S$ can be constructed.
Acknowledgement
The authors would like to thank Han Peters for stimulating discussions. The authors are moreover grateful for constructive comments from the referees.
A. Proof of Lemma 1.3
We provide a proof of Lemma 1.3 here closely following Brightwell and Winkler’s proof for the case $w=0$ modifying it where appropriate.
We start with the ‘if’ part. Fix $d$ and $w\geq 0$ and let $\mu$ be any Gibbs measure on $\mathbb{T}=\mathbb{T}_d$ . Let $U\subset V=V(\mathbb{T})$ be a finite set. We aim to show that for any configuration $\psi\;:\;U\to [q]$ , the probability
does not depend on $\mu$ .
We may assume that $U$ induces a tree with each vertex of degree $d+1$ or $1$ by taking a larger finite set if needed. Suppose that $U$ has $\ell$ leaves; denote the set of leaves by $L$ . Choose $n$ large enough, to be specified later, and let $W$ denote the collection of all vertices of $\mathbb{T}$ at distance at most $n$ from $U$ . The graph induced by $(W\setminus U)\cup L$ is the disjoint union of $\ell$ copies of $\mathbb{T}^n$ each rooted at a leaf of $U$ .
Now generate a random sample from the Gibbs measure $\mu$ as follows: first take a random sample $\boldsymbol \Phi$ from $\mu$ , uncolor all vertices in $W\setminus \partial W$ and then choose a random coloring of $W\setminus \partial W$ consistent with the coloring of $\partial W$ from $\mathbb{P}_{W}$ . This yields a valid way of generating a sample $\boldsymbol \Phi '$ by the Gibbs property.
The probability that ${\boldsymbol \Phi '}\!\restriction _U$ is equal to $\psi$ is then given by
The second probability can be rewritten as
Now $\mathbb{P}[{\boldsymbol \Phi '}\!\restriction _{\partial U}=\psi \!\restriction _{\partial U} \,\big | \,{\boldsymbol \Phi '}\!\restriction _{\partial W}=\rho ]$ is the product over the elements of $\partial U=L$ of the probability that the root vertex of a copy of $\mathbb{T}^n$ gets a particular color conditioned on the leaves having their colors specified by $\rho$ . By assumption, for $n$ large enough this probability gets arbitrarily close to $1/q$ . Therefore (A.3) gets arbitrarily close to $\mathbb{P}[{\boldsymbol \Phi '}\!\restriction _U=\psi \big | \,{\boldsymbol \Phi '}\!\restriction _{\partial U}=\psi \!\restriction _{\partial U}\!]q^{-\ell }$ as $n\to \infty$ . Plugging this back into (A.2) we obtain that as $n\to \infty$ ,
The latter probability is independent of $\mu$ . So this concludes the proof.
For the ‘only if’ part we merely sketch the argument. Suppose the limsup is not equal to $0$ for some color $c\in [q]$ . Then there must be distinct colors $c$ and $c'$ , a number $\varepsilon \gt 0$ , a sequence $\{n_i\}$ of natural numbers and boundary conditions $\tau _i$ on the leaves of $\mathbb{T}^{n_i}_d$ such that the associated probabilities of the roots getting color $c$ (resp. $c'$ ) are at least $1/q+\varepsilon$ (resp. at most $1/q-\varepsilon$ ). Let $\tau '_{\!\!i}$ be the boundary condition on the leaves of $\mathbb{T}^{n_i}_d$ obtained from $\tau _i$ by flipping the colors $c$ and $c'$ . By symmetry, these respective probabilities are then reversed. We can then create two distinct Gibbs measures with a limiting process using the boundary conditions $\tau _i$ and $\tau'_{\!\!i}$ respectively.