1 Introduction
By definition, chaotic dynamical systems’ future states are generally impossible to predict over the long term. For this reason they must be studied probabilistically, that is, in terms of measures that evolve under the dynamics. Many probabilistic questions about topologically mixing chaotic systems $f: M \circlearrowleft $ are quantitatively related to fast mixing (that is, decay of correlations) for smooth observables with respect to some invariant measure $\mu $ [Reference Baladi2]. Mathematically, mixing means the following decay of sufficiently regular (for example, $C^\infty $ ) observables $A, B: M \to \mathbb {R}$ :
Alternatively, this can be reformulated as weak convergence of smoothly reweighted versions of $\mu $ back to $\mu $ under the actions the dynamics:
for any $A,B$ with $\int _M B\,\mathrm {d}\mu = 1$ , where $f_*$ pushes measures forward under f.
A particularly important invariant measure is the Sinai–Ruelle–Bowen (SRB) measure $\rho $ . This is typically the most physically relevant invariant measure: among other things, assuming no periodicity in the system, measures with smooth enough Lebesgue densities commonly converge to the physical measure over time (typically exponentially quickly), that is,
for all $A,B \in C^\infty $ with $\int _M B\,\mathrm {d}\operatorname {\mathrm {Leb}} = 1$ . It is a standard meta-result in smooth ergodic theory that for sufficiently hyperbolic maps with $d_u$ positive Lyapunov exponents, the same result also holds for any measure $\mu $ with a regular conditional density along $d_u$ -dimensional submanifolds that are tangent to the expanding direction in phase space:
However, establishing fast mixing (or the related decay of iterates of a transfer operator) appears to be insufficient to answer various questions that depend on the long-term behaviour of a small problematic subset of the system’s attractor, such as the response problem in non-uniformly hyperbolic systems [Reference Ruelle21]. We might therefore ask if other measures $\mu $ also satisfy (1).
Perhaps the simplest example of a small subset of an attractor would be its restriction to a submanifold of phase space—which will generically be transverse to both stable and unstable manifolds. If this submanifold comes from some foliation (for example, of level sets of an observable), we can disintegrate the physical measure and study its conditional measure $\mu $ on this submanifold. An example of such a measure is shown in Figure 1: note that unlike Gibbs invariant measures, the measure is supported on a non-invariant Cantor set without a product structure.
If (1) obtains for such a conditional measure $\mu $ , we call it conditional mixing. For dissipative systems, conditional mixing appears to lie outside the scope of traditional study by transfer operators, and this basic problem has hitherto seen very little study, notwithstanding work for other classes of $\mu $ in the specific case of linear one-dimensional maps [Reference Bourgain and Dyatlov7, Reference Hochman and Shmerkin15, Reference Sahlsten and Stevens22]
Nevertheless, through a mixture of rigorous and theoretical study we will show that conditional mixing holds for a range of maps, and, in particular, has some connection with the fractal-geometric theory of Fourier dimension [Reference Mosquera and Shmerkin19, Reference Sahlsten and Stevens22]. From our results it seems that conditional mixing is likely to hold for a large set of maps and submanifolds (perhaps even almost all those for which there is no ‘obvious’ reason why it could not hold).
We also describe some consequences of conditional mixing. One physically meaningful consequence of conditional mixing is that the capability of Bayesian filters to make predictions about chaotic systems in the long term is limited if they only make partial observations (see §3); it also implies set-filling results for chaotic attractors (see §2). Another physical application, to proving the widely believed existence of linear response outside of smooth hyperbolic systems, is considered in [Reference Wormell24].
We stress that, except for conservative systems, conditional mixing does not follow from traditional Banach space approaches [Reference Baladi2, Reference Baladi3]: these typically rely on some sort of regularity of the initial measure (for example, smoothness of a density along a foliation). The conditional measures we will study in this paper have no regularity of this sort, and indeed are Cantor measures supported on totally disconnected sets. Banach space approaches, furthermore, typically proceed by showing that this kind of regularity is preserved or improved, and results usually hold while the map’s structure remains the same: on the other hand, we present an example in §4.1 where one choice of map fails to have conditional mixing even though all the structurally identical maps around it do. We note that the Fourier dimension theory for nonlinear maps [Reference Bourgain and Dyatlov7, Reference Sahlsten and Stevens22] uses the Dolgopyat method (in dynamics more commonly used to prove rapid mixing for flows) [Reference Dolgopyat11], where one is interested in joint non-integrability.
The paper is structured as follows. In §2 we give a mathematical definition of conditional mixing and give a simple, illustrative consequence of it involving set filling. In §3 we present the application of conditional mixing to the area of forecasting. In §§4 and 5 we give evidence for conditional mixing in various systems, respectively presenting a theorem for a class of (potentially nonlinear) baker’s maps and numerical evidence for some piecewise hyperbolic maps. We discuss our results in §6. The novel algorithms we use to obtain for our numerical results are given in Appendix A.
2 Definition and an illustrative consequence
Let $T: M\to M$ be a dynamical system with SRB measure $\rho $ , and let $H: M \to \mathbb {R}^d$ be a $C^2$ function with no critical points on the level set
Suppose that for some $c:\mathbb {R}^+ \to \mathbb {R}^+$ the following limit exists in the $C^0$ -weak topology:
Suppose also that $\mu $ is a finite measure. Any two sequences $c, c'$ will yield $\mu , \mu '$ identical up to scaling, as would changing the kernel
to another bounded, compactly supported decreasing function. If $\mu $ is a probability measure then we denote it by $\mathrm {d}\mu (x) = \mathrm {d}\rho (x \mid H(x) = 0)$ .
Definition 2.1. $(T,\rho )$ has conditional mixing with respect to H if for all $A, B \in C^\infty $ ,
This is to say that repeatedly pushing forward any weighted conditional measure $B\mu $ by T, we converge back to the SRB measure $\rho $ (up to a multiplicative constant) in the weak topology with respect to $C^1$ .
It is natural to wish to put some quantitative bounds on mixing. By analogy with the usual sort of mixing, a natural decay rate is exponential.
Definition 2.2. $(T,\rho )$ has exponential conditional mixing with respect to H if there exist $\xi <1$ , C and $r<\infty $ such that for all $A, B \in C^r$ ,
Note that (exponential) conditional mixing can already be expected to hold from the classical theory if T is conservative. For example, we have the following proposition (proven in Appendix D).
Proposition 2.1. Suppose $T: M \to M$ is a $C^3$ conservative topologically mixing Anosov diffeomorphism with M a compact $C^\infty $ manifold. Suppose $H \in C^2(M,\mathbb {R}^d)$ has no singular points on its zero set $\ell _H$ , the zero set is everywhere transverse to stable manifolds, and d is less than or equal to the number of stable directions. Then $(T,{\operatorname {\mathrm {Leb}}})$ has exponential conditional mixing with respect to H.
The measure-based conditional mixing implies an interesting set-convergence property of the intersection of the level set $\ell _H$ with the support of $\rho $ , which we denote by $\Lambda $ and which is often an attractor of T. We find that iterates of the intersection of the line and $\Lambda $ , that is, iterates of a slice of $\Lambda $ , converge back in Hausdorff distance to the full support of $\rho $ . An example of this phenomenon is plotted in Figure 2.
Proposition 2.2. Conditional mixing with respect to H implies that
Under a reasonably general assumption on the regularity of the SRB measure, exponential conditional mixing gives us a quantitative version of this result as well.
Proposition 2.3. Suppose $\rho $ is lower-Ahlfors regular: that is, there exist $C,d$ such that $\rho (B(x,\delta ))> C \delta ^d$ for all x. Exponential conditional mixing with respect to H implies that for some $\xi _1 < 1$ and $C_1$ ,
The proofs of these two propositions are given in Appendix D.
3 Forecasting with perfect partial observations
We now consider a fundamental practical problem to which the notion of conditional mixing is directly applicable: that of forecasting chaotic dynamics. Many forecasting methods have been developed that assimilate information obtained from observations. In general, these methods achieve this assimilation by approximating the Bayesian filter, also known as the optimal filter [Reference Doucet, De Freitas and Gordon12]. The stability of the Bayesian filter with small, finite observation error has recently been studied in Anosov systems [Reference Oljača, Kuna and Bröcker20].
To understand this in the simplest instance, let us suppose that our system $T: \mathcal {M} \circlearrowleft $ for $\mathcal {M} \subset \mathbb {R}^d$ has exponential mixing, and at time $n=0$ we have some prior probabilistic knowledge of the state of our system, given by some (presumably ‘nice’) measure $\mathrm {d}\mu ^{-}(x)$ . If we start with an unobserved system at statistical equilibrium, the natural choice of prior is $\mu ^{-} = \rho $ , the SRB measure.
We can now make a noisy and perhaps partial observation of our system, given as a value $y = H(x) + \zeta \in \mathbb {R}^{e}$ , where $\zeta $ is random with probabilities given by a smooth kernel $p(\zeta \mid x)\,\mathrm {d} \zeta $ .
Assimilating the observation y, the posterior probability measure of x is given by Bayes’ theorem as
with normalizing constant $Z(y) = \int _{M} p(y-H(w)\mid w)\,\mathrm {d}\mu ^{-}(w)$ . An example is given in Figure 3. Our best guess of the state of the system at future time n (that is, of $T^n(x)$ ) is then given by $T^n_*\mu $ ; the expected value of some (nice) observable A at time n is therefore
Depending on how effective a measurement $y=H(x)$ is of x, $\mu $ is likely to be concentrated on a smaller set than $\mu ^-$ , and this may improve forward estimates of the system’s state over the short to medium term. However, under our assumptions for $T, A, \mu ^-, p$ , exponential mixing results give that (4) will eventually converge at a fixed exponential rate to the SRB measure expectation $\int _M A\,\mathrm {d}\rho $ .
But as we make our observations more and more precise, that is, reduce the noise in H to zero, we might ask what posterior we end up with and what quality of forecast we can make with it. This is trivial if $H(x)$ specifies x: we will know our value of x exactly, and therefore $T^n(x)$ exactly for all time. However, it is typical in high-dimensional systems for H to be only a partial observation.
In a zero-noise limit the kernel p is no longer smooth, with $p(\zeta \mid w) = \delta (\zeta )$ . Our posterior measure $\mathrm {d}\mu (x)$ is then the simply the conditional probability measure of $\mathrm {d}\mu ^{-}(x)$ given that $H(x) = y$ :
If conditional mixing (2) holds, this must converge to the expectation of A with respect to the SRB measure $\rho $ , that is, in the long term we end up back with the default no-information guess. Conditional mixing thus codifies the intuition that typical incomplete information on the system should wash out over time.
On the other hand, if conditional mixing did not hold generically, and thus partial observations could with some positive likelihood be predictively useful for all time, there might be significant practical consequences for chaotic systems. However, what our exploratory mathematical results in §§4 and 5 will suggest is that conditional mixing should in fact hold, excluding this possibility.
In practice, of course, physical observations will always have some random error to them. This is also true of the evolution of physical systems, which are nonetheless considered worth studying in their zero-noise limit. In any case, if the observation uncertainty is small enough it will take some time to manifest in the prediction, and the behaviour until then is described by zero-noise limit we have discussed here (see Figure 4).
Nevertheless, prediction over the long term with small, non-zero observation errors does admit an interesting question for which general results can be proven using standard dynamics techniques. If T is a $C^\infty $ Anosov diffeomorphism and $A \in C^\infty (M)$ , then over the long term, the convergence of the forward predictions (4) is governed by how much $\mu $ picks up the transfer operator eigenfunctions of T. Mathematically, we have for all $K\in \mathbb {N}$ that as $n\to \infty $ [Reference Gouëzel and Liverani14],
where $r_k\in \mathbb {N}$ and $1> |\unicode{x3bb} _1| \geq |\unicode{x3bb} _2| \geq \cdots \to 0$ . Both the $\alpha _k$ and $\beta _k$ are independent of the Bayesian posterior $\mu $ : the $\alpha _k,\beta _k$ are functionals, and in particularly $\beta _k$ is an integration against a hyperdistribution, that is, $C^\infty $ in the unstable direction. While the problem is technical and beyond the scope of this paper, this latter fact suggests that as the observation error decreases to zero (and thus the density ${\mathrm {d}\mu }/{\mathrm {d}\rho }$ converges to a conditional distribution supported on $\ell _H$ ), the integral $\beta _k({\mathrm {d}\mu }/{\mathrm {d}\rho })$ could potentially converge to a finite limit when $\ell _H$ is transverse to stable manifolds, at least for Anosov maps with a $C^\infty $ unstable foliation. If this were the case, it would suggest that reducing the observation error does not strengthen the asymptotic long-term predictive power of the Bayesian filter—that is after any transients, which will at least initially approximate the zero-noise limit, and may take arbitrarily long to dissipate.
However, $C^\infty $ uniformly hyperbolic systems are probably unique in that their transfer operators have no non-removable essential spectrum, and K can therefore be taken to infinity in (5). We can thus contrast the $C^\infty $ Anosov case to our numerical evidence for the Lozi map in Figure 4, where the predictive power appeared to be improved by smaller error at times when the conditional mixing regime had washed out. Note that the Lozi map has a rough unstable foliation [Reference Baladi and Gouëzel4] and can usually be expected to lack a finite Markov partition [Reference Misiurewicz and Štimac17], leading to essential spectrum for its transfer operator [Reference Butterley, Canestrari and Jain8].
This argument would not yield a proof of conditional mixing a priori, as bounds for the error in the asymptotic expansion (5) will blow up as the observation noise goes to zero.
4 Rigorous results for a toy model: baker’s map
As a simple model to study conditional mixing, let us consider baker’s maps $b: D := [0,1]^2 \circlearrowleft $ of the following form:
where $k\geq 2$ is an integer, the $v_i, i = 1,\ldots ,k$ , are (possibly nonlinear) contractions with all $|v_i'|\leq \mu < 1$ . We will also assume that the open images $v_i((0,1))$ are disjoint, and the contractions have bounded distortion (that is, the $\log |v_i'|$ are $C^1$ ). An example of such a map is plotted in Figure 5. For smooth enough transverse foliations, it is possible to define in a natural way conditional measures for all leaves. For simplicity, we assume we have a foliation $\Psi (t;y)$ of a subset of D into graphs $x = \psi (y) - t$ , with the parameter t translating the graph in the x direction.
Proposition 4.1. Suppose that $\Psi (t;y) = (\psi (y) -t,y)$ is a foliation of some subset $D_\Psi \subseteq D$ for $|t| \leq t_* \in [0,1]$ and $y \in [0,1]$ . Let $\rho $ be the (unique) SRB measure of b.
Then for every $t\in [-t_*,t_*]$ there exists a unique probability measure $\rho _t$ supported on $\Psi (t,[0,1])$ such that the following assertions hold.
-
(a) For all $t \in (-t_*,t_*)$ and all continuous functions $A: D\to \mathbb {R}$ ,
(7) $$ \begin{align} \int_D A\,\mathrm{d}\rho_t = \lim_{\delta\to 0} \frac{1}{2\delta} \int_{\{\Psi(s,y) : |s-t|<\delta,y\in[0,1]\}} A\,\mathrm{d}\rho. \end{align} $$ -
(b) For all Borel sets $E \subseteq D_\Psi $ ,
$$ \begin{align*} \rho(E) = \int_{-t_*}^{t_*} \rho_t(E)\,\mathrm{d} t. \end{align*} $$
Let us be as general as we can about the functions against which the conditional measures weakly converge back to the full measure. For $\alpha ,\beta \in (0,1]$ we define the following norm on continuous functions $\phi : D \to \mathcal {C}$ :
where the directional Hölder semi-norms are given by
The Banach space $C^{\alpha ;\beta }$ will then consist of all continuous functions $\phi : D \to \mathcal {C}$ with $\| \phi \|_{\alpha ;\beta } < \infty $ : that is, functions that are $\alpha $ -Hölder in the x direction and $\beta $ -Hölder in the y direction. In particular, the Banach space of $C^1$ functions is continuously embedded in $C^{\alpha ;\beta }$ .
The following theorem, proved in Appendix B, says that for certain baker’s maps, exponential conditional mixing holds for all conditional SRB measures on a smooth foliation transversal to unstable lines:
Theorem 4.2. Suppose that $\Psi (t;y) = (\psi (y) -t,y)$ is a foliation of some subset of D for $|t| \leq t_* \in [0,1]$ and $y \in [0,1]$ , and $\psi $ is $C^2$ with $\psi ' \neq 0$ . Suppose one of the following conditions hold:
-
(I) The contractions $v_i$ are totally nonlinearFootnote † and $C^2$ and $\bigcup _i v_i([0,1]) = [0,1]$ .
-
(II) The contractions $v_i$ are totally nonlinear and analytic, as is $\psi $ .
-
(III) The contractions are linear with $v_i(x) := \mu x + o_i$ for $o_i \in [0,1-\mu ]$ , and $\psi " \neq 0$ .
Let $\rho $ be the (unique) SRB measure of a modified baker’s map b and let $\{\rho _t\}_{t\in [-t_*,t_*]}$ be the conditional measures of $\rho $ on the foliation. Then $(b,\rho )$ has exponential conditional mixing with respect to the level set of $H(x,y) = \psi (y)-t-x$ for all $t \in (-t_*,t_*)$ .
More specifically, there exists $d^*> 0$ such that for all $\gamma \in (1-d^*,1],\beta \in (0,1], \alpha \in (\gamma -d_*,1]$ , there exist $C> 0$ , $\xi \in (0,1)$ such that for all $t \in (-t_*,t_*)$ , $A \in C^{\alpha ;\beta }$ , $B \in C^\gamma $ , $n \in \mathbb {N}$ ,
Intuitively it seems that exponential conditional mixing holds wherever there is no obvious reason for it not to hold: that is to say, wherever the conditional measure $\mu $ is not compatible with the algebraic or symbolic structure of the unstable dynamics. Indeed, the Fourier decay results we use to prove Theorem 4.2 rely on joint non-integrability properties [Reference Sahlsten and Stevens22]). This incompatibility can be obtained either if $\rho _t$ is generated by nonlinear contractions (where the expanding dynamics is linear), as in I–II, or if the leaf $x = \psi (y)-t$ transforms the self-affine stable measure in a nonlinear way (as in III). We give an example where the conditional measure picks up the algebraic structure of both the stable and unstable dynamics in §4.1.
From this and Proposition 2.3, exponential convergence in Hausdorff distance of the slice sets follows.
Corollary 4.3. Under the conditions of Theorem 4.2, there exist $C_1$ and $\xi _1$ such that for all $t \in (-t_*,t_*)$ ,
where $\Lambda _b$ is the attractor of b.
Note that the classic piecewise affine baker’s maps fall under condition III of the theorem, although for conservative maps the application of Fourier dimension theory is unnecessary due to the Lebesgue-absolute continuity of $\rho $ .
It can also be seen from the proof that the constant $d_*$ is half the Fourier dimension of $\rho _t$ projected onto the x coordinate. (A choice of test functions with more specific Fourier decay properties might yield $d_*$ to be exactly the Fourier dimension.) This Fourier dimension is bounded from above by the Hausdorff dimension of $\rho _t$ , which, given the product structure of the measure, is easily seen also to be the stable dimension of the systems [Reference Schmeling and Troubetzkoy23]. Thus, a larger stable dimension suggests conditional mixing holds with respect to increasingly less regular observables, with consequences for linear response theory [Reference Ruelle21, Reference Wormell24].
In fact, for case III, both $d^*$ and the asymptotic rate of conditional mixing $\xi $ are independent of the conditioning foliation [Reference Mosquera and Shmerkin19, Theorem 3.1].
It is worth mentioning that it is also possible to decompose sufficiently regular (for example, real analytic) curves that are tangent to stable or unstable manifolds away from the support of the attractor into a finite set of curves that uniformly avoid tangencies. In fact, the stipulations that $\psi ' \neq 0$ or indeed that the foliation is a graph in y are mere artefacts of the proof and can almost certainly be relaxed.
To prove Theorem 4.2, we essentially use two facts. The first fact, used in Lemma B.6, is that the x component of b is a tupling map, whose action on Fourier coefficients of measures is well known. The second fact (Proposition B.3) is that the SRB measure is a product of uniform measure in the x direction and a Gibbs measure (in fact, a measure of maximal entropy) of an iterated function system in the y direction. This allows us to bring in some recent results on Fourier dimension of Gibbs measures [Reference Mosquera and Olivo18, Reference Mosquera and Shmerkin19, Reference Sahlsten and Stevens22].
The Fourier dimension of Gibbs measures is an area in progress whose results have not yet been consolidated, hence the somewhat particular set of alternatives. We remark that if $B(x,y)$ depends only on x then in case III the $\psi ' \neq 0$ restriction can be dropped: that is, quadratic tangencies with stable manifolds (lines of constant y) are allowed here.
4.1 A case where conditional mixing fails
However, it is clear that some stipulations on the contractions and foliation of a similar flavour to I–III must remain. In particular, either the contractions or the foliation should avoid preserving too good a linear structure, or conditional mixing will fail. An example where this occurs is given for the map $b_{\mathrm {bad}}$ , given by $k=3$ and
The attractor and dynamical picture for this map are plotted in Figure 6.
Here, we have the following proposition.
Proposition 4.4. $(b_{\mathrm {bad}},{\kern-1pt}\rho _{\mathrm {bad}})$ has no conditional mixing with respect to $H{\kern-1.2pt}(x{\kern-0.5pt},{\kern-1.2pt}y){\kern-1.2pt} ={\kern-1.2pt} y-x$ .
This is because the x and y directions, which are both linear, both preserve the middle-thirds Cantor set, which is copied across the directions using the linear foliation.
Proof. The contractions $v_1, v_2, v_3$ together generate the middle-thirds Cantor set $C_{1/3}$ : if $f_1(x) = x/3$ , $f_2(x) = (x+2)/3$ are the classical generators, then $v_1 = f_1^2$ , $v_2 = f_1 \circ f_2$ and $v_3 = f_2$ . The attractor $\Lambda _{b_{\mathrm {bad}}}$ of $b_{\mathrm {bad}}$ is therefore $[0,1]\times C_{1/3}$ (for example, using Proposition B.3). Our level set is $L = \{x-y=0\}$ so the support of our conditional measure is $L \cap \Lambda _{b_{\mathrm {bad}}} = \{(x,x) : x \in C_{1/3}\}$ .
The expanding dynamics of $b_{\mathrm {bad}}$ is just the tripling map $\kappa (x) = 3x\,\mod \! 1$ , which has $C_{1/3}$ as an invariant set. This means $C_{1/3} \times [0,1]$ is a proper invariant closed subset for the full baker’s map. Our conditional measure’s support $L \cap \Lambda _{b_{\mathrm {bad}}}$ is contained in this set and so for all time, $(b_{\mathrm {bad}}^n(L \cap \Lambda _{b_{\mathrm {bad}}}) \subseteq C_{1/3} \times [0,1]$ . We note that $p = (\tfrac {1}{3},\tfrac {1}{2})$ lies in $\Lambda _{b_{\mathrm {bad}}}$ , but $d(p,(b_{\mathrm {bad}}^n(L \cap \Lambda _{b_{\mathrm {bad}}}))) \geq d(p,C_{1/3}\times [0,1]) = 1/6$ , so $(b_{\mathrm {bad}}^n(L \cap \Lambda _{b_{\mathrm {bad}}})$ cannot converge in Hausdorff distance to $\Lambda _{b_{\mathrm {bad}}}$ . By the contrapositive of Proposition 2.2, no conditional mixing holds.
Nevertheless, if any nonlinear perturbation of either the level set or any of the $v_j$ is made, Theorem 4.2 will apply and conditional mixing will manifest. Conditional mixing relies on the lack of joint algebraic structure, rather than simply on regularizing properties.
5 Numerical example: Lozi map
While baker’s maps’ special structure (for example, as skew products) allows us to find rigorous results for them by borrowing existing theory, it is not immediately clear how to mathematically generalize our results. To study maps with less structure we now therefore turn to rigorously justified numerics, and consider the commonly studied and numerically amenable class of Lozi maps. These are piecewise hyperbolic affine maps $f: \mathbb {R}^2 \to \mathbb {R}^2$ with
For $a \in (1,2)$ and $b \in (0,\min \{a-1,4-2a\})$ the Lozi map f has chaotic dynamics on a compact region in phase space: when additionally $b \in (0, \sqrt {2}(a - \sqrt {2}))$ this has a single mixing SRB measure [Reference Misiurewicz16, Theorem 5]. All unstable manifolds have positive length when $b \in (0, a - \sqrt {2})$ [Reference Young, Hunt, Li, Kennedy and Nusse25, Theorem]. A Lozi attractor is shown in Figure 1.
Lozi maps are continuous, with a jump in the Jacobian across the singularity set $\mathcal {S} = \{x = 0\}$ .
In a similar fashion to those we defined for the baker’s map (Proposition 4.1), conditional measures of the SRB measure $\mathrm {d}\rho (x,y)$ on sets $\ell _{x_0} := \{x=x_0\}$ are well defined for all $x_0 \in \mathbb {R}$ intersecting the support of the Lozi attractor [Reference Wormell24, Theorem 2.1]. Let us denote these conditional measures by $\mathrm {d}\rho (y \mid x_0)$ . These conditional measures can be expected to have Hausdorff dimension strictly between $0$ and $1$ : in particular, they lack any manifold structure. A histogram of $\mathrm {d}\rho (y \mid 0) = \mathrm {d}\rho (y \mid (x,y) \in \mathcal {S})$ is plotted in Figure 1: the linear response for the Lozi map is determined from the mixing properties of the conditional measure on $\mathcal {S}$ [Reference Wormell24], so we will be most interested in this particular conditional measure.
We specifically conjecture that measures $\rho (\cdot \mid x_0)$ , when pushed forward under the Lozi map f, converge back to the full SRB measure $\rho $ , and that this convergence happens at an exponential rate.
Conjecture 5.1. For generic Lozi parameters $(a,b)$ and Lebesgue-almost all $x_0 \in \mathbb {R}$ , the Lozi map has conditional mixing with respect to level curves $x = x_0$ (that is, the measures $\rho (\cdot \mid x_0)$ .
We have strong and direct numerical evidence in favour of this conjecture: Figure 7 shows exponential decay of the correlation
by four orders of magnitude, with reliable error quantification. In fact, it seems that A need only be piecewise $C^1$ . The consequence of Conjecture 5.1 holding on the singular line $x=0$ (up to a technical generalization to second-order mixing), as we will show in [Reference Wormell24, Theorem 2.3], is that the Lozi map is a formal linear response to bounded dynamical perturbations [Reference Wormell24].
It should be noted that obtaining valid samples of the quantities in (9) is tricky. To begin with, we are sampling from the conditional measure $\rho (\cdot \mid x_0)$ , which is a codimension-one object on the attractor. We then must iterate forward under the Lozi map, which is chaotic and thus unstable. To validly perform this sampling rigorously, and quantify the associated numerical error, we have developed novel algorithms, presented in Appendix A. These algorithms could, we imagine, with care be extended to general hyperbolic dynamics.
6 Conclusion
While in a numerical example, one can only consider a particular case, our results on baker’s maps suggest that conditional mixing is a very robust property. Indeed, Theorem 4.2 shows that for an open dense set of such baker’s maps (perhaps all baker’s maps), conditional mixing holds on an open dense set of analytic curves. While these skew product maps are rather special, it seems from the Fourier dimension results that conditional mixing occurs when structure is broken, and so the situation might yet be better for conditional mixing in more general maps. To this end, we make the following conjecture.
Conjecture 6.1. For all analytic Anosov diffeomorphisms on compact surfaces, conditional mixing holds on an open dense set of functions H with no critical points on their zero level set.
On the other hand, it is interesting that the exponential rates of decay for the conditional measures are substantially slower than for smooth observables against the full SRB measure: for example, in Figure 7 the rate of exponential convergence for a smooth observable $A(x,y) = x$ is much slower when the initial measure $\mu $ is a conditional measure than for the full SRB measure $\mu \sim \rho $ . Furthermore, it appears, at least for the baker’s map in some cases, that this decay rate is independent of the conditioning submanifold (see discussion in Appendix B). In a related fashion, mixing rates against the SRB measure, which depend on the essential spectrum of the transfer operator in relevant function spaces, tend to be slower for smaller Hölder regularities of observables [Reference Baladi1]: thinking of the conditional measure $\rho (x\mid H(x) = 0)$ as equivalent to the SRB measure multiplied by a distribution $\delta (H(x))$ provides some connection between these two phenomena. Therefore, while the main connection we have seen appears to be to Fourier dimension, perhaps an appropriate functional analytic approach could yield fruit in studying this basic property of a chaotic system.
One might also ask what happens when the codimension of the conditioning submanifold is increased (in our study it has always been one). This is natural for the Bayesian filter problem since one typically makes repeated observations when observing a system [Reference Doucet, De Freitas and Gordon12, Reference Oljača, Kuna and Bröcker20]: we expect to end up with, say, a vector of one-dimensional observations $y = (H(x), H(f(x)), H(f^2(x)),\ldots , H(f^m(x)))$ . If this dimension m is no greater than the unstable dimension (that is, the number of positive Lyapunov exponents), then the construction of the conditional measure as in Proposition 4.1 and [Reference Wormell24, Theorem 2.1] will go through, and we might feel empowered to say that we expect conditional mixing to hold generically. On the other hand, if m is more than twice the box-counting dimension of the attractor, then the probabilistic Takens embedding theorem would tell us that y specifies x exactly $\rho $ -almost surely [Reference Barański, Gutman and Śpiewak5]. In the intermediate case where M lies between the unstable dimension and the attractor dimension, it could be that some kind of generic intersection property à la blenders [Reference Bonatti, Crovisier, Díaz and Wilkinson6] holds to give a conditional set of positive fractal dimension, which could also allow for some kind of conditional mixing. Numerical study in higher-dimensional systems may shed light on the situation.
Acknowledgements
This research has been supported by the European Research Council (ERC) under the European Union’s Horizon 2020 research and innovation programme (grant agreement no. 787304), as well as by the ETH Zurich Institute for Theoretical Studies. The author thanks Viviane Baladi for her comments on early stages of the manuscript. The datasets generated and/or analysed during the current study are available from the author on reasonable request. The author has no conflicts of interest to declare.
A Appendix. Simulation of Lozi map dynamics
Rather than attempt to compute deterministic estimates of these systems we will proceed by Birkhoff Monte Carlo sampling of the quantities we are interested in. Because we need to sample from measures $\rho (\cdot \mid x_0)$ conditioned on a codimension-one manifold, it will be necessary to simulate not point dynamics but dynamics on sets of higher dimensions, the natural choice being local unstable manifolds. Helpfully, because the Lozi map is piecewise affine, local unstable manifolds are straight-line segments. A ‘segment dynamics’ is proposed in Appendix A.1 and a numerical implementation given in Appendix A.2.
However, we would like to be sure we are not merely perceiving artefacts of sampling error or numerical imprecision: to achieve this, we will also need to quantify the statistical and deterministic errors associated with our numerical simulations. Section A.3 gives, surprisingly, a stable algorithm to simulate the chaotic Lozi dynamics that is compatible with validated interval arithmetic, and Appendix A.4 explains the quantification of random sampling errors.
A.1 Segment dynamics
Let $\Lambda $ be the attractor of the Lozi map f, and for points p whose orbits do not intersect the singular line $\mathcal {S}:= \{x = 0\}$ , define the local unstable manifold of a point $p \in \Lambda $ to be
where $\sigma _{(x,y)} := \operatorname {\mathrm {sign}} x$ . These are segments of the full unstable manifolds which have always remained on the same side of the singular line $\mathcal {S}$ .
Let $\vec {\mathcal {G}}$ be the set of directed open line segments in $\mathbb {R}^2$ , that is, open intervals where starting points and endpoints are distinguished. Then we can define a set of directed local unstable manifolds
which captures $\rho $ -almost all local unstable manifolds, since Lozi maps are piecewise affine, and almost all unstable manifolds are of positive length.
Let us also define the following product space
which we will use to parametrize each $\vec I \in \vec {\mathcal {L}}$ . $\vec \Lambda $ is almost everywhere a two-to-one cover of $\Lambda $ by the map $\pi (\vec I_{p,q},t) = (1-t)p+tq$ , where we denote the directed segment from point p to point q by $\vec I_{p,q}$ .
As a result, up to a set of $\rho $ -measure zero, we can lift the f-dynamics to $\vec \Lambda $ , by a map of the form
where $\unicode{x3bb} _{\vec I}: [0,1] \circlearrowleft $ is a full-branch expanding interval map. Let us define this a little more explicitly.
When $\vec I_{p,q} \cap \mathcal {S}$ is non-empty, we know it has exactly one element which we denote by $s \in \mathcal {S}$ with $s = \pi (\vec I_{p,q},t_*)$ for some $t_* \in (0,1)$ . The segment dynamics $\vec f$ can then be written explicitly as
It turns out that the SRB measure $\rho $ can also be lifted to an invariant measure of $\vec f$ by
Because $\pi ^{-1}$ is two-to-one $\rho $ -almost everywhere we have that $(\vec \Lambda , \vec f, \vec \rho )$ has most two ergodic components (which are identical up to reversing the direction of the segments), and we will be able to sample $\vec \rho $ by iterating $\vec f$ .
In defining a stable numerical method it will be useful to us that almost every point’s local unstable manifold has endpoints originating from the singular line.
Proposition A.1. The set
has full $\vec \rho $ measure.
This proposition is proved in Appendix E.
A.2 Simulation of segment dynamics
Given our interval dynamics, we can sample the conditional measures using $\rho $ via the function $\kappa _{x_0}(\vec I) := \vec I \cap \{x = x_0\}$ and $\ell (\vec I) = |\vec I|$ the length of the segment. Because the SRB measure is uniformly distributed along unstable manifolds, we have that
assuming that no contribution to the sum is made when $\kappa _{x_0}(\vec I) = \emptyset $ (that is, $\vec I$ does not intersect with the line we want to sample a conditional measure from). Then for $\vec {\rho }$ -almost-all starting values $(\vec I_0,t_0)$ we can estimate these expectations via a Birkhoff sum
with the segment dynamics that can be simulated using (A.2). The convergence (A.4) for the $\Psi $ we are interested in uses the Birkhoff ergodic theorem and the fact that $\kappa _{x_0}, \ell $ do not depend on the direction of the interval (so the ergodic component of $\vec \Lambda $ we sample from is immaterial).
Because from (A.2) the t dynamics is generated by full-branch interval maps that preserve Lebesgue measure, the branch dynamics is Markovian, with transition probabilities that are explicitly given. Because of this, the random dynamics
generates the $\vec f$ dynamics at equilibrium, where the $T_n$ are uniformly distributed, dependent hidden variables which are given by
To sample that $\vec I$ we do not actually need to know the $T_n$ , but they can be reconstructed from a time series of the segments by sampling the final value $T_{n_{\mathrm {final}}} \sim \mathrm {Uniform}(0,1)$ and iterating backwards according to (A.2): inverting the $\unicode{x3bb} _{\vec I}$ yields a contraction. This means we do not have to directly simulate an expanding map (which would have been problematic for rigorously validated simulation of the dynamics). In practice this is a very effective way to simulate the segment dynamics.
A.3 Validated numerical implementation of segment dynamics
However, computers can only encode real numbers to finite precision. Thus, at every computation step the results must be rounded to a given tolerance, introducing small errors, which may invalidate fine numerical results such as we wish to obtain. Validated interval arithmetic provides a vehicle to quantify the errors, but to use it we must first produce a deterministically stable algorithm to simulate a generic long chaotic time series. In this subsection will we present such an algorithm, introducing first the notion of interval arithmetic.
Let the set of closed intervals in $\mathbb {R}$ be $\mathcal {I}$ . The idea of validated interval arithmetic is to represent real numbers $\alpha $ by an interval $\mathfrak {a} \in \mathcal {I}$ such that we know $\alpha \in \mathfrak {a} \subset \mathbb {R}$ . Such an interval $\mathfrak {a}$ is given by its upper and lower bounds, and we can restrict the set of allowed intervals so these upper and lower bounds are representable in the finite-precision computer encoding. A function $g: \mathbb {R}^d \to \mathbb {R}^e$ can be implemented in validated arithmetic through a function $\mathfrak {g}: \mathcal {I}^d \to \mathcal {I}^e$ such that $\mathfrak {g}(\mathfrak {a})$ will always contain $g(\mathfrak {a})$ . One can thus be absolutely certain that $\alpha $ is contained in a set $\mathfrak {g}(\mathfrak {a})$ and so on.
The unstable manifolds $\vec I_{p,q}$ are defined by their endpoints, which update under the chaotic dynamics f. By definition, f is exponentially stretching, making it very difficult in general to obtain a rigorously validated time series. However, the fact that we are constantly resetting the segment endpoints to the critical line (A.2) makes efficiently obtaining such a time series quite possible.
We will store our segments $\vec I_{p,q}$ as $Q \vec I_{p',q'}$ , where Q is an orthogonal transformation of $\mathbb {R}^2$ (that is, a rotation matrix), and $p',q' \in \mathbb {R}^2$ have identical second coordinate. (Of course, a segment can be stored as a $2\times 2$ matrix of its endpoints’ coordinates.) Thus, Q rotates the phase space so that the unstable direction on the segment is along the first coordinate. If our segment at the next step is $\vec I_{p_1,q_1} = Q_1 \vec I_{p_1',q_1'}$ , we do not compute the quantities on the right-hand side from f explicitly, but rather, since we have on $f^{-1} (\vec I_{p_1,q_1})$ that the Lozi map is affine, having for some J that
we make the QR decomposition
where $Q_1$ is a rotation matrix and $R_1$ is upper triangular, and set
This means the dynamics in the endpoints’ shared second coordinate is contracting (and thus numerically stable), as in fact is the dynamics of Q, and that except for the (discrete) choice of branch $\mathcal {M}_{\pm }$ , these are both independent of the points’ first coordinates. This means the second coordinate as well as the rotation matrix Q can be stably approximated in interval arithmetic. (It is, however, necessary to explicitly code the QR decomposition (A.6) in a way optimized for interval arithmetic, as many standard qr routines give sub-par interval bounds that will lead to numerical blow-up of the algorithm.)
The first coordinates of $p'$ and $q'$ , on the other hand, have expanding dynamics. However, when the segment $\vec I_{p,q}$ is cut by the singular line at a point r, we can define $r' = Q'r$ without reference to these first coordinates. In particular, the singular line in the transformed coordinates is $Q^\top \mathcal {S}$ which solves some equation $x' = \beta y'$ , with $\beta $ bounded because unstable manifolds are uniformly transversal to the singular line [Reference Young, Hunt, Li, Kennedy and Nusse25]. Then, if the shared second coordinate of $p', q'$ is $y'$ , we can write
Notably, this point $s'$ is generated using only quantities whose numerical error remains stable. It thus replaces either one of $p, q$ which contain dynamics where the error grows. By Proposition A.1, almost every $p, q$ will eventually be replaced by such an s, resetting the size of its error and therefore ensuring it does not grow too big.
Implementation of the algorithm described above in validated interval arithmetic is straightforward, because it is stable, but here we must be careful: when our local unstable manifold is split and the choice of child manifold is to be made, the $t_*$ used to determine the choice is interval-valued (that is, in this set $\mathcal {I}$ and likely of positive width). The natural way to choose the branch to continue with is made by sampling a uniform random variable U and comparing it with $t_*$ : the choice of segment is then clear except where there is an overlap between U and $t_*$ . A simple way to deal with this problem is to choose the floating-point precision small enough to make an overlap unlikely enough to invite references to the age of the universe (twice the bits of the standard double-precision is enough). More comprehensive handling of this eschatological edge case may be done in various ways, including using importance sampling on multiple time series. (One compares U with some real number $t_{**} \in t_*$ and reweights the time series by $t_*/t_{**}$ or $(1-t_*)/(1-t_{**})$ as appropriate; note that the weights are also interval-valued.)
We will therefore be able to find an interval hypercube containing an exact time series $\{\vec f^n(\vec I,t_0)\}_{n = 0,\ldots ,N}$ from (A.5), where $t_0$ is implicitly defined by the random selection when the segment is cut.
A.4 Quantification of statistical error
As the $\vec f$ dynamics we sample is just a two-to-one lift from the f dynamics, which have a spectral gap [Reference Baladi and Gouëzel4], we expect that for large N the error between Birkhoff means and true expectations (A.4) obey a central limit theorem [Reference Chernov9]. If we have several long time series
for R independent samples from $I^{(r)} \sim \vec \rho $ , then for sufficiently large N, the sample mean $\bar \Psi $ of the $\Psi ^{(r)}$ will have expectation $\vec \rho (\Psi )$ (from the initialization of the time series at equilibrium), and will differ from this by a factor of $\mathcal {O}(1/\sqrt {RN})$ . Helpfully, we can quantify this deviation a posteriori: using the Gaussian behaviour of the $\Psi ^{(r)}$ we have that if $s_\Psi ^2$ is the sample variance, then for large N we have
where $t_{R-1}$ is Student’s t distribution with $R-1$ degrees of freedom. This allows us to put confidence intervals on our estimates, as in Figure 7. Such a principle has been used to test for linear response in previous work [Reference Gottwald, Wormell and Wouters13].
To obtain an accurate sample from $\vec \rho $ in initializing our time series we begin by initializing $(\vec I_{p_0,f(p_0)},t^{(r)}) \in \vec \Lambda $ , where $p_0 = ({2}/({2 + a - a^2 + 4b}),0)$ and $t^{(r)} \sim \textrm {Uniform}(0,1)$ (in fact, implicitly using the above random choice methods). This initial measure lies in the Banach space that converges exponentially quickly to the physical measure [Reference Demers and Liverani10] and so, by making the spin-up time $n_{\mathrm {init}}$ sufficiently long, we can ensure that our sampling initializations $\vec I^{(r)} = \vec f^{n_{\mathrm {init}}}(\vec I_{p_0,p_1},t^{(r)})$ come from a distribution exponentially close to $\vec \rho $ .
B Appendix. Proof of baker’s map result
In this appendix we will prove Theorem 4.2 on exponential conditional mixing for baker’s maps of the form (6).
For conciseness when quantitatively referring to Fourier dimension, let us say that a measure–function pair $(\nu ,\psi )$ has $(\eta ,C)$ Fourier decay if for all $j \in \mathbb {Z} \backslash \{0\}$ ,
and $\int _0^1 |\mathrm {d}\nu | \leq C$ . This implies that the Fourier dimension of $\psi _*\nu $ is at least equal to $2\eta $ .
Fourier decay is invariant under translations of $\psi $ .
Proposition B.1. Suppose that $(\nu ,\psi )$ has $(C,\eta )$ Fourier decay. Then for all $t \in \mathbb {R}$ , so does $(\nu ,\psi + t)$ .
Proof. We have
and the integral of $|\nu |$ remains no greater than C, as required.
To prove Theorem 4.2 we will employ a separate Fourier dimension theorem [Reference Mosquera and Shmerkin19, Reference Sahlsten and Stevens22] for each of the conditions in the theorem’s statement. The common component is the following lemma (into which any new Fourier dimension results may also be substituted).
Lemma B.2. Suppose one has a modified baker’s map b with contracting maps $v_i$ . Let $\nu _0$ be the probability measure such that $v_i^* \nu _0 = k^{-1} \nu _0$ for all $i = 1,\ldots , k$ , and let $\psi \in C^2$ be such that $\psi ' \neq 0$ and $(\nu _0,\psi )$ has $(C,\eta )$ Fourier decay.
Let $\gamma \in (1-\eta ,1)$ , $\beta \in (0,1]$ and $\alpha \in (2-\eta -\gamma ,1)$ .
Then there exist $\xi \in (0,1)$ and $C'$ depending only on $C,\eta ,\alpha ,\beta \gamma ,\psi '$ such that for all $A \in C^{\alpha ;\beta }$ and $B \in C^{\gamma }$ ,
where $\rho _0$ is defined as in Theorem 4.2.
Proof of Theorem 4.2
The measures $\nu _0$ from Lemma B.2 are the measures of maximal entropy of these expanding iterated function schemes: in particular, they are Gibbs (with constant weights) and atomless. If we have that $(\nu _0,\psi )$ has Fourier decay then so do $(\nu _0,\psi )$ uniformly from Proposition B.1, and Lemma B.2 then secures us the theorem. Since $\nu _0$ is a probability measure, it is only necessary to check that (B.1) holds, which we do procedurally from existing results.
That (B.1) holds for option III is a simple application of [Reference Mosquera and Shmerkin19, Theorem 3.1] (and in fact here, $\eta $ is independent of $\psi $ ).
To see this for options I and II requires a little more cunning. We have that $\psi $ is a diffeomorphism onto its image. Let $\omega (x) = \omega _0 + \omega _1 x$ map $\psi ([0,1])$ linearly onto $[0,1]$ , so that $\tilde \psi = \omega \circ \psi : [0,1]\circlearrowleft $ is a diffeomorphism. If $\{v_{\mathbf {i}}\}_{\mathbf {i} \in \{1,\ldots , k\}^n}$ are n-fold compositions of the contractions $v_{i}$ then for some large enough n, the n-fold compositions $\{\tilde \psi \circ v_{\mathbf {i}} \circ \tilde \psi ^{-1}\}_{\mathbf {i} \in \{1,\ldots , k\}^n}$ are uniformly contracting. They are also totally nonlinear and $C^2$ with bounded distortion. If option I holds then their ranges fill $[0,1]$ , and if II holds then they are analytic. Furthermore, under either option they remain totally nonlinear. By [Reference Sahlsten and Stevens22, Theorem 1.1] their measure of maximal entropy $\tilde \nu _0$ (which is Gibbs and atomless) therefore has polynomial decay of its Fourier transform, that is, for all $l \in \mathbb {R}\backslash \{0\}$ ,
for some $\eta>0$ and $C<\infty $ .
Now, this measure $\tilde \nu _0$ is also the measure of maximal entropy of the conjugated single iterates $\{ \tilde \psi \circ v_i \circ \tilde \psi ^{-1} \}_{i = 1,\ldots ,k}$ ; from the conjugacy we therefore know that $\tilde \nu _0 = \tilde \psi ^* \nu _0$ . Hence,
so, setting $l = j/\omega _1$ , we obtain that
as required.
The relevance of $\nu _0$ is that it is the cross-section of the SRB measure along lines of constant x (that is, local stable manifolds).
Proposition B.3. Let $\nu _0$ be as in Lemma B.2. Then $\rho = \operatorname {\mathrm {Leb}} \times \nu _0$ is the SRB measure of b.
Henceforth we will find it useful to write the unstable dynamics as $\kappa (x) = kx \,\mod \! 1$ .
Proof of Proposition B.3
$\rho $ is conditionally absolutely continuous along unstable manifolds (which are lines of fixed y), and solves
Hence, it is an SRB measure.
This allows us to prove the existence of our conditional measures $\rho _t$ .
Proof of Proposition 4.1
To prove (7), noticing that $\rho $ is just a product measure of the uniform measure in x (therefore t) and $\nu _0$ in y, we have that
where $\nu _0$ is defined in Lemma B.2. This integral is absolutely bounded by $\sup |A|$ , and so by the dominated convergence theorem,
This means we must define
and so get (7).
To prove the second part, we have that for any $E \subseteq D_\Psi $ ,
so
By a change of coordinates $(x,y) = \Psi (t,y) = (\psi (y)-t,y)$ , and using Proposition B.3, we have
The following technical lemmas will be of use in following proofs. We will prove them in Appendix C.
Lemma B.4. Suppose that for some $\alpha \in (0,1]$ , $\phi : [0,1] \to \mathbb {R}$ is a piecewise $\alpha $ -Hölder function with a finite number of jumps, and $\nu $ is an integrable atomless measure. Let $\hat \phi _j, \hat \nu _j$ be the respective Fourier coefficients of $\phi $ and $\nu $ . Then
provided the sum is absolutely convergent.
For $\alpha \in (0,1]$ let the Hölder semi-norm on a set $E \subseteq [0,1]$ be defined as follows:
Lemma B.5. Suppose $\phi :[0,1] \to \mathbb {R}$ is piecewise $\alpha $ -Hölder with jumps on a set $S \subset (0,1]$ (including at $1$ if it is not periodic). Then for $j \in \mathbb {Z} \backslash \{0\}$ ,
With these lemmas in hand, we will try and prove exponential conditional mixing in the projection of the baker’s map onto the x coordinate. This next lemma is the heart of the proof.
Lemma B.6. Suppose $\phi : \mathbb {R}/\mathbb {Z}$ is as in Lemma B.5, and $\nu $ is an integrable atomless measure with Fourier coefficients $\hat \nu _j$ , $(\operatorname {\mathrm {id}},\nu )$ has $(C_\nu , \eta )$ Fourier decay, and $\alpha> 1 - \eta $ . Then
Proof. Recall that $\kappa ^n(x) = k^n x\,\mod \! 1$ . The Fourier coefficients of $\phi \circ \kappa ^n$ are zero except for those whose indices are multiples of $k^n$ :
These decay as $\mathcal {O}(|j|^{-\alpha })$ , whereas the Fourier coefficients of $\nu $ are $\mathcal {O}(|j|^{-\eta })$ , so we know their convolution is summable. By Lemma B.4 we therefore have
This means
Elementary inequalities on the fractions, and the zeroth Fourier coefficient’s definition as the total integral give the required result.
We now attempt to connect this one-dimensional picture in $\kappa $ to the two-dimensional picture of the baker’s map. In this proposition we define a one-dimensional observable $A_{m,y_0}(x)$ that in the following proposition we find will closely approximate $A(b^m(x,y))$ for any y, when m is large enough.
Lemma B.7. Suppose that $\nu , \alpha $ are as in Lemma B.6. Suppose that $A: D \to \mathbb {R}$ has $|A|_{\alpha ,x} < \infty $ and let
Then
Proof. It is clear that $A_{m,y_0}$ is piecewise $\alpha $ -Hölder with jumps at $S_m := \{i/k^m : i = 1,\ldots , k^m\}$ .
We can also bound its Hölder constant. Suppose $[x,z] \subset (0,1] \backslash S_m$ . This means that $b^l(x,y_0)$ and $b^l(x,z_0)$ lie on the same piece of b for all $0 \leq l < m$ , and therefore that $b^m(x,y_0)$ and $b^m(z,y_0)$ have the same z component. As a result,
From (B.4), this means that $|A_{m,y_0}|_{C^\alpha (S_m^c)} \leq |A|_{\alpha ,x} k^{m\alpha }$ .
Applying Lemma B.6, we get that
as required.
Proposition B.8. For all $\beta> 0$ , $m \in \mathbb {N}$ , A with finite $|\cdot |_{\beta ,y}$ norm, and $x,y,y_0 \in [0,1]$ ,
Proof. We prove this by induction on m. We have that
which is bounded for all $y,y_0 \in [0,1]$ by $|A|_{\beta ,y} |y - y_0|^\beta $ . Suppose then that our proposition holds for some m. Then for any $y \in [0,1]$ ,
where the map branch $i = \lceil kx \rceil $ . As a consequence, $A_{m+1,y_0}(x) = A_{m,\nu _i(y_0)}(\kappa (x))$ , and so
as required for the inductive step, where we used that the $\nu _i$ contract points by a factor of $\mu $ .
We can then put Lemma B.7 and Proposition B.8 together to prove a primitive version of Lemma B.2.
Proposition B.9. Suppose $\alpha + \eta> 1$ . Then there exists $\xi < 1$ depending only on $\eta , k, \mu $ and $\beta $ and there also exists C such that if $\psi $ is $C^1$ with $\psi '=0$ on a finite set, and $\nu $ is an atomless measure such that $(\nu ,\psi )$ has $(\eta ,C_{\nu ,\psi })$ Fourier decay, then
Proof. We will divide $n = m + l$ and the difference in (B.5) up into several pieces that we will bound largely using previous results.
To begin with, we have as an application of Proposition B.8 that for any $y_0$ ,
Now, $\psi _*\nu $ is atomless with $(\operatorname {\mathrm {id}},\psi _*\nu )$ having $(\eta ,C_{\nu ,\psi })$ Fourier decay, so we can apply Lemma B.7 to obtain
Next,
because from Proposition B.3, the SRB measure $\rho $ projects to Lebesgue measure in the x coordinate. With this, we have that
using Proposition B.8. Recalling also that $\rho $ is b-invariant, and pushing $\nu $ forward preserves its total integral, we can say that
Putting (B.6), (B.7) and (B.8) together, we have that
By setting $l = \lceil (1 - {\eta \log k}/({\log \mu ^{-1} \beta + (1+\eta ) \log k})) n \rceil $ , we obtain that there exists a constant C depending on $\alpha , \beta , \eta , \mu , k$ such that (B.5) holds with
At this point, if we set $B \equiv 1$ , we could prove Lemma B.2 already. However, to incorporate it we need to show that we can multiply $\nu _0$ by sufficiently smooth Hölder functions and still retain adequate Fourier decay.
Lemma B.10. Suppose that $\psi :[0,1]\circlearrowleft $ is a $C^1$ diffeomorphism onto its image, and $\nu _0$ is an atomless measure such that $(\psi ,\nu )$ has $(\eta , C_{\nu _0,\psi })$ Fourier decay. Then for all $\gamma \in (1-\eta ,1]$ there exists C such that for all $B \in C^\gamma (D)$ ,
that is, if the measure $\mathrm {d}\sigma (y) := B(y,\psi ^{-1}(y))\mathrm {d}\nu (y)$ , then $(\sigma ,\psi )$ has $(\eta +\gamma -1, C_{\nu ,\psi } C)$ decay; furthermore, $\sigma $ is atomless.
Proof. We have that $\sigma $ is atomless because it is defined as an atomless measure multiplied by a bounded function.
Define function $B_\psi \in C^\gamma ([0,1])$ such that $B_\psi (x) = B(\psi ^{-1}(x),x)$ on $\psi ([0,1])$ . It is possible to do this so that
where $C^{\prime }_{\psi '} = 1 + \|1/\psi '\|_{L^\infty } \geq 1$ , and $\|B_\psi \|_{L^\infty } \leq \|B\|_{L^\infty }$ .
By Lemma B.5 we have that $\hat b_l$ , the Fourier coefficients of $B_\psi $ , have a certain bound,
and so in particular for any j, the Fourier coefficients of $e^{2\pi i j\cdot } B_\psi $ , which are just shifts of those of $B_\psi $ , decay as $\mathcal {O}(l^{-\gamma })$ .
Therefore, we can apply Lemma B.4 to get that
and so
for some $C"$ depending on $\gamma , \eta $ , giving what is required.
This is all we need to prove Lemma B.2.
Proof of Lemma B.2
By the definition of $\rho _t$ in (B.2),
and so
where $\sigma := B(\cdot ,\psi ^{-1}(\cdot ))\nu _0$ .
Since by assumption, $(\nu _0,\psi )$ has $(\eta , C_{\nu _0,\psi })$ Fourier decay, $\sigma $ has $(\eta + \gamma - 1, C_{\nu _0,\psi } C)$ decay for some C depending on $\eta , \gamma , \psi $ as a result of Lemma B.10. From this, $\sigma $ is also atomless. An application of Proposition B.9 gives us our result.
Finally, the construction of $\rho $ also allows us to prove the corollary.
Proof of Corollary 4.3
Using Proposition 2.3, it is enough to prove that the SRB measure $\rho $ is lower-Ahlfors regular. Since $\rho $ is a Cartesian product of Lebesgue measure with the Gibbs measure $\nu _0$ , $\rho $ is lower-Ahlfors regular if $\nu _0$ is.
Fix $\delta>0$ , and let $M = \lceil {\log \delta }/{\log \mu } \rceil $ . For any y in the support of $\nu _0$ we have that there exists a composition of M contractions in $\{v_j\}$ , which we denote by $v_{\mathbf j}$ , such that $y = v_{\mathbf j}(y_M)$ for some $y_M \in [0,1]$ . Set inclusion tells us that we must therefore have $y \in v_{\mathbf j}([0,1])$ .
The uniform contraction of the $v_j$ by a factor of $\nu $ means that the diameter of this set $v_{\mathbf j}([0,1])$ must be bounded by $\mu ^M$ , which by construction is smaller than $\delta $ . Hence, $v_{\mathbf j}([0,1])$ must be contained in the ball $B(y,\delta )$ .
On the other hand, the $\nu _0$ -measure of this set $v_{\mathbf j}([0,1])$ can be given using the constitutive relation of $\nu _0$ as $k^{-M} \nu _0([0,1]) = k^{-M}$ .
We can therefore say that
as required for lower-Ahlfors regularity.
Since the constants here are independent of t, we obtain the uniform-in-t convergence required.
C Appendix. Proofs of some integration lemmas
Here we collect some proofs of lemmas used in Appendix C involving integration.
Proof of Lemma B.4
For $l \in \mathbb {N}^+$ the $1$ -periodic Fejér kernel is given by
It is non-negative with total integral equal to $1$ , and for all $u \in [0,1/2]$ ,
As a result, $\phi \ast F_l$ converges pointwise to $\phi $ as $l \to \infty $ at all points of continuity of $\phi $ : which is to say, $\nu $ -almost everywhere, since $\phi $ has a finite number of jumps and $\nu $ has no atoms. Furthermore, the functions $\phi \ast F_l$ are uniformly bounded by the constant function $\| \phi \|_\infty $ (which is $\nu $ -integrable). As a result, we can apply the dominated convergence theorem to say that
Now,
where in the last line we could interchange integration and (finite) summation. Now, (C.2) is none other than the lth Cesàro sum of $\hat \psi _{-j}\hat \nu _j$ , whose limit is therefore the full sum, the full sum being absolutely convergent. Substituting this limit into (C.1), we obtain (B.3) as required.
Proof of Lemma B.5
We can divide up the interval of integration into j even pieces as follows:
We always have on any of these segments that
Let the index set of segments with jumps be
Clearly we have that the cardinality of $J_j$ is smaller than that of X for all j. For $l \notin J_j$ , we can do the usual Hölder continuity bound on the integral, using to begin with that $e^{2\pi ijx}$ is mean zero:
and then that
to get that
Combining (C.3) for the segments with jumps and (C.4) otherwise, we get that
as required.
D Appendix. Proof of results in §2
Proof of Proposition 2.1
Since T is conservative and topologically mixing it must have one physical measure $\rho $ which is Lebesgue measure.
Because H has no critical points on $\ell _H$ , the conditional measure $\rho (x\mid H(x) = 0)$ can be well defined as a $C^0$ -weak limit of
for some bump function $\psi \in C^\infty _c$ , with $c(\delta ) = \mathcal {O}(\delta ^{d})$ . This is constant along manifolds $\{x : H(x) = c\}, |c| \leq \delta $ , which are $C^1$ manifolds for small enough $\delta $ .
We know that the stable vector bundle of T is continuous and $\ell _H$ is compact. Hence, the manifolds $\{x : H(x) = c\}, |c| \leq \delta $ , must decompose locally into submanifolds of the same dimensionality as the unstable dimension, which are uniformly transverse to stable vector fields, and therefore some set of admissible stable leaves in the sense of [Reference Gouëzel and Liverani14]. Furthermore, $c(\delta ) \sim C \delta ^{d}$ for some $C>0$ .
It can then be shown by a computation that the family (D.1) is convergent in the $\mathcal {B}^{1,1}$ norm of [Reference Gouëzel and Liverani14], including when multiplied by $C^1$ functions B, and so the conditional measure lies in $\mathcal {B}^{1,1}$ , on which the Perron–Frobenius operator of T has a spectral gap [Reference Gouëzel and Liverani14, Theorem 2.3]. Hence, exponential conditional mixing obtains with $r=1$ .
Proof of Proposition 2.2
We will consider convergence of $d_{\mathrm {Haus}}(\Lambda ,\overline {T^n(\ell _H \cap \Lambda )})$ , as it is the same thing as without the set closure.
From its definition, the conditional measure’s support $\operatorname {\mathrm {supp}}\mu $ must be contained in $\ell _H \cap \Lambda $ , and so $\operatorname {\mathrm {supp}} T^n_*\mu $ is contained in $\overline {T^n(\ell _H \cap \Lambda )}$ .
Now, $\overline {T^n(\ell _H \cap \Lambda )} \subset \overline {T^n(\Lambda )} \subseteq \overline {\Lambda } = \Lambda $ , so it is enough to show that
Fix $\epsilon> 0$ , and let $\{B(\xi ,\epsilon )\}_{\xi \in \Xi }$ be a finite open cover of $\Lambda $ . Then, if for each $\xi \in \Xi $ we can show that $B(\xi ,\epsilon )$ has positive $T^n_*\mu $ -measure for every n large enough, we have that $d_{\mathrm {Haus}}(T^n(\ell _H\cap \Lambda ),\Lambda ) < 2\epsilon $ for these n, and so we are done.
Conditional mixing implies that if $\psi _{\xi ,\epsilon }$ is any $C^\infty $ non-negative bump function bounded by $1$ whose support is $B(\xi ,\epsilon )$ , then
because $\psi _{\xi ,\epsilon }>0$ on an open set overlapping with $\Lambda = \operatorname {\mathrm {supp}}\rho $ . This means that for n large enough,
as required.
Proof of Proposition 2.3
By translation and dilation we can construct $C^\infty $ bump functions $\psi _{\xi ,\epsilon }$ such that $\psi _{\xi ,\epsilon } = 1$ on $B(\xi ,\epsilon /2)$ , and their $C^r$ norms are bounded $\| \psi _{\xi ,\epsilon } \|_{C^1} \leq K \epsilon ^{-r}$ for constant K.
Lower-Ahlfors regularity of $\rho $ gives us that
and the exponential conditional mixing assumption then gives us that for all $\xi ,\epsilon $ ,
This will all be positive when $\epsilon \geq K_1 (\xi ^{1/(d+r)})^n$ for some constant $K_1$ , giving us uniform exponential decay in n of the bound $2\epsilon $ on the Hausdorff distance.
E Appendix. Proof of Proposition A.1
To prove Proposition A.1 we will first require the following result.
Proposition E.1. There exist $C> 0$ , $\unicode{x3bb}> 1$ such that for all $I \in \hat {\mathcal {L}}$ and $n \geq 0$ such that $f^m I \cap \mathcal {S} = \emptyset $ for $m < n$ ,
Proof. The segments I are unstable manifolds: because f is piecewise uniformly hyperbolic [Reference Young, Hunt, Li, Kennedy and Nusse25], these segments are eventually expanded by the action of f.
Proof of Proposition A.1
Define the observable on $\vec \Lambda $ ,
This measures whether $\vec I$ is cut by $\mathcal {S}$ on the left-hand side. Clearly, if $P(\vec f^{-n}(\vec I, t)) = 1$ , then $p_{\vec I} = f^n(s)$ for some $s \in \mathcal {S}$ . This holds a fortiori if
In the following we will show that this limit is almost always positive.
Fix $\epsilon> 0$ , and let $\vec \Lambda _{\mathrm {erg}}$ be an ergodic component of $\vec \Lambda $ . Because almost all unstable manifolds have positive measure, by the definition of $\vec \Lambda _{\mathrm {erg}}$ , there exists a set $E \subset \vec \Lambda $ of positive measure such that $|\vec I|> \epsilon $ for all $(\vec I, t) \in E$ . It is clear that we can choose $E = E_I \times (0,1)$ , $E_I$ being a collection of directed segments.
We know that for any segment $\vec J$ , if $\vec J$ does not cross $\mathcal {S}$ then $\vec f(\vec J,t) = (f(\vec J),t)$ . By Proposition E.1, this means that for $\vec I \in A_I$ , $f^n(\vec I, t) = (f^n(\vec I) t)$ with $ |f^n(\vec I)| \geq C \epsilon \unicode{x3bb} ^n$ , unless $\vec I$ is cut by $\mathcal {S}$ for some $m < n$ . However, there is some $m_*$ sufficiently large such that $C \epsilon \unicode{x3bb} ^{m_*}$ is greater than the diameter of the attractor $\Lambda $ , which brings about a contradiction, as $f^n(I) = \mathcal {W}^u_{\mathrm {loc}}(f^n(\pi (\vec I,0.5)))$ is a segment subset of $\Lambda $ . Thus, $f^m(\vec I)$ is cut by $\mathcal {S}$ for some $m> n$ .
Now, recalling that our segments are open segments, $\mathcal {S}$ will cut $f^m \vec I$ at some point $t' \in (0,1)$ . As a result, $P(\vec f^{m+1}(I,t)) = 1$ for $t \in (0,t')$ . This holds for any $\vec I \in E_I$ .
The consequence is that, for at least one $m'$ between $0$ and $m_*$ , there exists $B_P \subset f^{m'+1}(A) \subset \vec \Lambda _{\mathrm {erg}}$ of positive measure such that $P = 1$ on $B_P$ . By applying the Birkhoff ergodic theorem to $(\vec f^{-1},\vec \Lambda _{\mathrm {erg}},\vec \rho |_{\vec \Lambda _{\mathrm {erg}}})$ , the limit (E.2) must hold for $\vec \rho $ -almost all points on $\vec \Lambda _{\mathrm {erg}}$ .
By taking a union over all ergodic components of $\vec \Lambda $ we therefore have that for almost all $\vec I_{p,q} \in \vec \Lambda $ , $p_{\vec I} = f^{n}(s)$ for some $s \in \mathcal {S}$ . This is to say that $p_{\vec I}$ lies in the forward orbit of the singular line. Using that reversing the direction of segments is a measure isometry, this result equally holds true for q.