Hostname: page-component-6bf8c574d5-b4m5d Total loading time: 0 Render date: 2025-02-20T06:06:32.720Z Has data issue: false hasContentIssue false

An endpoint estimate for product singular integral operators on stratified Lie groups

Published online by Cambridge University Press:  27 January 2025

Michael G. Cowling
Affiliation:
School of Mathematics and Statistics, University of New South Wales, Sydney, NSW 2052, Australia e-mail: [email protected]
Ming-Yi Lee
Affiliation:
Department of Mathematics, National Central University, Chung-Li 320, Taiwan, Republic of China e-mail: [email protected]
Ji Li*
Affiliation:
Department of Mathematics, Macquarie University, Sydney, NSW, 2109, Australia
Jill Pipher
Affiliation:
Department of Mathematics, Brown University, Providence, RI 02912, USA e-mail: [email protected]
Rights & Permissions [Opens in a new window]

Abstract

We establish hyperweak boundedness of area functions, square functions, maximal operators, and Calderón–Zygmund operators on products of two stratified Lie groups.

Type
Article
Creative Commons
Creative Common License - CCCreative Common License - BYCreative Common License - NCCreative Common License - ND
This is an Open Access article, distributed under the terms of the Creative Commons Attribution-NonCommercial-NoDerivatives licence (https://creativecommons.org/licenses/by-nc-nd/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is unaltered and is properly cited. The written permission of Cambridge University Press must be obtained for commercial re-use or in order to create a derivative work.
Copyright
© The Author(s), 2025. Published by Cambridge University Press on behalf of Canadian Mathematical Society

1 Introduction and statement of main results

Since the 1980s, the development of multiparameter harmonic analysis has proceeded apace; recent contributions in the area include [Reference Chang and Fefferman1, Reference Fefferman8Reference Ferguson and Lacey11, Reference Hytönen and Martikainen18, Reference Journé20, Reference Lacey, Petermichl, Pipher and Wick22, Reference Nagel and Stein26, Reference Ou27, Reference Pipher28]. Much of the product space theory on $\mathbb {R}^m \times \mathbb {R}^n$ , including the duality of $\mathsf {H}^1$ with $\mathsf {BMO}$ , characterization of $\mathsf {H}^1$ by square functions and atomic decompositions, and interpolation, has been extended to more general products of spaces of homogeneous type.

In contrast to the classical theory of singular integrals, in multiparameter harmonic analysis product singular integrals are not of weak type $(1,1)$ . For functions supported on the unit cube, the classical weak type $(1,1)$ estimate was replaced by a $\mathsf {L\,log\,L} $ to $\mathsf {L}^{1,\infty }$ estimate by R. Fefferman [Reference Fefferman8]. More precisely, let $\mathcal {T}$ be a product Calderón–Zygmund operator on $\mathsf {L}^{2}(\mathbb {R}^m\times \mathbb {R}^n)$ as in [Reference Journé20] or [Reference Fefferman8]; then for every $\lambda \in \mathbb {R}_{+}$ ,

(1.1) $$ \begin{align} \begin{aligned} &|\{(x_1,x_2)\in {Q_1\times Q_2}\colon |\mathcal{T} f(x_1,x_2)|>\lambda\}| \leq \frac{C}{\lambda} F(f) \end{aligned} \end{align} $$

for all f supported in the product $Q_1 \times Q_2$ of the unit cubes in $\mathbb {R}^m$ and $\mathbb {R}^n$ , where

$$\begin{align*}F(f) = \iint_{Q_1\times Q_2} |f(x_1,x_2)| (1+\log^+(|f(x_1,x_2)|)) \,\mathrm{d} x_2 \,\mathrm{d} x_1; \end{align*}$$

here $\log ^+t :=\log (\max \{1,t\})$ . The proof in [Reference Fefferman8] relies on the boundedness of the strong maximal function and the area function from $\mathsf {L\,log\,L}$ to $\mathsf {L}^{1,\infty }$ , the local atomic decomposition of functions in $\mathsf {L\,log\,L}$ produced using the $\mathsf {L\,log\,L} $ to $\mathsf {L}^{1,\infty }$ boundedness of the area function, and the boundedness of $\mathcal {T}$ on $\mathsf {L\,log\,L}$ atoms.

Fefferman’s inequality is similar to the estimate for the strong maximal function [Reference Jessen, Marcinkiewicz and Zygmund19] of Jessen, Marcinkiewicz, and Zygmund.

A natural question arises: does a global version of (1.1) hold for area functions, square functions, maximal operators and singular integral operators, and in more general product settings? The global version of (1.1) cannot be of the same form. Indeed, take m and n to be $1$ and f to be the characteristic function of the unit square in $\mathbb {R} \times \mathbb {R}$ . Let $\mathcal {T}$ be either the double Hilbert transformation, with singular integral kernel $(xy)^{-1}$ , or the strong (rectangular) maximal operator. It is easy to check that

$$\begin{align*}\mathcal{T} (f)(x_1, x_2) \lesssim \frac{1}{ (|x_1| + 1)(|x_2|+1)} \qquad \forall x_1, x_2 \in \mathbb{R}, \end{align*}$$

and if $|x_1|> 1$ and $|x_2|> 1$ , then we may replace $\lesssim $ by $\eqsim $ . It follows that

(1.2) $$ \begin{align} | \{ (x_1, x_2) \in \mathbb{R}\times \mathbb{R} : \mathcal{T}(f)(x_1, x_2)> \lambda \} | \eqsim \frac{\log (\mathrm{e} + 1/\lambda)}{\lambda} F(f) \end{align} $$

for all $\lambda \in (0,1)$ . Hence in the global case, a right hand side of the form $F(f)/\lambda $ cannot be correct.

We are going to work with (generalizations of) the following global version of (1.1):

(1.3) $$ \begin{align} | \{ (x_1,x_2) \in \mathbb{R}^m\times \mathbb{R}^n\colon |\mathcal{T} f(x_1,x_2)|>\lambda \} | \leq C F(f / \lambda), \end{align} $$

where the integral defining the nonlinear functional F now extends over $\mathbb {R}^m \times \mathbb {R}^n$ . Later we review Orlicz spaces, and prove an inequality (2.11), which, when coupled with (1.3), implies that

(1.4) $$ \begin{align} | \{ (x_1,x_2) \in\mathbb{R}^m\times \mathbb{R}^n\colon |\mathcal{T} f(x_1,x_2)|>\lambda \} | \leq C \frac{\log (\mathrm{e} + 1/\lambda)}{\lambda} F(f ), \end{align} $$

which is of comparable size to the right hand side of (1.1) when $\lambda> 1$ and is of the right order to capture the global behavior of the examples of the previous paragraph.

As far as we know, there are no previous global results of this kind in the literature. We therefore argue that the global estimate (1.3) is a new contribution to the theory of product singular integrals even in the euclidean setting. Our approach here admits a broad generalization from euclidean spaces to stratified Lie groups.

Indeed, in this article, we establish a global analog of (1.3) on product spaces of the form $\mathbf {G} := G_1 \times G_2$ , where $G_1$ and $G_2$ are stratified Lie groups; these spaces include, but are more general than, the product of Euclidean spaces $\mathbb {R}^m\times \mathbb {R}^n$ . This setting is in some respects like the euclidean case, but in other respects is very different. For example, in general stratified Lie groups, there are no Cauchy–Riemann equations; these are used in the Euclidean case to relate square functions and maximal functions (see Merryfield [Reference Merryfield25]), and new ideas are needed here. Our new approach works for stratified Lie groups, but the group structure plays an essential role in our argument, and we cannot extend it to cover more general spaces of homogeneous type. Another difference is that the sub-Laplacian on a stratified Lie group is sometimes analytic hypoelliptic and sometimes not (see [Reference Helffer16] for more details). When the sub-Laplacian is analytic hypoelliptic, as in the euclidean case, some arguments are much easier (our Lemma 2.4 shows that the claim proved in (4.9) is not needed). A third difference is that in our more general setting, there is no analog of the dyadic rectangles that are used by Fefferman and others in the classical setting: as a replacement, we use a lemma of Hytonen and Kairema [Reference Hytönen and Kairema17]; this part of our contribution is valid in the more general context of spaces of homogenous type.

To state our results, we need a little notation. Details may be found later.

In this introduction, the auxiliary functions $\varphi ^{[i]}$ on $G_i$ satisfy standard decay and smoothness conditions and have integral $1$ . Likewise, the functions $\psi ^{[i]}$ on $G_i$ satisfy standard decay and smoothness and have integral $0$ . We write $\zeta ^{[1]}_{t_1}$ and $\zeta ^{[2]}_{t_2}$ for normalized dilates of functions $\zeta ^{[1]}$ on $G_1$ and $\zeta ^{[2]}$ on $G_2$ , and $\varphi _{\mathbf {t}}$ and $\psi _{\mathbf {t}}$ for the product functions $\varphi ^{[1]}_{t_1} \otimes \varphi ^{[2]}_{t_2}$ and $\psi ^{[1]}_{t_1} \otimes \psi ^{[2]}_{t_2}$ .

We define $\mathbf {T} := \mathbb {R}_{+} \times \mathbb {R}_{+}$ . For $\mathbf {g} :=(g_1,g_2)\in \mathbf {G}$ , $\mathbf {t} \in \mathbf {T}$ and $\eta \in [0,\infty )$ , we denote by $\mathbf {P}(\mathbf {g}, \mathbf {t})$ the product of open balls $B_1(g_1, t_1)\times B_2(g_2, t_2)$ and by $\Gamma ^{\eta }(\mathbf {g})$ the product cone $\Gamma _1^{\eta }(g_1)\times \Gamma _2^{\eta }(g_2)$ , where

$$\begin{align*}\Gamma_i^{\eta }(g_i) :=\{(h_i ,t_i )\in G_i \times \mathbb{R}_{+}\colon\rho_i (g_i ,h_i ) \leq \eta t_i \}. \end{align*}$$

For $\eta \in [0,\infty )$ and $f \in \mathsf {L}^{1}(\mathbf {G})$ , we define the maximal function:

(1.5) $$ \begin{align} \mathcal{M}_{\varphi,\eta }(f)(\mathbf{g}) :=\sup \Big\{ \big| (f \ast \varphi_{\mathbf{t}}) (\mathbf{h}) \big| \colon (\mathbf{h},\mathbf{t}) \in \Gamma^\eta (\mathbf{g}) \Big\} \qquad\forall \mathbf{g}\in\mathbf{G}. \end{align} $$

This maximal function is called radial when $\eta = 0$ and nontangential when $\eta> 0$ .

For $\eta \in \mathbb {R}_{+}$ and $f \in \mathsf {L}^{1}(\mathbf {G})$ , we define the Lusin area function $\mathcal {S}_{\psi ,\eta }(f)$ by

(1.6) $$ \begin{align} \mathcal{S}_{\psi,\eta }(f)(\mathbf{g}) :=\left( \iint_{\Gamma^{\eta }(\mathbf{g})} \frac{\left| ( f \ast \psi_{\mathbf{t}})(\mathbf{h}) \right| ^{2}}{\left| \mathbf{P}(\mathbf{g},\eta \mathbf{t}) \right| } \,\mathrm{d} \mathbf{h} \,\frac{\mathrm{d}\mathbf{t}}{\mathbf{t}} \right) ^{1/2} \qquad\forall \mathbf{g}\in\mathbf{G}. \end{align} $$

Here and throughout this article, $\,\mathrm {d} \mathbf {h}= \,\mathrm {d} h_1 \,\mathrm {d} h_2$ and $\,\frac {\mathrm {d}\mathbf {t}}{\mathbf {t}} = \frac {\,\mathrm {d} t_1 }{ t_1}\frac {\,\mathrm {d} t_2 }{ t_2}$ .

For $\eta = 0$ and $f \in \mathsf {L}^{1}(\mathbf {G})$ , we define the Littlewood–Paley square function $\mathcal {S}_{\psi ,0}(f)$ by

$$\begin{align*}\mathcal{S}_{\psi,0}(f)(\mathbf{g}) :=\left( \int_{\mathbf{T}} \left| (f \ast \psi_{\mathbf{t}})(\mathbf{g}) \right| ^{2} \,\frac{\mathrm{d}\mathbf{t}}{\mathbf{t}} \right) ^{1/2} \qquad\forall \mathbf{g}\in\mathbf{G}. \end{align*}$$

These are often written differently, but it is convenient to treat them together.

On a stratified Lie group G, we may take a basis $\{\mathcal {X}_1, \dots , \mathcal {X}_d\}$ for the space of left-invariant horizontal vector fields, and define the sub-Laplacian $\mathcal {L}$ to be $-\sum _{j-1}^{d} \mathcal {X}_j^2$ . The Riesz transformations are then the operators $\mathcal {X}_j \mathcal {L}^{-1/2}$ . The double Riesz transformations $\mathcal {R}^{[1]}_{j_1} \otimes \mathcal {R}^{[2]}_{j_2}$ on $\mathbf {G}$ are defined in the obvious way when $1 \leq j_i \leq d_i$ .

Theorem 1.1 Let $\mathcal {T}$ be a maximal operator $\mathcal {M}_{\varphi ,\eta }$ or a Littlewood–Paley operator $\mathcal {S}_{\psi ,\eta }$ , where $\eta \geq 0$ , or a double Riesz transformation $\mathcal {R}^{[1]}_{j_1} \otimes \mathcal {R}^{[2]}_{j_2}$ . Then

(1.7) $$ \begin{align} \big| \big\{\mathbf{g} \in \mathbf{G}\colon |\mathcal{T}f(\mathbf{g})|>\lambda\big\}\big| \lesssim F_\Phi({f/\lambda}) \qquad\forall \lambda \in \mathbb{R}_{+}, \end{align} $$

for all $f \in \mathsf {L\,log\,L}(\mathbf {G})$ , where $\mathsf {L\,log\,L}(\mathbf {G})$ is the Orlicz space associated with the functional $F_\Phi $ , given by

$$ \begin{align*} F_{\Phi}(f) := \iint_{\mathbf{G}} |f(\mathbf{g})| \log(\mathrm{e} + |f(\mathbf{g})| ) \,\mathrm{d} \mathbf{g}. \end{align*} $$

We explain Orlicz spaces and Luxemburg norms later. We say that the operator $\mathcal {T}$ is hyperweakly bounded when an estimate of the form (1.7) holds.

It is easy to iterate estimates for one-parameter maximal, area or singular integral operators to prove a local version of this result, where the support of the function f is restricted to lie in a compact set. However, iteration does not seem to be able to deal with the global case, and this is the main difficulty that we need to confront in this article.

Our presentation has the following structure. In Section 2, we review some background on stratified Lie groups, product spaces and Orlicz spaces.

In Section 3, we show that the strong maximal operator $\mathcal {M}_{s}$ is hyperweakly bounded, using a covering lemma, Theorem 3.2, that goes back to [Reference Córdoba and Fefferman5]. We then prove Theorem 1.1 for the maximal operator $\mathcal {M}_{\varphi ,\eta }$ , by using group properties to dominate the maximal operator $\mathcal {M}_{\varphi ,\eta }$ by the strong maximal operator.

In Section 4, we construct the atomic decomposition. We use the gradient of the Poisson kernel p as our auxiliary function, and then have a key result of [Reference Cowling, Fan, Li and Yan6, 12 and following], namely, the good- $\lambda $ inequality: for all $f \in \mathsf {L\,log\,L}(\mathbf {G})$ ,

(1.8)

when $\eta $ is sufficiently large; here $L_{\eta }(\lambda ) :=\left \{ \mathbf {g}\in {\mathbf {G}}\colon \mathcal {M}_{p,\eta }(f)(\mathbf {g}) \leq \lambda \right \}$ . This implies that the area operator is hyperweakly bounded. By using this boundedness and the Calderón reproducing formula, we can decompose global $\mathsf {L\,log\,L}$ functions into atoms.

In Section 5, we apply our atomic decomposition and a version of Journé’s covering lemma for spaces of homogeneous type established in [Reference Han, Li and Lin14]. We prove Theorem 1.1 for general operators $\mathcal {S}_{\psi ,\eta }$ and the double Riesz transformations $\mathcal {R}^{[1]}_{j_1} \otimes \mathcal {R}^{[2]}_{j_2}$ . The same argument holds for general product Calderón–Zygmund operators, as in Journé [Reference Journé20].

Most of the arguments rely only on the theory of spaces of homogeneous type. However, we need the setting of a stratified Lie group in two places. First, it gives us the good $\lambda $ inequality (1.8). Second, it gives us the Calderón reproducing formula for $\mathsf {L}^{2}(G)$ functions, which is needed for the atomic decomposition.

“Constants” are positive real numbers, depending only on the geometry of $\mathbf {G}$ unless otherwise indicated; we write $A \lesssim B$ when there exists a constant C such that $A \leq C B$ . We write $\chi _{E} $ for the indicator function of a set E, and o denotes the identity of a group.

2 Preliminaries

2.1 Stratified nilpotent Lie groups

Let G be a (real and finite dimensional) stratified nilpotent Lie group of step s with Lie algebra $\mathfrak {g}$ . This means that we may write $\mathfrak {g}$ as a vector space direct sum $\bigoplus _{j = 1}^{s} \mathfrak {v}_{j} $ , where $[\mathfrak {v}_1, \mathfrak {v}_{j}] = \mathfrak {v}_{j + 1}$ when $1\leq j \leq s$ ; here $\mathfrak {v}_{s+1} := \{0\}$ . Let $\nu $ denote the homogeneous dimension of G; that is, $\sum _{j=1}^{s} j \dim \mathfrak {v}_{j}$ . There is a one-parameter family of automorphic dilations $\delta _t$ on $\mathfrak {g}$ , given by

$$ \begin{align*} \delta_t (\mathcal{X}_1 + \mathcal{X}_2+ \ldots + \mathcal{X}_{s}) := t\mathcal{X}_1 + t^2\mathcal{X}_2 + \ldots + t^{s} \mathcal{X}_{s}; \end{align*} $$

here each $\mathcal {X}_j \in \mathfrak {v}_{j}$ and $t \in \mathbb {R}_{+}$ . The exponential mapping $\exp \colon \mathfrak {g} \to G$ is a diffeomorphism, and we identify $\mathfrak {g}$ and G. The dilations extend to automorphic dilations of G, also denoted by $\delta _t$ , by conjugation with $\exp $ . The Haar measure on G, which is bi-invariant, is the Lebesgue measure on $\mathfrak {g}$ lifted to G using $\exp $ . Many more general facts about stratified Lie groups may be found in [Reference Folland and Stein12].

By [Reference Hebisch and Sikora15], the group G may be equipped with a smooth subadditive homogeneous norm $\rho $ , a continuous function from G to $[0,\infty )$ that is smooth on $G\setminus \{o\}$ and satisfies

  1. (1) $\rho (g^{-1}) =\rho (g)$ ;

  2. (2) $\rho (g^{-1}h) \leq \rho (g) + \rho (h)$ ;

  3. (3) $\rho ({ \delta _t(g)}) =t\rho (g)$ for all $g\in G$ and all $t \in \mathbb {R}_{+}$ ;

  4. (4) $\rho (g) =0$ if and only if $g=o$ .

Abusing notation, we set $\rho (g, g') := \rho (g^{-1} g')$ for all $g, g' \in G$ ; this defines a metric on G. We write $B(g, r)$ for the open ball with centre g and radius r with respect to $\rho $ :

$$\begin{align*}B(g, r) = g B(o,r) = g \{ h \in G \colon \rho(h) < r \}. \end{align*}$$

Then

$$\begin{align*} \delta _r(B,o,1) = B(o,r) \quad\text{and}\quad |B(o,r)| = r^\nu |B(0,1)|.\end{align*}$$

The metric space $(G,\rho )$ is geometrically doubling; that is, there exists $A \in \mathbb {N}$ such that every metric ball $B(x,2r)$ may be covered by at most A balls of radius r.

We remind the reader that a stratified Lie group is a space of homogenous type in the sense of Coifman and Weiss [Reference Coifman and Weiss3, Reference Coifman and Weiss4], and analysis on stratified Lie groups uses much from the theory of such spaces. In particular, we frequently deal with molecules, that is, functions $\zeta $ that satisfy standard decay and smoothness conditions, by which we mean that there is a parameter $\epsilon \in (0,1]$ , which we fix once and for all, such that

(2.1) $$ \begin{align} \begin{gathered} \left| \zeta(g) \right| \lesssim \frac{1}{(1+ \rho(g))^{\nu+\epsilon}}, \\ \left| \zeta(g)-\zeta(g') \right| \lesssim \frac{\rho( g' g^{-1} )^\epsilon}{(1+ \rho(g)+ \rho(g'))^{\nu+2\epsilon} } \end{gathered} \end{align} $$

for all $g, g' \in G$ . We often impose an additional cancellation condition, namely

(2.2) $$ \begin{align} \int_G \zeta(g) \,\mathrm{d} g =0. \end{align} $$

The normalized dilate $f_t$ of a function f on G by $t \in \mathbb {R}_{+}$ is given by $f_{t} := t^{-\nu }f\circ \delta _{1/t}$ , and the convolution $f\ast f'$ of suitable functions f and $f'$ on G is defined by

$$ \begin{align*} (f \ast f')(g) :=\int_Gf(h)f'(h^{-1}g)\,\mathrm{d} h =\int_Gf(gh^{-1})f'(h)\,\mathrm{d} h. \end{align*} $$

Take left-invariant vector fields $\mathcal {X}_1$ , …, $\mathcal {X}_{n}$ on G that form a basis of $\mathfrak {v}_1$ , and define the sub-Laplacian $\mathcal {L} := -\sum _{j=1 }^{n} (\mathcal {X}_{j})^2 $ . Observe that each $\mathcal {X}_{j}$ is homogeneous of degree $1$ and $\mathcal {L}$ is homogeneous of degree $2$ , in the sense that

$$ \begin{align*} &\mathcal{X}_{j} \left( f \circ \delta_{t} \right) = t \left( \mathcal{X}_{j} f \right) \circ \delta_{t}, \\ &\mathcal{L} \left( f \circ \delta_{t} \right) = t^2 \left( \mathcal{L} f \right) \circ \delta_{t} \end{align*} $$

for all $t \in \mathbb {R}_{+}$ and all $f \in \mathsf {C}^{2}(G)$ .

Associated to the sub-Laplacian, Folland, and Stein [Reference Folland and Stein12] defined the Riesz potential operators $\mathcal {L}^{-\alpha }$ , where $\alpha \in \mathbb {R}_{+}$ ; these are convolution operators with homogeneous kernels. The Riesz transformation $\mathcal {R}_{j} := \mathcal {X}_{j} \mathcal {L}^{-1/2}$ is a singular integral operator, and is bounded on $\mathsf {L}^{p}(G)$ when $1 < p < \infty $ as well as from the Folland–Stein Hardy space $\mathsf {H}^1(G)$ [Reference Folland and Stein12] to $\mathsf {L}^{1}(G)$ .

The Hardy–Littlewood maximal operator $\mathcal {M}$ on G is defined using the metric balls:

where the “average integral” is defined by

For future use, we note that the layer cake formula implies that, if $\mu $ is a radial decreasing function on G (that is, $\mu (g)$ depends only on $\rho (g)$ and decreases as $\rho (g)$ increases), then

(2.3) $$ \begin{align} \big(\left| f \right| \ast \mu_\epsilon\big) (g) \leq \left\Vert \mu \right\Vert{}_{\mathsf{L}^{1}(G)} \mathcal{M}f(g) \qquad\forall g \in G. \end{align} $$

2.2 Functional calculus for the sub-Laplacian

The sub-Laplacian $\mathcal {L}$ has a spectral resolution:

$$\begin{align*}\mathcal{L} (f)=\int_{\mathbb{R}_{+}}\lambda \,\mathrm{d} \mathcal{E}_{\mathcal{L} }(\lambda) f \qquad\forall f\in \mathsf{L}^{2}(G), \end{align*}$$

where $\mathcal {E}_{\mathcal {L}}(\lambda )$ is a projection-valued measure on $[0,\infty )$ , the spectrum of $\mathcal {L}$ . For a bounded Borel function $m\colon [0,\infty )\to \mathbb {C}$ , we define the operator $F(\mathcal {L})$ spectrally:

$$ \begin{align*} m(\mathcal{L})f := \int_{\mathbb{R}_{+}} m(\lambda)\,\mathrm{d} \mathcal{E}_{\mathcal{L}}(\lambda) f \qquad\forall f\in \mathsf{L}^{2}(G). \end{align*} $$

This operator is a convolution with a Schwartz distribution on G.

2.3 The heat and Poisson kernels

Let $h_t$ and $p_{t}$ , where $t \in \mathbb {R}_{+}$ , be the heat and Poisson kernels associated with the sub-Laplacian operator $\mathcal {L}$ , that is, the convolution kernels of the operators $e^{t\mathcal {L}}$ and $e^{t \sqrt {\mathcal {L}}}$ on G. We write ${{q}}_{t}$ for $t\partial _t p_{t}$ . We warn the reader that $p_{t}$ and ${{q}}_{t}$ are the normalized dilates of $p_1$ and ${{q}}_{1}$ by the factor t, but $h_t$ is the normalized dilate of $h_1$ by a factor of $t^{1/2}$ . Let $\nabla $ denote the horizontal subgradient on G and denote the gradient $(\nabla , \partial _{t})$ on $G\times \mathbb {R}_{+}$ .

Lemma 2.1 The kernels $h_t$ and $p_{t}$ are $\mathbb {R}_{+}$ -valued. Further, $h_t$ and $p_{t}$ have integral $1$ , while ${{q}}_{t}$ has integral $0$ for all $t \in \mathbb {R}_{+}$ . Finally, there exists a constant c such that

for all $g\in G$ and $t \in \mathbb {R}_{+}$ .

Proof For the heat kernel estimates, see [Reference Varopoulos, Saloff-Coste and Coulhon30, Theorem IV.4.2]. Note that the first estimate has a version with the opposite inequality, with a different constant c.

The estimates for $p_{t}$ and ${{q}}_{t}$ follow from the subordination formula

$$ \begin{align*} e^{-t\sqrt{\mathcal{L}}} =\frac{1}{2\sqrt{\pi}}\int_{\mathbb{R}_{+}} \frac{te^{-{t^2}/{4v}}}{\sqrt{v}}e^{-v \mathcal{L}}\frac{\,\mathrm{d} v}{v}. \end{align*} $$

We leave the details to the reader.

This lemma shows that the heat kernel $h_1$ and the Poisson kernel $p_1$ and their derivatives satisfy the standard decay and smoothness conditions (2.1); their derivatives also satisfy the cancellation condition (2.2).

2.4 Systems of pseudodyadic cubes

We use the Hytönen–Kairema [Reference Hytönen and Kairema17] families of “dyadic cubes” in geometrically doubling metric spaces. We state a version of [Reference Hytönen and Kairema17, Theorem 2.2] that is simpler, in that we work on metric spaces rather than pseudometric spaces. The Hytönen–Kairema construction builds on seminal work of Christ [Reference Christ7] and of Sawyer and Wheeden [Reference Sawyer and Wheeden29].

Theorem 2.2 ([Reference Hytönen and Kairema17])

Let c, C and $\kappa $ be constants such that $0 < c \leq C < \infty $ and $12 C\kappa \leq c$ , and let $(G,\rho )$ be a metric stratified group. Then, for all $k \in \mathbb {Z}$ , there exist families $\mathscr {Q}_k(G)$ of pseudodyadic cubes Q with centres $z(Q)$ , such that:

  1. (1) G is the disjoint union of all $Q \in \mathscr {Q}_k(G)$ , for each $k\in \mathbb {Z}$ ;

  2. (2) $B(z(Q),c\kappa ^k/3)\subseteq Q \subseteq B(z(Q),2C\kappa ^k)$ for all $Q \in \mathscr {Q}_k(G)$ ;

  3. (3) if $Q \in \mathscr {Q}_k(G)$ and $Q' \in \mathscr {Q}_{k'}(G)$ where $k\leq k'$ , then either $Q \cap Q'=\emptyset $ or $Q \subseteq Q'$ ; in the second case, $B(z(Q), 2C\kappa ^k) \subseteq B(z(Q'),2C\kappa ^{k'})$ .

We write $\mathscr {Q}(G)$ for the union of all $\mathscr {Q}_k(G)$ , and call this a system of pseudodyadic cubes. Given a cube $Q \in \mathscr {Q}_k(G)$ , we denote the quantity $\kappa ^k$ by $\operatorname {\ell }(Q)$ , by analogy with the side-length of a Euclidean cube.

A finite collection $\{\mathscr {Q}^{\tau }\colon {\tau }=1,2,\dots ,\mathrm {T}\}$ of systems of pseudodyadic cubes is called a collection of adjacent systems of pseudodyadic cubes with parameters $C'$ , c, C and $\kappa $ , if it has the following properties: individually, each $\mathscr {Q}^{\tau }$ is a system of pseudodyadic cubes with parameters c, C and $\kappa $ as in Theorem 2.2; collectively, for each ball $B(x,r)\subseteq G$ such that $\kappa ^{k+3}<r\leq \kappa ^{k+2}$ , where $k\in \mathbb {Z}$ , there exist ${\tau } \in \{1, 2, \dots , \mathrm {T}\}$ and $Q\in \mathscr {Q}^{\tau }_k$ with centre $z(Q)$ such that $d(x, z(Q)) < 2\kappa ^{k}$

(2.4) $$ \begin{align} B(x,r)\subseteq Q\subseteq B(x,C'r). \end{align} $$

The following construction is due to [Reference Hytönen and Kairema17].

Theorem 2.3 Suppose that $(G,\rho )$ is a metric stratified group. Then there exists a finite collection $\{\mathscr {Q}^{\tau }\colon {\tau } = 1,2,\dots ,\mathrm {T}\}$ of adjacent systems of pseudodyadic cubes with parameters $C'$ , c, C and $\kappa $ , where $\kappa := 1/100$ , $c := 12^{-1}$ , $C := 4$ and $C' := 8 \times 10^{6}$ . For each ${\tau }\in \{1,2,\dots ,\mathrm {T}\}$ , the centres $z(Q)$ of the cubes $Q \in \mathscr {Q}^{\tau }_k$ have the properties that

$$\begin{align*}d(z(Q), z(Q')) \geq \kappa^k /4 \qquad\text{when }Q\neq Q' \end{align*}$$

and

$$\begin{align*}\min \{ d(x, z(Q)) \colon Q \in \mathscr{Q}^{\tau}_k \} < 2\kappa^k \qquad\forall x\in G. \end{align*}$$

From [Reference Kairema, Li, Pereyra and Ward21, Remark 2.8], the number $\mathrm {T}$ of adjacent systems of pseudodyadic cubes in Theorem 2.3 may be taken to be at most $A^6 \kappa ^{-\log _2(A)}$ , where A is the geometric doubling constant of G. The constants c, C, $C'$ and $\kappa $ do not depend on the choice of the metric stratified group $(G,\rho )$ .

2.5 Products of stratified groups

We equip the product of two stratified groups $G_1$ and $G_2$ with a product structure. We carry forward the notation from Section 2.1, adding a subscript i or superscript $[i]$ to clarify that we are dealing with $G_i$ ; the parameter i is always $1$ or $2$ . To shorten the formulae, we often use bold face type to indicate a product object: thus we write $\mathbf {G}$ , $\mathbf {T}$ , $\mathbf {g}$ , $\mathbf {r}$ and $\mathbf {t}$ in place of $G_1 \times G_2$ , $\mathbb {R}_{+} \times \mathbb {R}_{+}$ , $(g_1,g_2)$ , $(r_1,r_2)$ and $(t_1,t_2)$ . For example, $B_i (g_i, r_i)$ denotes the open ball in $G_i$ with centre $g_i$ and radius $r_i$ , with respect to the homogeneous norm $\rho _i $ , and we write $\mathbf {P}(\mathbf {g},\mathbf {r})$ for the product $B_1(g_1, r_1) \times B_2(g_2, r_2)$ . We write $\pi _i$ for the projection of $\mathbf {G}$ onto $G_i$ .

Products of balls are basic geometric objects and we write $\mathscr {P}(\mathbf {G})$ for the family of all such products. In addition we deal with rectangles, by which we mean products of pseudodyadic cubes. We write $\mathscr {R}(\mathbf {G})$ for the family of all rectangles, and for adjacent pseudodyadic systems as constructed in Theorem 2.3, we let $\mathscr {R}^{{\tau }_1,{\tau }_2}(\mathbf {G})$ be the family of rectangles $\{Q_1\times Q_2\colon Q_i\in \mathscr {Q}^{{\tau }_i},\ {\tau }_i=1,2,\dots ,\mathrm {T}_i\}$ . We let $\operatorname {\boldsymbol {\ell }}\colon \mathscr {R}(\mathbf {G}) \to \mathbf {T}$ be the function such that $\operatorname {\ell }_i(Q_1 \times Q_2) = \operatorname {\ell }(Q_i)$ , the “side-length” of $Q_i$ . We also let $\mathbf {z}(R)$ be the center of R, that is, $\mathbf {z}(R)=(z_1(R), z_2(R))$ , where, when $R=Q_1\times Q_2$ , we set $z_j(R)=z(Q_j)$ , with $z(Q_j)$ as defined in Theorem 2.2.

The element of Haar measure on $\mathbf {G}$ is denoted $\,\mathrm {d}\mathbf {g}$ , but is often written $\,\mathrm {d} g_1\,\mathrm {d} g_2$ for calculations. The convolution $f\ast f'$ of suitable functions f and $f'$ on $\mathbf {G}$ is defined by

$$ \begin{align*} (f\ast f')(\mathbf{g}) :=\int_{\mathbf{G}}f(\mathbf{h})f'(\mathbf{h}^{-1}\mathbf{g}) \,\mathrm{d} \mathbf{h}. \end{align*} $$

The strong maximal operator $\mathcal {M}_{s}$ is defined by

for all $f \in \mathsf {L}^{1}_{\mathrm {loc}}(\mathbf {G})$ . It is evident that $\mathcal {M}_{s}$ is dominated by the composition of the Hardy–Littlewood maximal operators in the factors:

$$\begin{align*}\mathcal{M}_{s}{f} \leq \mathcal{M}_1 \mathcal{M}_2 (f) \qquad\text{and}\qquad \mathcal{M}_{s}{f} \leq \mathcal{M}_2 \mathcal{M}_1(f). \end{align*}$$

When $1 < p \leq \infty $ , the operators $\mathcal {M}_1$ and $ \mathcal {M}_2$ in the factors are $\mathsf {L}^{p}$ -bounded, so the iterated maximal operators and hence the strong maximal operator are also $\mathsf {L}^{p}$ -bounded.

Given an open subset U of $\mathbf {G}$ with finite measure $\left | U \right | $ , we define the enlargement $U^{*}$ of U using the strong maximal operator $\mathcal {M}_{s}$ :

$$ \begin{align*} U^{*} := \Big\{ \mathbf{g} \in \mathbf{G} \colon \mathcal{M}_{s} \chi_{U}(\mathbf{g})> \alpha \Big\} , \end{align*} $$

for some $\alpha \in (0,1)$ that varies from instance to instance. It is easy to see that $U \subset U^*$ , while the $\mathsf {L}^{2}(\mathbf {G})$ boundedness of the strong maximal operator together with Chebyshev’s inequality shows that $| U^*| \lesssim |U|$ . We write $\mathscr {M}(U)$ for the family of maximal rectangles contained in U.

If $\varphi ^{[1]}$ on $G_1$ and $\varphi ^{[2]}$ on $G_2$ both satisfy the decay and smoothness conditions (2.1) and have integral $1$ , then

(2.5) $$ \begin{align} \left| ( f \ast \varphi_{\mathbf{t}})(\mathbf{g}) \right| \lesssim \mathcal{M}_{s}(f)(\mathbf{g}) \qquad\forall \mathbf{g} \in \mathbf{G}, \quad\forall f \in \mathsf{L}^{1}(\mathbf{G}), \end{align} $$

much as argued to prove (2.3), but with “biradial” in place of “radial”.

Recall from (1.6) that ${{q}}_{\mathbf {t}}$ denotes $t_1\partial _{t_1} p_{t_1}^{[1]} \otimes t_2\partial _{t_2} p^{[2]}_{t_2}$ and

The properties of the Poisson kernel are important for the next lemma.

Lemma 2.4 Suppose that $-\partial _{t_1}^2 - \partial _{t_2}^2 + \mathcal {L}^{[1]} + \mathcal {L}^{[2]}$ is analytic hypoelliptic on $\mathbf {T} \times \mathbf {G}$ . If $f \in \mathsf {L}^{2}(\mathbf {G})$ and $\mathcal {S}_{q,\eta }(f)(\mathbf {g}) = 0$ for some $\mathbf {g} \in \mathbf {G}$ and some $\eta \in \mathbb {R}_{+}$ , then $f = 0$ .

Proof The function $(\mathbf {t}, \mathbf {h}) \mapsto (f \ast {{q}}_{\mathbf {t}} )(\mathbf {h})$ is smooth in $ \mathbf {T} \times \mathbf {G}$ , and so the hypothesis $\mathcal {S}_{q,\eta }(f)(\mathbf {g}) = 0$ implies that $(f \ast {{q}}_{\mathbf {t}} )(\mathbf {h}) = 0$ for all $(\mathbf {t}, \mathbf {h}) \in \Gamma ^\eta (\mathbf {g})$ . Hence for all $\mathbf {h} \in \mathbf {G}$ , the function $\mathbf {t} \mapsto (f \ast p_{\mathbf {t}})(\mathbf {h})$ is constant once $t_1$ and $t_2$ are both large enough. However,

$$\begin{align*}\left| (f \ast p_{\mathbf{t}})(\mathbf{h}) \right| \leq \left\Vert f \right\Vert{}_{\mathsf{L}^{2}(\mathbf{G})} \left\Vert p_{\mathbf{t}} \right\Vert{}_{\mathsf{L}^{2}(\mathbf{G})} \to 0 \end{align*}$$

as $t_1$ or $t_2$ tends to infinity, so the only possible value for $(f \ast p_{\mathbf {t}})(\mathbf {h})$ when $t_1$ and $t_2$ are both large is $0$ . Since

$$\begin{align*}(-\partial_{t_1}^2f + \mathcal{L}^{[1]})( f \ast p_{\mathbf{t}}) = (- \partial_{t_2}^2f + \mathcal{L}^{[2]}) (f \ast p_{\mathbf{t}}) = 0, \end{align*}$$

analytic hypoellipticity implies that $f \ast p_{\mathbf {t}} = 0$ in $\mathbf {T} \times \mathbf {G}$ and hence $f = 0$ .

The hypothesis of analytic hypoellipticity certainly holds when $\mathbf {G} = \mathbb {R}^{m} \times \mathbb {R}^{n}$ , but in general it is not satisfied; see [Reference Helffer16] for more information.

2.6 Journé’s covering lemma

We recall a covering lemma for product spaces. Journé [Reference Journé20] first established this result on $\mathbb {R}\times \mathbb {R}$ . The fourth author [Reference Pipher28] extended it to higher dimensional Euclidean spaces and an arbitrary number of factors $\mathbb {R}^{n_1}\times \mathbb {R}^{n_2}\times \ldots \times \mathbb {R}^{n_m}$ . This was extended to products of spaces of homogeneous type in [Reference Han, Li and Lin14, Lemma 2.1].

Take $\alpha \in (0,1/2)$ . Let U be an open subset of $\mathbf {G}$ of finite measure and let $\mathscr {M}(U)$ be the set of all maximal subrectangles of U. Given $R=Q_1\times Q_2\in \mathscr {M}(U)$ , let $\tilde {Q}_2$ be the biggest pseudodyadic cube containing $Q_2$ such that

$$\begin{align*}\big|\big( Q_1\times \tilde{Q}_2\big) \cap U\big|> \alpha |Q_1\times \tilde{Q}_2|. \end{align*}$$

Similarly, given $R=Q_1\times Q_2\in \mathscr {M}(U)$ , let $\tilde {Q}_1$ be the biggest pseudodyadic cube containing $Q_1$ such that

$$\begin{align*}\big|\big( \tilde{Q}_1\times Q_2\big) \cap U\big|>\alpha |\tilde{Q}_1\times Q_2|. \end{align*}$$

We then set

$$\begin{align*}\gamma_1(R) := \frac{\operatorname{\ell}(\tilde{Q}_1)}{\operatorname{\ell}(Q_1)} \qquad\text{and}\qquad \gamma_2(R) := \frac{\operatorname{\ell}(\tilde{Q}_2)}{\operatorname{\ell}(Q_2)}. \end{align*}$$

Here is our Journé-type covering lemma on $\mathbf {G}$ .

Lemma 2.5 Suppose that U is an open subset in $\mathbf {G}$ of finite measure and $\delta \in \mathbb {R}_{+}$ . Then

$$ \begin{align*} \sum_{R\in \mathscr{M}(U)}\left| R \right| \left( \gamma_1(R)^{-\delta} + \gamma_2(R)^{-\delta} \right) \lesssim_\delta |U|. \end{align*} $$

Proof See [Reference Han, Li and Lin14, Lemma 2.1].

2.7 The Orlicz space L log L

We recall the definition of an Orlicz space on $\mathbf {G}$ . A Young function is a continuous, convex, increasing bijective function $\Phi \colon [0,\infty )\to [0,\infty )$ . To a Young function $\Phi $ , we associate the nonlinear functional $F_\Phi $ , given by

$$\begin{align*}F_{\Phi}(f) = \int_{\mathbf{G}} \Phi(|f(\mathbf{g})|) \,\mathrm{d} \mathbf{g} \qquad\forall f \in \mathsf{L}^{1}_{\mathrm{loc}}(\mathbf{G}). \end{align*}$$

The convexity of $\Phi $ implies that $\{ f \in \mathsf {L}^{1}_{\mathrm {loc}}(\mathbf {G}) \colon F_\Phi (f) \leq 1 \}$ is a closed convex symmetric set, and so the Luxemburg norm, given by

$$\begin{align*}\|f\|_{\mathsf{L}^{\Phi}(\mathbf{G})} := \inf\left\{ \lambda\in \mathbb{R}_{+} \colon F_{\Phi}(f/\lambda) \leq 1 \right\}\hspace{-1pt}, \end{align*}$$

is indeed a norm, and $\mathsf {L}^{\Phi }(\mathbf {G})$ , the set of functions f for which $\|f\|_{\mathsf {L}^{\Phi }(\mathbf {G})}$ is finite, is a Banach space. The sets $\{ f \in \mathsf {L}^{1}_{\mathrm {loc}}(\mathbf {G}) \colon F_\Phi (f) \leq 1 \}$ and $\{ f \in \mathsf {L}^{1}_{\mathrm {loc}}(\mathbf {G}) \colon \left \Vert f \right \Vert {}_{\mathsf {L}^{\Phi }(\mathbf {G})} \leq 1 \}$ coincide.

Suppose that $\Phi $ and $\Psi $ are Young functions such that

(2.6) $$ \begin{align} st \leq \Phi(s)+\Psi(t) \qquad\forall s,t \in \mathbb{R}_{+}. \end{align} $$

Then

(2.7) $$ \begin{align} \left| \int_{\mathbf{G}} f(\mathbf{g}) h(\mathbf{g}) \,\mathrm{d} \mathbf{g} \right| \leq \int_{\mathbf{G}} \Phi(f(\mathbf{g})) \,\mathrm{d} \mathbf{g} + \int_{\mathbf{G}} \Psi( h(\mathbf{g})) \,\mathrm{d} \mathbf{g} , \end{align} $$

and it follows that the corresponding Luxemburg norms are related by the generalized Hölder inequalities, namely,

(2.8) $$ \begin{align} \Big|\int_{\mathbf{G}} f(\mathbf{g}) h(\mathbf{g}) \,\mathrm{d} \mathbf{g}\Big| \leq 2\|f\|_{\mathsf{L}^{\Phi}(\mathbf{G})} \|h\|_{\mathsf{L}^{\bar\Phi}(\mathbf{G})}. \end{align} $$

We are particularly interested in a special pair of Young functions. Henceforth,

(2.9) $$ \begin{align} \Phi(s):=s[\log(\mathrm{e} + s)]\qquad\text{and}\qquad \Psi(t) := \exp(t)-1. \end{align} $$

By maximizing in s, it is straightforward to show that $1 + st - s \log (\mathrm {e}+s) \leq \mathrm {e}^t$ , so these functions satisfy (2.6), whence (2.7) and (2.8) hold. For this choice of $\Phi $ and $\Psi $ , the corresponding spaces are denoted by $\mathsf {L\,log\,L}(\mathbf {G})$ and $\mathrm {e}^{\mathsf {L}^{}}(\mathbf {G})$ . Clearly $\mathsf {L\,log\,L}(\mathbf {G}) \subseteq \mathsf {L}^{1}(\mathbf {G})$ . Further, $\Phi (\lambda t) \leq \Phi (\lambda ) \Phi (t)$ , whence

(2.10) $$ \begin{align} F_\Phi(\lambda f) \leq \Phi(\lambda) F_\Phi(f) \qquad\forall \lambda\in \mathbb{R}_{+} \quad\forall f \in \mathsf{L\,log\,L}(\mathbf{G}). \end{align} $$

This implies that $f \in \mathsf {L\,log\,L}(\mathbf {G})$ if and only if $F_\Phi (f)$ is finite, and that

(2.11) $$ \begin{align} F_\Phi(f/\lambda) \leq \Phi(1/\lambda) F_\Phi(f) , \end{align} $$

which we used in the introduction to prove (1.4) from (1.3).

Finally, since $\Psi $ is a Young function,

(2.12) $$ \begin{align} \Psi(t/\lambda) \leq \Psi(t)/\lambda \qquad\forall \lambda\in [1,\infty). \end{align} $$

For all this and much more about Orlicz spaces, see [Reference Luxemburg24].

We need a density result.

Proposition 2.6 $\mathsf {L}^{2} \cap \mathsf {L\,log\,L} (\mathbf {G})$ is dense in $\mathsf {L\,log\,L} (\mathbf {G})$ .

Proof For $f\in \mathsf {L\,log\,L} (\mathbf {G})$ , we define, for all $N \in \mathbb {N}$ ,

$$\begin{align*}f_N := \chi_{\{\mathbf{g}\in \mathbf{G}\colon |f(\mathbf{g})| \leq N\}} f. \end{align*}$$

Then $f_N\in \mathsf {L\,log\,L} (\mathbf {G})$ , $|f_N| \leq |f|$ , and $f_N \to f$ almost everywhere as $n \to \infty $ . Next,

$$ \begin{align*} \int_{\mathbf{G}} |f_N(\mathbf{g})|^2 \,\mathrm{d} \mathbf{g} \leq \int_{\mathbf{G}} N |f(\mathbf{g})| \log(\mathrm{e} + |f(\mathbf{g})|) \,\mathrm{d} \mathbf{g} = N F_{\Phi}(f), \end{align*} $$

which shows that $f_N \in \mathsf {L}^{2}(\mathbf {G})$ for all $N \in \mathbb {Z}^+$ . Finally, if $\lambda < 1$ , then

$$ \begin{align*} F_{\Phi}((f-f_N)/\lambda) &= \frac{1}{\lambda} \int_{\mathbf{G}} |f(\mathbf{g})-f_N(\mathbf{g})| \log{\Big(}\mathrm{e} + |(f(\mathbf{g})-f_N(\mathbf{g}))/\lambda)|\Big) \,\mathrm{d} \mathbf{g}\\ &\leq \frac{1}{\lambda} \int_{\{\mathbf{g}\in \mathbf{G}\colon |f(\mathbf{g})|>N\}} |f(\mathbf{g})| \left( \log(\mathrm{e} + |f(\mathbf{g})|) + \log(1/\lambda) \right) \,\mathrm{d} \mathbf{g} \\ &\to 0 \end{align*} $$

as $N \to \infty $ . Consequently, $\left \Vert {f-f_N}\right \Vert {}_{\mathsf {L\,log\,L}(\mathbf {G})}$ tends to zero as N tend to infinity.

We also have the following auxiliary result.

Lemma 2.7 For every $f \in \mathsf {L\,log\,L}(\mathbf {G})$ ,

$$\begin{align*}\sum_{k \leq 0} 2^{k}F_\Phi(2^{-k}f)^{1/2} \lesssim F_\Phi(f)^{1/2}. \end{align*}$$

Proof By definition, if $k\leq 0$ , then

$$ \begin{align*} 2^kF_{\Phi}(2^{-k}f)^{1/2} &= 2^{k} \bigg(\int_{\mathbf{G}} 2^{-k} |f(\mathbf{g})| \log\Big(\mathrm{e} + 2^{-k}|(f(\mathbf{g})|\Big) \,\mathrm{d} \mathbf{g}\bigg)^{1/2}\\ &\leq 2^{k / 2} \bigg(\int_{\mathbf{G}} |f(\mathbf{g})| \log\Big( 2^{-k}(\mathrm{e} + |(f(\mathbf{g})|)\Big) \,\mathrm{d} \mathbf{g}\bigg)^{1/2} \\ &= 2^{k / 2} \bigg(\int_{\mathbf{G}} |f(\mathbf{g})|\Big[ \log 2^{-k}+ \log\big( \mathrm{e} + |(f(\mathbf{g})|\big) \Big]\,\mathrm{d} \mathbf{g}\bigg)^{1/2} \\ &\leq (-k)^{1 / 2}2^{k / 2} (\log 2)^{1 / 2} \bigg(\int_{\mathbf{G}} |f(\mathbf{g})| \,\mathrm{d} \mathbf{g}\bigg)^{1/2} \\ &\qquad\qquad\qquad+ 2^{k / 2} \bigg(\int_{\mathbf{G}} |f(\mathbf{g})| \log\big( \mathrm{e} + |(f(\mathbf{g})|\big) \,\mathrm{d} \mathbf{g}\bigg)^{1/2}. \end{align*} $$

Summing over all such k yields the required estimate.

3 The strong maximal function and Proof of Theorem 1.1

The results in this section may be easily generalized to spaces of homogeneous type.

Before we tackle the main topic, we state and prove a covering lemma. We begin with a geometric observation.

Lemma 3.1 There is a geometric constant $C(\mathbf {G})$ with the property that

$$\begin{align*}R \subseteq \{ g\in \mathbf{G} \colon \mathcal{M}_{s}(\chi_U)(\mathbf{g})> C(\mathbf{G}) \alpha \} \end{align*}$$

for all measurable subsets U of $\mathbf {G}$ , all $\lambda \in (0,1)$ and all rectangles $R \in \mathscr {P}$ such that

$$\begin{align*}|R \cap U| \geq \lambda \left| R \right|. \end{align*}$$

Proof By definition, R is a product of pseudodyadic cubes $Q_1 \times Q_2$ , and by Theorem 2.2,

$$\begin{align*}B(z(Q_i),c \ell(Q_i)/3)\subseteq Q_i \subseteq B(z(Q_i),2C\ell(Q_i)). \end{align*}$$

Define $P := B(z(Q_1),2C\ell _1(Q_1) \times B(z(Q_2),2C\ell _2(Q_2)$ . Then $R \subseteq P$ and

$$ \begin{align*} |P \cap U| &\geq |R \cap U| \geq \lambda \left| R \right| \\ &\geq \lambda | B(z(Q_1),c\ell(Q_1)/3) \times B(z(Q_2),c\ell(Q_2)/3)| \\ &= \lambda \Big(\frac{c }{6C}\Big)^{\nu_1+\nu_2} | B(z(Q_1),2C\ell(Q_1)) \times B(z(Q_2),2C\ell(Q_2))| \\ &=: \lambda C(\mathbf{G}) |P|; \end{align*} $$

in the first equality we use the homogeneity of the measure of the balls on $G_1$ and $G_2$ . We could also prove a similar inequality by using the doubling property of the measure.

Hence R is a subset of the set

$$\begin{align*}U^* = \{ \mathbf{g} \in \mathbf{G} \colon \mathcal{M}_{s} \chi_{U}(\mathbf{g})> C(\mathbf{G}) \lambda \}, \end{align*}$$

as required.

Theorem 3.2 [Reference Córdoba and Fefferman5]

Let $\{R_j\}_{j\in J}$ be a family of rectangles in $ \mathbf {G}$ such that $\big |\bigcup _{j\in J} R_j\big |$ is finite. Then there is a sequence of rectangles $\{\tilde {R}_k\}\subset \{R_j\}_{j\in J}$ such that

$$\begin{align*}\big|\bigcup_{j\in J} R_j\big|\lesssim \big|\bigcup_{k} \tilde{R}_k\big| \qquad\text{and}\qquad \Big\Vert \sum_k\chi_{\tilde{R}_k} \Big\Vert_{\mathrm{e}^{\mathsf{L}^{}}(\mathbf{G})} \lesssim \big|\bigcup_{j\in J} R_j\big|. \end{align*}$$

Proof The proof is a generalization of that in [Reference Córdoba and Fefferman5]. However, the pseudodyadic case is somewhat trickier than the dyadic case in $\mathbb {R}^2$ , and the argument in [Reference Córdoba and Fefferman5] is rather brief and not always precise, so it seems worthwhile to provide a complete proof.

Since there are countably many rectangles, we may assume that $J = \mathbb {N}$ and $j \in \mathbb {N}$ . Let

$$\begin{align*}E:=\bigcup_j R_j. \end{align*}$$

Choose a subsequence $\{R_{\sigma (j)} \colon j \in \mathbb {N}\}$ of $\{R_j \colon j \in \mathbb {N}\}$ , using the rules that $\sigma (1) = 1$ and $\sigma (k+1)$ is the least $j> \sigma (k)$ such that

$$\begin{align*}| R_j \cap (R_{\sigma(1)} \cup \ldots \cup R_{\sigma(k)}) | \leq \frac{1}{4} |R_j|; \end{align*}$$

the construction terminates if this is not possible. Let $E_K = \bigcup _{1 \leq k \leq K} R_{\sigma (k)} \subseteq E$ and let $E_\infty = \bigcup _{1 \leq k \leq \infty } R_{\sigma (k)}$ . We note that $E_\infty $ is a subset of E, and hence $|E_\infty |\leq |E|<\infty $ by assumption. If the construction does not terminate, then by monotone convergence, we may choose K such that $|E_K| \geq |E_\infty |/2$ ; if the construction terminates, we take K to be the last index of the finite sequence. The set E consists of rectangles $R_j$ that are not one of the chosen rectangles $R_{\sigma (k)}$ together with rectangles that are subsets of $E_\infty $ if the construction did not terminate, or of $E_K$ otherwise. Suppose that the construction did not terminate. Then by Lemma 3.1,

$$\begin{align*}R_j \subseteq E_\infty^* := \{ \mathbf{g} \in \mathbf{G} \colon \mathcal{M}_{s} \chi_{E_\infty}(\mathbf{g})> {C(\mathbf{G})}/{4} \}. \end{align*}$$

Hence $E \subseteq E_\infty ^* \cup E_\infty $ , and the boundedness of $\mathcal {M}_{s}$ on $\mathsf {L}^{2}(\mathbf {G})$ shows that

$$\begin{align*}|E| \leq |E_\infty^*| + |E_\infty| \lesssim |E_\infty| \leq 2 |E_K|. \end{align*}$$

When the construction terminates, a slightly easier argument shows that $|E| \lesssim |E_K|$ .

We now relabel the finite family of rectangles $\{R_{\sigma (1)}, \dots , R_{\sigma (K)}\}$ as $R_K, \dots , R_1$ and repeat the construction, setting $\tau (1) := 1$ and choosing $\tau (l+1)$ to be the least $k> \tau (l)$ such that

$$\begin{align*}| R_k \cap (R_{\tau(1)} \cup \ldots \cup R_{\tau(l)}) | \leq \frac{1}{4} |R_k|; \end{align*}$$

we end up with L terms, say. Since

$$\begin{align*}| R_k \cap (R_{k+1} \cup \ldots \cup R_{K}) | \leq \frac{1}{4} |R_k| \end{align*}$$

by construction, it follows that

(3.1) $$ \begin{align} | R_{\tau(l+1)} \cap ((R_{\tau(1)} \cup \dots \cup R_{\tau(l)}) \cup (R_{\tau(l+2)} \cup \ldots \cup R_{\tau(L)}))| \leq \frac{1}{2} | R_{\tau(l+1)} |. \end{align} $$

Write $\tilde {R}_l$ for $R_{\tau (l)}$ and $\tilde {E}$ for $\bigcup _{l} R_{\tau (l)}$ . Much as before, $|E| \lesssim \big |\tilde {E}\big |$ . Note that

(3.2) $$ \begin{align} \big| \tilde{E}\big| \leq \sum_{l} \big| \tilde{R}_l \big| \leq 2 \sum_{l} \bigg| \tilde{R}_l \setminus \Big( \bigcup_{l' \neq l} \tilde{R}_{l'} \Big) \bigg| = 2 \bigg| \bigcup_{l}{\bigg(} \tilde{R}_l \setminus \Big( \bigcup_{l' \neq l} \tilde{R}_{l'} \Big) {\bigg)}\bigg| \leq 2 \big| \tilde{E} \big|. \end{align} $$

We now claim that

(3.3) $$ \begin{align} \big| \{ \mathbf{g} \in \mathbf{G} \colon \sum_{l} \chi_{\tilde{R}_l} \geq n \} \big| \lesssim 2^{-n/2} |E|. \end{align} $$

From this claim, it follows that

$$\begin{align*}\begin{aligned} \int_{\mathbf{G} } \Psi\Big( \sum_{l} \chi_{\tilde{R}_L}(\mathbf{g}) / \lambda \Big) \,\mathrm{d}\mathbf{g} &\lesssim \sum_{n=1}^{\infty} \Psi(n/\lambda) 2^{-n/2} |E| \\ &\lesssim \sum_{n=1}^{\infty} \exp(n/\lambda) 2^{-n/2} |E| \\ &= \sum_{n=1}^{\infty} \exp( n[ 1/\lambda - \log(2)/2]) |E| \\ &\lesssim |E| \end{aligned} \end{align*}$$

as long as $1/\lambda < \log (2)/2$ , and so $\sum _{l=1}^{L} \chi _{R_l} \in \mathrm {e}^{\mathsf {L}^{}}(G)$ . It remains to prove (3.3).

Consider $\mathbf {g} \in \mathbf {G}$ that lies in two distinct rectangles from the family $\{ \tilde {R}_l\}$ , R and S say. If $\operatorname {\ell }_1(S) = \operatorname {\ell }_1(R)$ , then $\pi _1(R) = \pi _1(S)$ , since both $\pi _1(R)$ and $\pi _1(S)$ are pseudodyadic cubes containing $\pi _1(\mathbf {g})$ ; now $\pi _2(R) \subseteq \pi _2(S)$ or $\pi _2(R) \supseteq \pi _2(S)$ , so $R\cap S$ is either R or S, and this contradicts (3.1). Hence we may assume, without loss of generality, that $\operatorname {\ell }_1(S) < \operatorname {\ell }_1(R)$ . A similar argument then shows that $\operatorname {\ell }_2(S)> \operatorname {\ell }_2(R)$ . Thus, if there are n distinct rectangles in $\{ \tilde {R}_l\}$ that contain $\mathbf {g}$ , then we may label them $R_1(\mathbf {g})$ , …, $R_n(\mathbf {g})$ , in such a way that $\operatorname {\ell }_1(R_j(\mathbf {g}))$ decreases with j and $\operatorname {\ell }_2(R_j(\mathbf {g}))$ increases with j; then

$$\begin{align*}R_1(\mathbf{g}) \cap \ldots \cap R_n(\mathbf{g}) = \pi_1(R_n(\mathbf{g})) \times \pi_2(R_1(\mathbf{g})), \end{align*}$$

and

$$\begin{align*}\left| R_1(\mathbf{g}) \cap \ldots \cap R_n(\mathbf{g}) \right| = \left| \pi_1(R_n(\mathbf{g})) \times \pi_2(R_1(\mathbf{g})) \right| \lesssim \kappa ^{-n \nu_1} \left| R_1(\mathbf{g}) \right|. \end{align*}$$

We say that a rectangle $T \in \{ \tilde {R}_l \}$ is a descendant of a rectangle $R \in \{ \tilde {R}_l\}$ , and we write $T \succ R$ , if $R\cap T \neq \emptyset $ and both $\operatorname {\ell }_1(T)> \operatorname {\ell }_1(R)$ and $\operatorname {\ell }_2(T) < \operatorname {\ell }_2(R)$ , and we say that T is a child of R if $T \succeq R$ and if $S \in \{ \tilde {R}_l\}$ and $T \succeq S \succeq R$ then $S = T$ or $S= R$ . We may define ancestors and parents similarly, with the relations reversed.

Fix a rectangle $R \in \{ \tilde {R}_l\}$ , and consider $\mathbf {g} \in R$ that lies in at least n distinct rectangles of $\{ \tilde {R}_l\}$ . Then $\mathbf {g}$ lies in at least $\lceil n/2\rceil $ distinct rectangles S such that $S \succeq R$ , or $\mathbf {g}$ lies in at least $\lceil n/2\rceil $ distinct rectangles S such that $S \preceq R$ . Thus

$$\begin{align*}\begin{aligned} &\left|\left\{ \mathbf{g} \in R \colon \sum\nolimits_{l} \chi_{\tilde{R}_l}(\mathbf{g}) \geq n \right\}\right| \\ &\qquad\leq \left|\left\{ \mathbf{g} \in R \colon \sum\nolimits_{l}^{\succeq} \chi_{\tilde{R}_l}(\mathbf{g}) \geq \frac{n}{2} \right\}\right| + \left|\left\{ \mathbf{g} \in R \colon \sum\nolimits_{l}^{\preceq} \chi_{\tilde{R}_l}(\mathbf{g}) \geq \frac{n}{2} \right\}\right| , \end{aligned} \end{align*}$$

where $\sum ^\succeq _{l}$ indicates that we sum over l such that $\tilde {R}_l \succeq R$ , and $\sum ^\preceq _{l}$ is defined analogously. We estimate the measure of one of these two sets: the other may be estimated similarly.

Consider all $S \in \{ \tilde {R}_l\}$ such that $S \succeq R$ . Inside this collection of rectangles, we may identify the children $S_1$ , …, $S_p$ of R, which are pairwise disjoint, the children of the children of R, which are again pairwise disjoint, and so on. Observe that

$$\begin{align*}R \cap (S_1 \cup \ldots \cup S_p) = \pi_1(R) \times (\pi_2(S_1) \cup \ldots \cup \pi_2(S_p)) \subseteq \pi_1(R) \times \pi_2(R), \end{align*}$$

and the near disjointness condition (3.1) implies that

$$\begin{align*}\left| \pi_2(S_1) \cup \ldots \cup \pi_2(S_p) \right| \leq \frac{1}{2} \left| \pi_2(R) \right|. \end{align*}$$

Then the measure of the set of all $\mathbf {g} \in R$ that also belong to another rectangle $S \succ R$ is at most $ \left | R \right |/2$ .

Likewise, similar inequalities hold for the children of the children of R, which we may label as $T_1$ , …, $T_q$ , and so

$$\begin{align*}\left| \pi_2(T_1) \cup \ldots \cup \pi_2(T_q) \right| \leq \frac{1}{2} \left| \pi_2(S_1) \cup \ldots \cup \pi_2(S_p) \right| \leq \frac{1}{2^2} \left| \pi_2(R) \right|. \end{align*}$$

and the measure of the set of all $\mathbf {g} \in R$ that also belong to two more rectangles $S, T \succ R$ is at most $2^{-2} \left | R \right |$ . Continuing inductively, the measure of the set of all $\mathbf {g} \in R$ that lie in at least m distinct rectangles $S\succ R$ is at most $2^{-m} \left | R\right |$ .

Combining this estimate with an almost identical estimate for the measures of sets defined using ancestors and parents, we conclude that the measure of the set of all $\mathbf {g} \in R$ that belong to at least n rectangles of $\{ \tilde {R}_l \}$ is at most

$$\begin{align*}2 \times 2^{-\lceil (n-1)/2\rceil} \left| R\right| \lesssim 2^{-n/2} \left| R\right|. \end{align*}$$

Then the measure of all $\mathbf {g} \in G$ that belong to at least n rectangles of $\{ \tilde {R}_l \}$ is at most a multiple of

(3.4) $$ \begin{align} \sum_{l} 2^{-n/2} \big| \tilde{R}_l \big| = 2^{-n/2} \sum_{l} \big| \tilde{R}_l \big| \lesssim 2^{-n/2} |E|, \end{align} $$

which is what we needed to prove.

Now we can prove an endpoint estimate for the strong maximal function on $\mathbf {G}$ .

Theorem 3.3 The strong maximal function satisfies the following:

$$\begin{align*}\big| \big\{\mathbf{g}\in \mathbf{G}\colon |\mathcal{M}_{s} f(\mathbf{g})|>\lambda\big\}\big| \lesssim F_\Phi(f/\lambda) \qquad\forall\lambda\in \mathbb{R}_{+} \end{align*}$$

for all $f \in \mathsf {L\,log\,L}(\mathbf {G})$ .

Proof We recall a fundamental estimate on the strong maximal function and pseudodyadic strong maximal functions associated with adjacent pseudodyadic systems, the prototype of which (in the setting of the product $\mathbb {R}\times \mathbb {R}$ ) was proved in [Reference Li, Pipher and Ward23, Theorem 6.1]. More explicitly, we define the pseudodyadic strong maximal function as follows:

where ${\tau }_i=1,2,\dots ,\mathrm {T}_i$ . Then

(3.5) $$ \begin{align} \mathcal{M}_{s}(f)(\mathbf{g})\lesssim \sum_{{\tau}_1=1}^{\mathrm{T}_1}\sum_{{\tau}_2=1}^{\mathrm{T}_2} \mathcal{M}_{s}^{{\tau}_1,{\tau}_2}(f)(\mathbf{g}) \qquad\forall \mathbf{g}\in\mathbf{G}. \end{align} $$

This inequality means that instead of considering general products of balls, it suffices to consider rectangles, and Theorem 3.2 may be applied.

In light of (3.5), sublinearity and Proposition 2.6, it suffices to show that

(3.6) $$ \begin{align} \big| \big\{g \in \mathbf{G}\colon |\mathcal{M}_{s}^{{\tau}_1,{\tau}_2} f(\mathbf{g})|>1\big\}\big| \lesssim F_\Phi(f) \qquad\forall f \in \mathsf{L}^{2}(\mathbf{G}) \cap \mathsf{L\,log\,L} (\mathbf{G}), \end{align} $$

when ${\tau }_i=1,2,\dots ,\mathrm {T}_i$ ; we may assume that f takes nonnegative real values.

Let $E=\big \{\mathbf {g} \in \mathbf {G}\colon |\mathcal {M}_{s}^{{\tau }_1,{\tau }_2} f(\mathbf {g})|>1\big \}$ . Since $f\in \mathsf {L}^{2}(\mathbf {G})$ and $\mathcal {M}_{s}^{{\tau }_1,{\tau }_2}$ is $\mathsf {L}^{2}$ bounded, $|E|$ is finite. For every $\mathbf {g}\in E$ there exists $R_{\mathbf {g}}\in \mathscr {R}^{{\tau }_1,{\tau }_2}(\mathbf {G})$ satisfying

(3.7)

Then $E=\bigcup _{\mathbf {g}\in E} R_{\mathbf {g}}$ by definition.

By Theorem 3.2, there is a sequence of rectangles $\{\tilde {R}_l\} \subseteq \{R_{\mathbf {g}}\}_{\mathbf {g}\in E}$ such that

(3.8) $$ \begin{align} |E| = \Big|\bigcup_{\mathbf{g}\in E} R_{\mathbf{g}}\Big| \lesssim \Big|\bigcup_{l} \tilde{R}_l\Big| \qquad\text{and}\qquad \Big\Vert\sum_{l} \chi_{\tilde{R}_l}\Big\Vert_{\mathrm{e}^{\mathsf{L}^{}(\mathbf{G})}} \lesssim \Big|\bigcup_{\mathbf{g}\in E} R_{\mathbf{g}}\Big|. \end{align} $$

Write $\tilde {E}$ for $\bigcup _l \tilde {R}_l$ and $\tilde {E}_n$ for $\left \{ \mathbf {g} \in \mathbf {G} : \sum _{l} \chi _{\tilde {R}_l}(\mathbf {g}) = n \right \}$ . From (3.7), (2.7), (2.10), and (2.12), we deduce that $|E| \lesssim |\tilde {E}|$ and, for all $\lambda \in [1,\infty )$ ,

$$ \begin{align*} |\tilde{E}| &= \Big| \bigcup_{l} \tilde{R}_l \Big| \le \sum_{l} |\tilde{R}_l| < \sum_{l} \int_{\tilde{R}_l} f(\mathbf{g})\,\mathrm{d} \mathbf{g}\\ &= \int_{\tilde{E}} \sum_{l} \chi_{\tilde{R}_l}(\mathbf{g})f(\mathbf{g})\,\mathrm{d} \mathbf{g} \\ &\leq \int_{\tilde{E}} \Phi(\lambda^2 f(\mathbf{g})) \,\mathrm{d} \mathbf{g} + \int_{\tilde{E}} \Psi( \sum_{l} \chi_{\tilde{R}_l}(\mathbf{g})/\lambda^2) \,\mathrm{d} \mathbf{g}\\ &\leq \Phi(\lambda^2)\int_{\mathbf{G}} \Phi(f(\mathbf{g})) \,\mathrm{d} \mathbf{g} + \frac{1}{\lambda} \int_{\tilde{E}} \Psi( \sum_{l} \chi_{\tilde{R}_l}(\mathbf{g})/\lambda) \,\mathrm{d} \mathbf{g}. \end{align*} $$

Theorem 3.2 shows that $|\tilde {E}_n| \leq C_1 2^{-n/2} |\tilde {E}|$ , where $C_1$ is a geometric constant. See also the quantitative argument in (3.4). Hence

$$ \begin{align*} |\tilde{E}| &\leq \Phi(\lambda^2) F_\Phi(f) + \frac{1}{\lambda} \sum_{n=1}^{\infty} \mathrm{e}^{n/\lambda} \big| \tilde{E}_n\big| \\ &\leq \Phi(\lambda^2) F_\Phi(f) + \frac{C_1}{\lambda} \sum_{n=1}^{\infty} \mathrm{e}^{n/\lambda} 2^{-n/2} \big| \tilde{E}\big|. \end{align*} $$

We fix $\lambda $ that is large enough that the series converges and the right hand term is less than $\big | \tilde {E} \big |/2$ ; then $\big | \tilde {E} \big | \leq 2 \Phi (\lambda ^2) F_\Phi (f)$ , as required.

Corollary 3.4 For $\eta \in [0,\infty )$ , the maximal operator $\mathcal {M}_{\varphi ,\eta }$ defined in (1.5) satisfies the endpoint estimate:

$$\begin{align*}\big| \big\{\mathbf{g}\in \mathbf{G}\colon \left| \mathcal{M}_{\varphi,\eta }(f)(\mathbf{g})\right|> \lambda \big\}\big| \lesssim F_\Phi(f/\lambda) \qquad\forall \lambda \in \mathbb{R}_{+} \end{align*}$$

for all $f \in \mathsf {L\,log\,L}(\mathbf {G})$ .

Proof It follows from (2.5) with a little care that $\mathcal {M}_{\zeta ,\eta }(f)\lesssim \mathcal {M}_{s} f$ pointwise.

The following lemma is another well known consequence of the boundedness of the strong maximal operator. Recall that $\mathscr {M}(U)$ denotes the collection of all maximal rectangles contained in U.

Lemma 3.5 Let U be an open subset of $\mathbf {G}$ of finite measure and $\beta \in \mathbb {R}_{+}$ . Given R in $\mathscr {M}(U)$ , let $\beta R$ be the set $\mathbf {z}(R)\delta _\beta (\mathbf {z}(R)^{-1}R)$ , where $\mathbf {z}(R)$ is the center of R defined in Section 2.5. Then

$$\begin{align*}\bigg| \bigcup_{R \in \mathscr{M}(U)} \beta R \bigg| \lesssim (\beta+1)^{\nu_1 +\nu_2} \left| U\right|. \end{align*}$$

Proof Recall that $B(z(Q),c\operatorname {\ell }(Q) /3)\subseteq Q \subseteq B(z(Q),2C\operatorname {\ell }(Q))$ for all $Q \in \mathscr {Q}(G)$ and that $\mathbf {P}(\mathbf {z},\operatorname {\boldsymbol {\ell }})$ is the product $B_1(z_1,\operatorname {\ell }_1) \times B_2(z_2,\operatorname {\ell }_2)$ . It follows that, if $R \in \mathscr {R}$ and $\mathbf {g} \in \beta R$ , then

$$\begin{align*}\mathbf{P}(\mathbf{z}(R), c \operatorname{\boldsymbol{\ell}}(R) /3) \subseteq R \subseteq \mathbf{P}(\mathbf{g}, 2C(\beta+1) \operatorname{\boldsymbol{\ell}}(R)), \end{align*}$$

and we deduce, much as in the proof of Lemma 3.1, that

$$\begin{align*}\beta R \subseteq \bigg\{ \mathbf{g} \in \mathbf{G} \colon \mathcal{M}_{s} (\chi_R)(\mathbf{g}) \geq C(\mathbf{G}) (\beta+1)^{-\nu_1-\nu_2} \bigg\}. \end{align*}$$

Hence

$$\begin{align*}\bigg| \bigcup_{R \in \mathscr{M}(U)} \beta R \bigg| \leq \bigg| \bigg\{ \mathbf{g} \in \mathbf{G} \colon \mathcal{M}_{s} (\chi_R)(\mathbf{g}) \geq C(\mathbf{G}) (\beta+1)^{-\nu_1-\nu_2} \bigg\} \bigg|, \end{align*}$$

and the hyperweak boundedness of $\mathcal {M}_{s}$ implies that

$$\begin{align*}\bigg| \bigcup_{R \in \mathscr{M}(U)} \beta R \bigg| \lesssim (\beta+1)^{\nu_1 + \nu_2} F_\Phi(\chi_U) \lesssim (\beta+1)^{\nu_1+\nu_2} \left| U\right|, \end{align*}$$

as claimed.

4 The atomic decomposition

In this section, we first prove a hyperweak boundedness estimate for the Lusin area function , and then use this estimate to decompose $\mathsf {L\,log\,L}(\mathbf {G})$ functions.

Our results in this and later sections need the more specific context of stratified Lie groups.

Recall that $p^{[i]}_{t_i}$ and $q^{[i]}_{t_i}$ denote the convolution kernels of the operators $e^{-t_i \sqrt {\mathcal {L}_i}}$ and $t_i \partial _{t_i} e^{-t_i \sqrt {\mathcal {L}_i}}$ ; then $q^{[i]}_{t_i} = t_i \partial _{t_i} p^{[i]}_{t_i}$ . We write $p_{\mathbf {t}}:=p^{[1]}_{t_1} \otimes p^{[2]}_{t_2}$ and ${{q}}_{\mathbf {t}}:=q^{[1]}_{t_1} \otimes q^{[2]}_{t_2}$ .

4.1 Hyperweak boundedness and the area function

The first step is to apply Corollary 3.4.

Proposition 4.1 The nontangential maximal operator $\mathcal {M}_{p,\eta }$ is hyperweakly bounded. That is,

$$\begin{align*}\big| \big\{\mathbf{g} \in \mathbf{G}\colon |\mathcal{M}_{p,\eta } f(\mathbf{g})|>\lambda \big\}\big| \lesssim F_\Phi(f/\lambda) \qquad\forall \lambda \in \mathbb{R}_{+} \end{align*}$$

for all $f \in \mathsf {L\,log\,L}(\mathbf {G})$ .

This weak type estimate for the Poisson maximal operator implies a similar estimate for the Lusin area function , which is the key to establishing the atomic decomposition for $\mathsf {L\,log\,L} (\mathbf {G})$ functions. More explicitly, we denote the tensor by and define

where

(4.1)

Theorem 4.2 The following estimate holds:

for all $f \in \mathsf {L\,log\,L}(\mathbf {G})$ .

Proof A recent result of Fan, Yan and the first and third authors [Reference Cowling, Fan, Li and Yan6] shows that for f such that $\left \Vert \mathcal {M}_{p,\eta } f \right \Vert {}_{\mathsf {L}^{1}(\mathbf {G}) }<\infty $ ,

More precisely, define the sublevel set $L_{\eta }(\lambda ) :=\left \{ \mathbf {g}\in {\mathbf {G}}\colon \mathcal {M}_{p,\eta }(f)(\mathbf {g}) \leq \lambda \right \}$ . Then it is shown that, when $\eta $ is sufficiently large,

(4.2)

for all $\lambda \in \mathbb {R}_{+}$ . Now $\big | L_{\eta }(\lambda )^c \big | \lesssim F_\Phi (f/\lambda )$ by Proposition 4.1, so the layer-cake formula and (2.10) show that

This implies that the right-hand side of (4.2) is finite. By repeating the argument of [Reference Cowling, Fan, Li and Yan6], we see that (4.2) also holds for $f\in \mathsf {L\,log\,L} (\mathbf {G})$ , and we conclude that

for all $f \in \mathsf {L\,log\,L}(\mathbf {G})$ , as required.

Recall that ${{q}}_{\mathbf {t}}$ denotes $t_1\partial _{t_1} p_{t_1}^{[1]} \otimes t_2\partial _{t_2} p^{[2]}_{t_2}$ and

Corollary 4.3 The area function $\mathcal {S}_{q,1} (f)$ satisfies the estimate:

$$ \begin{align*} \big|\big\{ \mathbf{g}\in \mathbf{G}\colon \mathcal{S}_{q,1} (f)(\mathbf{g})>\lambda \big\} \big| \lesssim F_\Phi(f/\lambda) \qquad\forall \lambda \in \mathbb{R}_{+} \end{align*} $$

for all $f \in \mathsf {L\,log\,L}(\mathbf {G})$ .

Proof This holds as $f * {{q}}_{\mathbf {t}}$ is one of the components of the tensor .

4.2 The atomic decomposition for L log L

Recall that the $q^{[i]}_1$ satisfy the standard smoothness, decay and cancellation conditions (2.1) and (2.2), and by [Reference Geller and Mayeli13] there exist compactly supported smooth functions $\varphi ^{[i]}$ on $G_i$ with integral $0$ such that

$$ \begin{align*} f &= \int_{\mathbf{T}} f \ast {{q}}_{\mathbf{t}} \ast \varphi_{\mathbf{t}} \,\frac{\mathrm{d}\mathbf{t}}{\mathbf{t}} \end{align*} $$

for all $f\in \mathsf {L}^{2}(\mathbf {G})$ , where $\varphi _{\mathbf {t}}=\varphi _{t_1}\otimes \varphi _{t_2}$ . This is the Calderón reproducing formula. We suppose that $\operatorname {supp}(\varphi ^{[i]}) \subseteq B_i(o,1)$ , by rescaling if necessary.

Theorem 4.4 (Atomic decomposition for $\mathsf {L\,log\,L} $ )

Let $f\in \mathsf {L}^{2}(\mathbf {G}) \cap \mathsf {L\,log\,L} (\mathbf {G})$ . Then we may write

$$\begin{align*}f=\sum_{k \in \mathbb{Z}} a_k, \end{align*}$$

where the sum converges unconditionally in $L^2(\mathbf {G})$ and the atoms $a_k$ satisfy:

  1. (1) $a_k$ vanishes outside a subset $U^{\dagger }_k$ of $\mathbf {G}$ such that $|U^{\dagger }_k|\lesssim F_\Phi (2^{-k}{f})$ ;

  2. (2) $\|a_k\|_{\mathsf {L}^{2}(\mathbf {G})}^2 \lesssim 2^{2k}F_\Phi (2^{-k}f)$ ;

  3. (3) each $a_k$ can be further decomposed: $a_k=\sum _{R\in \mathscr {M}(U^{*}_k)}a_{k,R} $ , where the sum converges unconditionally in $\mathsf {L}^{2}(\mathbf {G})$ , and

    1. (a) $\operatorname {supp} a_{k,R} \subseteq \beta R$ , for a suitable $\beta \in \mathbb {R}_{+}$ ;

    2. (b) $\int _{G_1}a_{k,R}(g_1,g_2) \,\mathrm {d} g_1=0$ for all $g_2 \in G_2$ ;

    3. (c) $\int _{G_2}a_{k,R}(g_1,g_2) \,\mathrm {d} g_2=0$ for all $g_1 \in G_1$ ;

    4. (d) $\sum _{R\in \mathscr {M}(U^*_k)} \left \Vert a_{k,R} \right \Vert {}^2_{\mathsf {L}^{2}(\mathbf {G})}\lesssim 2^{2k}F_\Phi (2^{-k} f)$ .

The sets $U^{\dagger }_k$ and $U^*_k$ are defined in the proof.

Proof From Corollary 4.3, $\mathcal {S}_{q,1}$ is hyperweakly bounded. Assume that $0 \neq f\in \mathsf {L}^{2}(\mathbf {G}) \cap \mathsf {L\,log\,L} (\mathbf {G})$ . Then $\mathcal {S}_{q,1} (f)\in \mathsf {L}^{2}(\mathbf {G})$ . Now for all $k\in \mathbb {Z}$ , we set

(4.3) $$ \begin{align} U_k &:= \{ \mathbf{g}\in\mathbf{G}\colon \mathcal{S}_{q,1}(f)(\mathbf{g})>2^k \}; \end{align} $$
(4.4) $$ \begin{align} U^{*}_k &:= \{\mathbf{g}\in\mathbf{G}\colon \mathcal{M}_{s} (\chi_{U_k})(\mathbf{g})> \alpha \}; \end{align} $$
(4.5) $$ \begin{align} U^{\dagger}_k &:= \bigcup\nolimits_{S \in \mathscr{M}(U^*_k)} \beta S; \end{align} $$
(4.6) $$ \begin{align} \mathscr{B}_k &:= \{ S \in \mathscr{R}(\mathbf{G}) \colon \left| S\cap U_k\right|> \left| S \right|/2, \left| S\cap U_{k+1}\right| \leq \left| S \right|/2 \}; \end{align} $$

here $\alpha $ and $\beta $ are positive numbers that will be specified during the proof. For each rectangle $S \in \mathscr {R}(\mathbf {G})$ , we define the tent $S_+$ over S to be the set

(4.7) $$ \begin{align} \Big\{ (\mathbf{g},\mathbf{t}) \in \mathbf{G} \times \mathbf{T}\colon \mathbf{g}\in S, 4C\operatorname{\ell}_1(S) < t_1 \leq 8C\operatorname{\ell}_1(S), 4C\operatorname{\ell}_2(S) < t_2 \leq 8C\operatorname{\ell}_2(S)\Big\}. \end{align} $$

The constant C is as in Theorem 2.2. This definition implies that, if $(\mathbf {h},\mathbf {t}) \in S_+$ , then

(4.8) $$ \begin{align} S \subseteq B_1(h_1,4C\operatorname{\ell}_1(S)) \times B_2(h_2,4C\operatorname{\ell}_2(S)) \subseteq B_1(h_1,t_1) \times B_2(h_2,t_2), \end{align} $$

and so $S_+ \subseteq \Gamma (g)$ for all $g \in S$ .

Further, the “upper half space” $\mathbf {T} \times \mathbf {G}$ is the disjoint union of all tents $S_+$ as S runs over $\mathscr {R}(\mathbf {G})$ .

By definition $U_k \supseteq U_{k+1}$ , so $\left | S\cap U_k\right | / \left | S \right | \geq \left | S\cap U_{k+1}\right | / \left | S \right |$ for all $k \in \mathbb {Z}$ . This implies that the sets $\mathscr {B}_k$ are pairwise disjoint. Indeed, if $S \in \mathscr {B}_k$ , then $\left | S\cap U_k\right |/ \left | S \right |> 1/2$ , which implies that $\left | S\cap U_j\right |/ \left | S \right |> 1/2$ whenever $j \leq k$ , so $S \notin \mathscr {B}_j$ if $j < k$ . Likewise, $\left | S\cap U_{k+1} \right |/ \left | S \right | \leq 1/2$ , which implies that $\left | S\cap U_j\right |/ \left | S \right |> 1/2$ whenever $j \geq k+1$ , so $S \notin \mathscr {B}_j$ if $j> k$ .

By Chebyshev’s inequality, $\left | U_k \right | \lesssim 2^{-2k}$ , so $\lim _{k \to \infty } \left | S\cap U_k\right |/ \left | S \right | =0$ for all $S \in \mathscr {R}(\mathbf {G})$ . If $\mathbf {G} \setminus \bigcup _{k\in \mathbb {Z}} U_k$ were a null set, then it would follow that $\lim _{k \to -\infty } \left | S\cap U_k\right |/ \left | S \right | =1$ for all $S \in \mathscr {R}(\mathbf {G})$ , and $\mathscr {R}(\mathbf {G})$ would be the disjoint union of the sets $\mathscr {B}_k$ as k runs over $\mathbb {Z}$ . However, except in special cases, we do not know that $\mathbf {G} \setminus \bigcup _{k \in \mathbb {Z}} U_k$ is null, and we must take into account the possibility that some $S \in \mathscr {R}(\mathbf {G})$ do not belong to any $\mathscr {B}_k$ . When $S \notin \bigcup _{k \in \mathbb {Z}} U_k$ , it is evident that $\left | S\cap U_{k} \right |/ \left | S \right | \leq 1/2$ for all $k \in \mathbb {Z}$ , or equivalently,

$$\begin{align*}\left| S\cap U_k^c\right|> \left| S \right|/2 \qquad\mathrm{for\ all\ } k\in\mathbb Z. \end{align*}$$

We claim that, if $S \notin \bigcup _{k \in \mathbb {Z}} U_k$ , then $(f \ast {{q}}_{\mathbf {t}} )(\mathbf {h}) =0$ for almost all $(\mathbf {h},\mathbf {t})\in S_+$ . To see this, we define the set $Z =\{ \mathbf {g}\in \mathbf {G}: \mathcal {S}_{q,1}(f)(\mathbf {g})=0\}$ , and note that $\bigcap _{k\in \mathbb Z} U_k^c=Z$ . Since $\mathcal {S}_{q,1}(f)$ is lower semicontinuous, Z is closed. Then

$$\begin{align*}|S\cap Z| = \lim_{k\to-\infty }|S\cap U_k^c|\geq \frac{\left| S \right|}{2}. \end{align*}$$

By definition, for every $\mathbf {g}\in Z$ ,

which implies that $(f \ast {{q}}_{\mathbf {t}} )(\mathbf {h})=0$ for almost all $(\mathbf {h},\mathbf {t})\in \Gamma (\mathbf {g})$ . Hence

(4.9) $$ \begin{align} \begin{aligned} &\frac{\left| S \right|}{2}\iint_{S_+} \left| (f \ast {{q}}_{\mathbf{t}} )(\mathbf{h}) \right|{}^{2} \,\mathrm{d}\mathbf{h} \,\frac{\mathrm{d}\mathbf{t}}{\mathbf{t}} \\ &\qquad\leq \left| S \cap Z \right| \iint_{S_+} \left| (f \ast {{q}}_{\mathbf{t}} )(\mathbf{h}) \right|{}^{2} \,\mathrm{d}\mathbf{h} \,\frac{\mathrm{d}\mathbf{t}}{\mathbf{t}}\\ &\qquad\leq\int_{S \cap Z} \iint_{\Gamma(\mathbf{g}) } \left| (f \ast {{q}}_{\mathbf{t}} )(\mathbf{h}) \right|{}^{2} \,\mathrm{d}\mathbf{h} \,\frac{\mathrm{d}\mathbf{t}}{\mathbf{t}}\,\,\mathrm{d}\mathbf{g}\\ &\qquad\leq \int_{Z } \iint_{\Gamma(\mathbf{g})}\left| (f \ast {{q}}_{\mathbf{t}} )(\mathbf{h}) \right|{}^{2} \,\mathrm{d}\mathbf{h} \,\frac{\mathrm{d}\mathbf{t}}{\mathbf{t}}\,\,\mathrm{d}\mathbf{g} \\ &\qquad=0, \end{aligned} \end{align} $$

so that $(f \ast {{q}}_{\mathbf {t}} )(\mathbf {h}) =0$ for almost all $(\mathbf {h},\mathbf {t})\in S_+$ , as claimed.

Since $f\in \mathsf {L}^{2}(\mathbf {G})$ , the reproducing formula implies that

$$ \begin{align*} f(\mathbf{g}) &= \int_{\mathbf{T}} (f \ast {{q}}_{\mathbf{t}} \ast \varphi_{\mathbf{t}})(\mathbf{g}) \,\frac{\mathrm{d}\mathbf{t}}{\mathbf{t}} \\ &= \int_{\mathbf{T}} \int_{\mathbf{G}} (f \ast {{q}}_{\mathbf{t}})(\mathbf{h}) \varphi_{\mathbf{t}}(\mathbf{h}^{-1}\mathbf{g}) \,\mathrm{d} \mathbf{h} \,\frac{\mathrm{d}\mathbf{t}}{\mathbf{t}} \\ &= \sum_{S\in \mathscr{R}(\mathbf{G})} \iint_{S_+} (f \ast {{q}}_{\mathbf{t}})(\mathbf{h}) \varphi_{\mathbf{t}}(\mathbf{h}^{-1}\mathbf{g}) \,\mathrm{d} \mathbf{h} \,\frac{\mathrm{d}\mathbf{t}}{\mathbf{t}} \\ &= \sum_{k \in \mathbb{Z}}\sum_{S\in \mathscr{B}_k} \iint_{S_+} (f \ast {{q}}_{\mathbf{t}})(\mathbf{h}) \varphi_{\mathbf{t}}(\mathbf{h}^{-1}\mathbf{g}) \,\mathrm{d} \mathbf{h} \,\frac{\mathrm{d}\mathbf{t}}{\mathbf{t}} \\ &= \sum_{k \in \mathbb{Z}} a_k(\mathbf{g}). \end{align*} $$

The third equality holds since the upper half-space $\mathbf {T} \times \mathbf {G}$ is the disjoint union of the tents $S_+$ . The fourth equality holds because the $\mathscr {B}_k$ are disjoint, and if $S \in \mathscr {R}(\mathbf {G}) \setminus \big (\bigcup _{k\in \mathbb {Z}} \mathscr {B}_k \big )$ , then $(f \ast q_{\mathbf {t}})(\mathbf {h}) = 0$ for almost all $(\mathbf {t}, \mathbf {h}) \in S_+$ .

If $\mathbf {g} \in S \in \mathscr {B}_k$ , then $|S \cap U_k | / \left | S \right |> 1/2$ so $\mathbf {g} \in U^{*}_k$ by Lemma 3.1 provided that $\alpha < C(\mathbf {G})/2$ ; coupled with (4.6), this shows that

(4.10) $$ \begin{align} \left| S \cap (U^{*}_{k} \setminus U_{k+1}) \right| = \left| S \setminus (S \cap U_{k+1}) \right| \geq \frac{1}{2} \left| S \right|. \end{align} $$

For future use, we note that, for any measurable set V in $\mathbf {G}$ ,

(4.11) $$ \begin{align} \begin{aligned} (\chi_V * \chi_{\mathbf{P}(\mathbf{o},\mathbf{t})})(\mathbf{h}) &= \int_{\mathbf{G}} \chi_V(\mathbf{g}) \chi_{\mathbf{P}(\mathbf{o},\mathbf{t})} (\mathbf{g}^{-1}\mathbf{h}) \,\mathrm{d}\mathbf{g} = \int_{\mathbf{G}} \chi_V(\mathbf{g}) \chi_{\mathbf{P}(\mathbf{h},\mathbf{t})} (\mathbf{g}) \,\mathrm{d}\mathbf{g} \\ &= \left| \mathbf{P}(\mathbf{h}, \mathbf{t}) \cap V \right|. \end{aligned} \end{align} $$

Next, for all $S\in \mathscr {B}_k$ , choose $\tilde {S}\in \mathscr {M}(U^{*}_k)$ such that $S\subseteq \tilde {S}$ . Then for all $R\in \mathscr {M}(U^{*}_k)$ , set $a_{k,R}=\sum _{S \in \mathscr {B}_k\colon \tilde {S}=R } b_{k,S}$ , where

$$\begin{align*}b_{k,S}(\mathbf{g}) = \iint_{S_+} (f \ast {{q}}_{\mathbf{t}})(\mathbf{h}) \varphi_{\mathbf{t}}(\mathbf{h}^{-1}\mathbf{g}) \,\mathrm{d} \mathbf{h} \,\frac{\mathrm{d}\mathbf{t}}{\mathbf{t}}. \end{align*}$$

By construction,

$$\begin{align*}a_k= \sum_{R\in \mathscr{M}(U^{*}_k)} a_{k,R}. \end{align*}$$

Now

$$\begin{align*}\operatorname{supp}(\varphi_{\mathbf{t}}) \subseteq B(o,t_1) \times B(o,t_2)\subseteq B_1(o,8C\operatorname{\ell}_1(S)) \times B_2(o,8C\operatorname{\ell}_2(S)) , \end{align*}$$

and hence

$$ \begin{align*} \operatorname{supp}( b_{k,S}) &\subseteq B_1(z_1(S),10C\operatorname{\ell}_1(S)) \times B_2(z_2(S),10C\operatorname{\ell}_2(S)) \\ &= B_1(z_1(S),\beta c\operatorname{\ell}_1(S)/3) \times B_2(z_2(S),\beta c\operatorname{\ell}_2(S)/3) \\ &\subseteq \beta S, \end{align*} $$

where $\beta = 30 C/c$ . Thus $a_k$ vanishes on $(U^{\dagger }_k)^c$ (recall that $U^{\dagger }_k$ is defined in (4.5)). By Lemmas 3.5 and 3.1 and Corollary 4.3,

(4.12) $$ \begin{align} |U^{\dagger}_k|\lesssim|U^{*}_k|\lesssim|U_k|\lesssim F_\Phi(2^{-k}f). \end{align} $$

Next, we claim that for all $k\in \mathbb {Z}$ ,

(4.13) $$ \begin{align} \|a_k\|_{\mathsf{L}^{2}(\mathbf{G})}^2\lesssim 2^{2k} F_\Phi(2^{-k} f). \end{align} $$

To see this, take $h \in \mathsf {L}^{2}(\mathbf {G})$ such that $\|h\|_{\mathsf {L}^{2}(\mathbf {G})}=1$ ; then, writing $\check \varphi $ for the reflected version of $\varphi $ , that is, $\check \varphi (\mathbf {g}) = \varphi (\mathbf {g}^{-1})$ , we see that

$$ \begin{align*} \big|\langle a_k,h \rangle\big| &= \bigg|\sum_{S\in \mathscr{B}_k} \iint_{S_+} (f \ast {{q}}_{\mathbf{t}})(\mathbf{g}) (h \ast \check\varphi_{\mathbf{t}})(\mathbf{g}) \,\mathrm{d} \mathbf{g} \,\frac{\mathrm{d}\mathbf{t}}{\mathbf{t}}\bigg|\\[5pt] &\leq \bigg( \sum_{S\in \mathscr{B}_k} \iint_{S_+} |(f \ast {{q}}_{\mathbf{t}})(\mathbf{g})|^2 \,\mathrm{d} \mathbf{g}\,\frac{\mathrm{d}\mathbf{t}}{\mathbf{t}} \bigg)^{1/2} \bigg( \sum_{S\in \mathscr{B}_k} \iint_{S_+} |(h \ast \check\varphi_{\mathbf{t}})(\mathbf{g})|^2 \,\mathrm{d} \mathbf{g} \,\frac{\mathrm{d}\mathbf{t}}{\mathbf{t}} \bigg)^{1/2} \\[5pt] &\lesssim \bigg( \sum_{S\in \mathscr{B}_k} \iint_{S_+} |(f \ast {{q}}_{\mathbf{t}})(\mathbf{g})|^2 \,\mathrm{d} \mathbf{g} \,\frac{\mathrm{d}\mathbf{t}}{\mathbf{t}} \bigg)^{1/2}. \end{align*} $$

By definition, (4.10), (4.7), (4.8), a convolution identity and (4.12),

$$ \begin{align*} &\sum_{S\in \mathscr{B}_k} \iint_{S_+} |(f \ast {{q}}_{\mathbf{t}})(\mathbf{h})|^2 \,\mathrm{d} \mathbf{h}\,\frac{\mathrm{d}\mathbf{t}}{\mathbf{t}} \\ &\qquad\leq 2\sum_{S\in \mathscr{B}_k} \iint_{S_+} \frac{\big| R \cap (U^{*}_{k} \setminus U_{k+1}) \big|}{\left| S \right|} \left| (f \ast {{q}}_{\mathbf{t}})(\mathbf{h})\right|{}^2 \,\mathrm{d} \mathbf{h}\,\frac{\mathrm{d}\mathbf{t}}{\mathbf{t}} \\ &\qquad\lesssim \sum_{S\in \mathscr{B}_k} \iint_{S_+} \frac{\big| \mathbf{P}(\mathbf{h},\mathbf{t}) \cap (U^{*}_{k} \setminus U_{k+1}) \big|}{|\mathbf{P}(\mathbf{o},\mathbf{t})|} |(f \ast {{q}}_{\mathbf{t}})(\mathbf{h})|^2 \,\mathrm{d} \mathbf{h}\,\frac{\mathrm{d}\mathbf{t}}{\mathbf{t}} \\ &\qquad {\ \leq} \iint_{\mathbf{G} \times \mathbf{T}} \frac{\big| \mathbf{P}(\mathbf{h},\mathbf{t}) \cap (U^{*}_{k} \setminus U_{k+1}) \big|}{|\mathbf{P}(\mathbf{o},\mathbf{t})|} |(f \ast {{q}}_{\mathbf{t}})(\mathbf{h})|^2 \,\frac{\mathrm{d}\mathbf{t}}{\mathbf{t}} \,\mathrm{d}\mathbf{h}\\ &\qquad= \iint_{\mathbf{G} \times \mathbf{T}} \Big(\chi_{U^{*}_{k} \setminus U_{k+1}} * \frac{\chi_{\mathbf{P}(\mathbf{o},\mathbf{t})}}{|\mathbf{P}(\mathbf{o},\mathbf{t})|}\Big)(\mathbf{h}) |(f \ast {{q}}_{\mathbf{t}})(\mathbf{h})|^2 \,\frac{\mathrm{d}\mathbf{t}}{\mathbf{t}} \,\mathrm{d}\mathbf{h} \\ &\qquad= \int_{\mathbf{G}}\chi_{U^{*}_{k} \setminus U_{k+1}} (\mathbf{h}) \int_{\mathbf{T}} \bigg(\left| f \ast {{q}}_{\mathbf{t}}\right|{}^2 * \frac{\chi_{\mathbf{P}(\mathbf{o},\mathbf{t})}}{|\mathbf{P}(\mathbf{o},\mathbf{t})|}\bigg)(\mathbf{h}) \,\frac{\mathrm{d}\mathbf{t}}{\mathbf{t}} \,\mathrm{d} \mathbf{h} \\\ &\qquad=\int_{U^{*}_k\backslash U_{k+1}} |\mathcal{S}_{q,1}(f)(\mathbf{h})|^2\,\mathrm{d} \mathbf{h} \\ &\qquad\leq 2^{2k}|U^{*}_k|\\ &\qquad\lesssim 2^{2k}F_\Phi(2^{-k}f). \end{align*} $$

This shows that $|\langle a_k,h\rangle |^2 \lesssim 2^{2k}F_\Phi (2^{-k}f)$ , whence $\|a_k\|_{\mathsf {L}^{2}(\mathbf {G})}^2\lesssim 2^{2k}F_\Phi (2^{-k}f)$ , and (4.13) holds.

We now claim that

(4.14) $$ \begin{align} \sum_{R\in \mathscr{M}(U^*_k)} \|a_{k,R}\|^2_{\mathsf{L}^{2}(\mathbf{G})} &\lesssim 2^{2k}F_\Phi(2^{-k}f). \end{align} $$

Arguing as above, we see that, if $h\in \mathsf {L}^{2}(\mathbf {G})$ and $\|h\|_{\mathsf {L}^{2}(\mathbf {G})}=1$ , then

$$ \begin{align*} \big|\langle a_{k,R},h\rangle\big| &= \bigg| \sum_{S \in \mathscr{B}_k(R)} \iint_{S_+} (f \ast {{q}}_{\mathbf{t}})(\mathbf{h})\ (h \ast \check\varphi_{\mathbf{t}})(\mathbf{h}) \,\mathrm{d} \mathbf{h} \,\frac{\mathrm{d}\mathbf{t}}{\mathbf{t}}\bigg|\\[5pt] &\leq \bigg( \sum_{S \in \mathscr{B}_k(R)} \iint_{S_+} |(f \ast {{q}}_{\mathbf{t}})(\mathbf{h})|^2 \,\mathrm{d} \mathbf{h} \,\frac{\mathrm{d}\mathbf{t}}{\mathbf{t}} \bigg)^{1/2} \\&\hspace{2.5cm}\times \bigg( \sum_{S \in \mathscr{B}_k(R)}\iint_{S_+} |(h \ast \check\varphi_{\mathbf{t}})(\mathbf{h})|^2 \,\mathrm{d} \mathbf{h} \,\frac{\mathrm{d}\mathbf{t}}{\mathbf{t}} \bigg)^{1/2}\\[5pt] &\lesssim \bigg( \sum_{S \in \mathscr{B}_k(R)} \iint_{S_+} |(f \ast {{q}}_{\mathbf{t}})(\mathbf{h})|^2 \,\mathrm{d} \mathbf{h}\,\frac{\mathrm{d}\mathbf{t}}{\mathbf{t}} \bigg)^{1/2}, \end{align*} $$

where $\mathscr {B}_k(R) := \{ S \in \mathscr {B}_k\colon \tilde {S}=R \}$ . Hence

$$\begin{align*}\left\Vert a_{k,R} \right\Vert{}_{\mathsf{L}^{2}(\mathbf{G})}\leq\bigg( \sum_{S \in \mathscr{B}_k(R)}\iint_{S_+} |(f \ast {{q}}_{\mathbf{t}})(\mathbf{h})|^2 \,\mathrm{d}\mathbf{h} \,\frac{\mathrm{d}\mathbf{t}}{\mathbf{t}} \bigg)^{1/2}, \end{align*}$$

which implies that

$$\begin{align*}\sum_{R\in \mathscr{M}(U^*_k)} \left\Vert a_{k,R} \right\Vert{}^2_{\mathsf{L}^{2}(\mathbf{G})} \leq \sum_{S\in \mathscr{B}_k} \iint_{S_+} |(f \ast {{q}}_{\mathbf{t}})(\mathbf{h})|^2 \,\mathrm{d} \mathbf{h} \,\frac{\mathrm{d}\mathbf{t}}{\mathbf{t}}. \end{align*}$$

Then, much as argued to prove (4.13), we deduce (4.14). The unconditional convergence of the sum follows as in [Reference Chen, Cowling, Lee, Li and Ottazzi2].

5 Proof of Theorem 1.1

In this section, we use the atomic decomposition to prove Theorem 1.1. Recall that, if $f\in \mathsf {L}^{2}(\mathbf {G}) \cap \mathsf {L\,log\,L} (\mathbf {G})$ , then we may write

$$\begin{align*}f=\sum_{k \in \mathbb{Z}} a_k \qquad\text{and}\qquad a_k=\sum_{R\in \mathscr{M}(U^*_k)}a_{k,R}, \end{align*}$$

where $U^*_k$ is as in Theorem 4.4, and $a_k$ and $a_{k.R}$ satisfy support and cancellation conditions. In Theorem 4.4, we defined

$$\begin{align*}U_k := \{ \mathbf{g}\in\mathbf{G}\colon \mathcal{S}_{q,1}(f)(\mathbf{g})>2^k \}; \end{align*}$$

and

$$ \begin{align*} U^{*}_k := \{\mathbf{g}\in\mathbf{G}\colon \mathcal{M}_{s} (\chi_{U_k})(\mathbf{g})>\alpha \}, \end{align*} $$

where $\alpha < C(\mathbf {G})/2$ ; now we also define

(5.1) $$ \begin{align} U^{**}_k := \{\mathbf{g}\in\mathbf{G}\colon \mathcal{M}_{s} (\chi_{U^{*}_k})(\mathbf{g})> \alpha \}. \end{align} $$

and

(5.2) $$ \begin{align} U^{***}_k := \{\mathbf{g}\in\mathbf{G}\colon \mathcal{M}_{s} (\chi_{U^{**}_k})(\mathbf{g})> \alpha \}. \end{align} $$

By definition, $U_k \subset U^{*}_k\subset U^{**}_k\subset U^{***}_k$ , and so $|U_k| \leq |U^{*}_k|\leq |U^{**}_k|\leq |U^{***}_k|$ . However,

$$\begin{align*}|U^{***}_k| \lesssim |U^{**}_k|\lesssim |U^{*}_k|\lesssim |U_k|, \end{align*}$$

where the implicit constants are geometric, from the $L^2(\mathbf {G})$ -boundedness of the strong maximal operator $\mathcal {M}_{s}$ .

5.1 Proof of Theorem 1.1 for area functions $\mathcal {S}_{\psi ,\eta }(f)$

Recall from Section 4.2 that $\mathsf {L}^{2}(\mathbf {G}) \cap \mathsf {L\,log\,L} (\mathbf {G})$ is dense in $ \mathsf {L\,log\,L} (\mathbf {G})$ . Now we take a general $\psi $ satisfying the decay, smoothness and cancellation conditions (2.1) and (2.2), and prove that

(5.3) $$ \begin{align} \big|\big\{ \mathbf{g}\in \mathbf{G}\colon \mathcal{S}_{\psi,\eta }(f)(\mathbf{g})>\lambda \big\}\big| \lesssim F_\Phi(f/\lambda) \qquad\forall \lambda \in \mathbb{R}_{+} \end{align} $$

for all $f\in \mathsf {L\,log\,L} (\mathbf {G})$ . By sublinearity and density, it suffices to prove that

(5.4) $$ \begin{align} \big|\big\{ \mathbf{g}\in \mathbf{G}\colon \mathcal{S}_{\psi,\eta }(f)(\mathbf{g})>1 \big\}\big| \lesssim F_\Phi(f) \end{align} $$

for all $f\in \mathsf {L}^{2}(\mathbf {G}) \cap \mathsf {L\,log\,L} (\mathbf {G})$ . We may and shall suppose that $\eta> 1$ .

From Theorem 4.4, we may write

$$\begin{align*}f=\sum_{k\in \mathbb Z}a_k , \end{align*}$$

where the $a_k$ satisfy the conditions of the lemma.

Observe that

$$ \begin{align*} \Big\Vert \sum_{k \leq 0}a_k\Big\Vert_{\mathsf{L}^{2}(\mathbf{G})} &\le \sum_{k \leq 0} \left\Vert a_k \right\Vert{}_{\mathsf{L}^{2}(\mathbf{G})} \lesssim \sum_{k \leq 0} 2^{k}F_\Phi(2^{-k}f)^{1/2} \lesssim F_\Phi(f)^{1/2}, \end{align*} $$

where the last inequality follows from Lemma 2.7.

Thus, by the $\mathsf {L}^{2}(\mathbf {G})$ boundedness of $\mathcal {S}_{\psi ,\eta }$ ,

(5.5) $$ \begin{align} \bigg|\bigg\{\mathbf{g}\in \mathbf{G}\colon \mathcal{S}_{\psi,\eta }\bigg(\sum_{k \leq 0}a_k\bigg)(\mathbf{g})>1\bigg\}\bigg| \lesssim \Big\Vert \sum_{k \leq 0}a_k\Big\Vert_{\mathsf{L}^{2}(\mathbf{G})}^2 \lesssim F_\Phi(f). \end{align} $$

To handle $\mathcal {S}_{\psi ,\eta }(\sum _{k \geq 1} a_k)$ , we use the fine structure of the atoms $a_k$ : each $a_k$ is supported in $U_k^\dagger $ , where $|U_k^\dagger |\lesssim 2^{-k}F_\Phi (f)$ , and $a_k=\sum _{R\in \mathscr {M}(U^*_k)} a_{k,R}$ .

For all $R = {Q_1} \times {Q_2} \in \mathscr {M}(U^*_k)$ , let ${\tilde {Q}_1}$ be the biggest pseudodyadic cube containing ${Q_1}$ such that ${\tilde {Q}_1}\times {Q_2}\subset U^{**}_k$ , where $U^{**}_k$ is defined in (5.1). Next, let ${\tilde {Q}_2}$ be the biggest pseudodyadic cube containing ${Q_2}$ such that ${\tilde {Q}_1}\times {\tilde {Q}_2} \subseteq U^{***}_k$ , where $U^{***}_k$ is defined in (5.2). Finally, let $R^{\dagger }$ be $100\beta _k({\tilde {Q}_1} \times {\tilde {Q}_2})$ , where $\beta _k = 2^{k/ (2\nu _1+2\nu _2)}$ . By Lemmas 3.5 and 3.1 and Theorem 4.4,

(5.6) $$ \begin{align} \begin{aligned} \bigg| \bigcup_{R \in \mathscr{M}(U^*_k)} R^{\dagger}\bigg| &\lesssim2^{k/2} |U_k^{\dagger}| \lesssim2^{k/2} |U_k^{***}| \lesssim2^{k/2} |U_k^{**}| \\ &\lesssim2^{k/2} |U_k^*| \lesssim2^{k/2} |U_k| \lesssim 2^{k/2} F_\Phi(2^{-k} f). \end{aligned} \end{align} $$

We claim that there exists $\delta \in \mathbb {R}_{+}$ such that

(5.7) $$ \begin{align} \sum_{R\in \mathscr{M}(U^*_k)} \int_{(R^{\dagger})^c} |\mathcal{S}_{\psi,\eta }(a_{k,R})(\mathbf{g})|\,\mathrm{d}\mathbf{g} \lesssim 2^{-\delta k} F_\Phi(f) \qquad\forall k \in \mathbb{N}. \end{align} $$

Assume (5.7) for the moment, and let $E^\dagger :=\bigcup _{k \geq 1}\bigcup _{R \in \mathscr {M}(U^*_k)} R^{\dagger }$ . Then on the one hand,

$$ \begin{align*} |E^\dagger| = \Big|\bigcup_{k \geq 1}\bigcup_{R \in \mathscr{M}(U^*_k)} R^{\dagger}\Big| &\lesssim\sum_{k \geq 1} 2^{k/2} F_\Phi(2^{-k} f) \lesssim F_\Phi(f) \end{align*} $$

by (5.6). On the other hand,

$$ \begin{align*} &\int_{(E^\dagger)^c}\Big|\mathcal{S}_{\psi,\eta }\Big( \sum_{k \geq 1}a_k\Big)(\mathbf{g})\Big|\,\mathrm{d} \mathbf{g} \\ &\qquad\lesssim \int_{(E^\dagger)^c}\Big|\mathcal{S}_{\psi,\eta }\Big( \sum_{k \geq 1}\sum_{R\in \mathscr{M}(U^*_k)} a_{k,R}\Big)(\mathbf{g})\Big|\,\mathrm{d} \mathbf{g} \\ &\qquad\lesssim \sum_{k \geq 1}\sum_{R\in \mathscr{M}(U^*_k)} \int_{(E^\dagger)^c}\Big|\mathcal{S}_{\psi,\eta }\Big( a_{k,R}\Big)(\mathbf{g})\Big|\,\mathrm{d} \mathbf{g} \\ &\qquad\lesssim F_\Phi(f), \end{align*} $$

where the last inequality follows by summing (5.7) over positive k.

By Chebychev’s inequality,

$$ \begin{align*} \Big|\Big\{ \mathbf{g} \in \mathbf{G} \colon\mathcal{S}_{\psi,\eta }\Big( \sum_{k \geq 1} a_k(\mathbf{g}) \Big)>1 \Big\} \Big| &\leq |E^\dagger|+ \int_{(E^\dagger)^c}\Big|\mathcal{S}_{\psi,\eta }\Big( \sum_{k \geq 1}a_k\Big) (\mathbf{g})\Big|\,\mathrm{d} \mathbf{g} \lesssim F_\Phi(f). \end{align*} $$

Together with (5.5), this implies (5.3).

It remains to prove (5.7). Now

$$ \begin{align*} &\int_{(R^{\dagger})^c} |\mathcal{S}_{\psi,\eta }(a_{k,R})(g_1, g_2)|\,\mathrm{d} g_2 \,\mathrm{d} g_1\\ &\quad\leq \int_{(100\beta_k{\tilde{Q}_1})^c}\int_{100{Q_2}}|\mathcal{S}_{\psi,\eta }(a_{k,R})(g_1, g_2)|\,\mathrm{d} g_2 \,\mathrm{d} g_1 \\ &\qquad +\int_{(100\beta_k{\tilde{Q}_1})^c}\int_{(100{Q_2})^c}|\mathcal{S}_{\psi,\eta }(a_{k,R})(g_1, g_2)|\,\mathrm{d} g_2 \,\mathrm{d} g_1 \\ &\qquad +\int_{(100\beta_k{\tilde{Q}_2})^c}\int_{100{Q_1}} |\mathcal{S}_{\psi,\eta }(a_{k,R})(g_1, g_2)| \,\mathrm{d} g_1 \,\mathrm{d} g_2 \\ &\qquad +\int_{(100\beta_k{\tilde{Q}_2})^c}\int_{(100{Q_1})^c} |\mathcal{S}_{\psi,\eta }(a_{k,R})(g_1, g_2)| \,\mathrm{d} g_1 \,\mathrm{d} g_2 \\ &\quad= \mathrm{I}_1(R) + \mathrm{I}_2(R) + \mathrm{I}_3(R) + \mathrm{I}_4(R), \end{align*} $$

say, where $\beta _k=2^{k/(2\nu _1+2\nu _2)} \eta $ . It suffices to control $\mathrm {I}_1(R)$ and $\mathrm {I}_2(R)$ , as the other two terms are similar.

By Hölder’s inequality and the $ \mathsf {L}^{2}(G_2)$ -boundedness of $\mathcal {S}_{\psi ^{[2]},\eta }$ ,

$$ \begin{align*} \mathrm{I}_1(R) &\lesssim |{Q_2}|^{1/2} \int_{(100\beta_k {\tilde{Q}_1})^c} \bigg(\int_{100{Q_2}} \big( \mathcal{S}_{\psi^{[1]},\eta }(a_{k,R})(g_1,g_2)\big) ^2 \,\mathrm{d} g_2\bigg)^{1/2} \,\mathrm{d} g_1. \end{align*} $$

We use the cancellation and support restrictions on $a_{k,R}$ in the first variable, (2.1), homogeneity, the geometry (namely, $h_1, z({Q_1}) \in {Q_1}$ and $g_1 \in (100\beta _k {\tilde {Q}_1})^c$ ) and Hölder’s inequality to deduce that

$$ \begin{align*} &|a_{k,R} \ast_1 \psi_t^{[1]}(g_1,g_2)| \\ &\qquad=\bigg|\int_{G_1} a_{k,R}(h_1,g_2)t^{-\nu_1} \big(\psi^{[1]}(\delta_{1/t}(h_1^{-1}g_1))-\psi^{[1]}(\delta_{1/t}(z({Q_1})^{-1}g_1))\big) \,\mathrm{d}{h}_1\bigg|\\ &\qquad\lesssim \int_{G_1} \left| a_{k,R}(h_1,g_2)\right| \frac{t^{-\nu_1} \big( \rho(\delta_{1/t}(z({Q_1})^{-1}g_1)\delta_{1/t}(h_1^{-1}g_1)^{-1}) \big)^{\epsilon}} {\big( 1 + \rho(\delta_{1/t}(h_1^{-1}g_1)) + \rho(\delta_{1/t}(z({Q_1})^{-1}g_1))\big)^{\nu_1+2\epsilon}} \,\mathrm{d}{h}_1 \\ &\qquad= \int_{G_1} \left| a_{k,R}(h_1,g_2)\right| \frac{t^{\epsilon} \big( \rho( z({Q_1})^{-1} h_1) \big)^{\epsilon}} {\big( t + \rho(h_1^{-1}g_1) + \rho(z({Q_1})^{-1}g_1)\big)^{\nu_1+2\epsilon}} \,\mathrm{d}{h}_1 \\ &\qquad\lesssim \int_{G_1} \left| a_{k,R}(h_1,g_2)\right| \frac{t^{\epsilon} \operatorname{\ell}({Q_1})^{\epsilon}} {\big( t + \rho(z({Q_1})^{-1}g_1)\big)^{\nu_1+2\epsilon}} \,\mathrm{d}{h}_1 \\ &\qquad\eqsim \frac{t^{\epsilon} \operatorname{\ell}({Q_1})^{\epsilon}} {\big( t + \rho(z({Q_1})^{-1}g_1) \big)^{\nu_1+2\epsilon}} \left\Vert a_{k,R}(\cdot,g_2) \right\Vert{}_{\mathsf{L}^{1}(G_1)}, \end{align*} $$

where $z({Q_1})$ is the center of ${Q_1}$ . Hence

(recall that denotes the “average integral”). This estimate implies that

$$ \begin{align*} \mathrm{I}_1(R) &= \int_{(100\beta_k {\tilde{Q}_1})^c}\int_{100{Q_2}}\left|\mathcal{S}_{\psi,\eta }(a_{k,R})(g_1,g_2)\right|\,\mathrm{d} g_2 \,\mathrm{d} g_1 \\ &\leq \int_{(100\beta_k {\tilde{Q}_1})^c} |100{Q_2}|^{1/2} \left( \int_{100{Q_2}}|\mathcal{S}_{\psi,\eta }(a_{k,R})(g_1,g_2)|^2\,\mathrm{d} g_2 \right) ^{1/2} \,\mathrm{d} g_1\\ &\leq \int_{(100\beta_k {\tilde{Q}_1})^c} |100{Q_2}|^{1/2} \left( \int_{G_2}|\mathcal{S}_{\psi,\eta }(a_{k,R})(g_1,g_2)|^2\,\mathrm{d} g_2 \right) ^{1/2} \,\mathrm{d} g_1\\ &\lesssim \int_{(100\beta_k {\tilde{Q}_1})^c} |{Q_2}|^{1/2} \frac{\operatorname{\ell}({Q_1})^{\epsilon} |{Q_1}|^{1/2} }{\rho_1(z({Q_1})^{-1}g_1)^{\nu_1+\epsilon}} \left\Vert a_{k,R} \right\Vert{}_{\mathsf{L}^{2}(\mathbf{G})} \,\mathrm{d} g_1\\ &= \operatorname{\ell}({Q_1})^{\epsilon} \int_{(100\beta_k {\tilde{Q}_1})^c} \frac1{\rho_1(z({Q_1})^{-1}g_1)^{\nu_1+\epsilon}} \,\mathrm{d} g_1 \left| R \right|{}^{1/2} \left\Vert a_{k,R} \right\Vert{}_{\mathsf{L}^{2}(\mathbf{G})} \\ &\eqsim_\epsilon \beta_k ^{-\epsilon} \gamma_1(R)^{-\epsilon} \left| R \right|{}^{1/2} \left\Vert a_{k,R} \right\Vert{}_{\mathsf{L}^{2}(\mathbf{G})}. \end{align*} $$

To estimate $\mathrm {I}_2(R)$ , we argue as follows. For $g_1\notin 100\beta _k {\tilde {Q}_1}$ and $g_2\notin 100 {Q_2}$ ,

and because of the cancellation of $a_{k,R}$ and properties of $\psi ^{[1]}$ and $\psi ^{[2]}$ ,

$$ \begin{align*} &\big| a_{k,R}\ast\psi_{\mathbf{t}}(\mathbf{g}) \big| \\ &\qquad= \bigg| \int_{\mathbf{G}} a_{k,R}(\mathbf{h}) \psi_{\mathbf{t}}(\mathbf{h}^{-1}\mathbf{g}) \,\mathrm{d} \mathbf{h} \bigg| \\ &\qquad= \bigg| \int_{\mathbf{G}} a_{k,R}(h_1,h_2) \psi^{[1]}_{t_1}(h_1^{-1}g_1) \psi^{[2]}_{t_2}(h_2^{-1}g_2) \,\mathrm{d} h_2 \,\mathrm{d} h_1 \bigg| \\ &\qquad= \bigg| \int_{\mathbf{G}} a_{k,R}(h_1,h_2) \big( \psi^{[1]}_{t_1}(h_1^{-1}g_1) - \psi^{[1]}_{t_1}(z({Q_1})^{-1}g_1) \big)\\ &\qquad\qquad\qquad\qquad \times\big( \psi^{[2]}_{t_2}(h_2^{-1}g_2) -\psi^{[2]}_{t_2}z({Q_2})^{-1}g_2)\big)\,\mathrm{d} h_2 \,\mathrm{d} h_1 \bigg| \\ &\qquad\leq \left\Vert a_{k,R} \right\Vert{}_{\mathsf{L}^{1}(\mathbf{G})} \sup\left\{ \left| \psi^{[1]}_{t_1}(h_1^{-1}g_1) - \psi^{[1]}_{t_1}(z({Q_1})^{-1}g_1) \right| \colon h_1 \in {Q_1}, g_1 \in (100b {Q_1})^c \right\}\\ &\qquad\qquad\qquad \times\sup\left\{ \left| \psi^{[2]}_{t_2}(h_2^{-1}g_2) - \psi^{[2]}_{t_2}(z({Q_2})^{-1}g_2) \right| \colon h_2 \in {Q_2}, g_2 \in (100b {Q_2})^c \right\}\\ &\qquad\lesssim \left\Vert a_{k,R} \right\Vert{}_{\mathsf{L}^{1}(\mathbf{G})} \frac{t_1^{\epsilon} \operatorname{\ell}({Q_1})^{\epsilon}}{\big( t_1 + \rho_1(g_1^{-1}z({Q_1}))\big)^{\nu_1+2\epsilon}} \frac{t_2^{\epsilon}\operatorname{\ell}({Q_2})^{\epsilon}}{\big( t_2 + \rho_2(g_2^{-1}z({Q_2}))\big)^{\nu_2+2\epsilon}} \,, \end{align*} $$

by the same geometrical arguments as used to treat $\mathrm {I}_1(R)$ , applied in both variables. Hence $\big (\mathcal {S}_{\psi ,\eta }(a_{k,R})(\mathbf {g})\big )^2$ is dominated by a multiple of $\left \Vert a_{k,R} \right \Vert {}_{\mathsf {L}^{1}(\mathbf {G})} $ by

Our estimation of $\mathrm {I}_2(R)$ concludes with the observation that

$$ \begin{align*} \mathrm{I}_2(R) &= \int_{(100\beta_k {\tilde{Q}_1})^c}\int_{(100{Q_2})^c}| \mathcal{S}_{\psi,\eta }(a_{k,R})(g_1,g_2)|\,\mathrm{d} g_2 \,\mathrm{d} g_1 \\ &\lesssim \left\Vert a_{k,R} \right\Vert{}_{\mathsf{L}^{1}(\mathbf{G})} \int_{(100\beta_k {\tilde{Q}_1})^c}\int_{(100{Q_2})^c}\\&\qquad \frac{ \operatorname{\ell}({Q_1})^{\epsilon}}{\big( \rho_1(g_1^{-1}z({Q_1}))\big)^{\nu_1+\epsilon}} \frac{ \operatorname{\ell}({Q_2})^{\epsilon}}{\big( \rho_2(g_2^{-1}z({Q_2}))\big)^{\nu_2+\epsilon}} \,\mathrm{d} g_2 \,\mathrm{d} g_1 \\ &\lesssim \frac{ \operatorname{\ell}({Q_1})^{\epsilon}}{(100 \beta_k \operatorname{\ell}({\tilde{Q}_1}))^{\epsilon}} \frac{ \operatorname{\ell}({Q_2})^{\epsilon}}{(100 \operatorname{\ell}({Q_2}))^{\epsilon}} \left| R \right|{}^{1/2} \left\Vert a_{k,R} \right\Vert{}_{\mathsf{L}^{2}(\mathbf{G})} \\ &\eqsim \beta_k ^{-\epsilon} \gamma_1(R)^{-\epsilon} \left| R \right|{}^{1/2} \left\Vert a_{k,R} \right\Vert{}_{\mathsf{L}^{2}(\mathbf{G})}. \end{align*} $$

Then, by combining the estimates of $\mathrm {I}_1(R)$ and $\mathrm {I}_2(R)$ , and then using Hölder’s inequality and Lemma 2.5, we see that for all $k \geq 1$ ,

$$ \begin{align*} &\sum_{R\in \mathscr{M}(U^*_k)} \mathrm{I}_1(R) + \mathrm{I}_2(R) \\ &\qquad\lesssim \beta_k ^{-\epsilon} \sum_{R\in \mathscr{M}(U^*_k)} \gamma_1(R)^{-\epsilon} \left| R \right|{}^{1/2} \left\Vert a_{k,R} \right\Vert{}_{\mathsf{L}^{2}(\mathbf{G})}\\ &\qquad\leq \beta_k ^{-\epsilon}\bigg(\sum_{R\in \mathscr{M}(U^*_k)} \gamma_1(R)^{-2\epsilon}\left| R \right| \bigg)^{1/2}\bigg(\sum_{R\in \mathscr{M}(U^*_k)} \left\Vert a_{k,R} \right\Vert{}^2_{\mathsf{L}^{2}(\mathbf{G})}\bigg)^{1/2}\\ &\qquad\lesssim_\epsilon \beta_k ^{-\epsilon} |U_k|^{1/2} \Big( 2^{2k} F_\Phi(2^{-k} f) \Big) ^{1/2}\\ &\qquad\lesssim \beta_k ^{-\epsilon} 2^{k} F_\Phi(2^{-k} f)\\ &\qquad\lesssim 2^{-\epsilon k/(2\nu_1+2\nu_2)} F_\Phi(f). \end{align*} $$

Similar estimates hold for $\mathrm {I}_3(R)$ and $\mathrm {I}_4(R)$ , but with $\gamma _2$ in place of $\gamma _1$ , and hence (5.7) holds, and the proof is complete.

5.2 Proof of Theorem 1.1 for square functions $\mathcal {S}_{\psi ,0}(f)$

Arguing much as in Section 5.1, we may also check that

(5.8) $$ \begin{align} \big|\big\{ \mathbf{g}\in \mathbf{G}\colon \mathcal{S}_{\psi,\eta }(f)(\mathbf{g})>\lambda \big\}\big| \lesssim F_\Phi(f/\lambda) \qquad\forall\lambda\in\mathbb{R}_{+} \end{align} $$

for all $f\in \mathsf {L\,log\,L} (\mathbf {G})$ . We leave the details to the reader.

5.3 Hyperweak boundedness of Riesz transformations

We write $\mathcal {R}$ for a double Riesz transformation $\mathcal {R}^{[1]}_{j_1} \otimes \mathcal {R}^{[2]}_{j_2}$ .

Theorem 5.1 The double Riesz transformations satisfy the endpoint estimate:

(5.9) $$ \begin{align} \big|\big\{ \mathbf{g}\in \mathbf{G}\colon \mathcal{R}(f)(\mathbf{g})>\lambda \big\} \lesssim F_\Phi(f/\lambda) \qquad\forall\lambda\in\mathbb{R}_{+} \end{align} $$

for all $f\in \mathsf {L\,log\,L} (\mathbf {G})$ .

Proof By density and sublinearity, we need only prove that, for all $f\in \mathsf {L}^{2}(\mathbf {G}) \cap \mathsf {L\,log\,L} (\mathbf {G})$ ,

(5.10) $$ \begin{align} \big|\big\{ \mathbf{g}\in \mathbf{G}\colon \mathcal{R}(f)(\mathbf{g})>1 \big\}\big| \lesssim F_\Phi(f). \end{align} $$

From Theorem 4.4, we may write

$$\begin{align*}f=\sum_{k\in \mathbb Z}a_k \end{align*}$$

where the $a_k$ satisfy the conditions specified there. By the $\mathsf {L}^{2}(\mathbf {G})$ boundedness of $\mathcal {R}$ ,

(5.11) $$ \begin{align} \Big|\Big\{\mathbf{g}\in \mathbf{G}\colon \mathcal{R}\bigg(\sum_{k \leq 0} a_k\bigg)(\mathbf{g})>1 \Big\}\Big| \lesssim \Big\|\sum_{k \leq 1}a_k\Big\|_{\mathsf{L}^{2}(\mathbf{G})}^2 \lesssim F_\Phi(f). \end{align} $$

To handle $\mathcal {R}(\sum _{k \geq 1} a_k)$ , we consider the $a_k$ in more detail. The atom $a_k$ is supported in $U^\dagger _k$ , where $|U^\dagger _k|\lesssim F_\Phi (2^{-k} f)$ , and we may write $a_k=\sum _{R\in \mathscr {M}(U^*_k)} a_{k,R}$ .

Again, for all $R = {Q_1} \times {Q_2} \in \mathscr {M}(U^*_k)$ , let ${\tilde {Q}_1}$ be the biggest pseudodyadic cube containing ${Q_1}$ such that ${\tilde {Q}_1}\times {Q_2}\subset U^{**}_k$ , where $U^{**}_k$ is defined in (5.1). Next, let ${\tilde {Q}_2}$ be the biggest pseudodyadic cube containing ${Q_2}$ such that ${\tilde {Q}_1}\times {\tilde {Q}_2} \subseteq U^{***}_k$ , where $U^{***}_k$ is defined in (5.2). Now let $R^{\dagger }$ be $100\beta _k({\tilde {Q}_1} \times {\tilde {Q}_2})$ , where $\beta _k = 2^{k/ (2\nu _1+2\nu _2)}$ . By Lemmas 3.5 and 3.1 and Theorem 4.4,

$$ \begin{align*} \bigg| \bigcup_{R \in \mathscr{M}(U^*_k)} R^{\dagger}\bigg| &\lesssim 2^{k/2} |U_k^{***}| \lesssim 2^{k/2} |U_k^{**}| \lesssim 2^{k/2} |U_k^*| \lesssim 2^{k/2} |U_k| \lesssim 2^{k/2} F_\Phi(2^{-k} f). \end{align*} $$

As in the proof of Theorem 1.1, to prove that

(5.12) $$ \begin{align} \bigg|\bigg\{\mathbf{g}\in \mathbf{G}\colon \mathcal{R}\bigg(\sum_{k \geq 1} a_k\bigg)(\mathbf{g})>1\bigg\}\bigg| \lesssim F_\Phi(f), \end{align} $$

it suffices to show that, for some $\delta \in \mathbb {R}_{+}$ ,

(5.13) $$ \begin{align} \sum_{R\in \mathscr{M}(U^*_k)} \int_{(R^{\dagger})^c} |\mathcal{R}(a_{k,R})(\mathbf{g})|\,\mathrm{d} \mathbf{g} \lesssim 2^{-\delta k} F_\Phi(f) \end{align} $$

for every $k \geq 1$ . If (5.13) holds, then (5.11) and (5.12) imply (5.10). We now prove (5.13).

$$ \begin{align*} \int_{(R^{\dagger})^c} |\mathcal{R}(a_{k,R})(\mathbf{g})|\,\mathrm{d} \mathbf{g} &\le \iint_{g_1\notin 100\beta_k{\tilde{Q}_1}} |\mathcal{R}(a_{k,R})(\mathbf{g})|\,\mathrm{d} \mathbf{g} + \iint_{g_2\notin100\beta_k{\tilde{Q}_2}} |\mathcal{R}(a_{k,R})(\mathbf{g})|\,\mathrm{d} \mathbf{g} \\ &=:\mathrm{I}_1(R) + \mathrm{I}_2(R), \end{align*} $$

say, where $\beta _k=2^{{k/(2\nu _1+2\nu _2)}}$ . Since the estimates of $\mathrm {I}_1(R)$ and $\mathrm {I}_2(R)$ are symmetric, we only estimate $\mathrm {I}_1(R)$ . Note that

$$ \begin{align*} \mathrm{I}_1(R)&=\int_{g_1\notin 100\beta_k{\tilde{Q}_1}}\int_{100{Q_2}}|\mathcal{R}(a_{k,R})(\mathbf{g})|\,\mathrm{d} \mathbf{g}+\int_{g_1\notin 100\beta_k{\tilde{Q}_1}}\int_{(100{Q_2})^c}|\mathcal{R}(a_{k,R})(\mathbf{g})|\,\mathrm{d} \mathbf{g} \\ &=:\mathrm{I}_{11}(R)+\mathrm{I}_{12}(R), \end{align*} $$

say. By Hölder’s inequality and the $\mathsf {L}^{2}(G_2)$ -boundedness of $\mathcal {R}^{[2]}_{j_2}$ ,

$$ \begin{align*} \mathrm{I}_{11}(R) &\lesssim |{\tilde{Q}_2}|^{1/2}\int_{g_1\notin 100\beta_k{\tilde{Q}_1}} \bigg(\int_{Q_2} \big( \mathcal{R}^{[1]}_{j_1}(a_{k,R})(g_1,g_2)\big) ^2 \,\mathrm{d} g_2\bigg)^{1/2} \,\mathrm{d} g_1. \end{align*} $$

By the cancellation condition on $a_{k,R}(\cdot ,g_2)$ and the smoothness of $\mathcal {R}^{[1]}_{j_1}$ ,

$$ \begin{align*} \left| \mathcal{R}^{[1]}_{j_1}(a_{k,R})(\mathbf{g}) \right| &= \left| \int_{{G_1}} \big( \mathcal{R}^{[1]}_{j_1}(g_1,h_1)-\mathcal{R}^{[1]}_{j_1}(g_1,z({Q_1}))\big) a_{k,R}(h_1,g_2) \,\mathrm{d} h_1 \right| \\ &\lesssim \frac{\operatorname{\ell}({Q_1})^{\epsilon}}{\rho_1(g_1,z({Q_1}))^{\nu_1+\epsilon}} \int_{{G_1}} a_{k,R}(h_1,g_2) \,\mathrm{d} h_1\\ &\lesssim \frac{\operatorname{\ell}({Q_1})^{\epsilon}}{\rho_1(g_1,z({Q_1}))^{\nu_1+\epsilon}} |{Q_1}|^{1/2}\bigg(\int_{{G_1}} |a_{k,R}(h_1,g_2)|^2 \,\mathrm{d} h_1\bigg)^{1/2}, \end{align*} $$

where $z({Q_1})$ is the center of ${Q_1}$ . Hence

$$ \begin{align*} \mathrm{I}_{11}(R) \lesssim \beta_k^{-\epsilon}\gamma_{11}(R)^{-\epsilon} \left| R \right|{}^{1/2} \left\Vert a_{k,R} \right\Vert{}_{\mathsf{L}^{2}(\mathbf{G})}. \end{align*} $$

To estimate $\mathrm {I}_{12}(R)$ , we use the cancellation of $a_{k,R}$ to write $\mathcal {R}(a_{k,R})(\mathbf {g})$ as

$$ \begin{align*} &\int_{\beta_k R} \big( \mathcal{R}^{[1]}_{j_1}(g_1,h_1) - \mathcal{R}^{[1]}_{j_1}(g_1, z({Q_1}))\big) \big( \mathcal{R}^{[2]}_{j_2}(g_2,h_2)\\&\quad-\mathcal{R}^{[2]}_{j_2}(g_2, z({Q_2}))\big) a_{k,R}(h_1,h_2) \,\mathrm{d} h_1 \,\mathrm{d} h_2, \end{align*} $$

where $z({Q_2})$ is the center of ${Q_2}$ . Using the smoothness of $\mathcal {R}^{[1]}_{j_1}$ and $\mathcal {R}^{[2]}_{j_2}$ , we deduce that

$$ \begin{align*} \mathrm{I}_{12}(R)&\lesssim \int_{g_1\notin 100b{\tilde{Q}_1}}\int_{(100{Q_2})^c} \frac{\operatorname{\ell}({Q_1})^{\epsilon}}{\rho_1(g_1,z({Q_1}))^{\nu_1+\epsilon}}\\&\qquad\qquad\quad \frac{\operatorname{\ell}({Q_2})^{\epsilon}}{\rho_2(g_2,z({Q_2}))^{\nu_2+\epsilon}} \,\mathrm{d} g_1 \,\mathrm{d} g_2 \left| R\right|{}^{1/2} \left\Vert a_{k,R} \right\Vert{}_{\mathsf{L}^{2}(\mathbf{G})}\\ &\lesssim b^{-\epsilon}\gamma_1(R)^{-\epsilon} \left| R \right|{}^{1/2} \left\Vert a_{k,R} \right\Vert{}_{\mathsf{L}^{2}(\mathbf{G})}. \end{align*} $$

Then, by continuing as in Section 5.1, we establish (5.13) and complete the proof.

Acknowledgements

The authors thank the referees for their careful reading and helpful comments.

Footnotes

M.G.C. and J.L. are supported by ARC DP 220100285 and NNSF 12171221. M.-Y.L. is supported by NSTC 112-2115-M-008-001-MY2.

References

Chang, S.-Y. A. and Fefferman, R., A continuous version of duality of ${H}^1$ with BMO on the bidisc. Ann. Math. 112(1980), 179201.Google Scholar
Chen, P., Cowling, M. G., Lee, M.-Y., Li, J., and Ottazzi, A., Flag Hardy space theory on Heisenberg groups and applications. Preprint, arXiv:2102.07371.Google Scholar
Coifman, R. R. and Weiss, G., Analyse harmonique non-commutative sur certains espaces homogènes, Lecture Notes in Mathematics, 242, Springer Verlag, Berlin, 1971.Google Scholar
Coifman, R. R. and Weiss, G., Extensions of Hardy spaces and their use in analysis . Bull. Amer. Math. Soc. 83(1977), 569645.Google Scholar
Córdoba, A. and Fefferman, R., A geometric proof of the strong maximal theorem . Ann. Math. 102(1975), 95100.Google Scholar
Cowling, M. G., Fan, Z., Li, J., and Yan, L., Characterizations of product Hardy spaces on stratified groups by singular integrals and maximal functions. To appear in The Mathematical Heritage of Guido Weiss. Preprint, arXiv:2210.01265.Google Scholar
Christ, M., A $T(b)$ theorem with remarks on analytic capacity and the Cauchy integral. Colloq. Math. 61(1990), 601628.Google Scholar
Fefferman, R., A note on a lemma of Zo . Proc. Amer. Math. Soc. 96(1986), 241246.Google Scholar
Fefferman, R., Harmonic analysis on product spaces . Ann. Math. 126(1987), 109130.Google Scholar
Fefferman, R. and Stein, E. M., Singular integrals on product spaces . Adv. Math. 45(1982), 117143.Google Scholar
Ferguson, S. H. and Lacey, M. T., A characterization of product BMO by commutators . Acta Math. 189(2002), 143160.Google Scholar
Folland, G. B. and Stein, E. M., Hardy spaces on homogeneous groups. Princeton University Press, Princeton, NJ, 1982.Google Scholar
Geller, D. and Mayeli, A., Continuous wavelets and frames on stratified Lie groups I . J. Fourier Anal. Appl. 12(2006), 543579.Google Scholar
Han, Y. S., Li, J., and Lin, C. C., Criterion of the ${L}^2$ boundedness and sharp endpoint estimates for singular integral operators on product spaces of homogeneous type. Ann. Scuola Norm. Sup. Pisa Cl. Sci. 16(2016), no. 5, 845907.Google Scholar
Hebisch, W. and Sikora, A., A smooth subadditive homogeneous norm on a homogeneous group . Stud. Math. 96(1990), 231236.Google Scholar
Helffer, B., Conditions nécessaires d’hypoanalyticité pour des opérateurs invariants à gauche homogènes sur un groupe nilpotent gradué . J. Differ. Equ. 44(1982), 466481.Google Scholar
Hytönen, T. and Kairema, A., Systems of dyadic cubes in a doubling metric space . Colloq. Math. 126(2012), 133.Google Scholar
Hytönen, T. and Martikainen, H., Non-homogeneous $T1$ theorem for bi-parameter singular integrals. Adv. Math. 261(2014), 220273.Google Scholar
Jessen, B., Marcinkiewicz, J., and Zygmund, A., Note on the differentiability of multiple integrals . Fund. Math. 25(1935), 217234.Google Scholar
Journé, J.-L., Calderón–Zygmund operators on product spaces . Rev. Mat. Iberoam. 1(1985), 5591.Google Scholar
Kairema, A., Li, J., Pereyra, M., and Ward, L. A., Haar bases on quasi-metric measure spaces, and dyadic structure theorems for function spaces on product spaces of homogeneous type . J. Funct. Anal. 271(2016), no. 7, 17931843.Google Scholar
Lacey, M., Petermichl, S., Pipher, J., and Wick, B. D., Multiparameter Riesz commutators . Amer. J. Math. 131(2009), 731769.Google Scholar
Li, J., Pipher, J., and Ward, L., Dyadic structure theorems for multiparameter function spaces . Rev. Mat. Iberoam. 31(2015), 767797.Google Scholar
Luxemburg, W. A. J., Banach function spaces. Ph.D. thesis, T.U. Delft, 1955.Google Scholar
Merryfield, K. G., On the area integral, Carleson measures and ${H}^p$ in the polydisc. Indiana Univ. Math. J. 34(1985), 663685.Google Scholar
Nagel, A. and Stein, E. M., On the product theory of singular integrals . Rev. Mat. Iberoam. 20(2004), 531561.Google Scholar
Ou, Y., A $T(b)$ theorem on product spaces. Trans. Amer. Math. Soc. 367(2015), 61596197.Google Scholar
Pipher, J., Journé’s covering lemma and its extension to higher dimensions . Duke Math. J. 53(1986), 683690.Google Scholar
Sawyer, E. and Wheeden, R., Weighted inequalities for fractional integrals on Euclidean and homogeneous spaces . Amer. J. Math. 114(1992), 813874.Google Scholar
Varopoulos, N. T., Saloff-Coste, L., and Coulhon, T., Analysis and geometry on groups . In: Cambridge tracts in mathematics 100, Cambridge University Press, Cambridge, 1992.Google Scholar