1 Introduction and statement of main results
Since the 1980s, the development of multiparameter harmonic analysis has proceeded apace; recent contributions in the area include [Reference Chang and Fefferman1, Reference Fefferman8–Reference Ferguson and Lacey11, Reference Hytönen and Martikainen18, Reference Journé20, Reference Lacey, Petermichl, Pipher and Wick22, Reference Nagel and Stein26, Reference Ou27, Reference Pipher28]. Much of the product space theory on
$\mathbb {R}^m \times \mathbb {R}^n$
, including the duality of
$\mathsf {H}^1$
with
$\mathsf {BMO}$
, characterization of
$\mathsf {H}^1$
by square functions and atomic decompositions, and interpolation, has been extended to more general products of spaces of homogeneous type.
In contrast to the classical theory of singular integrals, in multiparameter harmonic analysis product singular integrals are not of weak type
$(1,1)$
. For functions supported on the unit cube, the classical weak type
$(1,1)$
estimate was replaced by a
$\mathsf {L\,log\,L} $
to
$\mathsf {L}^{1,\infty }$
estimate by R. Fefferman [Reference Fefferman8]. More precisely, let
$\mathcal {T}$
be a product Calderón–Zygmund operator on
$\mathsf {L}^{2}(\mathbb {R}^m\times \mathbb {R}^n)$
as in [Reference Journé20] or [Reference Fefferman8]; then for every
$\lambda \in \mathbb {R}_{+}$
,

for all f supported in the product
$Q_1 \times Q_2$
of the unit cubes in
$\mathbb {R}^m$
and
$\mathbb {R}^n$
, where

here
$\log ^+t :=\log (\max \{1,t\})$
. The proof in [Reference Fefferman8] relies on the boundedness of the strong maximal function and the area function from
$\mathsf {L\,log\,L}$
to
$\mathsf {L}^{1,\infty }$
, the local atomic decomposition of functions in
$\mathsf {L\,log\,L}$
produced using the
$\mathsf {L\,log\,L} $
to
$\mathsf {L}^{1,\infty }$
boundedness of the area function, and the boundedness of
$\mathcal {T}$
on
$\mathsf {L\,log\,L}$
atoms.
Fefferman’s inequality is similar to the estimate for the strong maximal function [Reference Jessen, Marcinkiewicz and Zygmund19] of Jessen, Marcinkiewicz, and Zygmund.
A natural question arises: does a global version of (1.1) hold for area functions, square functions, maximal operators and singular integral operators, and in more general product settings? The global version of (1.1) cannot be of the same form. Indeed, take m and n to be
$1$
and f to be the characteristic function of the unit square in
$\mathbb {R} \times \mathbb {R}$
. Let
$\mathcal {T}$
be either the double Hilbert transformation, with singular integral kernel
$(xy)^{-1}$
, or the strong (rectangular) maximal operator. It is easy to check that

and if
$|x_1|> 1$
and
$|x_2|> 1$
, then we may replace
$\lesssim $
by
$\eqsim $
. It follows that

for all
$\lambda \in (0,1)$
. Hence in the global case, a right hand side of the form
$F(f)/\lambda $
cannot be correct.
We are going to work with (generalizations of) the following global version of (1.1):

where the integral defining the nonlinear functional F now extends over
$\mathbb {R}^m \times \mathbb {R}^n$
. Later we review Orlicz spaces, and prove an inequality (2.11), which, when coupled with (1.3), implies that

which is of comparable size to the right hand side of (1.1) when
$\lambda> 1$
and is of the right order to capture the global behavior of the examples of the previous paragraph.
As far as we know, there are no previous global results of this kind in the literature. We therefore argue that the global estimate (1.3) is a new contribution to the theory of product singular integrals even in the euclidean setting. Our approach here admits a broad generalization from euclidean spaces to stratified Lie groups.
Indeed, in this article, we establish a global analog of (1.3) on product spaces of the form
$\mathbf {G} := G_1 \times G_2$
, where
$G_1$
and
$G_2$
are stratified Lie groups; these spaces include, but are more general than, the product of Euclidean spaces
$\mathbb {R}^m\times \mathbb {R}^n$
. This setting is in some respects like the euclidean case, but in other respects is very different. For example, in general stratified Lie groups, there are no Cauchy–Riemann equations; these are used in the Euclidean case to relate square functions and maximal functions (see Merryfield [Reference Merryfield25]), and new ideas are needed here. Our new approach works for stratified Lie groups, but the group structure plays an essential role in our argument, and we cannot extend it to cover more general spaces of homogeneous type. Another difference is that the sub-Laplacian on a stratified Lie group is sometimes analytic hypoelliptic and sometimes not (see [Reference Helffer16] for more details). When the sub-Laplacian is analytic hypoelliptic, as in the euclidean case, some arguments are much easier (our Lemma 2.4 shows that the claim proved in (4.9) is not needed). A third difference is that in our more general setting, there is no analog of the dyadic rectangles that are used by Fefferman and others in the classical setting: as a replacement, we use a lemma of Hytonen and Kairema [Reference Hytönen and Kairema17]; this part of our contribution is valid in the more general context of spaces of homogenous type.
To state our results, we need a little notation. Details may be found later.
In this introduction, the auxiliary functions
$\varphi ^{[i]}$
on
$G_i$
satisfy standard decay and smoothness conditions and have integral
$1$
. Likewise, the functions
$\psi ^{[i]}$
on
$G_i$
satisfy standard decay and smoothness and have integral
$0$
. We write
$\zeta ^{[1]}_{t_1}$
and
$\zeta ^{[2]}_{t_2}$
for normalized dilates of functions
$\zeta ^{[1]}$
on
$G_1$
and
$\zeta ^{[2]}$
on
$G_2$
, and
$\varphi _{\mathbf {t}}$
and
$\psi _{\mathbf {t}}$
for the product functions
$\varphi ^{[1]}_{t_1} \otimes \varphi ^{[2]}_{t_2}$
and
$\psi ^{[1]}_{t_1} \otimes \psi ^{[2]}_{t_2}$
.
We define
$\mathbf {T} := \mathbb {R}_{+} \times \mathbb {R}_{+}$
. For
$\mathbf {g} :=(g_1,g_2)\in \mathbf {G}$
,
$\mathbf {t} \in \mathbf {T}$
and
$\eta \in [0,\infty )$
, we denote by
$\mathbf {P}(\mathbf {g}, \mathbf {t})$
the product of open balls
$B_1(g_1, t_1)\times B_2(g_2, t_2)$
and by
$\Gamma ^{\eta }(\mathbf {g})$
the product cone
$\Gamma _1^{\eta }(g_1)\times \Gamma _2^{\eta }(g_2)$
, where

For
$\eta \in [0,\infty )$
and
$f \in \mathsf {L}^{1}(\mathbf {G})$
, we define the maximal function:

This maximal function is called radial when
$\eta = 0$
and nontangential when
$\eta> 0$
.
For
$\eta \in \mathbb {R}_{+}$
and
$f \in \mathsf {L}^{1}(\mathbf {G})$
, we define the Lusin area function
$\mathcal {S}_{\psi ,\eta }(f)$
by

Here and throughout this article,
$\,\mathrm {d} \mathbf {h}= \,\mathrm {d} h_1 \,\mathrm {d} h_2$
and
$\,\frac {\mathrm {d}\mathbf {t}}{\mathbf {t}} = \frac {\,\mathrm {d} t_1 }{ t_1}\frac {\,\mathrm {d} t_2 }{ t_2}$
.
For
$\eta = 0$
and
$f \in \mathsf {L}^{1}(\mathbf {G})$
, we define the Littlewood–Paley square function
$\mathcal {S}_{\psi ,0}(f)$
by

These are often written differently, but it is convenient to treat them together.
On a stratified Lie group G, we may take a basis
$\{\mathcal {X}_1, \dots , \mathcal {X}_d\}$
for the space of left-invariant horizontal vector fields, and define the sub-Laplacian
$\mathcal {L}$
to be
$-\sum _{j-1}^{d} \mathcal {X}_j^2$
. The Riesz transformations are then the operators
$\mathcal {X}_j \mathcal {L}^{-1/2}$
. The double Riesz transformations
$\mathcal {R}^{[1]}_{j_1} \otimes \mathcal {R}^{[2]}_{j_2}$
on
$\mathbf {G}$
are defined in the obvious way when
$1 \leq j_i \leq d_i$
.
Theorem 1.1 Let
$\mathcal {T}$
be a maximal operator
$\mathcal {M}_{\varphi ,\eta }$
or a Littlewood–Paley operator
$\mathcal {S}_{\psi ,\eta }$
, where
$\eta \geq 0$
, or a double Riesz transformation
$\mathcal {R}^{[1]}_{j_1} \otimes \mathcal {R}^{[2]}_{j_2}$
. Then

for all
$f \in \mathsf {L\,log\,L}(\mathbf {G})$
, where
$\mathsf {L\,log\,L}(\mathbf {G})$
is the Orlicz space associated with the functional
$F_\Phi $
, given by

We explain Orlicz spaces and Luxemburg norms later. We say that the operator
$\mathcal {T}$
is hyperweakly bounded when an estimate of the form (1.7) holds.
It is easy to iterate estimates for one-parameter maximal, area or singular integral operators to prove a local version of this result, where the support of the function f is restricted to lie in a compact set. However, iteration does not seem to be able to deal with the global case, and this is the main difficulty that we need to confront in this article.
Our presentation has the following structure. In Section 2, we review some background on stratified Lie groups, product spaces and Orlicz spaces.
In Section 3, we show that the strong maximal operator
$\mathcal {M}_{s}$
is hyperweakly bounded, using a covering lemma, Theorem 3.2, that goes back to [Reference Córdoba and Fefferman5]. We then prove Theorem 1.1 for the maximal operator
$\mathcal {M}_{\varphi ,\eta }$
, by using group properties to dominate the maximal operator
$\mathcal {M}_{\varphi ,\eta }$
by the strong maximal operator.
In Section 4, we construct the atomic decomposition. We use the gradient of the Poisson kernel p as our auxiliary function, and then have a key result of [Reference Cowling, Fan, Li and Yan6, 12 and following], namely, the good-
$\lambda $
inequality: for all
$f \in \mathsf {L\,log\,L}(\mathbf {G})$
,

when
$\eta $
is sufficiently large; here
$L_{\eta }(\lambda ) :=\left \{ \mathbf {g}\in {\mathbf {G}}\colon \mathcal {M}_{p,\eta }(f)(\mathbf {g}) \leq \lambda \right \}$
. This implies that the area operator
is hyperweakly bounded. By using this boundedness and the Calderón reproducing formula, we can decompose global
$\mathsf {L\,log\,L}$
functions into atoms.
In Section 5, we apply our atomic decomposition and a version of Journé’s covering lemma for spaces of homogeneous type established in [Reference Han, Li and Lin14]. We prove Theorem 1.1 for general operators
$\mathcal {S}_{\psi ,\eta }$
and the double Riesz transformations
$\mathcal {R}^{[1]}_{j_1} \otimes \mathcal {R}^{[2]}_{j_2}$
. The same argument holds for general product Calderón–Zygmund operators, as in Journé [Reference Journé20].
Most of the arguments rely only on the theory of spaces of homogeneous type. However, we need the setting of a stratified Lie group in two places. First, it gives us the good
$\lambda $
inequality (1.8). Second, it gives us the Calderón reproducing formula for
$\mathsf {L}^{2}(G)$
functions, which is needed for the atomic decomposition.
“Constants” are positive real numbers, depending only on the geometry of
$\mathbf {G}$
unless otherwise indicated; we write
$A \lesssim B$
when there exists a constant C such that
$A \leq C B$
. We write
$\chi _{E} $
for the indicator function of a set E, and o denotes the identity of a group.
2 Preliminaries
2.1 Stratified nilpotent Lie groups
Let G be a (real and finite dimensional) stratified nilpotent Lie group of step s with Lie algebra
$\mathfrak {g}$
. This means that we may write
$\mathfrak {g}$
as a vector space direct sum
$\bigoplus _{j = 1}^{s} \mathfrak {v}_{j} $
, where
$[\mathfrak {v}_1, \mathfrak {v}_{j}] = \mathfrak {v}_{j + 1}$
when
$1\leq j \leq s$
; here
$\mathfrak {v}_{s+1} := \{0\}$
. Let
$\nu $
denote the homogeneous dimension of G; that is,
$\sum _{j=1}^{s} j \dim \mathfrak {v}_{j}$
. There is a one-parameter family of automorphic dilations
$\delta _t$
on
$\mathfrak {g}$
, given by

here each
$\mathcal {X}_j \in \mathfrak {v}_{j}$
and
$t \in \mathbb {R}_{+}$
. The exponential mapping
$\exp \colon \mathfrak {g} \to G$
is a diffeomorphism, and we identify
$\mathfrak {g}$
and G. The dilations extend to automorphic dilations of G, also denoted by
$\delta _t$
, by conjugation with
$\exp $
. The Haar measure on G, which is bi-invariant, is the Lebesgue measure on
$\mathfrak {g}$
lifted to G using
$\exp $
. Many more general facts about stratified Lie groups may be found in [Reference Folland and Stein12].
By [Reference Hebisch and Sikora15], the group G may be equipped with a smooth subadditive homogeneous norm
$\rho $
, a continuous function from G to
$[0,\infty )$
that is smooth on
$G\setminus \{o\}$
and satisfies
-
(1)
$\rho (g^{-1}) =\rho (g)$ ;
-
(2)
$\rho (g^{-1}h) \leq \rho (g) + \rho (h)$ ;
-
(3)
$\rho ({ \delta _t(g)}) =t\rho (g)$ for all
$g\in G$ and all
$t \in \mathbb {R}_{+}$ ;
-
(4)
$\rho (g) =0$ if and only if
$g=o$ .
Abusing notation, we set
$\rho (g, g') := \rho (g^{-1} g')$
for all
$g, g' \in G$
; this defines a metric on G. We write
$B(g, r)$
for the open ball with centre g and radius r with respect to
$\rho $
:

Then

The metric space
$(G,\rho )$
is geometrically doubling; that is, there exists
$A \in \mathbb {N}$
such that every metric ball
$B(x,2r)$
may be covered by at most A balls of radius r.
We remind the reader that a stratified Lie group is a space of homogenous type in the sense of Coifman and Weiss [Reference Coifman and Weiss3, Reference Coifman and Weiss4], and analysis on stratified Lie groups uses much from the theory of such spaces. In particular, we frequently deal with molecules, that is, functions
$\zeta $
that satisfy standard decay and smoothness conditions, by which we mean that there is a parameter
$\epsilon \in (0,1]$
, which we fix once and for all, such that

for all
$g, g' \in G$
. We often impose an additional cancellation condition, namely

The normalized dilate
$f_t$
of a function f on G by
$t \in \mathbb {R}_{+}$
is given by
$f_{t} := t^{-\nu }f\circ \delta _{1/t}$
, and the convolution
$f\ast f'$
of suitable functions f and
$f'$
on G is defined by

Take left-invariant vector fields
$\mathcal {X}_1$
, …,
$\mathcal {X}_{n}$
on G that form a basis of
$\mathfrak {v}_1$
, and define the sub-Laplacian
$\mathcal {L} := -\sum _{j=1 }^{n} (\mathcal {X}_{j})^2 $
. Observe that each
$\mathcal {X}_{j}$
is homogeneous of degree
$1$
and
$\mathcal {L}$
is homogeneous of degree
$2$
, in the sense that

for all
$t \in \mathbb {R}_{+}$
and all
$f \in \mathsf {C}^{2}(G)$
.
Associated to the sub-Laplacian, Folland, and Stein [Reference Folland and Stein12] defined the Riesz potential operators
$\mathcal {L}^{-\alpha }$
, where
$\alpha \in \mathbb {R}_{+}$
; these are convolution operators with homogeneous kernels. The Riesz transformation
$\mathcal {R}_{j} := \mathcal {X}_{j} \mathcal {L}^{-1/2}$
is a singular integral operator, and is bounded on
$\mathsf {L}^{p}(G)$
when
$1 < p < \infty $
as well as from the Folland–Stein Hardy space
$\mathsf {H}^1(G)$
[Reference Folland and Stein12] to
$\mathsf {L}^{1}(G)$
.
The Hardy–Littlewood maximal operator
$\mathcal {M}$
on G is defined using the metric balls:

where the “average integral” is defined by

For future use, we note that the layer cake formula implies that, if
$\mu $
is a radial decreasing function on G (that is,
$\mu (g)$
depends only on
$\rho (g)$
and decreases as
$\rho (g)$
increases), then

2.2 Functional calculus for the sub-Laplacian
The sub-Laplacian
$\mathcal {L}$
has a spectral resolution:

where
$\mathcal {E}_{\mathcal {L}}(\lambda )$
is a projection-valued measure on
$[0,\infty )$
, the spectrum of
$\mathcal {L}$
. For a bounded Borel function
$m\colon [0,\infty )\to \mathbb {C}$
, we define the operator
$F(\mathcal {L})$
spectrally:

This operator is a convolution with a Schwartz distribution on G.
2.3 The heat and Poisson kernels
Let
$h_t$
and
$p_{t}$
, where
$t \in \mathbb {R}_{+}$
, be the heat and Poisson kernels associated with the sub-Laplacian operator
$\mathcal {L}$
, that is, the convolution kernels of the operators
$e^{t\mathcal {L}}$
and
$e^{t \sqrt {\mathcal {L}}}$
on G. We write
${{q}}_{t}$
for
$t\partial _t p_{t}$
. We warn the reader that
$p_{t}$
and
${{q}}_{t}$
are the normalized dilates of
$p_1$
and
${{q}}_{1}$
by the factor t, but
$h_t$
is the normalized dilate of
$h_1$
by a factor of
$t^{1/2}$
. Let
$\nabla $
denote the horizontal subgradient on G and
denote the gradient
$(\nabla , \partial _{t})$
on
$G\times \mathbb {R}_{+}$
.
Lemma 2.1 The kernels
$h_t$
and
$p_{t}$
are
$\mathbb {R}_{+}$
-valued. Further,
$h_t$
and
$p_{t}$
have integral
$1$
, while
${{q}}_{t}$
has integral
$0$
for all
$t \in \mathbb {R}_{+}$
. Finally, there exists a constant c such that

for all
$g\in G$
and
$t \in \mathbb {R}_{+}$
.
Proof For the heat kernel estimates, see [Reference Varopoulos, Saloff-Coste and Coulhon30, Theorem IV.4.2]. Note that the first estimate has a version with the opposite inequality, with a different constant c.
The estimates for
$p_{t}$
and
${{q}}_{t}$
follow from the subordination formula

We leave the details to the reader.
This lemma shows that the heat kernel
$h_1$
and the Poisson kernel
$p_1$
and their derivatives satisfy the standard decay and smoothness conditions (2.1); their derivatives also satisfy the cancellation condition (2.2).
2.4 Systems of pseudodyadic cubes
We use the Hytönen–Kairema [Reference Hytönen and Kairema17] families of “dyadic cubes” in geometrically doubling metric spaces. We state a version of [Reference Hytönen and Kairema17, Theorem 2.2] that is simpler, in that we work on metric spaces rather than pseudometric spaces. The Hytönen–Kairema construction builds on seminal work of Christ [Reference Christ7] and of Sawyer and Wheeden [Reference Sawyer and Wheeden29].
Theorem 2.2 ([Reference Hytönen and Kairema17])
Let c, C and
$\kappa $
be constants such that
$0 < c \leq C < \infty $
and
$12 C\kappa \leq c$
, and let
$(G,\rho )$
be a metric stratified group. Then, for all
$k \in \mathbb {Z}$
, there exist families
$\mathscr {Q}_k(G)$
of pseudodyadic cubes Q with centres
$z(Q)$
, such that:
-
(1) G is the disjoint union of all
$Q \in \mathscr {Q}_k(G)$ , for each
$k\in \mathbb {Z}$ ;
-
(2)
$B(z(Q),c\kappa ^k/3)\subseteq Q \subseteq B(z(Q),2C\kappa ^k)$ for all
$Q \in \mathscr {Q}_k(G)$ ;
-
(3) if
$Q \in \mathscr {Q}_k(G)$ and
$Q' \in \mathscr {Q}_{k'}(G)$ where
$k\leq k'$ , then either
$Q \cap Q'=\emptyset $ or
$Q \subseteq Q'$ ; in the second case,
$B(z(Q), 2C\kappa ^k) \subseteq B(z(Q'),2C\kappa ^{k'})$ .
We write
$\mathscr {Q}(G)$
for the union of all
$\mathscr {Q}_k(G)$
, and call this a system of pseudodyadic cubes. Given a cube
$Q \in \mathscr {Q}_k(G)$
, we denote the quantity
$\kappa ^k$
by
$\operatorname {\ell }(Q)$
, by analogy with the side-length of a Euclidean cube.
A finite collection
$\{\mathscr {Q}^{\tau }\colon {\tau }=1,2,\dots ,\mathrm {T}\}$
of systems of pseudodyadic cubes is called a collection of adjacent systems of pseudodyadic cubes with parameters
$C'$
, c, C and
$\kappa $
, if it has the following properties: individually, each
$\mathscr {Q}^{\tau }$
is a system of pseudodyadic cubes with parameters c, C and
$\kappa $
as in Theorem 2.2; collectively, for each ball
$B(x,r)\subseteq G$
such that
$\kappa ^{k+3}<r\leq \kappa ^{k+2}$
, where
$k\in \mathbb {Z}$
, there exist
${\tau } \in \{1, 2, \dots , \mathrm {T}\}$
and
$Q\in \mathscr {Q}^{\tau }_k$
with centre
$z(Q)$
such that
$d(x, z(Q)) < 2\kappa ^{k}$

The following construction is due to [Reference Hytönen and Kairema17].
Theorem 2.3 Suppose that
$(G,\rho )$
is a metric stratified group. Then there exists a finite collection
$\{\mathscr {Q}^{\tau }\colon {\tau } = 1,2,\dots ,\mathrm {T}\}$
of adjacent systems of pseudodyadic cubes with parameters
$C'$
, c, C and
$\kappa $
, where
$\kappa := 1/100$
,
$c := 12^{-1}$
,
$C := 4$
and
$C' := 8 \times 10^{6}$
. For each
${\tau }\in \{1,2,\dots ,\mathrm {T}\}$
, the centres
$z(Q)$
of the cubes
$Q \in \mathscr {Q}^{\tau }_k$
have the properties that

and

From [Reference Kairema, Li, Pereyra and Ward21, Remark 2.8], the number
$\mathrm {T}$
of adjacent systems of pseudodyadic cubes in Theorem 2.3 may be taken to be at most
$A^6 \kappa ^{-\log _2(A)}$
, where A is the geometric doubling constant of G. The constants c, C,
$C'$
and
$\kappa $
do not depend on the choice of the metric stratified group
$(G,\rho )$
.
2.5 Products of stratified groups
We equip the product of two stratified groups
$G_1$
and
$G_2$
with a product structure. We carry forward the notation from Section 2.1, adding a subscript i or superscript
$[i]$
to clarify that we are dealing with
$G_i$
; the parameter i is always
$1$
or
$2$
. To shorten the formulae, we often use bold face type to indicate a product object: thus we write
$\mathbf {G}$
,
$\mathbf {T}$
,
$\mathbf {g}$
,
$\mathbf {r}$
and
$\mathbf {t}$
in place of
$G_1 \times G_2$
,
$\mathbb {R}_{+} \times \mathbb {R}_{+}$
,
$(g_1,g_2)$
,
$(r_1,r_2)$
and
$(t_1,t_2)$
. For example,
$B_i (g_i, r_i)$
denotes the open ball in
$G_i$
with centre
$g_i$
and radius
$r_i$
, with respect to the homogeneous norm
$\rho _i $
, and we write
$\mathbf {P}(\mathbf {g},\mathbf {r})$
for the product
$B_1(g_1, r_1) \times B_2(g_2, r_2)$
. We write
$\pi _i$
for the projection of
$\mathbf {G}$
onto
$G_i$
.
Products of balls are basic geometric objects and we write
$\mathscr {P}(\mathbf {G})$
for the family of all such products. In addition we deal with rectangles, by which we mean products of pseudodyadic cubes. We write
$\mathscr {R}(\mathbf {G})$
for the family of all rectangles, and for adjacent pseudodyadic systems as constructed in Theorem 2.3, we let
$\mathscr {R}^{{\tau }_1,{\tau }_2}(\mathbf {G})$
be the family of rectangles
$\{Q_1\times Q_2\colon Q_i\in \mathscr {Q}^{{\tau }_i},\ {\tau }_i=1,2,\dots ,\mathrm {T}_i\}$
. We let
$\operatorname {\boldsymbol {\ell }}\colon \mathscr {R}(\mathbf {G}) \to \mathbf {T}$
be the function such that
$\operatorname {\ell }_i(Q_1 \times Q_2) = \operatorname {\ell }(Q_i)$
, the “side-length” of
$Q_i$
. We also let
$\mathbf {z}(R)$
be the center of R, that is,
$\mathbf {z}(R)=(z_1(R), z_2(R))$
, where, when
$R=Q_1\times Q_2$
, we set
$z_j(R)=z(Q_j)$
, with
$z(Q_j)$
as defined in Theorem 2.2.
The element of Haar measure on
$\mathbf {G}$
is denoted
$\,\mathrm {d}\mathbf {g}$
, but is often written
$\,\mathrm {d} g_1\,\mathrm {d} g_2$
for calculations. The convolution
$f\ast f'$
of suitable functions f and
$f'$
on
$\mathbf {G}$
is defined by

The strong maximal operator
$\mathcal {M}_{s}$
is defined by

for all
$f \in \mathsf {L}^{1}_{\mathrm {loc}}(\mathbf {G})$
. It is evident that
$\mathcal {M}_{s}$
is dominated by the composition of the Hardy–Littlewood maximal operators in the factors:

When
$1 < p \leq \infty $
, the operators
$\mathcal {M}_1$
and
$ \mathcal {M}_2$
in the factors are
$\mathsf {L}^{p}$
-bounded, so the iterated maximal operators and hence the strong maximal operator are also
$\mathsf {L}^{p}$
-bounded.
Given an open subset U of
$\mathbf {G}$
with finite measure
$\left | U \right | $
, we define the enlargement
$U^{*}$
of U using the strong maximal operator
$\mathcal {M}_{s}$
:

for some
$\alpha \in (0,1)$
that varies from instance to instance. It is easy to see that
$U \subset U^*$
, while the
$\mathsf {L}^{2}(\mathbf {G})$
boundedness of the strong maximal operator together with Chebyshev’s inequality shows that
$| U^*| \lesssim |U|$
. We write
$\mathscr {M}(U)$
for the family of maximal rectangles contained in U.
If
$\varphi ^{[1]}$
on
$G_1$
and
$\varphi ^{[2]}$
on
$G_2$
both satisfy the decay and smoothness conditions (2.1) and have integral
$1$
, then

much as argued to prove (2.3), but with “biradial” in place of “radial”.
Recall from (1.6) that
${{q}}_{\mathbf {t}}$
denotes
$t_1\partial _{t_1} p_{t_1}^{[1]} \otimes t_2\partial _{t_2} p^{[2]}_{t_2}$
and

The properties of the Poisson kernel are important for the next lemma.
Lemma 2.4 Suppose that
$-\partial _{t_1}^2 - \partial _{t_2}^2 + \mathcal {L}^{[1]} + \mathcal {L}^{[2]}$
is analytic hypoelliptic on
$\mathbf {T} \times \mathbf {G}$
. If
$f \in \mathsf {L}^{2}(\mathbf {G})$
and
$\mathcal {S}_{q,\eta }(f)(\mathbf {g}) = 0$
for some
$\mathbf {g} \in \mathbf {G}$
and some
$\eta \in \mathbb {R}_{+}$
, then
$f = 0$
.
Proof The function
$(\mathbf {t}, \mathbf {h}) \mapsto (f \ast {{q}}_{\mathbf {t}} )(\mathbf {h})$
is smooth in
$ \mathbf {T} \times \mathbf {G}$
, and so the hypothesis
$\mathcal {S}_{q,\eta }(f)(\mathbf {g}) = 0$
implies that
$(f \ast {{q}}_{\mathbf {t}} )(\mathbf {h}) = 0$
for all
$(\mathbf {t}, \mathbf {h}) \in \Gamma ^\eta (\mathbf {g})$
. Hence for all
$\mathbf {h} \in \mathbf {G}$
, the function
$\mathbf {t} \mapsto (f \ast p_{\mathbf {t}})(\mathbf {h})$
is constant once
$t_1$
and
$t_2$
are both large enough. However,

as
$t_1$
or
$t_2$
tends to infinity, so the only possible value for
$(f \ast p_{\mathbf {t}})(\mathbf {h})$
when
$t_1$
and
$t_2$
are both large is
$0$
. Since

analytic hypoellipticity implies that
$f \ast p_{\mathbf {t}} = 0$
in
$\mathbf {T} \times \mathbf {G}$
and hence
$f = 0$
.
The hypothesis of analytic hypoellipticity certainly holds when
$\mathbf {G} = \mathbb {R}^{m} \times \mathbb {R}^{n}$
, but in general it is not satisfied; see [Reference Helffer16] for more information.
2.6 Journé’s covering lemma
We recall a covering lemma for product spaces. Journé [Reference Journé20] first established this result on
$\mathbb {R}\times \mathbb {R}$
. The fourth author [Reference Pipher28] extended it to higher dimensional Euclidean spaces and an arbitrary number of factors
$\mathbb {R}^{n_1}\times \mathbb {R}^{n_2}\times \ldots \times \mathbb {R}^{n_m}$
. This was extended to products of spaces of homogeneous type in [Reference Han, Li and Lin14, Lemma 2.1].
Take
$\alpha \in (0,1/2)$
. Let U be an open subset of
$\mathbf {G}$
of finite measure and let
$\mathscr {M}(U)$
be the set of all maximal subrectangles of U. Given
$R=Q_1\times Q_2\in \mathscr {M}(U)$
, let
$\tilde {Q}_2$
be the biggest pseudodyadic cube containing
$Q_2$
such that

Similarly, given
$R=Q_1\times Q_2\in \mathscr {M}(U)$
, let
$\tilde {Q}_1$
be the biggest pseudodyadic cube containing
$Q_1$
such that

We then set

Here is our Journé-type covering lemma on
$\mathbf {G}$
.
Lemma 2.5 Suppose that U is an open subset in
$\mathbf {G}$
of finite measure and
$\delta \in \mathbb {R}_{+}$
. Then

Proof See [Reference Han, Li and Lin14, Lemma 2.1].
2.7 The Orlicz space L log L
We recall the definition of an Orlicz space on
$\mathbf {G}$
. A Young function is a continuous, convex, increasing bijective function
$\Phi \colon [0,\infty )\to [0,\infty )$
. To a Young function
$\Phi $
, we associate the nonlinear functional
$F_\Phi $
, given by

The convexity of
$\Phi $
implies that
$\{ f \in \mathsf {L}^{1}_{\mathrm {loc}}(\mathbf {G}) \colon F_\Phi (f) \leq 1 \}$
is a closed convex symmetric set, and so the Luxemburg norm, given by

is indeed a norm, and
$\mathsf {L}^{\Phi }(\mathbf {G})$
, the set of functions f for which
$\|f\|_{\mathsf {L}^{\Phi }(\mathbf {G})}$
is finite, is a Banach space. The sets
$\{ f \in \mathsf {L}^{1}_{\mathrm {loc}}(\mathbf {G}) \colon F_\Phi (f) \leq 1 \}$
and
$\{ f \in \mathsf {L}^{1}_{\mathrm {loc}}(\mathbf {G}) \colon \left \Vert f \right \Vert {}_{\mathsf {L}^{\Phi }(\mathbf {G})} \leq 1 \}$
coincide.
Suppose that
$\Phi $
and
$\Psi $
are Young functions such that

Then

and it follows that the corresponding Luxemburg norms are related by the generalized Hölder inequalities, namely,

We are particularly interested in a special pair of Young functions. Henceforth,

By maximizing in s, it is straightforward to show that
$1 + st - s \log (\mathrm {e}+s) \leq \mathrm {e}^t$
, so these functions satisfy (2.6), whence (2.7) and (2.8) hold. For this choice of
$\Phi $
and
$\Psi $
, the corresponding spaces are denoted by
$\mathsf {L\,log\,L}(\mathbf {G})$
and
$\mathrm {e}^{\mathsf {L}^{}}(\mathbf {G})$
. Clearly
$\mathsf {L\,log\,L}(\mathbf {G}) \subseteq \mathsf {L}^{1}(\mathbf {G})$
. Further,
$\Phi (\lambda t) \leq \Phi (\lambda ) \Phi (t)$
, whence

This implies that
$f \in \mathsf {L\,log\,L}(\mathbf {G})$
if and only if
$F_\Phi (f)$
is finite, and that

which we used in the introduction to prove (1.4) from (1.3).
Finally, since
$\Psi $
is a Young function,

For all this and much more about Orlicz spaces, see [Reference Luxemburg24].
We need a density result.
Proposition 2.6
$\mathsf {L}^{2} \cap \mathsf {L\,log\,L} (\mathbf {G})$
is dense in
$\mathsf {L\,log\,L} (\mathbf {G})$
.
Proof For
$f\in \mathsf {L\,log\,L} (\mathbf {G})$
, we define, for all
$N \in \mathbb {N}$
,

Then
$f_N\in \mathsf {L\,log\,L} (\mathbf {G})$
,
$|f_N| \leq |f|$
, and
$f_N \to f$
almost everywhere as
$n \to \infty $
. Next,

which shows that
$f_N \in \mathsf {L}^{2}(\mathbf {G})$
for all
$N \in \mathbb {Z}^+$
. Finally, if
$\lambda < 1$
, then

as
$N \to \infty $
. Consequently,
$\left \Vert {f-f_N}\right \Vert {}_{\mathsf {L\,log\,L}(\mathbf {G})}$
tends to zero as N tend to infinity.
We also have the following auxiliary result.
Lemma 2.7 For every
$f \in \mathsf {L\,log\,L}(\mathbf {G})$
,

Proof By definition, if
$k\leq 0$
, then

Summing over all such k yields the required estimate.
3 The strong maximal function and Proof of Theorem 1.1
The results in this section may be easily generalized to spaces of homogeneous type.
Before we tackle the main topic, we state and prove a covering lemma. We begin with a geometric observation.
Lemma 3.1 There is a geometric constant
$C(\mathbf {G})$
with the property that

for all measurable subsets U of
$\mathbf {G}$
, all
$\lambda \in (0,1)$
and all rectangles
$R \in \mathscr {P}$
such that

Proof By definition, R is a product of pseudodyadic cubes
$Q_1 \times Q_2$
, and by Theorem 2.2,

Define
$P := B(z(Q_1),2C\ell _1(Q_1) \times B(z(Q_2),2C\ell _2(Q_2)$
. Then
$R \subseteq P$
and

in the first equality we use the homogeneity of the measure of the balls on
$G_1$
and
$G_2$
. We could also prove a similar inequality by using the doubling property of the measure.
Hence R is a subset of the set

as required.
Theorem 3.2 [Reference Córdoba and Fefferman5]
Let
$\{R_j\}_{j\in J}$
be a family of rectangles in
$ \mathbf {G}$
such that
$\big |\bigcup _{j\in J} R_j\big |$
is finite. Then there is a sequence of rectangles
$\{\tilde {R}_k\}\subset \{R_j\}_{j\in J}$
such that

Proof The proof is a generalization of that in [Reference Córdoba and Fefferman5]. However, the pseudodyadic case is somewhat trickier than the dyadic case in
$\mathbb {R}^2$
, and the argument in [Reference Córdoba and Fefferman5] is rather brief and not always precise, so it seems worthwhile to provide a complete proof.
Since there are countably many rectangles, we may assume that
$J = \mathbb {N}$
and
$j \in \mathbb {N}$
. Let

Choose a subsequence
$\{R_{\sigma (j)} \colon j \in \mathbb {N}\}$
of
$\{R_j \colon j \in \mathbb {N}\}$
, using the rules that
$\sigma (1) = 1$
and
$\sigma (k+1)$
is the least
$j> \sigma (k)$
such that

the construction terminates if this is not possible. Let
$E_K = \bigcup _{1 \leq k \leq K} R_{\sigma (k)} \subseteq E$
and let
$E_\infty = \bigcup _{1 \leq k \leq \infty } R_{\sigma (k)}$
. We note that
$E_\infty $
is a subset of E, and hence
$|E_\infty |\leq |E|<\infty $
by assumption. If the construction does not terminate, then by monotone convergence, we may choose K such that
$|E_K| \geq |E_\infty |/2$
; if the construction terminates, we take K to be the last index of the finite sequence. The set E consists of rectangles
$R_j$
that are not one of the chosen rectangles
$R_{\sigma (k)}$
together with rectangles that are subsets of
$E_\infty $
if the construction did not terminate, or of
$E_K$
otherwise. Suppose that the construction did not terminate. Then by Lemma 3.1,

Hence
$E \subseteq E_\infty ^* \cup E_\infty $
, and the boundedness of
$\mathcal {M}_{s}$
on
$\mathsf {L}^{2}(\mathbf {G})$
shows that

When the construction terminates, a slightly easier argument shows that
$|E| \lesssim |E_K|$
.
We now relabel the finite family of rectangles
$\{R_{\sigma (1)}, \dots , R_{\sigma (K)}\}$
as
$R_K, \dots , R_1$
and repeat the construction, setting
$\tau (1) := 1$
and choosing
$\tau (l+1)$
to be the least
$k> \tau (l)$
such that

we end up with L terms, say. Since

by construction, it follows that

Write
$\tilde {R}_l$
for
$R_{\tau (l)}$
and
$\tilde {E}$
for
$\bigcup _{l} R_{\tau (l)}$
. Much as before,
$|E| \lesssim \big |\tilde {E}\big |$
. Note that

We now claim that

From this claim, it follows that

as long as
$1/\lambda < \log (2)/2$
, and so
$\sum _{l=1}^{L} \chi _{R_l} \in \mathrm {e}^{\mathsf {L}^{}}(G)$
. It remains to prove (3.3).
Consider
$\mathbf {g} \in \mathbf {G}$
that lies in two distinct rectangles from the family
$\{ \tilde {R}_l\}$
, R and S say. If
$\operatorname {\ell }_1(S) = \operatorname {\ell }_1(R)$
, then
$\pi _1(R) = \pi _1(S)$
, since both
$\pi _1(R)$
and
$\pi _1(S)$
are pseudodyadic cubes containing
$\pi _1(\mathbf {g})$
; now
$\pi _2(R) \subseteq \pi _2(S)$
or
$\pi _2(R) \supseteq \pi _2(S)$
, so
$R\cap S$
is either R or S, and this contradicts (3.1). Hence we may assume, without loss of generality, that
$\operatorname {\ell }_1(S) < \operatorname {\ell }_1(R)$
. A similar argument then shows that
$\operatorname {\ell }_2(S)> \operatorname {\ell }_2(R)$
. Thus, if there are n distinct rectangles in
$\{ \tilde {R}_l\}$
that contain
$\mathbf {g}$
, then we may label them
$R_1(\mathbf {g})$
, …,
$R_n(\mathbf {g})$
, in such a way that
$\operatorname {\ell }_1(R_j(\mathbf {g}))$
decreases with j and
$\operatorname {\ell }_2(R_j(\mathbf {g}))$
increases with j; then

and

We say that a rectangle
$T \in \{ \tilde {R}_l \}$
is a descendant of a rectangle
$R \in \{ \tilde {R}_l\}$
, and we write
$T \succ R$
, if
$R\cap T \neq \emptyset $
and both
$\operatorname {\ell }_1(T)> \operatorname {\ell }_1(R)$
and
$\operatorname {\ell }_2(T) < \operatorname {\ell }_2(R)$
, and we say that T is a child of R if
$T \succeq R$
and if
$S \in \{ \tilde {R}_l\}$
and
$T \succeq S \succeq R$
then
$S = T$
or
$S= R$
. We may define ancestors and parents similarly, with the relations reversed.
Fix a rectangle
$R \in \{ \tilde {R}_l\}$
, and consider
$\mathbf {g} \in R$
that lies in at least n distinct rectangles of
$\{ \tilde {R}_l\}$
. Then
$\mathbf {g}$
lies in at least
$\lceil n/2\rceil $
distinct rectangles S such that
$S \succeq R$
, or
$\mathbf {g}$
lies in at least
$\lceil n/2\rceil $
distinct rectangles S such that
$S \preceq R$
. Thus

where
$\sum ^\succeq _{l}$
indicates that we sum over l such that
$\tilde {R}_l \succeq R$
, and
$\sum ^\preceq _{l}$
is defined analogously. We estimate the measure of one of these two sets: the other may be estimated similarly.
Consider all
$S \in \{ \tilde {R}_l\}$
such that
$S \succeq R$
. Inside this collection of rectangles, we may identify the children
$S_1$
, …,
$S_p$
of R, which are pairwise disjoint, the children of the children of R, which are again pairwise disjoint, and so on. Observe that

and the near disjointness condition (3.1) implies that

Then the measure of the set of all
$\mathbf {g} \in R$
that also belong to another rectangle
$S \succ R$
is at most
$ \left | R \right |/2$
.
Likewise, similar inequalities hold for the children of the children of R, which we may label as
$T_1$
, …,
$T_q$
, and so

and the measure of the set of all
$\mathbf {g} \in R$
that also belong to two more rectangles
$S, T \succ R$
is at most
$2^{-2} \left | R \right |$
. Continuing inductively, the measure of the set of all
$\mathbf {g} \in R$
that lie in at least m distinct rectangles
$S\succ R$
is at most
$2^{-m} \left | R\right |$
.
Combining this estimate with an almost identical estimate for the measures of sets defined using ancestors and parents, we conclude that the measure of the set of all
$\mathbf {g} \in R$
that belong to at least n rectangles of
$\{ \tilde {R}_l \}$
is at most

Then the measure of all
$\mathbf {g} \in G$
that belong to at least n rectangles of
$\{ \tilde {R}_l \}$
is at most a multiple of

which is what we needed to prove.
Now we can prove an endpoint estimate for the strong maximal function on
$\mathbf {G}$
.
Theorem 3.3 The strong maximal function satisfies the following:

for all
$f \in \mathsf {L\,log\,L}(\mathbf {G})$
.
Proof We recall a fundamental estimate on the strong maximal function and pseudodyadic strong maximal functions associated with adjacent pseudodyadic systems, the prototype of which (in the setting of the product
$\mathbb {R}\times \mathbb {R}$
) was proved in [Reference Li, Pipher and Ward23, Theorem 6.1]. More explicitly, we define the pseudodyadic strong maximal function as follows:

where
${\tau }_i=1,2,\dots ,\mathrm {T}_i$
. Then

This inequality means that instead of considering general products of balls, it suffices to consider rectangles, and Theorem 3.2 may be applied.
In light of (3.5), sublinearity and Proposition 2.6, it suffices to show that

when
${\tau }_i=1,2,\dots ,\mathrm {T}_i$
; we may assume that f takes nonnegative real values.
Let
$E=\big \{\mathbf {g} \in \mathbf {G}\colon |\mathcal {M}_{s}^{{\tau }_1,{\tau }_2} f(\mathbf {g})|>1\big \}$
. Since
$f\in \mathsf {L}^{2}(\mathbf {G})$
and
$\mathcal {M}_{s}^{{\tau }_1,{\tau }_2}$
is
$\mathsf {L}^{2}$
bounded,
$|E|$
is finite. For every
$\mathbf {g}\in E$
there exists
$R_{\mathbf {g}}\in \mathscr {R}^{{\tau }_1,{\tau }_2}(\mathbf {G})$
satisfying

Then
$E=\bigcup _{\mathbf {g}\in E} R_{\mathbf {g}}$
by definition.
By Theorem 3.2, there is a sequence of rectangles
$\{\tilde {R}_l\} \subseteq \{R_{\mathbf {g}}\}_{\mathbf {g}\in E}$
such that

Write
$\tilde {E}$
for
$\bigcup _l \tilde {R}_l$
and
$\tilde {E}_n$
for
$\left \{ \mathbf {g} \in \mathbf {G} : \sum _{l} \chi _{\tilde {R}_l}(\mathbf {g}) = n \right \}$
. From (3.7), (2.7), (2.10), and (2.12), we deduce that
$|E| \lesssim |\tilde {E}|$
and, for all
$\lambda \in [1,\infty )$
,

Theorem 3.2 shows that
$|\tilde {E}_n| \leq C_1 2^{-n/2} |\tilde {E}|$
, where
$C_1$
is a geometric constant. See also the quantitative argument in (3.4). Hence

We fix
$\lambda $
that is large enough that the series converges and the right hand term is less than
$\big | \tilde {E} \big |/2$
; then
$\big | \tilde {E} \big | \leq 2 \Phi (\lambda ^2) F_\Phi (f)$
, as required.
Corollary 3.4 For
$\eta \in [0,\infty )$
, the maximal operator
$\mathcal {M}_{\varphi ,\eta }$
defined in (1.5) satisfies the endpoint estimate:

for all
$f \in \mathsf {L\,log\,L}(\mathbf {G})$
.
Proof It follows from (2.5) with a little care that
$\mathcal {M}_{\zeta ,\eta }(f)\lesssim \mathcal {M}_{s} f$
pointwise.
The following lemma is another well known consequence of the boundedness of the strong maximal operator. Recall that
$\mathscr {M}(U)$
denotes the collection of all maximal rectangles contained in U.
Lemma 3.5 Let U be an open subset of
$\mathbf {G}$
of finite measure and
$\beta \in \mathbb {R}_{+}$
. Given R in
$\mathscr {M}(U)$
, let
$\beta R$
be the set
$\mathbf {z}(R)\delta _\beta (\mathbf {z}(R)^{-1}R)$
, where
$\mathbf {z}(R)$
is the center of R defined in Section 2.5. Then

Proof Recall that
$B(z(Q),c\operatorname {\ell }(Q) /3)\subseteq Q \subseteq B(z(Q),2C\operatorname {\ell }(Q))$
for all
$Q \in \mathscr {Q}(G)$
and that
$\mathbf {P}(\mathbf {z},\operatorname {\boldsymbol {\ell }})$
is the product
$B_1(z_1,\operatorname {\ell }_1) \times B_2(z_2,\operatorname {\ell }_2)$
. It follows that, if
$R \in \mathscr {R}$
and
$\mathbf {g} \in \beta R$
, then

and we deduce, much as in the proof of Lemma 3.1, that

Hence

and the hyperweak boundedness of
$\mathcal {M}_{s}$
implies that

as claimed.
4 The atomic decomposition
In this section, we first prove a hyperweak boundedness estimate for the Lusin area function , and then use this estimate to decompose
$\mathsf {L\,log\,L}(\mathbf {G})$
functions.
Our results in this and later sections need the more specific context of stratified Lie groups.
Recall that
$p^{[i]}_{t_i}$
and
$q^{[i]}_{t_i}$
denote the convolution kernels of the operators
$e^{-t_i \sqrt {\mathcal {L}_i}}$
and
$t_i \partial _{t_i} e^{-t_i \sqrt {\mathcal {L}_i}}$
; then
$q^{[i]}_{t_i} = t_i \partial _{t_i} p^{[i]}_{t_i}$
. We write
$p_{\mathbf {t}}:=p^{[1]}_{t_1} \otimes p^{[2]}_{t_2}$
and
${{q}}_{\mathbf {t}}:=q^{[1]}_{t_1} \otimes q^{[2]}_{t_2}$
.
4.1 Hyperweak boundedness and the area function
The first step is to apply Corollary 3.4.
Proposition 4.1 The nontangential maximal operator
$\mathcal {M}_{p,\eta }$
is hyperweakly bounded. That is,

for all
$f \in \mathsf {L\,log\,L}(\mathbf {G})$
.
This weak type estimate for the Poisson maximal operator implies a similar estimate for the Lusin area function , which is the key to establishing the atomic decomposition for
$\mathsf {L\,log\,L} (\mathbf {G})$
functions. More explicitly, we denote the tensor
by
and define

where

Theorem 4.2 The following estimate holds:

for all
$f \in \mathsf {L\,log\,L}(\mathbf {G})$
.
Proof A recent result of Fan, Yan and the first and third authors [Reference Cowling, Fan, Li and Yan6] shows that for f such that
$\left \Vert \mathcal {M}_{p,\eta } f \right \Vert {}_{\mathsf {L}^{1}(\mathbf {G}) }<\infty $
,

More precisely, define the sublevel set
$L_{\eta }(\lambda ) :=\left \{ \mathbf {g}\in {\mathbf {G}}\colon \mathcal {M}_{p,\eta }(f)(\mathbf {g}) \leq \lambda \right \}$
. Then it is shown that, when
$\eta $
is sufficiently large,

for all
$\lambda \in \mathbb {R}_{+}$
. Now
$\big | L_{\eta }(\lambda )^c \big | \lesssim F_\Phi (f/\lambda )$
by Proposition 4.1, so the layer-cake formula and (2.10) show that

This implies that the right-hand side of (4.2) is finite. By repeating the argument of [Reference Cowling, Fan, Li and Yan6], we see that (4.2) also holds for
$f\in \mathsf {L\,log\,L} (\mathbf {G})$
, and we conclude that

for all
$f \in \mathsf {L\,log\,L}(\mathbf {G})$
, as required.
Recall that
${{q}}_{\mathbf {t}}$
denotes
$t_1\partial _{t_1} p_{t_1}^{[1]} \otimes t_2\partial _{t_2} p^{[2]}_{t_2}$
and

Corollary 4.3 The area function
$\mathcal {S}_{q,1} (f)$
satisfies the estimate:

for all
$f \in \mathsf {L\,log\,L}(\mathbf {G})$
.
Proof This holds as
$f * {{q}}_{\mathbf {t}}$
is one of the components of the tensor
.
4.2 The atomic decomposition for L log L
Recall that the
$q^{[i]}_1$
satisfy the standard smoothness, decay and cancellation conditions (2.1) and (2.2), and by [Reference Geller and Mayeli13] there exist compactly supported smooth functions
$\varphi ^{[i]}$
on
$G_i$
with integral
$0$
such that

for all
$f\in \mathsf {L}^{2}(\mathbf {G})$
, where
$\varphi _{\mathbf {t}}=\varphi _{t_1}\otimes \varphi _{t_2}$
. This is the Calderón reproducing formula. We suppose that
$\operatorname {supp}(\varphi ^{[i]}) \subseteq B_i(o,1)$
, by rescaling if necessary.
Theorem 4.4 (Atomic decomposition for
$\mathsf {L\,log\,L} $
)
Let
$f\in \mathsf {L}^{2}(\mathbf {G}) \cap \mathsf {L\,log\,L} (\mathbf {G})$
. Then we may write

where the sum converges unconditionally in
$L^2(\mathbf {G})$
and the atoms
$a_k$
satisfy:
-
(1)
$a_k$ vanishes outside a subset
$U^{\dagger }_k$ of
$\mathbf {G}$ such that
$|U^{\dagger }_k|\lesssim F_\Phi (2^{-k}{f})$ ;
-
(2)
$\|a_k\|_{\mathsf {L}^{2}(\mathbf {G})}^2 \lesssim 2^{2k}F_\Phi (2^{-k}f)$ ;
-
(3) each
$a_k$ can be further decomposed:
$a_k=\sum _{R\in \mathscr {M}(U^{*}_k)}a_{k,R} $ , where the sum converges unconditionally in
$\mathsf {L}^{2}(\mathbf {G})$ , and
-
(a)
$\operatorname {supp} a_{k,R} \subseteq \beta R$ , for a suitable
$\beta \in \mathbb {R}_{+}$ ;
-
(b)
$\int _{G_1}a_{k,R}(g_1,g_2) \,\mathrm {d} g_1=0$ for all
$g_2 \in G_2$ ;
-
(c)
$\int _{G_2}a_{k,R}(g_1,g_2) \,\mathrm {d} g_2=0$ for all
$g_1 \in G_1$ ;
-
(d)
$\sum _{R\in \mathscr {M}(U^*_k)} \left \Vert a_{k,R} \right \Vert {}^2_{\mathsf {L}^{2}(\mathbf {G})}\lesssim 2^{2k}F_\Phi (2^{-k} f)$ .
-
The sets
$U^{\dagger }_k$
and
$U^*_k$
are defined in the proof.
Proof From Corollary 4.3,
$\mathcal {S}_{q,1}$
is hyperweakly bounded. Assume that
$0 \neq f\in \mathsf {L}^{2}(\mathbf {G}) \cap \mathsf {L\,log\,L} (\mathbf {G})$
. Then
$\mathcal {S}_{q,1} (f)\in \mathsf {L}^{2}(\mathbf {G})$
. Now for all
$k\in \mathbb {Z}$
, we set




here
$\alpha $
and
$\beta $
are positive numbers that will be specified during the proof. For each rectangle
$S \in \mathscr {R}(\mathbf {G})$
, we define the tent
$S_+$
over S to be the set

The constant C is as in Theorem 2.2. This definition implies that, if
$(\mathbf {h},\mathbf {t}) \in S_+$
, then

and so
$S_+ \subseteq \Gamma (g)$
for all
$g \in S$
.
Further, the “upper half space”
$\mathbf {T} \times \mathbf {G}$
is the disjoint union of all tents
$S_+$
as S runs over
$\mathscr {R}(\mathbf {G})$
.
By definition
$U_k \supseteq U_{k+1}$
, so
$\left | S\cap U_k\right | / \left | S \right | \geq \left | S\cap U_{k+1}\right | / \left | S \right |$
for all
$k \in \mathbb {Z}$
. This implies that the sets
$\mathscr {B}_k$
are pairwise disjoint. Indeed, if
$S \in \mathscr {B}_k$
, then
$\left | S\cap U_k\right |/ \left | S \right |> 1/2$
, which implies that
$\left | S\cap U_j\right |/ \left | S \right |> 1/2$
whenever
$j \leq k$
, so
$S \notin \mathscr {B}_j$
if
$j < k$
. Likewise,
$\left | S\cap U_{k+1} \right |/ \left | S \right | \leq 1/2$
, which implies that
$\left | S\cap U_j\right |/ \left | S \right |> 1/2$
whenever
$j \geq k+1$
, so
$S \notin \mathscr {B}_j$
if
$j> k$
.
By Chebyshev’s inequality,
$\left | U_k \right | \lesssim 2^{-2k}$
, so
$\lim _{k \to \infty } \left | S\cap U_k\right |/ \left | S \right | =0$
for all
$S \in \mathscr {R}(\mathbf {G})$
. If
$\mathbf {G} \setminus \bigcup _{k\in \mathbb {Z}} U_k$
were a null set, then it would follow that
$\lim _{k \to -\infty } \left | S\cap U_k\right |/ \left | S \right | =1$
for all
$S \in \mathscr {R}(\mathbf {G})$
, and
$\mathscr {R}(\mathbf {G})$
would be the disjoint union of the sets
$\mathscr {B}_k$
as k runs over
$\mathbb {Z}$
. However, except in special cases, we do not know that
$\mathbf {G} \setminus \bigcup _{k \in \mathbb {Z}} U_k$
is null, and we must take into account the possibility that some
$S \in \mathscr {R}(\mathbf {G})$
do not belong to any
$\mathscr {B}_k$
. When
$S \notin \bigcup _{k \in \mathbb {Z}} U_k$
, it is evident that
$\left | S\cap U_{k} \right |/ \left | S \right | \leq 1/2$
for all
$k \in \mathbb {Z}$
, or equivalently,

We claim that, if
$S \notin \bigcup _{k \in \mathbb {Z}} U_k$
, then
$(f \ast {{q}}_{\mathbf {t}} )(\mathbf {h}) =0$
for almost all
$(\mathbf {h},\mathbf {t})\in S_+$
. To see this, we define the set
$Z =\{ \mathbf {g}\in \mathbf {G}: \mathcal {S}_{q,1}(f)(\mathbf {g})=0\}$
, and note that
$\bigcap _{k\in \mathbb Z} U_k^c=Z$
. Since
$\mathcal {S}_{q,1}(f)$
is lower semicontinuous, Z is closed. Then

By definition, for every
$\mathbf {g}\in Z$
,

which implies that
$(f \ast {{q}}_{\mathbf {t}} )(\mathbf {h})=0$
for almost all
$(\mathbf {h},\mathbf {t})\in \Gamma (\mathbf {g})$
. Hence

so that
$(f \ast {{q}}_{\mathbf {t}} )(\mathbf {h}) =0$
for almost all
$(\mathbf {h},\mathbf {t})\in S_+$
, as claimed.
Since
$f\in \mathsf {L}^{2}(\mathbf {G})$
, the reproducing formula implies that

The third equality holds since the upper half-space
$\mathbf {T} \times \mathbf {G}$
is the disjoint union of the tents
$S_+$
. The fourth equality holds because the
$\mathscr {B}_k$
are disjoint, and if
$S \in \mathscr {R}(\mathbf {G}) \setminus \big (\bigcup _{k\in \mathbb {Z}} \mathscr {B}_k \big )$
, then
$(f \ast q_{\mathbf {t}})(\mathbf {h}) = 0$
for almost all
$(\mathbf {t}, \mathbf {h}) \in S_+$
.
If
$\mathbf {g} \in S \in \mathscr {B}_k$
, then
$|S \cap U_k | / \left | S \right |> 1/2$
so
$\mathbf {g} \in U^{*}_k$
by Lemma 3.1 provided that
$\alpha < C(\mathbf {G})/2$
; coupled with (4.6), this shows that

For future use, we note that, for any measurable set V in
$\mathbf {G}$
,

Next, for all
$S\in \mathscr {B}_k$
, choose
$\tilde {S}\in \mathscr {M}(U^{*}_k)$
such that
$S\subseteq \tilde {S}$
. Then for all
$R\in \mathscr {M}(U^{*}_k)$
, set
$a_{k,R}=\sum _{S \in \mathscr {B}_k\colon \tilde {S}=R } b_{k,S}$
, where

By construction,

Now

and hence

where
$\beta = 30 C/c$
. Thus
$a_k$
vanishes on
$(U^{\dagger }_k)^c$
(recall that
$U^{\dagger }_k$
is defined in (4.5)). By Lemmas 3.5 and 3.1 and Corollary 4.3,

Next, we claim that for all
$k\in \mathbb {Z}$
,

To see this, take
$h \in \mathsf {L}^{2}(\mathbf {G})$
such that
$\|h\|_{\mathsf {L}^{2}(\mathbf {G})}=1$
; then, writing
$\check \varphi $
for the reflected version of
$\varphi $
, that is,
$\check \varphi (\mathbf {g}) = \varphi (\mathbf {g}^{-1})$
, we see that

By definition, (4.10), (4.7), (4.8), a convolution identity and (4.12),

This shows that
$|\langle a_k,h\rangle |^2 \lesssim 2^{2k}F_\Phi (2^{-k}f)$
, whence
$\|a_k\|_{\mathsf {L}^{2}(\mathbf {G})}^2\lesssim 2^{2k}F_\Phi (2^{-k}f)$
, and (4.13) holds.
We now claim that

Arguing as above, we see that, if
$h\in \mathsf {L}^{2}(\mathbf {G})$
and
$\|h\|_{\mathsf {L}^{2}(\mathbf {G})}=1$
, then

where
$\mathscr {B}_k(R) := \{ S \in \mathscr {B}_k\colon \tilde {S}=R \}$
. Hence

which implies that

Then, much as argued to prove (4.13), we deduce (4.14). The unconditional convergence of the sum follows as in [Reference Chen, Cowling, Lee, Li and Ottazzi2].
5 Proof of Theorem 1.1
In this section, we use the atomic decomposition to prove Theorem 1.1. Recall that, if
$f\in \mathsf {L}^{2}(\mathbf {G}) \cap \mathsf {L\,log\,L} (\mathbf {G})$
, then we may write

where
$U^*_k$
is as in Theorem 4.4, and
$a_k$
and
$a_{k.R}$
satisfy support and cancellation conditions. In Theorem 4.4, we defined

and

where
$\alpha < C(\mathbf {G})/2$
; now we also define

and

By definition,
$U_k \subset U^{*}_k\subset U^{**}_k\subset U^{***}_k$
, and so
$|U_k| \leq |U^{*}_k|\leq |U^{**}_k|\leq |U^{***}_k|$
. However,

where the implicit constants are geometric, from the
$L^2(\mathbf {G})$
-boundedness of the strong maximal operator
$\mathcal {M}_{s}$
.
5.1 Proof of Theorem 1.1 for area functions
$\mathcal {S}_{\psi ,\eta }(f)$
Recall from Section 4.2 that
$\mathsf {L}^{2}(\mathbf {G}) \cap \mathsf {L\,log\,L} (\mathbf {G})$
is dense in
$ \mathsf {L\,log\,L} (\mathbf {G})$
. Now we take a general
$\psi $
satisfying the decay, smoothness and cancellation conditions (2.1) and (2.2), and prove that

for all
$f\in \mathsf {L\,log\,L} (\mathbf {G})$
. By sublinearity and density, it suffices to prove that

for all
$f\in \mathsf {L}^{2}(\mathbf {G}) \cap \mathsf {L\,log\,L} (\mathbf {G})$
. We may and shall suppose that
$\eta> 1$
.
From Theorem 4.4, we may write

where the
$a_k$
satisfy the conditions of the lemma.
Observe that

where the last inequality follows from Lemma 2.7.
Thus, by the
$\mathsf {L}^{2}(\mathbf {G})$
boundedness of
$\mathcal {S}_{\psi ,\eta }$
,

To handle
$\mathcal {S}_{\psi ,\eta }(\sum _{k \geq 1} a_k)$
, we use the fine structure of the atoms
$a_k$
: each
$a_k$
is supported in
$U_k^\dagger $
, where
$|U_k^\dagger |\lesssim 2^{-k}F_\Phi (f)$
, and
$a_k=\sum _{R\in \mathscr {M}(U^*_k)} a_{k,R}$
.
For all
$R = {Q_1} \times {Q_2} \in \mathscr {M}(U^*_k)$
, let
${\tilde {Q}_1}$
be the biggest pseudodyadic cube containing
${Q_1}$
such that
${\tilde {Q}_1}\times {Q_2}\subset U^{**}_k$
, where
$U^{**}_k$
is defined in (5.1). Next, let
${\tilde {Q}_2}$
be the biggest pseudodyadic cube containing
${Q_2}$
such that
${\tilde {Q}_1}\times {\tilde {Q}_2} \subseteq U^{***}_k$
, where
$U^{***}_k$
is defined in (5.2). Finally, let
$R^{\dagger }$
be
$100\beta _k({\tilde {Q}_1} \times {\tilde {Q}_2})$
, where
$\beta _k = 2^{k/ (2\nu _1+2\nu _2)}$
. By Lemmas 3.5 and 3.1 and Theorem 4.4,

We claim that there exists
$\delta \in \mathbb {R}_{+}$
such that

Assume (5.7) for the moment, and let
$E^\dagger :=\bigcup _{k \geq 1}\bigcup _{R \in \mathscr {M}(U^*_k)} R^{\dagger }$
. Then on the one hand,

by (5.6). On the other hand,

where the last inequality follows by summing (5.7) over positive k.
By Chebychev’s inequality,

Together with (5.5), this implies (5.3).
It remains to prove (5.7). Now

say, where
$\beta _k=2^{k/(2\nu _1+2\nu _2)} \eta $
. It suffices to control
$\mathrm {I}_1(R)$
and
$\mathrm {I}_2(R)$
, as the other two terms are similar.
By Hölder’s inequality and the
$ \mathsf {L}^{2}(G_2)$
-boundedness of
$\mathcal {S}_{\psi ^{[2]},\eta }$
,

We use the cancellation and support restrictions on
$a_{k,R}$
in the first variable, (2.1), homogeneity, the geometry (namely,
$h_1, z({Q_1}) \in {Q_1}$
and
$g_1 \in (100\beta _k {\tilde {Q}_1})^c$
) and Hölder’s inequality to deduce that

where
$z({Q_1})$
is the center of
${Q_1}$
. Hence

(recall that denotes the “average integral”). This estimate implies that

To estimate
$\mathrm {I}_2(R)$
, we argue as follows. For
$g_1\notin 100\beta _k {\tilde {Q}_1}$
and
$g_2\notin 100 {Q_2}$
,

and because of the cancellation of
$a_{k,R}$
and properties of
$\psi ^{[1]}$
and
$\psi ^{[2]}$
,

by the same geometrical arguments as used to treat
$\mathrm {I}_1(R)$
, applied in both variables. Hence
$\big (\mathcal {S}_{\psi ,\eta }(a_{k,R})(\mathbf {g})\big )^2$
is dominated by a multiple of
$\left \Vert a_{k,R} \right \Vert {}_{\mathsf {L}^{1}(\mathbf {G})} $
by

Our estimation of
$\mathrm {I}_2(R)$
concludes with the observation that

Then, by combining the estimates of
$\mathrm {I}_1(R)$
and
$\mathrm {I}_2(R)$
, and then using Hölder’s inequality and Lemma 2.5, we see that for all
$k \geq 1$
,

Similar estimates hold for
$\mathrm {I}_3(R)$
and
$\mathrm {I}_4(R)$
, but with
$\gamma _2$
in place of
$\gamma _1$
, and hence (5.7) holds, and the proof is complete.
5.2 Proof of Theorem 1.1 for square functions
$\mathcal {S}_{\psi ,0}(f)$
Arguing much as in Section 5.1, we may also check that

for all
$f\in \mathsf {L\,log\,L} (\mathbf {G})$
. We leave the details to the reader.
5.3 Hyperweak boundedness of Riesz transformations
We write
$\mathcal {R}$
for a double Riesz transformation
$\mathcal {R}^{[1]}_{j_1} \otimes \mathcal {R}^{[2]}_{j_2}$
.
Theorem 5.1 The double Riesz transformations satisfy the endpoint estimate:

for all
$f\in \mathsf {L\,log\,L} (\mathbf {G})$
.
Proof By density and sublinearity, we need only prove that, for all
$f\in \mathsf {L}^{2}(\mathbf {G}) \cap \mathsf {L\,log\,L} (\mathbf {G})$
,

From Theorem 4.4, we may write

where the
$a_k$
satisfy the conditions specified there. By the
$\mathsf {L}^{2}(\mathbf {G})$
boundedness of
$\mathcal {R}$
,

To handle
$\mathcal {R}(\sum _{k \geq 1} a_k)$
, we consider the
$a_k$
in more detail. The atom
$a_k$
is supported in
$U^\dagger _k$
, where
$|U^\dagger _k|\lesssim F_\Phi (2^{-k} f)$
, and we may write
$a_k=\sum _{R\in \mathscr {M}(U^*_k)} a_{k,R}$
.
Again, for all
$R = {Q_1} \times {Q_2} \in \mathscr {M}(U^*_k)$
, let
${\tilde {Q}_1}$
be the biggest pseudodyadic cube containing
${Q_1}$
such that
${\tilde {Q}_1}\times {Q_2}\subset U^{**}_k$
, where
$U^{**}_k$
is defined in (5.1). Next, let
${\tilde {Q}_2}$
be the biggest pseudodyadic cube containing
${Q_2}$
such that
${\tilde {Q}_1}\times {\tilde {Q}_2} \subseteq U^{***}_k$
, where
$U^{***}_k$
is defined in (5.2). Now let
$R^{\dagger }$
be
$100\beta _k({\tilde {Q}_1} \times {\tilde {Q}_2})$
, where
$\beta _k = 2^{k/ (2\nu _1+2\nu _2)}$
. By Lemmas 3.5 and 3.1 and Theorem 4.4,

As in the proof of Theorem 1.1, to prove that

it suffices to show that, for some
$\delta \in \mathbb {R}_{+}$
,

for every
$k \geq 1$
. If (5.13) holds, then (5.11) and (5.12) imply (5.10). We now prove (5.13).

say, where
$\beta _k=2^{{k/(2\nu _1+2\nu _2)}}$
. Since the estimates of
$\mathrm {I}_1(R)$
and
$\mathrm {I}_2(R)$
are symmetric, we only estimate
$\mathrm {I}_1(R)$
. Note that

say. By Hölder’s inequality and the
$\mathsf {L}^{2}(G_2)$
-boundedness of
$\mathcal {R}^{[2]}_{j_2}$
,

By the cancellation condition on
$a_{k,R}(\cdot ,g_2)$
and the smoothness of
$\mathcal {R}^{[1]}_{j_1}$
,

where
$z({Q_1})$
is the center of
${Q_1}$
. Hence

To estimate
$\mathrm {I}_{12}(R)$
, we use the cancellation of
$a_{k,R}$
to write
$\mathcal {R}(a_{k,R})(\mathbf {g})$
as

where
$z({Q_2})$
is the center of
${Q_2}$
. Using the smoothness of
$\mathcal {R}^{[1]}_{j_1}$
and
$\mathcal {R}^{[2]}_{j_2}$
, we deduce that

Then, by continuing as in Section 5.1, we establish (5.13) and complete the proof.
Acknowledgements
The authors thank the referees for their careful reading and helpful comments.