1. Introduction
A real-valued sequence
is said to be uniformly distributed (or equidistributed) modulo one if, for every interval
, we have

$\left\{ x\right\} $
denotes the fractional part of x, and
denotes the length of the interval I. There are many examples of sequences which satisfy this property, e.g., the Kronecker sequence
$x_{n}=\alpha n$
is irrational, and more generally (as was shown by Weyl in his pioneering 1916 paper [
Reference Weyl18
]) the sequence
), where at least one of the coefficients
is irrational. In the metric sense, more can be said: Weyl proved [
Reference Weyl18
] that for any sequence
of distinct integers, the sequence
$x_{n}=\alpha a_{n}$
is uniformly distributed modulo one for (Lebesgue) almost all
. This is also true for real-valued sequences whose elements are sufficiently separated from each other (see, e.g., [
Reference Kuipers and Niederreiter6
, chapter 1, corollary 4·1]): if
is a real-valued sequence, and there exists a positive constant
such that
for each
$n\ne m$
, then the sequence
$x_{n}=\alpha a_{n}$
is uniformly distributed modulo one for almost all
. This condition clearly holds for real-valued, positive, lacunary sequences, i.e., sequences such that
, and there exists a constant
such that for all
we have

While the classical theory deals with the distribution of sequences modulo one at “large” scales, there has been a growing interest in recent years in the fluctuations of sequences at smaller scales. For many sequences, it is conjectured (backed up by numerical evidence) that the small-scale statistics (at the scale
– the mean gap of the first N elements of the sequence modulo one) are in agreement with the statistics of uniform i.i.d. random points in the unit interval (Poissonian statistics), thus demonstrating pseudo-random behaviour for such sequences. The rigorous study of such questions was initiated by Rudnick and Sarnak [
Reference Rudnick and Sarnak13
], who considered the pair correlation (see (1·3), with L fixed) of the sequence
$ x_n = \alpha n^d$
modulo one (
$ d\ge 2 $
), and proved Poissonian limiting behaviour for almost all
$ \alpha \in \mathbb{R} $
; while the same limiting behaviour is conjectured to hold for any specific irrational
$ \alpha $
which is badly approximable by rationals, the question remains open, as are most questions about small-scale statistics of specific sequences (i.e., not in the metric sense).
A popular small-scale statistic is the (normalised) gap distribution of the re-ordered first N elements of the sequence modulo one, which for many sequences is expected to converge to the exponential distribution (“Poissonian gap statistics”) – the almost sure limiting distribution of the gaps in the random model. Lacunary sequences are among the rare examples where such behaviour has been rigorously proved to hold (in the metric sense): Rudnick and Zaharescu proved [
Reference Rudnick and Zaharescu16
] Poissonian gap statistics for almost all
for the sequence
$x_{n}=\alpha a_{n}$
is an integer-valued lacunary sequence; this was recently extended to real-valued lacunary sequences by Chaubey and the author [
Reference Chaubey and Yesha2
]. At the other extreme, Lutsko and Technau recently proved [
Reference Lutsko and Technau10
] Poissonian gap statistics for the slowly growing sequence
$ x_n = \alpha (\log n)^A$
$ A>1 $
) – remarkably, this holds for any
$ \alpha>0 $
, and not only in the metric sense – see also the closely related results [
Reference Lutsko, Sourmelidis and Technau8, Reference Lutsko and Technau9
] about Poissonian correlations for the sequence
$ x_n = \alpha n^\theta $
$ \theta $
is small.
Statistics in the “mesoscopic” regime, i.e., at the scale
, where
, provide further information which may capture some interesting features of sequences. An example of such a statistic is the number variance (the variance of the number of elements in random intervals, see the definition below in our setting), famously studied for the zeros of the Riemann zeta function, for which at small scales the number variance is consistent with that of the eigenvalues of random matrices drawn from the Gaussian unitary ensemble (GUE), whereas “saturation” occurs at larger scales (see [
Reference Berry1
]). In the context of sequences modulo one, only a few results have been established so far in the mesoscopic regime, mainly concerning the leading order asymptotics of the long-range correlations of the sequence
$x_{n}=\alpha n^{2}$
(see [
Reference Heath–Brown4, Reference Hille5, Reference Lutsko7, Reference Nair and Pollicott12, Reference Technau and Walker17
]); nevertheless, important intermediate-scale statistics such as the number variance have largely remained unexplored. The aim of this paper is to study such statistics for real-valued lacunary sequences.
be a positive, real-valued lacunary sequence; we are interested in the distribution of the number of elements modulo one of the sequence
$\left(x_{n}\right)_{n=1}^{\infty}=\left(\alpha a_{n}\right)_{n=1}^{\infty}$
in intervals of length
around points
, which we denote by

is the characteristic function of the interval
The first statistic that we will study is the number variance

i.e., the variance of
, where we randomise w.r.t. the centre of the interval x. We would like to show that for generic values of
, we have

which is in agreement with the random model. In our first main result we show that (1·1) holds with high probability in (essentially) the full mesoscopic regime (namely, all the way up to
is arbitrarily small).
Theorem 1·1.
, and let I be a bounded interval. Assume that
. Then (1·1) holds with high probability w.r.t.
: for any
, we have

It is desirable to extend this to an almost sure statement, which we are able to establish in a narrower regime
(along with a technical condition on the oscillations of L, which clearly holds for natural choices of L, e.g., when
Theorem 1·2.
, and assume that
and that
. Then for almost all
, we have

For slowly growing L (and under an even milder condition on its oscillations), we will be able to establish a central limit theorem for
. This would hold for example when
$L=\left(\log N\right)^{t}$
Theorem 1·3.
such that for all
we have
, and assume that there exists
such that
. Then for almost all
, for any
, we have

We remark that while the condition
in Theorem 1·3 is essential, Theorems 1·1 and 1·2 also hold for fixed L, thus extending the results of [
Reference Rudnick and Technau14
We would like to stress the difference between (1·1) and some weaker notions of long-range Poissonian correlations, as studied, e.g., in [
Reference Hille5, Reference Lutsko7, Reference Technau and Walker17
]. Note that the number variance
can be expressed in terms of the pair correlation function. Indeed, a direct calculation shows (see, e.g., [
Reference Marklof11
]) that


is the (scaled) pair correlation function of
$\left(\alpha a_{n}\right)_{n=1}^{\infty}$
with respect to the test function

Hence, (1·1) is equivalent to

We thus see that (1·4), and therefore (1·1), is a significantly stronger statement then long-range Poissonian pair correlation in the sense of
, where the error term is insufficient for determining the asymptotics of the number variance. Similarly, for
, consider the k-level correlation function

Proposition 4·1, which is the main ingredient in the proof of Theorem 1·3, is notably stronger than long-range Poissonian higher correlations in the sense of
, which would be insufficient for concluding Theorem 1·3.
2. The number variance
By the Poisson summation formula, we have the following identity for the pair correlation function (1·3)



(we have used the standard notation
$e\!\left(z\right)=e^{2\pi iz}).$
We fix a smooth, compactly supported, non-negative weight function
$\rho\in C_{c}^{\infty}\left(\mathbb{R}\right)\!,\,\rho\ge0$
, and denote the weighted
norm of


the aim of the rest of this section is to give an upper bound for
We first observe the following identity which will be useful for estimating sums involving
Lemma 2·1. Let
$1\le L<N$
. We have

Proof. Let
. By the Poisson summation formula we have

In the next lemma, we will see that up to an error term of order
the ranges of the summations defining
can be significantly restricted.
Lemma 2·2.
$1\le L<N$
and let
$\epsilon > 0$
We have


Proof. We have

where we bounded the first summation using the bound
and the second summation using
$\widehat{\rho}\left(x\right)\ll x^{-k}$
for all
. Thus,

where in the last equality we used Lemma 2·1. Finally, by bounding
trivially and applying the bound
$\widehat{\Delta}\!\left(x\right)\ll x^{-2}$
, we have

which concludes the proof.
We will now analyse when the summation defining
does not vanish.
Proposition 2·3.
such that
$0<\left|n_{1}\right|\le N^{4}$
, and
such that
$1\le y_{1}<x_{1}\le N$
. Then there exist at most
$O\!\left(N^{\epsilon}\log N\right)$
values of
such that
$0<\left|n_{2}\right|\le N^{4}$
$x_{2}\le x_{1},$
$1\le y_{2}<x_{2}\le N$
, and

Proof. We follow the ideas of [ Reference Rudnick and Technau14, Reference Rudnick and Zaharescu15 ]. We have

on the other hand,

Applying the reverse triangle inequality to (2·2), we have

substituting the estimates (2·3) and (2·4), we obtain

, we have
$a_{x_{1}}\ge a_{1}C^{x_{1}-1}>a_{1}C^{N^{1/4}-1}$
, and hence

and therefore for sufficiently large N we have

so that
$x_{1}-x_{2}\ll\log N$
. Thus, there are at most
$O\!\left(\log N\right)$
possible values for
, and moreover we conclude that
$x_{2}\gg N^{1/4}$
, and hence
$a_{x_{2}}\gg C^{N^{1/4}}.$
We now fix
. Since

we see that for sufficiently large N, the integer
is uniquely determined by the values of
. It is therefore sufficient to bound the number of possible values of
. There are
$O\!\left(\log N\right)$
values of
such that
. We will therefore count the number of possible values of
such that
. For such
we have

and therefore

Hence, given
such that
, we have

and since
$\left|n_{2}\right|\le N^{4}$
we conclude that

so in fact
. We therefore see that the value of
is identical for each
such that
. But, for such
, (2·2) gives

so that
lies in an interval of length
, and since

there could be at most
values of
in this interval.
As an immediate corollary of Lemma 2·2 and Proposition 2·3, we obtain an upper bound for
Corollary 2·4. Let
$1\le L<N$
and let
$\epsilon < 0$
We have

Proof. We use the bound
and Lemma 2·2 to conclude that

By symmetry we can assume that
$x_{2}\le x_{1}$
, so that by the bound
and by Proposition 2·3, the inner summation is
$O\!\left(N^{\epsilon/2}\log N\right)$
. Hence, Lemma 2·1 gives the required bound (2·5).
3. Proofs of Theorems 1·1 and 1·2
We are now ready to prove Theorem 1·1.
Proof of Theorem 1·1. By (1·2) and (2·1) we have

Hence, for sufficiently large N we have

Denote by
$ \chi_I $
the characteristic function of the interval I, and fix a smooth, compactly supported weight function
$\rho\in C_{c}^{\infty}\left(\mathbb{R}\right)$
such that
for all
$ x\in \mathbb{R} $
. By Chebyshev’s inequality we conclude that for sufficiently large N we have

where we used (2·5) with
We now turn to the proof of Theorem 1·2, that is, we will show that (1·1) (or equivalently (1·4)) holds for almost all
. It sufficient to prove this for
$\alpha\in I$
where I is a bounded interval. We first show that almost sure convergence of the pair correlation holds along a subsequence.
Lemma 3·1.
Let I be a bounded interval, and let
. Assume that
. Let
, and denote
. Then for almost all
$\alpha\in I$
, we have

Proof. Applying (2·5) as in (3·1), for every
and N sufficiently large we have

so that

the asymptotic (3·2) thus holds for almost all
$\alpha\in I$
by the Borel–Cantelli lemma.
We now have all that is needed to prove Theorem 1·2.
Proof of Theorem 1·2. It is sufficient to show that

for almost all
$\alpha\in I$
. Let
. For any N there exists m such that
$N_{m-1}\le N<N_{m}$
. Moreover,
, and by the assumption

we have

thus, there exists a constant
such that for any
, for sufficiently large N we have

by applying Lemma 3·1 with
instead of L (and therefore with
instead of
), as
we have

for all
$\alpha\in I_{\delta}$
, where
is a full measure set in I
note that
by the assumption
. Hence, for sufficiently large N we have

for all
$\alpha\in I_{\delta}$
. Symmetrically,

in a full measure set in I (depending on
); since
can be taken arbitrarily small along a countable sequence of values, and a countable intersection of full measure sets is still of full measure, the bounds (3·5) and (3·6) imply (3·4) for almost all
$\alpha\in I$
Remark. The faster L grows, the sparser the subsequence
one has to take in order to apply the Borel–Cantelli lemma in the proof of Lemma 3·1. On the other hand, since we require the condition
the subsequence
cannot be too sparse. For example, if
$N_{m}=\lfloor m^{t}\rfloor$
, one needs
for (3·3) to hold, but also
, so that
. This explains why the above argument only works for L growing slower than
4. Higher order correlations – proof of Theorem 1·3
Taking expectations w.r.t x, for all
, we have


we have (see [ Reference Hauke and Zafeiropoulos3 , lemma 13])

$0\le j\le k,$
denote by
$\begin{Bmatrix}k\\[5pt] j\end{Bmatrix}$
the Stirling number of the second kind, i.e., the number of ways to partition a set of k elements into j non-empty subsets. We partition the sum over
on the right-hand side of (4·1) into sums with j distinct indices. The term corresponding to
is clearly equal to L. Recalling the definition (1·5) of the j-level correlation functions
, we then have

In view of Lemma A·1, Theorem 1·3 will be a direct consequence of the following proposition.
Proposition 4·1.
such that for all
we have
, and assume that there exists
such that
. Then for almost all
, we have

for all
and all
We apply the following strategy for proving Proposition 4·1. We first prove an analogous result with a smooth test function along a subsequence. We then unsmooth along the subsequence, and finally deduce the result along the full sequence. We would like to use the results of [
Reference Chaubey and Yesha2
], and for that it would be more convenient to work with a “transformed” correlation function: for
and for a smooth, compactly supported function
(which may depend on N), we denote the smoothed k-level correlation function

and the transformed smoothed k-level correlation function



For the transformed correlation function we have the following
norm estimate: let I be a bounded interval, and let


Lemma 4·2.
. For each
there exists
such that

$\left\Vert \psi\right\Vert _{r,1}=\sum\limits _{\left|\alpha\right|\le r}\left\Vert \partial^{\alpha}\psi\right\Vert _{1}$
Proof. For a smooth, compactly supported, non-negative weight function
$\rho\in C_{c}^{\infty}\left(\mathbb{R}\right)\!,\,\rho\ge0$
, denote

Then proposition 7 in [
Reference Chaubey and Yesha2
] implies that for each
there exists
such that

while the term
$\left\Vert \psi\right\Vert _{r,1}^{2}$
is not explicitly stated there, it follows from the proof, which we now sketch (for the full details we refer the reader to [
Reference Chaubey and Yesha2
]): for

by the Poisson summation formula, we have

and hence

where the range of the summation
$\sum\limits ^{*}$
is over
$1\le x_{1},\dots,x_{k}\le N$
are distinct, and
$1\le y_{1},\dots,y_{k}\le N$
are distinct. Fix
; by splitting the summation over n, m into different ranges and using the bounds
$\left|\widehat{\psi}\right|\le\left\Vert \psi\right\Vert _{1}\le\left\Vert \psi\right\Vert _{r,1}$
$\left|\widehat{\psi}\right|\ll\left\Vert \psi\right\Vert _{r,1}\left\Vert x\right\Vert _{\infty}^{-r}$
(for arbitrarily large r), we obtain

The contribution of the first two terms is negligible by a trivial estimate, and so is the contribution of the third term restricted to the range
(choosing r sufficiently large depending on
). The rest of the contribution from the third term is then bounded by [
Reference Chaubey and Yesha2
, proposition 2] which states that there are at most
values of n, m, x, y in the above ranges such that
$\left|n\cdot\Delta_{\left(a_{n}\right)}\left(x\right)-m\cdot\Delta_{\left(a_{n}\right)}\left(y\right)\right|\le N^{\epsilon}$
, which gives (4·6).
Finally, if we choose
such that
$\rho\ge \chi_{I}$
, then

and (4·5) follows.
, and let
be as in Lemma 4·2; let
, and assume that
$\psi\in C_{c}^{\infty}\left(\mathbb{R}^{k-1}\right)$
is a smooth approximation to
, such that
$\left\Vert \Delta-\psi\right\Vert _{\infty}\ll\delta$
and such that
$\left\Vert \psi\right\Vert _{r,1}\ll\delta^{-r}$
. By (4·2), if L grows slower than any power of N, then

Moreover, we have

We deduce almost sure convergence along a subsequence.
Lemma 4·3.
be such that for all
we have
and let
. Let
$N_{m}=\lfloor m^{1+\epsilon}\rfloor$
, and denote
. Then for almost all
$\alpha\in I$
, we have

for all
Proof. It is sufficient to show that for any fixed
, (4·7) holds for almost all
$\alpha\in I$
. By identity (4·4), Lemma 4·2, and the upper bound on L, for each
there exists
such that

Hence, by the Borel–Cantelli lemma, for
sufficiently small we have

for almost all
$\alpha\in I$
, and in particular

where we used again the upper bound on L.
be approximations to
satisfying the above assumptions such that
; a simple way to construct such approximations is to convolve the functions

, where
, and
$\varphi\in C_{c}^{\infty}\left(\mathbb{R}^{k-1}\right)$
is the standard mollifier. We then have

substituting the asymptotics (4·8), we conclude that (4·7) holds for almost all
$\alpha\in I$
We are now ready to prove Proposition 4·1.
Proof of Proposition
4·1. The argument is similar to that of the proof of Theorem 1·2. Let
; it is enough to show that for almost all
$\alpha\in I$
, we have

for all
. Let
$N_{m}=\lfloor m^{1+\epsilon/2}\rfloor$
, so that for any N there exists m such that
$N_{m-1}\le N<N_{m}$
. Moreover,
, and by the assumption

we have

hence, there exists a constant
such that for sufficiently large N we have

by the upper bound on L and Lemma 4·3 with
instead of L, for almost all
$\alpha\in I$
we have

for all
. Thus, for sufficiently large N we have (using again the upper bound on L), for almost all
$\alpha\in I$
we have

for all
. Similarly, for almost all
$\alpha\in I$
we have

We thank Jens Marklof and Zeév Rudnick for stimulating discussions and for their comments.
Appendix A. Normal approximation to the Poisson distribution
We require a normal approximation to a random variable whose moments are close to the Poisson moments. Denote

the kth moment of a Poisson-distributed random variable with parameter L, and

the kth moment of a standard Gaussian random variable.
Lemma A·1.
, and let
be a sequence of random variables such that for all
and for all
we have

. Then

, where
is the standard Gaussian distribution.
Proof. Let
It is sufficient to prove that for all
we have
$\lim\limits _{N\to\infty}\mathbb{E}\left[\widehat{X_{N}}^{k}\right]=\mu_{k}^{normal}$
. By (A·1), we have

so we have to show that for all
we have

be a Poisson-distributed random variable with parameter L and
; we have to show that for all
we have
$\lim\limits _{N\to\infty}\mathbb{E}\left[\widehat{Y_{L}}^{k}\right]=\mu_{k}^{normal}$
. Let
be the moment-generating function of
. Then for any t we have

so that the limit is the moment-generating function of a standard Gaussian random variable. Since the convergence in (A·3) is uniform in a complex neighbourhood of
and all the functions involved are (complex) analytic, convergence of the moments (which can be expressed as the derivatives of the moment-generating function evaluated at zero) easily follows from Cauchy’s integral formula.