Inference on the intraday spot volatility from high-frequency order prices with irregular microstructure noise

Markus Bibinger

doi:10.1017/jpr.2023.96

Inference on the intraday spot volatility from high-frequency order prices with irregular microstructure noise

Part of: Markov processes Inference from stochastic processes Limit theorems

Published online by Cambridge University Press: 14 February 2024

Markus Bibinger

Show author details

Markus Bibinger*: Affiliation:
Julius-Maximilians-Universität Würzburg
*: *Postal address: Chair of Applied Stochastics, Faculty of Mathematics and Computer Science, Institute of Mathematics, Julius-Maximilians-Universität Würzburg, Emil-Fischer-Straße 30, 97074 Würzburg, Germany. Email: [email protected]

Article contents

Abstract
Introduction
Model with lower-bounded, one-sided noise and assumptions
Construction of spot volatility estimators
Asymptotic results
Implementation and simulations
Proofs
Funding information
Competing interests
References

Rights & Permissions

Abstract

We consider estimation of the spot volatility in a stochastic boundary model with one-sided microstructure noise for high-frequency limit order prices. Based on discrete, noisy observations of an Itô semimartingale with jumps and general stochastic volatility, we present a simple and explicit estimator using local order statistics. We establish consistency and stable central limit theorems as asymptotic properties. The asymptotic analysis builds upon an expansion of tail probabilities for the order statistics based on a generalized arcsine law. In order to use the involved distribution of local order statistics for a bias correction, an efficient numerical algorithm is developed. We demonstrate the finite-sample performance of the estimation in a Monte Carlo simulation.

Keywords

Arcsine law limit order book market microstructure nonparametric boundary model volatility estimation

MSC classification

Primary: 62M09: Non-Markovian processes

Secondary: 60J65: Brownian motion 60F05: Central limit and other weak theorems

Type: Original Article
Information: Journal of Applied Probability , Volume 61 , Issue 3 , September 2024 , pp. 858 - 885

DOI: https://doi.org/10.1017/jpr.2023.96 [Opens in a new window]
Copyright: © The Author(s), 2024. Published by Cambridge University Press on behalf of Applied Probability Trust

1. Introduction

Time series of intraday prices are typically described as a discretized path of a continuous-time stochastic process. To have arbitrage-free markets the log-price process should be a semimartingale. Risk estimation based on high-frequency data at the highest available observation frequencies has to take microstructure frictions into account. Disentangling these market microstructure effects from the dynamics of the long-run price evolution has led to observation models with additive noise; see, for instance, [Reference Aït-Sahalia and Jacod2, Reference Hansen and Lunde13, Reference Li and Linton19]. The market microstructure noise, modelling among other effects the oscillation of traded prices between bid and ask order levels in an electronic market, is classically a centred (white) noise process with expectation equal to zero. These models can explain many stylized facts of high-frequency data. Having available full limit order books including data of submissions, cancellations, and executions of bid and ask limit orders, however, it is not clear which time series to consider at all. While challenging the concept of one price process it raises the question of whether the information can be exploited more efficiently, in particular to improve risk quantification. The stochastic boundary model considered for limit order prices of an order book has been discussed by [Reference Bibinger, Jirak and Reiß5], [Reference Liu, Liu, Liu and Ding20], and [Reference Bishwal8, Chapter 1.8]. It preserves the concept of an underlying efficient, semimartingale log-price which determines the long-run price dynamics and an additive, exogenous noise which models market-specific microstructure frictions. Its key idea is that ask order prices should (in most cases) lie above the unobservable efficient price and bid prices below the efficient price. This leads to observation errors which are irregular in the sense of having non-zero expectation and a distribution with a lower- or upper-bounded support. Considering without loss of generality a model for (best) ask order prices, we obtain lower-bounded observation errors and use local minima for the estimation. Modelling (best) bid prices instead would yield a model with upper-bounded observation errors and local maxima could be used for an analogous estimation. Both can be combined in practice.

It is known that the statistical and probabilistic properties of models with irregular noise are very different than for regular noise and require other methods; see, for instance, [Reference Jirak, Meister and Reiß17, Reference Meister and Reiß23, Reference Reiß and Wahl24]. Therefore, our estimation methods and asymptotic theory are quite different compared to the market microstructure literature, while we can still profit from some of the techniques used there. In [Reference Bibinger, Jirak and Reiß5] an estimator for the quadratic variation of a continuous semimartingale, that is, the integrated volatility, was proposed with convergence rate $n^{-1/3}$ , based on n discrete observations with one-sided noise. Optimality of the rate was proved in the standard asymptotic minimax sense. The main insight was that this convergence rate is better than the optimal rate, $n^{-1/4}$ , under regular market microstructure noise.

A recent strand of literature proposes structural, parametric market microstructure noise models incorporating information based on observed order book quantities as volume or trade types; see [Reference Chaker9–Reference Clinet and Potiron11, Reference Li, Xie and Zheng18]. Splitting the noise into a parametric function of such quantities and residual noise, a plug-in estimation of integrated volatility can also yield faster convergence rates than in the classical model with uninformative noise. While this effect of improved volatility estimation appears to be a similarity to our work, our viewpoint on market microstructure is quite distinct. We focus on a model with one-sided instead of centred noise, but we neither impose a parametric assumption on the noise, nor do we include additional trading information. Such refinements of a one-sided noise model, as discussed in the mentioned works for the centred noise model, might be of interest for future research when microstructure effects of bid and ask quotes are better understood. This could potentially further improve volatility estimation.

Inference on the spot volatility is one of the most important topics in the financial literature; see, for instance, [Reference Bibinger, Hautsch, Malec and Reiß4, Reference Mancini, Mattiussi and Renò22] and the references therein. In this work, we address spot volatility estimation for the model from [Reference Bibinger, Jirak and Reiß5]. Using local minima over blocks of shrinking lengths $h_n\propto n^{-2/3}\propto (nh_n)^{-2}$ , the resulting distribution of local minima in [Reference Bibinger, Jirak and Reiß5] became involved and infeasible, such that a central limit theorem for the integrated volatility estimator could not be obtained. Our spot volatility estimator is related to a localized version of the estimator from [Reference Bibinger, Jirak and Reiß5], combined with truncation methods to eliminate jumps of the semimartingale. For the asymptotic theory, however, we follow a different approach choosing blocks of lengths $h_n$ , where $h_n n^{2/3}\to\infty$ slowly. This allows us to establish stable central limit theorems with the best achievable rate, arbitrarily close to $n^{-1/6}$ , in the important special case of a semimartingale volatility. We exploit this to construct pointwise asymptotic confidence intervals.

Although the asymptotic theory relies on block lengths that are slightly unbalanced by smoothing out the impact of the noise distribution on the distribution of local minima asymptotically, our numerical study demonstrates that the confidence intervals work well in realistic scenarios with block lengths which optimize the estimation performance. Robustness to different noise specifications is an advantage that is naturally implied by our approach. Our estimator is surprisingly simple: it is a local average of squared differences of block-wise minima times a constant factor which comes from moments of the half-normal distribution of the minimum of a Brownian motion over the unit time interval. This estimator is consistent. However, the stable central limit theorem at a fast convergence rate requires a subtle bias correction which incorporates a more precise approximation of the asymptotic distribution of local minima. For that purpose, our analysis is based on a generalization of the arcsine law which gives the distribution of the proportion of time over some interval that a Brownian motion is positive. In order to compute the bias-correction function numerically, we introduce an efficient algorithm. Reducing local minima over many random variables to iterated minima of two random variables in each step combined with a convolution step can be interpreted as a kind of dynamic programming approach. It turns out to be much more efficient compared to the natural approximation by a Monte Carlo simulation and is a crucial ingredient of our numerical application. Our convergence rate is much faster than the optimal rate, $n^{1/8}$ , for spot volatility estimation under regular noise [Reference Bibinger, Hautsch, Malec and Reiß4, Reference Hoffmann, Munk and Schmidt-Hieber14]. The main contribution of this work is to develop the probabilistic foundation for the asymptotic analysis of the estimator and to establish the stable central limit theorems, asymptotic confidence, and a numerically practicable method.

The methods and proof techniques to deal with jumps are inspired by the truncation methods pioneered in [Reference Mancini21] and summarized in [Reference Jacod and Protter15, Chapter 13]. Overall, the strategy and restrictions on jump processes are to some extent similar, while several details under irregular noise using order statistics are rather different compared to settings without noise or with regular centred noise as in [Reference Bibinger and Winkelmann7].

We introduce and further discuss our model in Section 2. Section 3 presents estimation methods and Section 4 asymptotic results. The numerical application is considered in Section 5 and a Monte Carlo simulation study illustrates the appealing finite-sample performance of the method. All proofs are given in Section 6.

2. Model with lower-bounded, one-sided noise and assumptions

Consider an Itô semimartingale

(1)

\begin{align} X_t & = X_0 + \int_0^t a_s\,\textrm{d} s + \int_0^t\sigma_s\,\textrm{d} W_s + \int_0^t\int_{\mathbb{R}}\delta(s,z)\mathbf{1}_{\{|\delta(s,z)|\leq 1\}}\,(\mu-\nu)(\textrm{d} s,\textrm{d} z) \nonumber \\[5pt] & \quad + \int_0^t\int_{\mathbb{R}}\delta(s,z)\mathbf{1}_{\{|\delta(s,z)|> 1\}}\,\mu(\textrm{d} s,\textrm{d} z), \qquad t\ge 0, \end{align}

with a one-dimensional standard Brownian motion $(W_t)$ , defined on some filtered probability space $(\Omega^X,\mathcal{F}^X,(\mathcal{F}^X_t),\mathbb{P}^X)$ . For the drift process $(a_t)$ and the volatility process $(\sigma_t)$ we impose the following quite general assumptions.

Assumption 1. The processes $(a_t)_{t\ge 0}$ and $(\sigma_t)_{t\ge 0}$ are locally bounded. The volatility process is strictly positive, $\inf_{t\in[0,1]}\sigma_t>0$ , $\mathbb{P}^X$ -almost surely. For all $0\leq t+s\leq1$ , $t\ge 0$ , $s\ge 0$ , with some constants $C_{\sigma}>0$ , and $\alpha>0$ ,

(2)

\begin{equation} \mathbb{E}[(\sigma_{t+s}-\sigma_{t})^2] \le C_{\sigma}s^{2\alpha}. \end{equation}

Condition (2) introduces a regularity parameter $\alpha$ , governing the smoothness of the volatility process. The parameter $\alpha$ is crucial, since it will naturally influence the convergence rates of spot volatility estimation. Inequality (2) is less restrictive than $\alpha$ -Hölder continuity, since it does not rule out volatility jumps. For instance, any compound Poisson jump process with a jump size distribution having finite second moments satisfies (2) with $\alpha=\frac12$ . Since second moments in (2) of such a process are bounded by a constant times $(s^2+s)$ , i.e. the second moment of a Poisson distribution with parameter s, this readily follows. Similar bounds for more general jump processes are given, for instance, in [Reference Jacod and Protter15, Section 13]. This is important as empirical evidence for volatility jumps, in particular simultaneous price and volatility jumps, has been reported for intraday high-frequency financial data [Reference Bibinger, Neely and Winkelmann6, Reference Tauchen and Todorov28]. The presented theory is, moreover, for general stochastic volatilities, also allowing for rough volatility. Rough fractional stochastic volatility models recently became popular and are used, for instance, in the macroscopic model of [Reference El Euch, Fukasawa and Rosenbaum12, Reference Rosenbaum and Tomas25].

The jump component of (1) is illustrated as in [Reference Jacod and Protter15] and related literature, where the predictable function $\delta$ is defined on $\Omega\times \mathbb{R}_+\times \mathbb{R}$ , and the Poisson random measure $\mu$ is compensated by $\nu(\textrm{d} s,\textrm{d} z)=\lambda(\textrm{d} z)\otimes \textrm{d} s$ , with a $\sigma$ -finite measure $\lambda$ . We impose the following standard condition with a generalized Blumenthal–Getoor or jump activity index r, $0\le r\le 2$ .

Assumption 2. Assume that $\sup_{\omega,x}|\delta(t,x)|/\gamma(x)$ is locally bounded with a non-negative, deterministic function $\gamma$ which satisfies $\int_{\mathbb{R}}(\gamma^r(x)\wedge 1)\,\lambda(\textrm{d} x)<\infty$ .

We use the notation $a\wedge b=\min\!(a,b)$ , and $a\vee b=\max\!(a,b)$ , throughout this paper. Assumption 2 is most restrictive in the case $r=0$ , when jumps are of finite activity. The larger r is, the more general jump components are allowed. We will develop results under mild restrictions on r.

The process $(X_t)$ , which can be decomposed into

(3)

\begin{equation} X_t=C_t+J_t,\end{equation}

with a continuous component $(C_t)$ and a càdlàg jump component $(J_t)$ , provides a model for the latent efficient log-price process in continuous time.

High-frequency (best) ask order prices from a limit order book at times $t_i^n$ , $0\le i\le n$ , on the fix time interval [0, 1] cannot be adequately modelled by discrete recordings of $(X_t)$ . Instead, we propose the additive model with lower-bounded, one-sided microstructure noise:

(4)

\begin{equation} Y_i = X_{t_i^n} + \varepsilon_i, \qquad i=0,\ldots,n,\ \varepsilon_i\stackrel{\textrm{iid}}{\sim}F_{\eta},\ \varepsilon_i\ge 0.\end{equation}

The crucial property of the model is that the support of the noise is lower bounded. It is not that important that this boundary is zero—it could be a different constant, or even a regularly varying function over time. The methods and results presented are robust with respect to such model generalizations. We set the bound equal to zero, which appears to be the most natural choice for limit orders.

Assumption 3. The independent and identically distributed (i.i.d.) noise $(\varepsilon_i)_{0\le i\le n}$ has a cumulative distribution function (CDF) $F_{\eta}$ satisfying

(5)

\begin{equation} F_{\eta}(x) = \eta x(1+{\scriptstyle{\mathcal{O}}}(1))\quad\textit{as}\ x\downarrow 0. \end{equation}

This is a nonparametric model in that the extreme value index is $-1$ for the minimum domain of attraction close to the boundary. This standard assumption on one-sided noise has already been used in [Reference Jirak, Meister and Reiß17, Reference Reiß and Wahl24] within different frameworks. We do not require assumptions about the maximum domain of attraction, moments, and the tails of the noise distribution. Parametric examples which satisfy (5) are, for instance, the uniform distribution on some interval, the exponential distribution, and the standard Pareto distribution with heavy tails.

The i.i.d. assumption on the noise is crucial, and generalizations to weakly dependent noise will require considerable work and new proof concepts. Heterogeneity, that is, a time-dependent noise level $\eta(t)$ , could be included in our asymptotic analysis under mild assumptions.

3. Construction of spot volatility estimators

We partition the observation interval [0, 1] into $h_n^{-1}$ equispaced blocks, $h_n^{-1}\in\mathbb{N}$ , and take local minima on each block. We hence obtain, for $k=0,\ldots,h_n^{-1}-1$ , the local, block-wise minima

\begin{equation*} m_{k,n} = \min_{i\in\mathcal{I}_k^n}Y_i, \qquad \mathcal{I}_k^n = \{i\in\{0,\ldots,n\}\colon t_i^n\in[kh_n,(k+1)h_n)\}.\end{equation*}

While $h_n^{-1}$ is an integer, $nh_n$ is in general not integer valued. For a simple interpretation, however, we can think of $nh_n$ as an integer-valued sequence which gives the number of noisy observations per block in the case of equidistant observations. A spot volatility estimator could be obtained as a localized version of the estimator from [Reference Bibinger, Jirak and Reiß5, (2.9)] for the integrated volatility in the analogous model. The idea is that differences $m_{k,n}-m_{k-1,n}$ of local minima estimate differences of efficient prices, and a sum of squared differences can be used to estimate the volatility. However, things are not that simple. To determine the expectation of squared differences of local minima we introduce the function

(6)

\begin{equation} \Psi_n(\sigma^2) = \frac{\pi}{2(\pi-2)}h_n^{-1}\mathbb{E} \Big[\Big(\min_{i\in\{0,\ldots,nh_n-1\}}(\sigma B_{{i}/{n}}+\varepsilon_i) - \min_{i\in\{1,\ldots,nh_n\}}(\sigma \tilde B_{{i}/{n}}+\varepsilon_i)\Big)^2\Big],\end{equation}

where $(B_t)$ and $(\tilde B_t)$ denote two independent standard Brownian motions. In [Reference Bibinger, Jirak and Reiß5], the block length balanced the order of block-wise minimal errors $(nh_n)^{-1}$ under (5) and the order $h_n^{1/2}$ of the movement of the stochastic semimartingale boundary over a block. For $h_n n^{2/3}\to \infty$ , $\Psi_n$ tends to the identity function, so we have that

(7)

\begin{equation} \Psi_n(\sigma^2) = \sigma^2+{\scriptstyle{\mathcal{O}}}(1) \quad\mbox{as}\ n\to \infty.\end{equation}

In this asymptotic regime local minima are mainly determined by local minima of the boundary process, such that the first-order approximation equals (6) when neglecting the noise $(\varepsilon_i)$ on the right-hand side. The half-normal distribution of the minimum of a Brownian motion over an interval and its moments then readily yield (7). A formal proof of (7) is contained in Step 3 of the proof of Theorem 1 in Section 6.2. Note that we defined $\Psi_n$ differently than in [Reference Bibinger, Jirak and Reiß5], e.g. in their (A.35), with the additional factor $\pi/(\pi-2)$ . By the simple asymptotic approximation in (7), we do not require $\Psi_n^{-1}$ for a consistent estimator.

When there are no price jumps, a simple consistent estimator for the spot squared volatility $\sigma_{\tau}^2$ is given by

(8)

\begin{equation} \hat\sigma^2_{\tau-} = \frac{\pi}{2(\pi-2)K_n}\sum_{k=(\lfloor h_n^{-1}\tau\rfloor-K_n)\vee 1}^{\lfloor h_n^{-1}\tau\rfloor-1} h_n^{-1}(m_{k,n}-m_{k-1,n})^2 \end{equation}

for suitable sequences $h_n\to 0$ and $K_n\to\infty$ . Using only observations before time $\tau$ , the estimator is available online at time $\tau\in(0,1]$ during a trading day. For $\tau$ close to 0, when $\lfloor h_n^{-1}\tau\rfloor\le K_n$ , the factor $K_n^{-1}$ can be adjusted to get an average. Since this is unimportant for asymptotic theory, we keep $K_n^{-1}$ for simple notation. Working with ex post data over the whole interval, instead of using only observations before time $\tau$ , we may also use

(9)

\begin{equation} \hat\sigma^2_{\tau+} = \frac{\pi}{2(\pi-2)K_n} \sum_{k=\lfloor h_n^{-1}\tau\rfloor+1}^{(\lfloor h_n^{-1}\tau\rfloor+K_n)\wedge (h_n^{-1}-1)}h_n^{-1}(m_{k,n}-m_{k-1,n})^2,\end{equation}

or an estimator with an average centred around time $\tau\in(0,1)$ . The difference between the two estimators (9) and (8) can be used to infer a possible jump in the volatility process at time $\tau\in(0,1)$ , as in [Reference Bibinger, Neely and Winkelmann6].

To construct confidence intervals for the spot volatility, it is useful to also establish a spot quarticity estimator:

(10)

\begin{equation} \widehat{{\sigma^4_{\tau}}}_- = \frac{\pi}{4(3\pi-8)K_n} \sum_{k=(\lfloor h_n^{-1}\tau\rfloor-K_n)\vee 1}^{\lfloor h_n^{-1}\tau\rfloor-1}h_n^{-2}\big(m_{k,n}-m_{k-1,n})^4.\end{equation}

A spot volatility estimator that is robust with respect to jumps in $(X_t)$ is obtained with threshold versions of these estimators. We truncate differences of local minima whose absolute values exceed a threshold $u_n= \beta\cdot h_n^{\kappa}$ , $\kappa\in\big(0,\frac12\big)$ , with some positive constant $\beta$ , which leads to

(11)

\begin{equation} \hat\sigma^{2,\textrm{(tr)}}_{\tau-} = \frac{\pi}{2(\pi-2)K_n} \sum_{k=(\lfloor h_n^{-1}\tau\rfloor-K_n)\vee 1}^{\lfloor h_n^{-1}\tau\rfloor-1}h_n^{-1}(m_{k,n}-m_{k-1,n})^2 \mathbf{1}_{\{|m_{k,n}-m_{k-1,n}|\le u_n\}},\end{equation}

and analogous versions of the estimators (9) and (10).

4. Asymptotic results

We establish asymptotic results for equidistant observations, $t_i^n=i/n$ . We begin with the asymptotic theory in a setup without jumps in $(X_t)$ .

Theorem 1. (Stable central limit theorem for continuous $(X_t)$ .) Set $h_n$ such that $h_n n^{2/3}\to \infty$ and $K_n=C_K h_n^{\delta -2\alpha/(1+2\alpha)}$ for arbitrary $\delta$ , $0<\delta<2\alpha/(1+2\alpha)$ , and some constant $C_K>0$ . If $(X_t)$ is continuous, i.e. $J_t=0$ in (3), under Assumptions 1 and 3, the spot volatility estimator (8) is consistent, $\hat\sigma^2_{\tau-}\stackrel{\mathbb{P}}{\rightarrow} \sigma_{\tau-}^2$ , and satisfies the stable central limit theorem

(12)

\begin{equation} K_n^{1/2}\big(\hat\sigma^2_{\tau-}-\Psi_n\big(\sigma_{\tau-}^2\big)\big) \stackrel{\textrm{st}}{\longrightarrow} \mathcal{N}\bigg(0,\frac{7\pi^2/4-2\pi/3-12}{(\pi-2)^2}\sigma^4_{\tau-}\bigg). \end{equation}

There is only a difference between $\sigma_{\tau}^2$ and its left limit $\sigma_{\tau-}^2$ in the case of a volatility jump at time $\tau$ . In particular, the estimator is also consistent for $\sigma_{\tau}^2$ for any fix $\tau\in(0,1)$ . The convergence rate $K_n^{-1/2}$ gets arbitrarily close to $n^{ -2\alpha/(3+6\alpha)}$ , which is optimal in our model. The optimal rate is attained, according to [Reference Bibinger, Jirak and Reiß5], for $h_n\propto n^{-2/3}$ and $K_n\propto h_n^{-2\alpha/(1+2\alpha)}$ , i.e. $\delta\downarrow 0$ . In the important special case when $\alpha=\frac12$ , for a semimartingale volatility, the rate is arbitrarily close to $n^{-1/6}$ . This is much faster than the optimal rate of convergence in the model with additive centred microstructure noise, which is known to be $n^{ -1/8}$ [Reference Bibinger, Hautsch, Malec and Reiß4, Reference Hoffmann, Munk and Schmidt-Hieber14]. The constant in the asymptotic variance is obtained from several variance and covariance terms including (squared) local minima, and is approximately 2.44. The function $\Psi_n$ was shown to be monotone and invertible in [Reference Bibinger, Jirak and Reiß5], and $\Psi_n$ and its inverse $\Psi_n^{-1}$ can be approximated using Monte Carlo simulations, see Section 5.1. The asymptotic distribution of the estimator does not hinge on the noise level $\eta$ , which is different to methods for centred noise. Hence, we do not require any pre-estimation of noise parameters and the theory directly extends to a time-varying noise level $\eta(t)$ in (5) under the mild assumption that $0<\eta(t)<\infty$ for all t. The stable convergence in (12) is stronger than weak convergence and is important, since the limit distribution is mixed normal depending on the stochastic volatility. We refer to [Reference Jacod and Protter15, Section 2.2.1] for an introduction to stable convergence. For a normalized central limit theorem, we can use the spot quarticity estimator (10).

Proposition 1. (Feasible central limit theorem.) Under the conditions of Theorem 1, the spot quarticity estimator (10) is consistent, such that we get for the spot volatility estimation the normalized central limit theorem

\begin{equation*} K_n^{1/2}\frac{\pi-2}{\sqrt{\widehat{{\sigma^4_{\tau}}}_-(7\pi^2/4-2\pi/3-12)}\,} \big(\hat\sigma^2_{\tau-}-\Psi_n\big(\sigma_{\tau-}^2\big)\big) \stackrel{\textrm{d}}{\longrightarrow} \mathcal{N}(0,1). \end{equation*}

Proposition 1 yields asymptotic confidence intervals for spot volatility estimation. For $q\in (0,1)$ , we have

\begin{multline*} \mathbb{P}\bigg(\sigma_{\tau-}^2\in\bigg[\Psi_n^{-1}\bigg(\hat\sigma^2_{\tau-}-\frac{\sqrt{\widehat{{\sigma^4_{\tau}}}_-(7\pi^2/4-2\pi/3-12)}}{\pi-2}K_n^{-1/2}\Phi^{-1}(1-q/2)\bigg), \\[5pt] \Psi_n^{-1}\bigg(\hat\sigma^2_{\tau-}+\frac{\sqrt{\widehat{{\sigma^4_{\tau}}}_-(7\pi^2/4-2\pi/3-12)}}{\pi-2}K_n^{-1/2}\Phi^{-1}(1-q/2)\bigg)\bigg]\bigg)\to 1-q \end{multline*}

by the monotonicity of $\Psi_n^{-1}$ , with $\Phi$ the CDF of the standard normal distribution. Since $\Psi_n^{-1}$ is differentiable by [Reference Bibinger, Jirak and Reiß5, (A.35)] and the derivative is $\big(\Psi_n^{-1}\big)'=1+{\scriptstyle{\mathcal{O}}}(1)$ by (7), the delta method (for stable convergence) also yields asymptotic confidence intervals and the central limit theorem

(13)

\begin{equation} K_n^{1/2}\big(\Psi_n^{-1}\big(\hat\sigma^2_{\tau-}\big)-\sigma_{\tau-}^2\big) \stackrel{\textrm{st}}{\longrightarrow} \mathcal{N}\bigg(0,\frac{7\pi^2/4-2\pi/3-12}{(\pi-2)^2}\sigma^4_{\tau-}\bigg).\end{equation}

We cannot simply replace $\Psi_n\big(\sigma_{\tau-}^2\big)$ in (12) by its first-order approximation $\sigma_{\tau-}^2$ , or $\Psi_n^{-1}\big(\hat\sigma^2_{\tau-}\big)$ in (13) by $\hat\sigma^2_{\tau-}$ , since the biases do not converge to zero sufficiently fast. That is, $(\hat\sigma^2_{\tau-}-\sigma_{\tau-}^2) =\mathcal{O}_{\mathbb{P}}\big(K_n^{-1/2}\big)$ does not hold true in general. Furthermore, if the condition $h_n n^{2/3}\to\infty$ is violated, the central limit theorems do not apply.

Theorem 2. (Stable central limit theorem with jumps in $(X_t)$ .) Set $h_n$ such that $h_n n^{2/3}\to \infty$ and $K_n=C_K h_n^{\delta -2\alpha/(1+2\alpha)}$ for arbitrary $\delta$ , $0<\delta<2\alpha/(1+2\alpha)$ , and some constant $C_K>0$ . Under Assumptions 1, 2, and 3, with

\begin{align*}r< \frac{2+2\alpha}{1+2\alpha},\end{align*}

the truncated spot volatility estimator (11) with

\begin{align*}\kappa\in \bigg(\frac{1}{2-r}\frac{\alpha}{2\alpha+1},\frac12\bigg)\end{align*}

is consistent, $\hat\sigma^{2,\textrm{(tr)}}_{\tau-}\stackrel{\mathbb{P}}{\rightarrow} \sigma_{\tau-}^2$ , and satisfies the stable central limit theorem

\begin{equation*} K_n^{1/2}\big(\hat\sigma^{2,\textrm{(tr)}}_{\tau-}-\Psi_n\big(\sigma_{\tau-}^2\big)\big) \stackrel{\textrm{st}}{\longrightarrow} \mathcal{N}\bigg(0,\frac{7\pi^2/4-2\pi/3-12}{(\pi-2)^2}\sigma^4_{\tau-}\bigg). \end{equation*}

In order to obtain a central limit theorem at (almost) optimal rate, we thus have to impose mild restrictions on the jump activity. For the standard model with a semimartingale volatility, i.e. $\alpha=\frac12$ , we need $r<\frac32$ , and for $\alpha=1$ we have the stronger condition that $r<\frac43$ . These conditions are equivalent to those of [Reference Bibinger and Winkelmann7, Theorem 1], which gives a central limit theorem for spot volatility estimation under similar assumptions on $(X_t)$ , but with a slower rate of convergence for centred microstructure noise. Using a truncated quarticity estimator with the same thresholding again yields a feasible central limit theorem and asymptotic confidence intervals.

Remark 1. From a theoretical point of view we might ponder why we do not work out an asymptotic theory for $h_n\propto n^{-2/3}$ when noise and efficient price both influence the asymptotic distribution of the local minima. However, in this balanced case, the asymptotic distribution is infeasible. For this reason, [Reference Bibinger, Jirak and Reiß5] could not establish a central limit theorem for their integrated volatility estimator. Moreover, their estimator was only implicitly defined depending on the unknown function $\Psi_n^{-1}$ . Even imposing a parametric assumption on the noise as an exponential distribution would not render a feasible limit theory for $h_n\propto n^{-2/3}$ —see the discussion in [Reference Bibinger, Jirak and Reiß5]. Choosing $h_n$ such that $h_n n^{2/3}\to\infty$ slowly instead yields a simple, explicit, and consistent estimator and a feasible central limit theorem for spot volatility estimation. In particular, we use $\Psi_n$ only for the bias correction of the simple estimator, while the estimator itself and the (estimated) asymptotic variance do not hinge on $\Psi_n$ . Central limit theorems for spot volatility estimators are in general only available at almost optimal rates, when the variance dominates the squared bias in the mean squared error; see, for instance, Theorem 13.3.3 and the remarks below it in [Reference Jacod and Protter15]. Therefore, (12) is the best achievable central limit theorem. Moreover, our choice of $h_n$ avoids strong assumptions on the noise that would be inevitable for smaller blocks. Our numerical work will demonstrate that the asymptotic results presented are useful in practice and facilitate efficient inference on the spot volatility. In particular, Section 5.2 revolves around the question of how to choose block lengths in practice.

5. Implementation and simulations

5.1. Monte Carlo approximation of $\Psi_n$

Although the function $\Psi_n$ from (6) tends to the identity asymptotically, it has a crucial role as a bias correction of our estimator in (12). We can compute the function numerically based on a Monte Carlo simulation. Hence, we have to compute $\Psi_n(\sigma^2)$ as a Monte Carlo mean over many iterations and over a fine grid of values for the squared volatility. Then, we can also numerically invert the function and use $\Psi_n^{-1}(\!\cdot\!)$ . To make this procedure feasible without too high a computational expense we require an algorithm to efficiently sample from the law of the local minima for some given n and block length $h_n$ .

Consider, for $nh_n\in\mathbb{N}$ with $Z_i\stackrel{\textrm{iid}}{\sim}\mathcal{N}(0,1)$ and the observation errors $(\varepsilon_k)_{k\ge 0}$ , the minimum

\begin{align*}M_1^{nh_n}\;:\!=\;\min_{k=1,\ldots,nh_n}\Bigg(\frac{\sigma}{\sqrt{n}\,}\sum_{i=1}^k Z_i+\varepsilon_k\Bigg)\end{align*}

for some fixed $\sigma>0$ , and, for $l\in\{0,\ldots,nh_n\}$ ,

\begin{align*}M_l^{nh_n}\;:\!=\;\min_{k=l,\ldots,nh_n}\Bigg(\frac{\sigma}{\sqrt{n}\,}\sum_{i=0}^k Z_i+\varepsilon_k\Bigg),\end{align*}

where we set $Z_0\;:\!=\;0$ . Since

\begin{align*}\Psi_n(\sigma^2)=\frac12\frac{\pi}{\pi-2}h_n^{-1}\mathbb{E}\big[\big(M_0^{nh_n-1}-M_1^{nh_n}\big)^2\big],\end{align*}

with $M_0^{nh_n-1}$ generated independently from $M_1^{nh_n}$ , we want to simulate samples distributed as $M_0^{nh_n-1}$ and $M_1^{nh_n}$ , respectively. Note that for finite $nh_n$ there is no exact equality between the moments of $M_0^{nh_n-1}$ and $M_1^{nh_n}$ , which can be relevant in particular for moderate values of $nh_n$ . As in the simulation of Section 5.2, we implement exponentially distributed observation errors $(\varepsilon_k)$ , with some given noise level $\eta$ . In data applications, we can do the same with an estimated noise level

\begin{equation*} \hat\eta = \Bigg(\frac{1}{2n}\sum_{i=1}^n(Y_i-Y_{i-1})^2\Bigg)^{-1/2} = \eta + \mathcal{O}_{\mathbb{P}}(n^{-1/2}).\end{equation*}

This estimator works for all noise distributions with finite fourth moments. In view of the discussion of the model in [Reference Bibinger, Jirak and Reiß5], exponentially distributed noise is the most natural example satisfying (5). Simulations with other noise distributions lead to similar results. This is expected, since the estimator only hinges on local minima and their distribution is asymptotically more determined by the Brownian motion than by the noise distribution. To simulate the local minima for given n, $h_n$ , $\eta$ , and squared volatility $\sigma^2$ in an efficient way we use a specific dynamic programming principle. Observe that

\begin{align*} M_1^{nh_n} & = \frac{\sigma}{\sqrt{n}\,}Z_1 + \min\!\big(\varepsilon_1,M_2^{nh_n}\big) \\[5pt] & = \frac{\sigma}{\sqrt{n}\,}Z_1 + \min\!\bigg(\varepsilon_1,\frac{\sigma}{\sqrt{n}\,}Z_2 + \min\!\big(\varepsilon_2,M_3^{nh_n}\big)\bigg) \\[5pt] & = \frac{\sigma}{\sqrt{n}\,}Z_1 + \min\!\bigg(\!\cdots\min\!\bigg(\varepsilon_{nh_n-2},\frac{\sigma}{\sqrt{n}\,}Z_{nh_n-1} + \min\!\bigg(\varepsilon_{nh_n-1},\frac{\sigma}{\sqrt{n}\,}Z_{nh_n} + \varepsilon_{nh_n}\bigg)\!\bigg)\cdots\bigg).\end{align*}

In the baseline noise model $\varepsilon_k\stackrel{\textrm{iid}}{\sim}\text{Exp}(\eta)$ , the random variable $({\sigma}/{\sqrt{n}})Z_{nh_n}+\varepsilon_{nh_n}$ has an exponentially modified Gaussian (EMG) distribution. With any fixed noise distribution, we can easily generate realizations from this convolution. A pseudorandom variable distributed as $M_1^{nh_n}$ is now generated following the last transformation in the reverse direction. Algorithmically, this reads

1. Generate $U_{nh_n} \sim \textrm{EMG}(\sigma^2/n,\eta)\sim \textrm{Exp}(\eta) + (\sigma/\sqrt{n})\textrm{Norm}(1)$
2. $U_{nh_n-1}=\min\!(U_{nh_n},\textrm{Exp}(\eta))+(\sigma/\sqrt{n})\textrm{Norm}(1)$
3. Iterate until $U_1$

where the end point $U_1$ has the target distribution of $M_1^{nh_n}$ . In each iteration step, we thus take the minimum of the current state of the process with one independent exponentially distributed random variable and the convolution with one independent normally distributed random variable. To sample from the distribution of $M_0^{nh_n-1}$ instead, we use the same algorithm and just drop the convolution with the normal distribution in the last step.

This algorithm facilitates a many times faster sampling from the distribution of local minima and numerical approximation of $\Psi_n$ compared to running for each value a standard Monte Carlo simulation in that local minima are computed over blocks of length $h_n$ .

Figure 1 plots the result of the Monte Carlo approximation of $\Psi_n(\sigma^2)$ for $n=23\,400$ and $n\cdot h_n=15$ on a grid of 1500 values of $\sigma^2$ . In this case, $h_n$ is quite small, but this configuration turns out to be useful in Section 5.2. We know that $\Psi_n(\sigma^2)$ is monotone, such that the oscillation of the function in Figure 1 is due to the inaccuracy of the Monte Carlo means, although we use $N=100\,000$ iterations for each grid point. Nevertheless, we can see that the function is rather close to a linear function with slope $1{.}046$ based on a least squares estimate. The left panel of Figure 1 draws a comparison to the identity function which is illustrated by the dotted line, while the right panel draws a comparison to the linear function with slope $1{.}046$ . We see that it is crucial to correct for the bias in (12) when using such small values of $h_n$ . Although the function $\Psi_n(\sigma^2)$ is not exactly linear, a simple bias correction dividing estimates by $1{.}046$ is almost as good as using the more precise numerical inversion based on the Monte Carlo approximation. Since the Monte Carlo approximations of $\Psi_n(\sigma^2)$ look close to linear functions in all the cases considered, we report the estimated slopes based on least squares and $N=100\,000$ Monte Carlo iterations for different values of $h_n$ in Table 1 to summarize concisely the distance between the function $\Psi_n(\sigma^2)$ and the identity. Simulating all iterations for all grid points with our algorithm takes only a few hours with a standard computer.

Figure 1. Monte Carlo means to estimate $\Psi_n(\sigma^2)$ over a fine grid (interpolated line) for $n=23\,400$ and $n\cdot h_n=15$ . Left: the dotted line shows the identity function. Right: the dotted line is a linear function with slope $1.046$ .

5.2. Simulation study of estimators

We simulate $n=23\,400$ observations corresponding to one observation per second over a (NASDAQ) trading day of 6.5 hours. The efficient price process is simulated from the model

\begin{align*} \textrm{d} X_t & = \nu_t\sigma_t\,\textrm{d} W_t, \\[5pt] \textrm{d}\sigma_t^2 & = 0{.}0162\cdot(0{.}8465 - \sigma_t^2)\,\textrm{d} t + 0{.}117 \cdot \sigma_t\,\textrm{d} B_t, \\[5pt] \nu_t & = (6 - \sin\!(3\pi t/4)) \cdot 0{.}002, \qquad t\in[0,1].\end{align*}

The factor $(\nu_t)$ generates a typical U-shaped intraday volatility pattern. $(W_t,B_t)$ is a two-dimensional Brownian motion with leverage $\textrm{d}[W,B]_t=-0{.}2\,\textrm{d} t$ . The stochastic volatility component has several realistic features and the simulated model is in line with recent literature; see [Reference Bibinger, Neely and Winkelmann6] and references therein. We do not include a drift in $X_t$ to avoid introducing another process or more parameters. Any drift evolving within a reasonable range of values will not affect the numerical results presented. Observations with lower-bounded, one-sided microstructure noise are generated by $Y_i=X_{{i}/{n}}+\varepsilon_i$ , $0\le i\le n$ , with exponentially distributed noise $\varepsilon_i\stackrel{\textrm{iid}}{\sim}\text{Exp}(\eta)$ , with $\eta=10\,000$ . The noise variance is then rather small, but this is in line with stylized facts of real NASDAQ data such as, for instance, those analyzed in [Reference Bibinger, Hautsch, Malec and Reiß4, Reference Bibinger, Neely and Winkelmann6]. Note that the noise level estimate is analogous to the one used for regular market microstructure noise. Typical noise levels obtained e.g. for Apple are approximately 15 000, and approximately 4000 for 3M; see the supplement of [Reference Bibinger, Hautsch, Malec and Reiß4].

Table 1. Regression slopes to measure the bias of estimator (8) and deviation $\Psi_n(\sigma^2)-\sigma^2$ .

Figure 2. True and estimated spot volatility with pointwise confidence sets.

Figure 2 shows a fixed path of the squared volatility. We fix this path for the following Monte Carlo simulation and generate new observations of $(X_t)$ and $(Y_i)$ in each iteration according to our model. The dashed line in Figure 2 gives the estimated volatility by the Monte Carlo means over $N=50\,000$ iterations based on $n\cdot h_n=15$ observations per block using the non-adjusted estimator (8), but with windows which are centred around the block on which we estimate the spot volatility, i.e. windows centred around the time $\tau$ , and with $K_n=180$ . We plot estimates on each block, where the estimates close to the boundaries rely on fewer observations. The solid line gives the bias-corrected volatility estimates using the numerically evaluated function $\Psi_n$ , based on the algorithm from Section 5.1 with $n\cdot h_n=15$ and $n=23\,400$ . We determined the values $n\cdot h_n=15$ and $K_n=180$ as suitable values to obtain a small mean squared error. In fact, the choice of $K_n=180$ is rather large in favour of a smaller variance that yields a rather smooth estimated spot volatility in Figure 2. The estimated volatility hence appears smoother compared to the true semimartingale volatility, but the intraday pattern is captured well by our estimation. We expect that this is typically an appealing implementation in practice as smaller $K_n$ results in a larger variance. Choosing $K_n=180$ rather large, we have to use quite small block sizes $h_n$ to control the overall bias of the estimation. Since $h_n\cdot n^{2/3}\approx 0{.}52$ is small, the bias correction becomes crucial here. Still, our asymptotic results work well for this implementation. This can be seen by the comparison of pointwise empirical 10% and 90% quantiles from the Monte Carlo iterations illustrated by the grey area and the 10% and 90% quantiles of the limit normal distribution with the asymptotic variance from (12). The latter are drawn as dotted lines for the blocks with distance larger than $K_n/2$ from the boundaries, where the variances are of order $K_n^{-1}$ . Close to the boundaries the empirical variances increase due to the smaller number of blocks used for the estimates. Moreover, the bias correction, which is almost identical to dividing each estimate by $1{.}046$ , correctly scales the simple estimates which have a significant positive bias for the chosen tuning parameters. Overall, our asymptotic results provide a good finite-sample fit even though we have $h_n\cdot n^{2/3}<1$ here. Note, however, that $\sigma_t \cdot \eta\approx 100$ , and our asymptotic expansion in fact requires that $h_n^{3/2} n\sigma_t \eta$ is large when taking constants into account. Since the simulated scenario uses realistic values, we recommend similar block lengths for applications to real high-frequency financial data. According to the summary statistics in the supplement of [Reference Bibinger, Hautsch, Malec and Reiß4], some assets exhibit higher noise-to-signal ratios, and for those larger blocks are preferable.

Table 2. Summary statistics of estimation for different values of $h_n$ and $K_n$ . MSD = mean standard deviation, MAB = mean absolute bias, MABC = MAB of bias-corrected estimator.

All values multiplied by a factor of $10^6$ .

Table 2 summarizes the performance of the estimation along different choices of $nh_n$ and $K_n$ using the following quantities:

MSD: the mean standard deviation of N iterations averaged over all grid points;
MAB: the mean absolute bias of N iterations averaged over all grid points and for the estimator (8) without any bias correction;
MABC: the mean absolute bias of N iterations averaged over all grid points and for the estimator (8) with a simple bias correction dividing estimates by the factors given in Table 1.

All the results are based on $N=50\,000$ Monte Carlo iterations. First of all, the values used for Figure 2 are not unique minimizers of the mean squared error. Several other combinations given in Table 2 render equally good results. Overall, the performance is comparable within a broad range of block lengths and window sizes. The variances decrease for larger $K_n$ , while the bias increases with larger $K_n$ for fixed $h_n$ . Important for the bias is the total window size, $K_n\cdot h_n$ , over which the volatility is approximated by a constant for the estimation. The variance only depends on $K_n$ : changing the block length for fixed $K_n$ does not significantly affect the variance. While the MSD is hence almost constant within the columns of Table 2, the bias after correction, MABC, increases from the top down due to the increasing window size. Without the bias correction two effects interfere for MAB. Larger blocks reduce the systematic bias due to $\Psi_n(\sigma_t^2)-\sigma_t^2$ , but the increasing bias due to the increasing window size prevails for $n\cdot h_n=78$ , and the two larger values of $K_n$ .

6. Proofs

6.1. Law of the integrated negative part of a Brownian motion

A crucial lemma for our theory is on an upper bound for the CDF of the integrated negative part of a Brownian motion. We prove a lemma based on a generalization of Lévy’s arcsine law by [Reference Takács27]. The result is in line with the conjecture in [Reference Janson16, (261)], where one finds an expansion of the density with a precise constant for the leading term. Denote by $f_+$ the positive part and by $f_-$ the negative part of some real-valued function f.

Lemma 1. For a standard Brownian motion $(W_t)_{t\ge 0}$ ,

\begin{align*}\mathbb{P}\bigg(\int_0^1(W_t)_-\,\textrm{d} t\le x\bigg) = \mathcal{O}(x^{1/3}), \quad x\to 0.\end{align*}

Proof. Observe the equality in distribution $\int_0^1(W_t)_-\,\textrm{d} t\stackrel{\textrm{d}}{=}\int_0^1(W_t)_+\,\textrm{d} t$ , such that

\begin{align*}\mathbb{P}\bigg(\int_0^1(W_t)_-\,\textrm{d} t\le x\bigg) = \mathbb{P}\bigg(\int_0^1(W_t)_+\,\textrm{d} t\le x\bigg), \quad x>0.\end{align*}

For any $\varepsilon>0$ , the inequality

\begin{align*}\int_0^1(W_t)_+\,\textrm{d} t \ge \int_0^1W_t\cdot\mathbf{1}(W_t>\varepsilon)\,\textrm{d} t \ge \varepsilon\int_0^1\mathbf{1}(W_t>\varepsilon)\,\textrm{d} t\end{align*}

leads us to

\begin{align*} \mathbb{P}\bigg(\int_0^1(W_t)_+\,\textrm{d} t \le x\bigg) \le \mathbb{P}\bigg(\varepsilon\int_0^1\mathbf{1}(W_t>\varepsilon)\,\textrm{d} t \le x\bigg) & = \mathbb{P}\bigg(1-\int_0^1\mathbf{1}(W_t\le \varepsilon)\,\textrm{d} t \le x/\varepsilon\bigg) \\[5pt] & = \mathbb{P}\bigg(\int_0^1\mathbf{1}(W_t\le \varepsilon)\,\textrm{d} t \ge 1-x/\varepsilon\bigg). \end{align*}

Using [Reference Takács27, (15) and (16)], we obtain

\begin{equation*} \mathbb{P}\bigg(\int_0^1\mathbf{1}(W_t\le \varepsilon)\,\textrm{d} t \ge 1-x/\varepsilon\bigg) = \frac{1}{\pi}\int_{1-x/\varepsilon}^{1}\frac{\exp\!(\!-\!\varepsilon^2/(2u))}{\sqrt{u(1-u)}}\,\textrm{d} u+2\Phi(\varepsilon)-1, \end{equation*}

with $\Phi$ the CDF of the standard normal distribution. Thereby, we obtain

\begin{equation*} \mathbb{P}\bigg(\int_0^1(W_t)_+\,\textrm{d} t \le x\bigg) \le \frac{1}{\pi}\int_{1-x/\varepsilon}^{1}\frac{\exp\!(\!-\!\varepsilon^2/(2u))}{\sqrt{u(1-u)}}\,\textrm{d} u + 2\int_0^{\varepsilon}\frac{\exp\!(\!-\!u^2/2)}{\sqrt{2\pi}}\,\textrm{d} u, \end{equation*}

and elementary bounds give the upper bound

\begin{equation*} \mathbb{P}\bigg(\int_0^1(W_t)_+\,\textrm{d} t \le x\bigg) \le \frac{2}{\pi}\sqrt{\frac{x}{\varepsilon}}\frac{1}{\sqrt{1-x/\varepsilon}\,} + \frac{2\varepsilon}{\sqrt{2\pi}\,}. \end{equation*}

Choosing $\varepsilon=x^{1/3}$ , we obtain the upper bound

\begin{equation*} \mathbb{P}\bigg(\int_0^1(W_t)_+\,\textrm{d} t \le x\bigg) \le \frac{2}{\pi}x^{1/3}\frac{1}{\sqrt{1-x^{2/3}}\,} + \frac{2x^{1/3}}{\sqrt{2\pi}}. \end{equation*}

6.2. Asymptotics of the spot volatility estimation in the continuous case

Proof of Theorem 1. In the following, we write $A_n\lesssim B_n$ for two real sequences if there exists some $n_0\in\mathbb{N}$ and a constant K such that $A_n\le K B_n$ for all $n\ge n_0$ .

Step 1 In the first step, we prove the approximation

\begin{align*} \hat\sigma^2_{\tau-} & = \frac{\pi}{2(\pi-2)K_n}\sum_{k=(\lfloor h_n^{-1}\tau\rfloor-K_n)\vee 1}^{\lfloor h_n^{-1}\tau\rfloor-1} h_n^{-1}(m_{k,n}-m_{k-1,n})^2 \\[5pt] & = \frac{\pi}{2(\pi-2)K_n}\sum_{k=(\lfloor h_n^{-1}\tau\rfloor-K_n)\vee 1}^{\lfloor h_n^{-1}\tau\rfloor-1} h_n^{-1}(\tilde m_{k,n}-\tilde m^*_{k-1,n})^2 + \mathcal{O}_{\mathbb{P}}\big(h_n^{\alpha\wedge 1/2}\big) \end{align*}

with

\begin{equation*} \tilde m_{k,n} = \min_{i\in\mathcal{I}_k^{n}}(\varepsilon_i+\sigma_{(k-1)h_n}(W_{t_i^n}-W_{kh_n})), \qquad \tilde m^*_{k-1,n} = \min_{i\in\mathcal{I}_{k-1}^{n}}(\varepsilon_i-\sigma_{(k-1)h_n}(W_{kh_n}-W_{t_i^n})). \end{equation*}

We show that, for $k\in\{1,\ldots,h_n^{-1}-1\}$ ,

(14)

\begin{equation} m_{k,n}-m_{k-1,n} = \tilde m_{k,n} - \tilde m^*_{k-1,n} + {\scriptstyle{\mathcal{O}}}_{\mathbb{P}}\big(h_n^{1/2}\big). \end{equation}

We subtract $X_{kh_n}$ from $m_{k,n}$ and $m_{k-1,n}$ , and use that, for all i,

\begin{align*}(Y_{i}-X_{kh_n})-(X_{t_i^n}-(X_{kh_n}+\sigma_{(k-1)h_n}(W_{t_i^n}-W_{kh_n}))) = (\sigma_{(k-1)h_n}(W_{t_i^n}-W_{kh_n})+\varepsilon_{i}).\end{align*}

This implies that

\begin{equation*} \min_{i\in\mathcal{I}_{k}^n}\!(Y_{i}-X_{kh_n}) - \max_{i\in\mathcal{I}_{k}^n}\!(X_{t_i^n}-(X_{kh_n}+\sigma_{(k-1)h_n}(W_{t_i^n}-W_{kh_n}))) \le \min_{i\in\mathcal{I}_{k}^n}\!(\sigma_{(k-1)h_n}(W_{t_i^n}-W_{kh_n})+\varepsilon_{i}). \end{equation*}

Changing the roles of $(Y_{i}-X_{kh_n})$ and $(\sigma_{(k-1)h_n}(W_{t_i^n}-W_{kh_n})+\varepsilon_{i})$ , we obtain by the analogous inequalities and the triangle inequality, with $M_t\;:\!=\;X_{kh_n} + \int_{kh_n}^{t}\sigma_{(k-1)h_n}\,\textrm{d} W_s$ , that

\begin{align*} |m_{k,n}-X_{kh_n}-\tilde m_{k,n}| & \le \max_{i\in\mathcal{I}_{k}^n}|X_{t_i^n}-M_{t_i^n}| \\[5pt] & \le \sup_{t\in[kh_n,(k+1)h_n]}|X_t-M_t| \\[5pt] & \le \sup_{t\in[kh_n,(k+1)h_n]}\bigg|C_t-C_{kh_n}-\int_{kh_n}^t\sigma_{(k-1)h_n}\,\textrm{d} W_s\bigg|. \end{align*}

We write $(C_t)$ for $(X_t)$ to emphasize continuity, see (3). Then (14) follows from

(15)

\begin{equation} \sup_{t\in[kh_n,(k+1)h_n]}\bigg|C_t-C_{kh_n}-\int_{kh_n}^t\sigma_{(k-1)h_n}\,\textrm{d} W_s\bigg| = {\scriptstyle{\mathcal{O}}}_{\mathbb{P}}(h_n^{1/2}), \end{equation}

and the analogous estimate for $m_{k-1,n}$ and $\tilde m^*_{k-1,n}$ . We decompose

\begin{align*} \sup_{t\in[kh_n,(k+1)h_n]}\bigg|C_t-C_{kh_n}-\int_{kh_n}^t\sigma_{(k-1)h_n}\,\textrm{d} W_s\bigg| & \le \sup_{t\in[kh_n,(k+1)h_n]}\bigg|\int_{kh_n}^t(\sigma_s-\sigma_{(k-1)h_n})\,\textrm{d} W_s\bigg| \\[5pt] & \quad + \sup_{t\in[kh_n,(k+1)h_n]}\int_{kh_n}^t |a_s|\,\textrm{d} s. \end{align*}

Under Assumption 1, we can assume that $(\sigma_t)$ and $(a_t)$ are bounded on [0, 1] by the localization from [Reference Jacod and Protter15, Section 4.4.1]. Using Itô’s isometry and Fubini’s theorem, we obtain that

\begin{equation*} \mathbb{E}\bigg[\bigg(\int_{kh_n}^t(\sigma_s-\sigma_{(k-1)h_n})\,\textrm{d} W_s\bigg)^2\bigg] = \mathbb{E}\bigg[\int_{kh_n}^t(\sigma_s-\sigma_{(k-1)h_n})^2\,\textrm{d} s\bigg] = \int_{kh_n}^t\mathbb{E}[(\sigma_s-\sigma_{(k-1)h_n})^2]\,\textrm{d} s, \end{equation*}

such that Assumption 1 yields, for any $t\in[kh_n,(k+1)h_n]$ ,

\begin{align*} \mathbb{E}\bigg[\bigg(\int_{kh_n}^t(\sigma_s-\sigma_{(k-1)h_n})\,\textrm{d} W_s\bigg)^2\bigg] & \le C_{\sigma}\int_{kh_n}^t(s-(k-1)h_n)^{2\alpha}\,\textrm{d} s \\[5pt] & \le C_{\sigma}(2\alpha+1)^{-1}(t-(k-1)h_n)^{2\alpha+1} = \mathcal{O}\big(h_n^{2\alpha+1}\big). \end{align*}

By Doob’s martingale maximal inequality and since $\sup_{t\in[kh_n,(k+1)h_n]}\int_{kh_n}^t |a_s|\,\textrm{d} s = \mathcal{O}_{\mathbb{P}}(h_n)$ ,

\begin{align*}\sup_{t\in[kh_n,(k+1)h_n]}\bigg|C_t-C_{kh_n}-\int_{kh_n}^t\sigma_{(k-1)h_n}\,\textrm{d} W_s\bigg| = \mathcal{O}_{\mathbb{P}}\big(h_n^{(1/2+\alpha)\wedge 1}\big).\end{align*}

We conclude that (15) holds, since $\alpha>0$ . Since

\begin{align*}h_n^{-1}(m_{k,n}-m_{k-1,n})(m_{k,n}-\tilde m_{k,n}) = \mathcal{O}_{\mathbb{P}}\big(h_n^{\alpha\wedge 1/2}\big),\end{align*}

and analogously for $(m_{k-1,n}-\tilde m^*_{k-1,n})$ , we conclude Step 1 by writing

\begin{align*} (m_{k,n}-m_{k-1,n})^2 - (\tilde m_{k,n}-\tilde m^*_{k-1,n})^2 & = (m_{k,n}-m_{k-1,n}+\tilde m_{k,n}-\tilde m^*_{k-1,n}) \\[5pt] & \quad \times (m_{k,n}-\tilde m_{k,n}+\tilde m^*_{k-1,n}-m_{k-1,n}). \end{align*}

Step 2 We bound the bias of the spot volatility estimation using Step 1. For $\lfloor h_n^{-1}\tau\rfloor>K_n$ , we obtain from the definition of the function $\Psi_n$ in (6) that

\begin{align*} & \mathbb{E}[\hat\sigma^2_{\tau-}-\Psi_n(\sigma_{\tau-}^2)] = \frac{\pi}{2(\pi-2)K_n}\sum_{k=(\lfloor h_n^{-1}\tau\rfloor-K_n)\vee 1}^{\lfloor h_n^{-1}\tau\rfloor-1} h_n^{-1}\mathbb{E}[(m_{k,n}-m_{k-1,n})^2] - \mathbb{E}[\Psi_n(\sigma_{\tau-}^2)] \\[5pt] & \quad = \frac{1}{K_n}\frac{\pi}{2(\pi-2)}\sum_{k=(\lfloor h_n^{-1}\tau\rfloor-K_n)\vee 1}^{\lfloor h_n^{-1}\tau\rfloor-1} h_n^{-1}\mathbb{E}\big[\big(\tilde m_{k,n}-\tilde m^*_{k-1,n}\big)^2\big] - \mathbb{E}\big[\Psi_n\big(\sigma_{\tau-}^2\big)\big] + \mathcal{O}\big(h_n^{\alpha\wedge 1/2}\big) \\[5pt] & \quad = \frac{1}{K_n}\frac{\pi}{2(\pi-2)}\sum_{k=(\lfloor h_n^{-1}\tau\rfloor-K_n)\vee 1}^{\lfloor h_n^{-1}\tau\rfloor-1} \frac{2(\pi-2)}{\pi}\mathbb{E}\big[\Psi_n\big(\sigma_{(k-1)h_n}^2\big)\big] - \mathbb{E}\big[\Psi_n\big(\sigma_{\tau-}^2\big)\big] + \mathcal{O}\big(h_n^{\alpha\wedge 1/2}\big) \\[5pt] & \quad \lesssim \frac{1}{K_n}\sum_{k=(\lfloor h_n^{-1}\tau\rfloor-K_n)\vee 1}^{\lfloor h_n^{-1}\tau\rfloor-1} \mathbb{E}\big[\sigma_{(k-1)h_n}^2-\sigma_{\tau-}^2\big] + \mathcal{O}\big(h_n^{\alpha\wedge 1/2}\big) \\[5pt] & \quad \lesssim \frac{1}{K_n}\sum_{k=(\lfloor h_n^{-1}\tau\rfloor-K_n)\vee 1}^{\lfloor h_n^{-1}\tau\rfloor-1} \mathbb{E}\big[\sigma_{(k-1)h_n}-\sigma_{\tau-}\big] + \mathcal{O}\big(h_n^{\alpha\wedge 1/2}\big) \\[5pt] & \quad \lesssim \frac{1}{K_n}\sum_{k=(\lfloor h_n^{-1}\tau\rfloor-K_n)\vee 1}^{\lfloor h_n^{-1}\tau\rfloor-1} \big(\mathbb{E}\big[\big(\sigma_{(k-1)h_n}-\sigma_{\tau-}\big)^2\big]\big)^{1/2} + \mathcal{O}\big(h_n^{\alpha\wedge 1/2}\big) \\[5pt] & \quad = \mathcal{O}((K_nh_n)^{\alpha}) = {\scriptstyle{\mathcal{O}}}\big(h_n^{\alpha/(1+2\alpha)}\big) = {\scriptstyle{\mathcal{O}}}\big(K_n^{-1/2}\big). \end{align*}

The first $\lesssim$ estimate is in fact an equality up to an additional factor $(1+{\scriptstyle{\mathcal{O}}}(1))$ , since $\Psi'_{\!\!n}(x)=1+{\scriptstyle{\mathcal{O}}}(1)$ for all $x\ge 0$ , exploiting the abovementioned differentiability based on [Reference Bibinger, Jirak and Reiß5, (A.35)]. For the asymptotic upper bounds we used the binomial formula

\begin{align*}\sigma_{(k-1)h_n}^2-\sigma_{\tau-}^2 = (\sigma_{(k-1)h_n}-\sigma_{\tau-})(\sigma_{(k-1)h_n}+\sigma_{\tau-}) \le 2C(\sigma_{(k-1)h_n}-\sigma_{\tau-}),\end{align*}

exploiting as in Step 1 that $(\sigma_t)$ is bounded with some upper bound C, and Hölder’s inequality to conclude with (2) from Assumption 1. Finally, we used that $\big(\alpha\wedge\frac12\big) > \alpha/(2\alpha+1)$ for all $\alpha$ .

Step 3 For the consistency of $\hat\sigma^2_{\tau-}$ , we prove that

(16)

\begin{equation} \mathbb{E}\big[\hat\sigma^2_{\tau-}-\sigma_{\tau-}^2\big] = {\scriptstyle{\mathcal{O}}}(1). \end{equation}

This includes a proof of (7). Denote by $\mathbb{P}_{\sigma_{(k-1)h_n}}$ the regular conditional probabilities conditioned on $\sigma_{(k-1)h_n}$ , and by $\mathbb{E}_{\sigma_{(k-1)h_n}}$ the expectations with respect to the conditional measures. We obtain by the tower rule that

(17)

\begin{align} \mathbb{E}\big[h_n^{-1}\big(\tilde m_{k,n}-\tilde m^*_{k-1,n})^2\big] & = \mathbb{E}\big[h_n^{-1}\mathbb{E}_{\sigma_{(k-1)h_n}}\big[\big(\tilde m_{k,n}-\tilde m^*_{k-1,n})^2\big]\big] \nonumber \\[5pt] & = \mathbb{E}\Big[\mathbb{E}_{\sigma_{(k-1)h_n}}\big[\big(h_n^{-1/2}\tilde m_{k,n})^2\big] + \mathbb{E}_{\sigma_{(k-1)h_n}}\big[\big(h_n^{-1/2}\tilde m^*_{k-1,n})^2\big] \nonumber \\[5pt] & \qquad\quad -2\mathbb{E}_{\sigma_{(k-1)h_n}}\big[h_n^{-1/2}\tilde m_{k,n}\big] \mathbb{E}_{\sigma_{(k-1)h_n}}\big[h_n^{-1/2}\tilde m^*_{k-1,n}\big]\Big] \end{align}

by the conditional independence of $\tilde m_{k,n}$ and $\tilde m^*_{k-1,n}$ .

We establish and use an approximation of the tail probabilities of $(\tilde m_{k,n})$ and $(\tilde m^*_{k-1,n})$ , respectively. For $x\in\mathbb{R}$ , we have

\begin{align*} & \mathbb{P}_{\sigma_{(k-1)h_n}}\Big(h_n^{-1/2} \min_{i\in\mathcal{I}_{k}^n}\big(\varepsilon_{i}+\sigma_{(k-1)h_n}(W_{t_i^n}-W_{kh_n})\big)>x\sigma_{(k-1)h_n}\Big) \\[5pt] & \quad = \mathbb{P}_{\sigma_{(k-1)h_n}}\Big(\min_{i\in\mathcal{I}_{k}^n} \big(h_n^{-1/2}\big(W_{t_i^n}-W_{kh_n}\big)+h_n^{-1/2}\sigma_{(k-1)h_n}^{-1}\varepsilon_{i}\big)>x\Big) \\[5pt] & \quad = \mathbb{E}_{\sigma_{(k-1)h_n}}\Bigg[\prod_{i=\lfloor k n h_n\rfloor+1}^{\lfloor (k+1)nh_n\rfloor} \mathbb{P}\big(\varepsilon_{i}>h_n^{1/2}\sigma_{(k-1)h_n}\big(x-h_n^{-1/2}(W_{t_i^n}-W_{kh_n})\big) \mid \mathcal{F}^X\big)\Bigg] \\[5pt] & \quad = \mathbb{E}_{\sigma_{(k-1)h_n}}\Bigg[\exp\!\Bigg(\sum_{i=\lfloor k n h_n\rfloor+1}^{\lfloor (k+1)nh_n\rfloor} \log\big(1-F_{\eta}\big(h_n^{1/2}\sigma_{(k-1)h_n}\big(x-h_n^{-1/2}(W_{t_i^n}-W_{kh_n})\big)\big)\big)\Bigg)\Bigg] \end{align*}

by the tower rule for conditional expectations, and since $\varepsilon_{i}\stackrel{\textrm{iid}}{\sim}F_{\eta}$ . We have

\begin{alignat*}{2} W_{t_i^n}-W_{kh_n} & = \sum_{j=1}^{i-\lfloor k n h_n\rfloor}\tilde U_j, \quad & & \tilde U_j\stackrel{\textrm{iid}}{\sim}\mathcal{N}(0,n^{-1}),\ j\ge 2,\ \tilde U_1\sim\mathcal{N}\big(0,t^n_{\lfloor k n h_n \rfloor +1}-kh_n\big), \\[5pt] U_j & = h_n^{-1/2}\tilde U_j, & & U_j\stackrel{\textrm{iid}}{\sim}\mathcal{N}\big(0,(nh_n)^{-1}\big),\ j\ge 2,\ U_1\sim\mathcal{N}\big(0,h_n^{-1}\big(t^n_{\lfloor k n h_n \rfloor +1}-kh_n\big)\big). \end{alignat*}

From (5), and with a first-order Taylor expansion of $z\mapsto \log\!(1-z)$ , we have

\begin{equation*} \log\!(1-F_{\eta}(y))\stackrel{(5)}{=}\log\!(1-\eta y(1+{\scriptstyle{\mathcal{O}}}(1))) = -\eta y+{\scriptstyle{\mathcal{O}}}(\eta y) = -\eta y_++{\scriptstyle{\mathcal{O}}}(\eta y) \end{equation*}

as $y\to 0$ , where we add the positive part in the last equality since $F_{\eta}(y)=0$ for any $y\le 0$ . We obtain

\begin{align*} &\mathbb{P}_{\sigma_{(k-1)h_n}}\Big(h_n^{-1/2}\min_{i\in\mathcal{I}_{k}^n} \big(\varepsilon_{i}+\sigma_{(k-1)h_n}\big(W_{t_i^n}-W_{kh_n}\big)\big)>x\sigma_{(k-1)h_n}\Big) \\[5pt] & \quad = \mathbb{E}_{\sigma_{(k-1)h_n}}\Bigg[\exp\!\Bigg({-}h_n^{1/2}\sigma_{(k-1)h_n}\eta \sum_{i=\lfloor k n h_n \rfloor+1}^{\lfloor (k+1)nh_n\rfloor}\Bigg(x-\sum_{j=1}^{i-\lfloor k n h_n\rfloor}U_j\Bigg)_{+} (1+{\scriptstyle{\mathcal{O}}}(1))\Bigg)\Bigg] \\[5pt] & \quad = \mathbb{E}_{\sigma_{(k-1)h_n}}\bigg[\exp\!\bigg({-}h_n^{1/2}nh_n\sigma_{(k-1)h_n}\eta \int_0^1(B_t-x)_{-}\,\textrm{d} t(1+{\scriptstyle{\mathcal{O}}}(1))\bigg)\bigg]. \end{align*}

In the last equality we used that the Riemann sums tend almost surely to the integral with a standard Brownian motion $(B_t)_{t\ge 0}$ in the integrand. Since the expression in the expectation is bounded, as a product of conditional probabilities, by 1, we conclude with dominated convergence. If $nh_n^{3/2}\to \infty$ , we deduce that

(18)

\begin{align} & \mathbb{P}_{\sigma_{(k-1)h_n}}\Big(h_n^{-1/2}\min_{i\in\mathcal{I}_{k}^n} \big(\varepsilon_{i}+\sigma_{(k-1)h_n}\big(W_{t_i^n}-W_{kh_n}\big)\big)>x\sigma_{(k-1)h_n}\Big) \nonumber \\[5pt] & = \mathbb{P}\Big(\inf_{0\le t\le 1} B_t\ge x\Big) \nonumber \\[5pt] & \quad +\mathbb{E}_{\sigma_{(k-1)h_n}}\bigg[\mathbf{1}\Big(\inf_{0\le t\le 1} B_t< x\Big) \exp\!\bigg({-}h_n^{3/2}n\sigma_{(k-1)h_n}\eta \int_0^1(B_t-x)_{-}\,\textrm{d} t (1+{\scriptstyle{\mathcal{O}}}(1))\bigg)\bigg] \nonumber \\[5pt] & = \mathbb{P}\Big(\inf_{0\le t\le 1} B_t\ge x\Big) + \mathbb{P}\Big(\inf_{0\le t\le 1} B_t< x\Big)\cdot {\scriptstyle{\mathcal{O}}}(1). \end{align}

We do not have a lower bound for $\int_0^1(B_t-x)_{-}\,\textrm{d} t$ . However, using that the first entry time $T_x$ of $(B_t)$ in x, conditional on $\{\inf_{0\le t\le 1} B_t< x \}$ , has a continuous conditional density $f(t\mid T_x<1)$ , by Lemma 1 and properties of the Brownian motion we obtain, for any $\delta>0$ ,

\begin{align*} & \mathbb{E}_{\sigma_{(k-1)h_n}}\bigg[\mathbf{1}\Big(\inf_{0\le t\le 1} B_t< x\Big) \exp\!\bigg({-}h_n^{3/2}n\sigma_{(k-1)h_n}\eta\int_0^1(B_t-x)_{-}\,\textrm{d} t\bigg)\bigg] \\[5pt] & \le \exp\!\big({-}\big(h_n^{3/2}n\big)^{\delta}\sigma_{(k-1)h_n}\eta\big) \mathbb{P}\Big(\inf_{0\le t\le 1} B_t< x\Big) \\[5pt] & \quad + \mathbb{P}\bigg(\inf_{0\le t\le 1} B_t< x, \int_0^1(B_t-x)_{-}\,\textrm{d} t \le \big(h_n^{3/2}n\big)^{-1+\delta}\bigg) \\[5pt] & \le \bigg(\exp\!\big({-}\big(h_n^{3/2}n\big)^{\delta}\sigma_{(k-1)h_n}\eta\big) + \int_0^1\mathbb{P}\bigg(\int_s^1(B_t)_{-}\,\textrm{d} t\le \big(h_n^{3/2}n\big)^{-1+\delta}\bigg) f(s\mid T_x<1)\,\textrm{d} s\bigg) \\[5pt] & \quad \times \mathbb{P}\Big(\inf_{0\le t\le 1} B_t< x\Big) \\[5pt] & \le \bigg(\exp\!\big({-}\big(h_n^{3/2}n\big)^{\delta}\sigma_{(k-1)h_n}\eta\big) \\[5pt] & \qquad + \int_0^1\mathbb{P}\bigg((1-s)\int_0^1(B_t)_{-}\,\textrm{d} t\le \big(h_n^{3/2}n\big)^{-1+\delta}\bigg) f(s\mid T_x<1)\,\textrm{d} s\bigg)\mathbb{P}\Big(\inf_{0\le t\le 1} B_t< x\Big). \end{align*}

We focus on the second addend of the first factor, since the exponential term decays faster. It is bounded by a constant times

\begin{align*} & \int_0^1\mathbb{P}\bigg((1-s)\int_0^1(B_t)_{-}\,\textrm{d} t\le \big(h_n^{3/2}n\big)^{-1+\delta}\bigg)\,\textrm{d} s \\[5pt] & \qquad \le \int_0^{1-b_n}\mathbb{P}\bigg( (1-s)\int_0^1(B_t)_{-}\,\textrm{d} t\le \big(h_n^{3/2}n\big)^{-1+\delta}\bigg)\,\textrm{d} s + \int_{1-b_n}^1 \textrm{d} s \\[5pt] & \qquad \le \mathbb{P}\bigg(b_n\int_0^1(B_t)_{-}\,\textrm{d} t\le \big(h_n^{3/2}n\big)^{-1+\delta}\bigg) + b_n = \mathcal{O}\big(\big(h_n^{3/2}nb_n^{-1}\big)^{-({1+\delta})/{3}}+b_n\big) \end{align*}

for any sequence $(b_n)$ , $b_n\in(0,1)$ , where we used Lemma 1. Choosing a $b_n$ which minimizes the order yields

\begin{equation*} \mathbb{E}_{\sigma_{(k-1)h_n}}\bigg[\mathbf{1}\Big(\inf_{0\le t\le 1} B_t< x\Big) \exp\!\bigg({-}h_n^{3/2}n\sigma_{(k-1)h_n}\eta\int_0^1(B_t-x)_{-}\,\textrm{d} t\bigg)\bigg] = \mathbb{P}\Big(\inf_{0\le t\le 1} B_t< x\Big)\cdot R_n \end{equation*}

almost surely, with a remainder that satisfies $R_n = \mathcal{O}\big(\big(h_n^{3/2}n\big)^{-({1+\delta})/{4}}\big)$ . From the unconditional Lévy distribution of $T_x$ , $f(s\mid T_x<1)$ is explicit, but we omit its precise form which does not influence the asymptotic order. Under the condition $nh_n^{3/2}\to \infty$ , the minimum of the Brownian motion over the interval hence dominates the noise in the distribution of local minima, different than for the choice $h_n\propto n^{-2/3}$ . By the reflection principle,

(19)

\begin{equation} \mathbb{P}\Big({-}\inf_{0\le t\le 1} B_t\ge x\Big) = \mathbb{P}\Big(\sup_{0\le t\le 1} B_t\ge x\Big) = 2\mathbb{P}(B_1\ge x) = \mathbb{P}(|B_1|\ge x) \end{equation}

for $x\ge 0$ .

Using the illustration of moments by integrals over tail probabilities we exploit this, and a completely analogous estimate for $\tilde m^*_{k-1,n}$ , to approximate conditional expectations. This yields, for all $k\in\{1,\ldots,h_n^{-1}-1\}$ ,

\begin{align*} \mathbb{E}_{\sigma_{(k-1)h_n}}\big[h_n^{-1/2}\tilde m_{k,n}\big] & = \int_0^{\infty}\mathbb{P}_{\sigma_{(k-1)h_n}}\big(h_n^{-1/2}\tilde m_{k,n}>x\big)\,\textrm{d} x \\[5pt] & \quad - \int_0^{\infty}\mathbb{P}_{\sigma_{(k-1)h_n}}\big({-}h_n^{-1/2}\tilde m_{k,n}>x\big)\,\textrm{d} x \\[5pt] & = -\int_0^{\infty}\mathbb{P}_{\sigma_{(k-1)h_n}}\Big(\sigma_{(k-1)h_n}\sup_{0\le t\le 1} B_t>x\Big)\,\textrm{d} x + {\scriptstyle{\mathcal{O}}}_{\mathbb{P}}(1) \\[5pt] & = -\int_0^{\infty}\mathbb{P}_{\sigma_{(k-1)h_n}}(\sigma_{(k-1)h_n}|B_1|>x)\,\textrm{d} x + {\scriptstyle{\mathcal{O}}}_{\mathbb{P}}(1) \\[5pt] & = -\mathbb{E}_{\sigma_{(k-1)h_n}}\big[\sigma_{(k-1)h_n}| B_1|\big] + {\scriptstyle{\mathcal{O}}}_{\mathbb{P}}(1) \\[5pt] & = -\sqrt{\frac{2}{\pi}}\sigma_{(k-1)h_n}+{\scriptstyle{\mathcal{O}}}_{\mathbb{P}}(1). \end{align*}

We used (19). An analogous computation yields the same result for $\tilde m^*_{k-1,n}$ :

\begin{equation*} \mathbb{E}_{\sigma_{(k-1)h_n}}\big[h_n^{-1/2}\tilde m^*_{k-1,n}\big] = -\sqrt{\frac{2}{\pi}}\sigma_{(k-1)h_n}+{\scriptstyle{\mathcal{O}}}_{\mathbb{P}}(1). \end{equation*}

For the second conditional moments, we obtain, for all $k\in\{1,\ldots,h_n^{-1}-1\}$ ,

\begin{align*} \mathbb{E}_{\sigma_{(k-1)h_n}}\big[h_n^{-1}\big(\tilde m_{k,n}\big)^2\big] & = 2\int_0^{\infty}x\mathbb{P}_{\sigma_{(k-1)h_n}}\big(\big|h_n^{-1/2}\tilde m_{k,n}\big|>x\big)\,\textrm{d} x \\[5pt] & = 2\int_0^{\infty}x\mathbb{P}_{\sigma_{(k-1)h_n}}\Big(\sigma_{(k-1)h_n}\sup_{0\le t\le 1} B_t>x\Big)\,\textrm{d} x + {\scriptstyle{\mathcal{O}}}_{\mathbb{P}}(1) \\[5pt] & = 2\int_0^{\infty}x\mathbb{P}_{\sigma_{(k-1)h_n}}(\sigma_{(k-1)h_n}|B_1|>x)\,\textrm{d} x + {\scriptstyle{\mathcal{O}}}_{\mathbb{P}}(1) \\[5pt] & = \sigma_{(k-1)h_n}^2+{\scriptstyle{\mathcal{O}}}_{\mathbb{P}}(1). \end{align*}

The last identity uses the illustration of the second moment of the normal distribution as an integral over tail probabilities. An analogous computation yields

\begin{equation*} \mathbb{E}_{\sigma_{(k-1)h_n}}\big[h_n^{-1}\big(\tilde m^*_{k-1,n}\big)^2\big] = \sigma_{(k-1)h_n}^2 + {\scriptstyle{\mathcal{O}}}_{\mathbb{P}}(1). \end{equation*}

Inserting the identities for the conditional moments in (17) yields

\begin{equation*} \mathbb{E}\big[h_n^{-1}\big(\tilde m_{k,n}-\tilde m^*_{k-1,n})^2\big] = 2\bigg(1-\frac{2}{\pi}\bigg)\mathbb{E}\big[\sigma_{(k-1)h_n}^2\big] + {\scriptstyle{\mathcal{O}}}(1) \end{equation*}

such that

\begin{align*} \mathbb{E}\big[\hat\sigma^2_{\tau-}-\sigma_{\tau-}^2\big] & = \frac{\pi}{2(\pi-2)K_n}\sum_{k=(\lfloor h_n^{-1}\tau\rfloor-K_n)\vee 1}^{\lfloor h_n^{-1}\tau\rfloor-1} h_n^{-1}\mathbb{E}\big[\big(\tilde m_{k,n}-\tilde m^*_{k-1,n})^2\big] - \mathbb{E}[\sigma_{\tau-}^2] + {\scriptstyle{\mathcal{O}}}(1) \\[5pt] & = \frac{1}{K_n}\sum_{k=(\lfloor h_n^{-1}\tau\rfloor-K_n)\vee 1}^{\lfloor h_n^{-1}\tau\rfloor-1} \mathbb{E}\big[\sigma_{(k-1)h_n}^2-\sigma_{\tau-}^2\big] + {\scriptstyle{\mathcal{O}}}(1) = {\scriptstyle{\mathcal{O}}}(1). \end{align*}

This proves (16). Since the next step shows that the variance of the estimator tends to zero, consistency holds true.

Step 4 We determine the asymptotic variance of the estimator. Illustrating moments as integrals over tail probabilities, with the analogous approximation as above, we obtain, for all $k\in\{1,\ldots,h_n^{-1}-1\}$ ,

\begin{align*} \textrm{var}_{\sigma_{(k-1)h_n}}\big(\tilde m_{k,n}^2\big) & = \mathbb{E}_{\sigma_{(k-1)h_n}}\big[\tilde m_{k,n}^4\big] - \big(\mathbb{E}_{\sigma_{(k-1)h_n}}\big[\tilde m_{k,n}^2\big]\big)^2 \\[6pt] & = 2\sigma_{(k-1)h_n}^4h_n^2 + {\scriptstyle{\mathcal{O}}}_{\mathbb{P}}(h_n^2), \\[6pt] \textrm{cov}_{\sigma_{(k-1)h_n}}\big(\tilde m_{k,n}^2,\tilde m_{k,n}\tilde m^*_{k-1,n}\big) & = \mathbb{E}_{\sigma_{(k-1)h_n}}\big[\tilde m_{k,n}^3\big]\mathbb{E}_{\sigma_{(k-1)h_n}}\big[\tilde m^*_{k-1,n}\big] \\[6pt] & \quad - \mathbb{E}_{\sigma_{(k-1)h_n}}\big[\tilde m_{k,n}^2\big]\mathbb{E}_{\sigma_{(k-1)h_n}}[\tilde m_{k,n}] \mathbb{E}_{\sigma_{(k-1)h_n}}\big[\tilde m^*_{k-1,n}\big] \\[6pt] & = \frac{2}{\pi}\sigma_{(k-1)h_n}^4h_n^2 + {\scriptstyle{\mathcal{O}}}_{\mathbb{P}}(h_n^2), \\[6pt] \textrm{var}_{\sigma_{(k-1)h_n}}\big(\tilde m_{k,n}\tilde m^*_{k-1,n}\big) & = \mathbb{E}_{\sigma_{(k-1)h_n}}\big[\tilde m_{k,n}^2\big] \mathbb{E}_{\sigma_{(k-1)h_n}}\big[\big(\tilde m^*_{k-1,n}\big)^2\big] \\[6pt] & \quad - \big(\mathbb{E}_{\sigma_{(k-1)h_n}}\big[\tilde m_{k,n}\big] \mathbb{E}_{\sigma_{(k-1)h_n}}\big[\tilde m^*_{k-1,n}\big]\big)^2 \\[6pt] & = \sigma_{(k-1)h_n}^4\bigg(1-\frac{4}{\pi^2}\bigg)h_n^2 + {{\mathcal{O}}}_{\mathbb{P}}(h_n^2). \end{align*}

We have used the first four moments of the half-normal distribution and their illustration via integrals over tail probabilities. The dependence structure between $\tilde m_{k,n}$ and $\tilde m^*_{k,n}$ also affects the variance of $\hat\sigma^2_{\tau-}$ . We perform approximation steps for covariances similar to those for the moments of local minima above, using

\begin{align*} &h_n^{-1}\textrm{cov}_{\sigma_{(k-1)h_n}}\big(\tilde m_{k,n},\tilde m^*_{k,n}\big) \\[5pt] & = \int_{-\infty}^{\infty}\int_{-\infty}^{\infty}\Big(\mathbb{P}_{\sigma_{(k-1)h_n}} \big(h_n^{-1/2}\tilde m_{k,n}>x, h_n^{-1/2}\tilde m^*_{k,n}>y\big) \\[5pt] & \qquad\qquad\qquad - \mathbb{P}_{\sigma_{(k-1)h_n}}\big(h_n^{-1/2}\tilde m_{k,n}>x\big) \mathbb{P}_{\sigma_{(k-1)h_n}}\big(h_n^{-1/2}\tilde m^*_{k,n}>y\big)\Big)\,\textrm{d} x\,\textrm{d} y \\[5pt] & = \int_{0}^{\infty}\int_{0}^{\infty}\bigg(\mathbb{P}_{\sigma_{(k-1)h_n}} \Big(\sigma_{(k-1)h_n}\sup_{0\le t\le 1}B_t>x, \sigma_{(k-1)h_n}\Big(\sup_{0\le t\le 1}B_t-B_1\Big)>y\Big) \\[5pt] & \qquad\qquad\quad - \mathbb{P}_{\sigma_{(k-1)h_n}}\Big(\sigma_{(k-1)h_n}\sup_{0\le t\le 1}B_t>x\Big) \mathbb{P}_{\sigma_{(k-1)h_n}}\Big(\sigma_{(k-1)h_n}\Big(\sup_{0\le t\le 1}B_t-B_1\Big)>y\Big)\!\bigg)\textrm{d} x\,\textrm{d} y \\[5pt] & \quad + {\scriptstyle{\mathcal{O}}}_{\mathbb{P}}(1). \end{align*}

This shows that the joint distribution of $(\tilde m_{k,n},\tilde m^*_{k,n})$ relates to the distribution of the minimum and the difference between the minimum and the endpoint of Brownian motion over an interval, or equivalently the distribution of the maximum and the difference between the maximum and the endpoint. The latter is readily obtained from the joint density of the maximum and the endpoint, which is a well-known result on stochastic processes; see, e.g., [Reference Shepp26]. Utilizing this, we obtain, for all $k\in\{1,\ldots,h_n^{-1}-1\}$ ,

\begin{align*}\textrm{cov}_{\sigma_{(k-1)h_n}}\big(\tilde m_{k,n},\tilde m^*_{k,n}\big) = \bigg(\frac{1}{2}-\frac{2}{\pi}\bigg)h_n\sigma_{(k-1)h_n}^2 \big(1+\mathcal{O}_{\mathbb{P}}\big(h_n^{\alpha}\big)\big) + {\scriptstyle{\mathcal{O}}}_{\mathbb{P}}(h_n).\end{align*}

The additional remainder of order $h_n^{\alpha}$ in probability is due to the different approximations of $(\sigma_t)$ in $\tilde m_{k,n}$ and $\tilde m^*_{k,n}$ . This implies that, for all $k\in\{1,\ldots,h_n^{-1}-1\}$ ,

\begin{align*} &\textrm{cov}_{\sigma_{(k-1)h_n}}\big(\tilde m_{k,n}\tilde m^*_{k-1,n},\tilde m_{k+1,n}\tilde m^*_{k,n}\big) \\[5pt] & = \big(\mathbb{E}_{\sigma_{(k-1)h_n}}\big[\tilde m_{k,n}\tilde m^*_{k,n}\big] - \mathbb{E}_{\sigma_{(k-1)h_n}}\big[\tilde m_{k,n}\big]\mathbb{E}_{\sigma_{(k-1)h_n}}\big[\tilde m^*_{k,n}\big]\big) \mathbb{E}_{\sigma_{(k-1)h_n}}\big[\tilde m^*_{k-1,n}\big]\mathbb{E}\big[\tilde m_{k+1,n}\big] \\[5pt] & = \sigma_{(k-1)h_n}^4\bigg(\frac{1}{\pi}-\frac{4}{\pi^2}\bigg)h_n^2 + {\scriptstyle{\mathcal{O}}}_{\mathbb{P}}\big(h_n^2\big). \end{align*}

With analogous steps, we deduce two more covariances which contribute to the asymptotic variance:

\begin{align*} \textrm{cov}_{\sigma_{(k-1)h_n}}\big(\tilde m_{k,n}^2,\big(\tilde m^*_{k,n}\big)^2\big) & = -h_n^2\frac{\sigma_{(k-1)h_n}^4}{2}+{\scriptstyle{\mathcal{O}}}_{\mathbb{P}}\big(h_n^2\big), \\[5pt] \textrm{cov}_{\sigma_{(k-1)h_n}}\big(\big(\tilde m^*_{k,n}\big)^2,m_k\tilde m^*_{k-1,n}\big) & = -h_n^2\frac{2}{3\pi}\sigma_{(k-1)h_n}^4+{\scriptstyle{\mathcal{O}}}_{\mathbb{P}}\big(h_n^2\big). \end{align*}

All covariance terms which enter the asymptotic variance are of one of these forms. For the conditional variance given $\sigma^2_{\tau-}$ , we obtain

\begin{align*} & \textrm{var}_{\sigma^2_{\tau-}}\big(\hat\sigma^2_{\tau-}\big) \\[5pt] & = \frac{1}{K_n^2}\frac{\pi^2}{4(\pi-2)^2}\Bigg( \sum_{k=(\lfloor h_n^{-1}\tau\rfloor-K_n)\vee 1}^{\lfloor h_n^{-1}\tau\rfloor-1}h_n^{-2} \textrm{var}_{\sigma^2_{\tau-}}\big(\tilde m_{k,n}^2+(\tilde m^*_{k,n})^2-2\tilde m_{k,n}\tilde m^*_{k-1,n}\big) \\[5pt] & \quad -\sum_{k=(\lfloor h_n^{-1}\tau\rfloor-K_n)\vee 2}^{\lfloor h_n^{-1}\tau\rfloor-1}4h_n^{-2} \textrm{cov}_{\sigma^2_{\tau-}}\big(\tilde m_{k,n}\tilde m^*_{k-1,n},\tilde m_{k-1,n}^2 +(\tilde m^*_{k-1,n})^2 - 2\tilde m_{k-1,n}\tilde m^*_{k-2,n}\big)\Bigg) \\[5pt] & \quad + {\scriptstyle{\mathcal{O}}}_{\mathbb{P}}\big(K_n^{-1}\big) \\[5pt] & = \frac{1}{K_n^2}\frac{\pi^2}{4(\pi-2)^2}\Bigg( \sum_{k=(\lfloor h_n^{-1}\tau\rfloor-K_n)\vee 1}^{\lfloor h_n^{-1}\tau\rfloor-1}h_n^{-2} \Big(2\textrm{var}_{\sigma^2_{\tau-}}\big(\tilde m_{k,n}^2\big) + 4\textrm{var}_{\sigma^2_{\tau-}}\big(\tilde m_{k,n}\tilde m^*_{k-1,n}\big) \\[5pt] & \quad + 2\textrm{cov}_{\sigma^2_{\tau-}}\big(\tilde m_{k,n}^2,(\tilde m^*_{k,n})^2\big) - 4\textrm{cov}_{\sigma^2_{\tau-}}\big(\tilde m_{k,n}^2,\tilde m_{k,n}\tilde m^*_{k-1,n}\big) - 4\textrm{cov}_{\sigma^2_{\tau-}}\big((\tilde m^*_{k,n})^2, \tilde m_{k,n}\tilde m^*_{k-1,n}\big)\Big) \\[5pt] & \quad + \sum_{k=(\lfloor h_n^{-1}\tau\rfloor-K_n)\vee 2}^{\lfloor h_n^{-1}\tau\rfloor-1}4h_n^{-2} \Big(2\textrm{cov}_{\sigma^2_{\tau-}}\!\big(\tilde m_{k,n}\tilde m^*_{k-1,n}, \tilde m_{k-1,n}\tilde m^*_{k-2,n}\big) - \textrm{cov}_{\sigma^2_{\tau-}}\!\big(\tilde m_{k,n}\tilde m^*_{k-1,n},\tilde m_{k-1,n}^2\big) \\[5pt] & \quad - \textrm{cov}_{\sigma^2_{\tau-}}\big(\tilde m_{k,n}\tilde m^*_{k-1,n},(\tilde m^*_{k-1,n})^2\big)\Big)\Bigg) + {\scriptstyle{\mathcal{O}}}_{\mathbb{P}}\big(K_n^{-1}\big) \\[5pt] & = \frac{1}{K_n}\frac{\pi^2}{4(\pi-2)^2}\sigma_{\tau-}^4 \bigg(8-\frac{16}{\pi^2}-1-\frac{8}{\pi}+\frac{8}{3\pi}+2\bigg(\frac{4}{3\pi}-\frac{16}{\pi^2}\bigg)\bigg) + {\scriptstyle{\mathcal{O}}}_{\mathbb{P}}\big(K_n^{-1}\big) \\[5pt] & = \frac{1}{K_n}\frac{1}{(\pi-2)^2}\bigg(\frac{7\pi^2}{4}-\frac{2\pi}{3}-12\bigg)\sigma^4_{\tau-} + {\scriptstyle{\mathcal{O}}}_{\mathbb{P}}\big(K_n^{-1}\big). \end{align*}

Step 5 For a central limit theorem, the squared bias needs to be asymptotically negligible compared to the variance, which is satisfied for $K_n={\scriptstyle{\mathcal{O}}}(h_n^{-2\alpha/(1+2\alpha)})$ . By the existence of higher moments of $\tilde m_{k,n}$ and $\tilde m^*_{k-1,n}$ , a Lyapunov-type condition is straightforward, such that asymptotic normality conditional on $\sigma^2_{\tau-}$ is implied by a classical central limit theorem for m-dependent triangular arrays such as the one in [Reference Berk3]. A feasible central limit theorem is implied by this conditional asymptotic normality in combination with $\mathcal{F}^X$ -stable convergence. For the stability, we show that the $\alpha_n=K_n^{1/2}\big(\hat\sigma_{\tau-}^2-\sigma_{\tau-}^2\big)$ satisfy

(20)

\begin{equation} \mathbb{E}[Z g(\alpha_n)] \rightarrow \mathbb{E}[Z g(\alpha)] = \mathbb{E}[Z]\mathbb{E}[g(\alpha)] \end{equation}

for any $\mathcal{F}^X$ -measurable bounded random variable Z and continuous bounded function g, where

\begin{equation*} \alpha = \sigma_{\tau-}^2\frac{1}{(\pi-2)}\sqrt{\frac{7\pi^2}{4}-\frac{2\pi}{3}-12}\, U, \end{equation*}

with U a standard normally distributed random variable which is independent of $\mathcal{F}^X$ . By the above approximations it suffices to prove this for the statistics based on $\tilde m_{k,n}$ and $\tilde m^*_{k-1,n}$ from (14), and Z measurable with respect to $\sigma\big(\int_0^t\sigma_s\,\textrm{d} W_s,0\le t\le 1\big)$ . Set

\begin{equation*} A_n = [\tau-(K_n+1)h_n,\tau], \quad \tilde X(n)_t = \int_0^t\mathbf{1}_{A_n}(s)\sigma_{\lfloor sh_n^{-1}\rfloor h_n}\,\textrm{d} W_s, \quad \bar X(n)_t = X_t-\tilde X(n)_t. \end{equation*}

Denote with $\mathcal{H}_n$ the $\sigma$ -field generated by $\bar X(n)_t$ and $\mathcal{F}^X_0$ . The sequence $(\mathcal{H}_n)_{n\in\mathbb{N}}$ is isotonic with limit $\bigvee_n \mathcal{H}_n=\sigma(\int_0^t\sigma_s\,\textrm{d} W_s,0\le t\le 1)$ . Since $\mathbb{E}[Z\mid\mathcal{H}_n]\rightarrow Z$ in $L^1(\mathbb{P})$ as $n\rightarrow\infty$ , it is enough to show that $\mathbb{E}[Zg(\alpha_n)]\rightarrow \mathbb{E}[Z]\mathbb{E}[g(\alpha)]$ for Z being $\mathcal{H}_{n_0}$ -measurable for some $n_0\in\mathbb{N}$ . Observe that $\alpha_n$ includes only increments of local minima based on $\tilde X(n)_t$ , which are uncorrelated from those of $\bar X(n)_t$ . For all $n\ge n_0$ , we hence obtain $\mathbb{E}[Zg(\alpha_n)]=\mathbb{E}[Z]\mathbb{E}[g(\alpha_n)] \rightarrow \mathbb{E}[Z]\mathbb{E}[g(\alpha)]$ by a standard central limit theorem. This shows (20), and completes the proof of (12).

Proof of Proposition 1. For the quarticity estimator (10), when $\lfloor h_n^{-1}\tau\rfloor>K_n$ we have

\begin{align*} \mathbb{E}\big[\widehat{{\sigma^4_{\tau}}}_--\sigma^4_{\tau-}\big] & = \frac{\pi}{4(3\pi-8)K_n}\sum_{k=(\lfloor h_n^{-1}\tau\rfloor-K_n)\vee 1}^{\lfloor h_n^{-1}\tau\rfloor-1}h_n^{-2} \mathbb{E}\Big[\tilde m_{k,n}^4 + (\tilde m^*_{k-1,n})^4 - 4\tilde m_{k,n}^3\tilde m^*_{k-1,n} \\[5pt] & \qquad\qquad - 4\tilde m_{k,n}(\tilde m^*_{k-1,n})^3 + 6\tilde m_{k,n}^2(\tilde m^*_{k-1,n})^2\Big] - \mathbb{E}[\sigma^4_{\tau-}] + \mathcal{O}\big(h_n^{\alpha\wedge 1/2}\big) \\[5pt] & = \bigg(\frac{\pi}{4(3\pi-8)}\bigg(6-\frac{16}{\pi}-\frac{16}{\pi}+6\bigg)-1\bigg) \mathbb{E}[\sigma^4_{\tau-}] + {\scriptstyle{\mathcal{O}}}(1) \\[5pt] & = {\scriptstyle{\mathcal{O}}}(1) \end{align*}

by using the same moments as in the computation of the asymptotic variance. We can bound its variance by

\begin{align*} \textrm{var}\big(\widehat{{\sigma^4_{\tau}}}_-\big) & \le \frac{\pi^2}{16(3\pi-8)^2K_n^2}2K_nh_n^{-4} \textrm{var}\big(\big(\tilde m_{k,n}-\tilde m^*_{k-1,n}\big)^4\big) + {\scriptstyle{\mathcal{O}}}\big(K_n^{-1}\big) \\[5pt] & \le \frac{1}{K_n}\frac{\pi^2}{8(3\pi-8)^2}h_n^{-4} \mathbb{E}\big[\big(\tilde m_{k,n}-\tilde m^*_{k-1,n}\big)^8\big] + {\scriptstyle{\mathcal{O}}}\big(K_n^{-1}\big) \\[5pt] & \le \frac{1}{K_n}\frac{\pi^2}{8(3\pi-8)^2}h_n^{-4}256\mathbb{E}\big[\tilde m_{k,n}^8\big] + {\scriptstyle{\mathcal{O}}}\big(K_n^{-1}\big) = \mathcal{O}(K_n^{-1}), \end{align*}

which readily implies Proposition 1.

6.3. Asymptotics of the truncated spot volatility estimation with jumps

Proof of Theorem 2. Denote by $D^X_k\;:\!=\;m_{k,n}-m_{k-1,n}$ , $k=1,\ldots,h_n^{-1}-1$ , the differences of local minima based on the observations (4), with the general semimartingale (3) with jumps. Denote by $D^C_k\;:\!=\;\tilde m_{k,n}-\tilde m_{k-1,n}^*$ , $k=1,\ldots,h_n^{-1}-1$ , the differences of the unobservable local minima considered in Section 6.2. In particular, the statistics $D^C_k$ are based only on the continuous part $(C_t)$ in (3) such that the jumps are eliminated. Theorem 2 is implied by Proposition 1 if we can show that

\begin{equation*} \frac{\pi}{2(\pi-2)K_n}\sum_{k=(\lfloor h_n^{-1}\tau\rfloor-K_n)\vee 1}^{\lfloor h_n^{-1}\tau\rfloor-1}h_n^{-1} \big(\big(D^X_k\big)^2\mathbf{1}_{\{|D^X_k|\le u_n\}}-\big(D^C_k\big)^2\big) = \mathcal{O}_{\mathbb{P}}\big(h_n^{{\alpha}/({2\alpha+1})}\big) = {\scriptstyle{\mathcal{O}}}_{\mathbb{P}}\big(K_n^{-1/2}\big). \end{equation*}

We decompose this difference of the truncated estimator, which is based on the available observations with jumps, and the non-truncated estimator, which uses non-available observations without jumps, in the following way:

\begin{align*} & \frac{\pi}{2(\pi-2)K_n}\sum_{k=(\lfloor h_n^{-1}\tau\rfloor-K_n)\vee 1}^{\lfloor h_n^{-1}\tau\rfloor-1}h_n^{-1} \big(\big(D^X_k\big)^2\mathbf{1}_{\{|D^X_k|\le u_n\}}-\big(D^C_k\big)^2\big) \\[5pt] & = \frac{\pi}{2(\pi-2)K_n}\sum_{k=(\lfloor h_n^{-1}\tau\rfloor-K_n)\vee 1}^{\lfloor h_n^{-1}\tau\rfloor-1}h_n^{-1} \Big(\mathbf{1}_{\{|D^C_k|> c u_n\}}\big(\big(D^X_k\big)^2\mathbf{1}_{\{|D^X_k|\le u_n\}}-\big(D^C_k\big)^2\big) \\[5pt] & \qquad\qquad + \mathbf{1}_{\{|D^C_k|\le c u_n\}}\mathbf{1}_{\{|D^X_k|\le u_n\}} \big(\big(D^X_k\big)^2-\big(D^C_k\big)^2\big) - \mathbf{1}_{\{|D^C_k|\le c u_n\}} \mathbf{1}_{\{|D^X_k|> u_n\}}\big(D^C_k\big)^2\Big) \end{align*}

with some arbitrary constant $c\in(0,1)$ . Without loss of generality we can set $\beta=1$ in this proof, i.e. $u_n=h_n^{\kappa}$ . We consider the three addends, which are different error terms, separately by

• large absolute statistics based on the continuous part $(C_t)$ ;
• non-truncated statistics which contain (small) jumps;
• the truncation also of the continuous parts in the statistics $(D_k^X)$ which exceed the threshold.

The probability $\mathbb{P}(|D^C_k|> c u_n)$ can be bounded using the estimate from (18) and Gaussian tail bounds. Observe that the remainder in (18) is non-negative. This yields that, for some $y>0$ , we have

\begin{align*}\mathbb{P}\big(h_n^{-1/2}\big|\tilde m_{k,n}\big|>y\big)\le\mathbb{P}\Big(\sup_{0\le t\le 1} B_t >y\Big),\end{align*}

which is intuitive, since the errors $(\varepsilon_i)$ are non-negative. We apply the triangular inequality and then Hölder’s inequality to the expectation of the absolute first error term and obtain, for any $p\in \mathbb{N}$ ,

\begin{align*} & \frac{\pi}{2(\pi-2)K_n}\mathbb{E}\Bigg[\Bigg| \sum_{k=(\lfloor h_n^{-1}\tau\rfloor-K_n)\vee 1}^{\lfloor h_n^{-1}\tau\rfloor-1}h_n^{-1} \mathbf{1}_{\{|D^C_k|> c u_n\}}\big(\big(D^X_k\big)^2\mathbf{1}_{\{|D^X_k|\le u_n\}} - \big(D^C_k\big)^2\big)\Bigg|\Bigg] \\[5pt] & \le \frac{\pi}{2(\pi-2)K_n}\sum_{k=(\lfloor h_n^{-1}\tau\rfloor-K_n)\vee 1}^{\lfloor h_n^{-1}\tau\rfloor-1}h_n^{-1} \mathbb{E}\big[\mathbf{1}_{\{|D^C_k|> c u_n\}}\big|\big(D^X_k\big)^2\mathbf{1}_{\{|D^X_k|\le u_n\}} - \big(D^C_k\big)^2\big|\big] \\[5pt] & \le \frac{\pi}{2(\pi-2)K_n}\sum_{k=(\lfloor h_n^{-1}\tau\rfloor-K_n)\vee 1}^{\lfloor h_n^{-1}\tau\rfloor-1}h_n^{-1} \big(\mathbb{P}\big(|D^C_k|> c u_n\big)2\,\big(u_n^4+\mathbb{E}\big[\big(D^C_k\big)^4\big]\big)\big)^{1/2} \\[5pt] & \le \frac{\pi}{2(\pi-2)K_n}\sum_{k=(\lfloor h_n^{-1}\tau\rfloor-K_n)\vee 1}^{\lfloor h_n^{-1}\tau\rfloor-1}h_n^{-1} \big(\mathbb{P}\big(h_n^{-1/2}|D^C_k|> c h_n^{\kappa-1/2}\big)\big)^{1/2}\sqrt{2} u_n^2 \\[7pt] & \le \frac{\pi}{2(\pi-2)K_n}\sum_{k=(\lfloor h_n^{-1}\tau\rfloor-K_n)\vee 1}^{\lfloor h_n^{-1}\tau\rfloor-1}h_n^{-1} \bigg(2\mathbb{P}\bigg(|B_1|> \frac{c}{2} h_n^{\kappa-1/2}\bigg)\bigg)^{1/2}\sqrt{2} u_n^2 \\[7pt] &\le \frac{\sqrt{2}\pi}{(\pi-2)K_n}\sum_{k=(\lfloor h_n^{-1}\tau\rfloor-K_n)\vee 1}^{\lfloor h_n^{-1}\tau\rfloor-1}h_n^{2\kappa-1} \exp\!\bigg({-}\frac{c^2}{4} h_n^{2\kappa-1}\bigg) \\[7pt] & = \mathcal{O}\big(h_n^{(-p+1)(2\kappa-1)}\big) = {\scriptstyle{\mathcal{O}}}\big(h_n^{{\alpha}/({2\alpha+1})}\big). \end{align*}

Since $2\kappa-1<0$ and p is arbitrarily large, we conclude that the first error term is asymptotically negligible. We will use the elementary inequalities

\begin{align*} D^X_k & = \min_{i\in\mathcal{I}_k^n}(C_{{i}/{n}}+J_{{i}/{n}}+\varepsilon_i) - \min_{i\in\mathcal{I}_{k-1}^n}(C_{{i}/{n}}+J_{{i}/{n}}+\varepsilon_i) \\[5pt] & \le \min_{i\in\mathcal{I}_k^n}(C_{{i}/{n}}+\varepsilon_i) + \max_{i\in\mathcal{I}_k^n}J_{{i}/{n}} - \min_{i\in\mathcal{I}_{k-1}^n}(C_{{i}/{n}}+\varepsilon_i) - \min_{i\in\mathcal{I}_{k-1}^n}J_{{i}/{n}} \\[5pt] & = D^C_k + \max_{i\in\mathcal{I}_k^n}J_{{i}/{n}} - \min_{i\in\mathcal{I}_{k-1}^n}J_{{i}/{n}} + \mathcal{O}_{\mathbb{P}}\big(h_n^{\alpha\wedge 1/2}\big), \\[5pt] D^X_k & = \min_{i\in\mathcal{I}_k^n}(C_{{i}/{n}}+J_{{i}/{n}}+\varepsilon_i) - \min_{i\in\mathcal{I}_{k-1}^n}(C_{{i}/{n}}+J_{{i}/{n}}+\varepsilon_i) \\[5pt] & \ge \min_{i\in\mathcal{I}_k^n}(C_{{i}/{n}}+\varepsilon_i) + \min_{i\in\mathcal{I}_k^n}J_{{i}/{n}} - \min_{i\in\mathcal{I}_{k-1}^n}(C_{\frac{i}/{n}}+\varepsilon_i) - \max_{i\in\mathcal{I}_{k-1}^n}J_{{i}/{n}} \\[5pt] & = D^C_k + \min_{i\in\mathcal{I}_k^n}J_{{i}/{n}} - \max_{i\in\mathcal{I}_{k-1}^n}J_{{i}/{n}} + \mathcal{O}_{\mathbb{P}}\big(h_n^{\alpha\wedge 1/2}\big). \end{align*}

Therefore, we can bound $|D^X_k-D^C_k|$ by

\begin{align*} \sup_{\substack{i\in\mathcal{I}_{k}^n, j\in\mathcal{I}_{k-1}^n}}|J_{\frac{i}{n}}-J_{\frac{j}{n}}| & \le \sup_{\substack{s\in[kh_n,(k+1)h_n], t\in[(k-1)h_n,kh_n]}}|J_{s}-J_{t}| \\[5pt] & \le \sup_{\substack{s\in[kh_n,(k+1)h_n]}}|J_{s}-J_{kh_n}| + \sup_{\substack{t\in[(k-1)h_n,kh_n]}}|J_{kh_n}-J_{t}|, \end{align*}

and the remainder term of the approximation for the continuous part, which is $\mathcal{O}_{\mathbb{P}}\big(h_n^{\alpha\wedge 1/2}\big)$ . Since the compensated small jumps of a semimartingale admit a martingale structure, Doob’s inequality for càdlàg $L_2$ -martingales can be used to bound these suprema. Based on these preliminaries, we obtain, for the expected absolute value of the second error term,

\begin{align*} & \frac{\pi}{2(\pi-2)K_n}\mathbb{E}\Bigg[\Bigg| \sum_{k=(\lfloor h_n^{-1}\tau\rfloor-K_n)\vee 1}^{\lfloor h_n^{-1}\tau\rfloor-1}h_n^{-1}\mathbf{1}_{\{|D^C_k|\le c u_n\}} \mathbf{1}_{\{|D^X_k|\le u_n\}}\big(\big(D^X_k\big)^2-\big(D^C_k\big)^2\big)\Bigg|\Bigg] \\[5pt] & \le \frac{\pi}{2(\pi-2)K_n}\sum_{k=(\lfloor h_n^{-1}\tau\rfloor-K_n)\vee 1}^{\lfloor h_n^{-1}\tau\rfloor-1}h_n^{-1} \mathbb{E}\big[\mathbf{1}_{\{|D^C_k|\le c u_n\}}\mathbf{1}_{\{|D^X_k|\le u_n\}} \big|\big(D^X_k\big)^2-\big(D^C_k\big)^2\big|\big] \\[5pt] & \lesssim \frac{1}{K_n}\sum_{k=(\lfloor h_n^{-1}\tau\rfloor-K_n)\vee 1}^{\lfloor h_n^{-1}\tau\rfloor-1}h_n^{-1} \mathbb{E}\Big[\sup_{\substack{i\in\mathcal{I}_{k}^n, j\in\mathcal{I}_{k-1}^n}} |J_{{i}/{n}}-J_{{j}/{n}}|^2\wedge (1+c)^2u_n^2\Big] \\[5pt] & \lesssim \frac{1}{K_n}\sum_{k=(\lfloor h_n^{-1}\tau\rfloor-K_n)\vee 1}^{\lfloor h_n^{-1}\tau\rfloor-1}h_n^{-1} \mathbb{E}\Big[\sup_{\substack{t\in [kh_n,(k+1)h_n]}}|J_{t}-J_{kh_n}|^2\wedge u_n^2\Big] \\[5pt] & \lesssim \frac{1}{K_n}\sum_{k=(\lfloor h_n^{-1}\tau\rfloor-K_n)\vee 1}^{\lfloor h_n^{-1}\tau\rfloor-1}h_n^{-1} \mathbb{E}\big[|J_{(k+1)h_n}-J_{kh_n}|^2\wedge u_n^2\big] = \mathcal{O}\big(u_n^{2-r}\big). \end{align*}

Applying the elementary inequalities from above, a cross term in the upper bound for $\big(D^X_k\big)^2-\big(D^C_k\big)^2$ is of smaller order and directly neglected. It can be handled using the Cauchy–Schwarz inequality. In the last step, we adopt a bound on the expected absolute thresholded jump increments from [Reference Aït-Sahalia and Jacod1, (54)]. For the negligibility of the second error term, we thus get the condition that

(21)

\begin{equation} \kappa(2-r)\ge \frac{\alpha}{1+2\alpha}. \end{equation}

Doob’s inequality also yields

\begin{equation*} \mathbb{P}\Big(\sup_{\substack{t\in [kh_n,(k+1)h_n]}}|J_{t}-J_{kh_n}|\ge (1-c) u_n\Big) \le \frac{\mathbb{E}[|J_{(k+1)h_n}-J_{kh_n}|^{r\wedge 1}]}{((1-c)u_n)^{r\wedge 1}} + \mathcal{O}(h_n) = \mathcal{O}\big(h_n u_n^{-r}\big). \end{equation*}

For this upper bound, we decomposed the jumps in the sum of large jumps and the martingale of compensated small jumps, to which we applied Doob’s inequality. We derive the following estimate for the expectation of the third (absolute) error term:

\begin{align*} & \frac{\pi}{2(\pi-2)K_n}\sum_{k=(\lfloor h_n^{-1}\tau\rfloor-K_n)\vee 1}^{\lfloor h_n^{-1}\tau\rfloor-1}h_n^{-1} \mathbb{E}\big[\mathbf{1}_{\{|D^C_k|\le c u_n\}}\mathbf{1}_{\{|D^X_k|> u_n\}}\big(D^C_k\big)^2\big] \\[5pt] & \le \frac{\pi}{2(\pi-2)K_n}\sum_{k=(\lfloor h_n^{-1}\tau\rfloor-K_n)\vee 1}^{\lfloor h_n^{-1}\tau\rfloor-1}h_n^{-1} \mathbb{E}\big[\mathbf{1}_{\{2\sup_{\substack{s\in[(k-1)h_n,(k+1)h_n]}}|J_{s}-J_{kh_n}|\ge (1-c) u_n\}}\big(D^C_k\big)^2\big] \\[5pt] & \lesssim \frac{1}{K_n}\sum_{k=(\lfloor h_n^{-1}\tau\rfloor-K_n)\vee 1}^{\lfloor h_n^{-1}\tau\rfloor-1}h_n^{-1} \mathbb{P}\Big(\sup_{\substack{t\in [kh_n,(k+1)h_n]}}|J_{t}-J_{kh_n}|\ge (1-c) u_n\Big) \mathbb{E}\big[\big(D^C_k\big)^2\big] \\[5pt] & \lesssim \frac{1}{K_n}\sum_{k=(\lfloor h_n^{-1}\tau\rfloor-K_n)\vee 1}^{\lfloor h_n^{-1}\tau\rfloor-1} \bigg(\frac{\mathbb{E}[|J_{(k+1)h_n}-J_{kh_n}|^{r\wedge 1}]}{((1-c)u_n)^{r\wedge 1}} + \mathcal{O}(h_n)\bigg) = \mathcal{O}\big(h_n u_n^{-r}\big). \end{align*}

For the negligibility of the third error term, we thus get the condition that

(22)

\begin{equation} 1-\kappa \!r\ge \frac{\alpha}{1+2\alpha}. \end{equation}

Since, under the conditions of Theorem 2, (21) and (22) are satisfied, the proof is finished by the negligibility of all addends in the decomposition above.

Acknowledgement

The author is grateful to an anonymous reviewer for helpful comments.

Funding information

Financial support from the Deutsche Forschungsgemeinschaft (DFG) under grant 403176476 is gratefully acknowledged.

Competing interests

There were no competing interests to declare which arose during the preparation or publication process of this article.

References

Aït-Sahalia, Y. and Jacod, J. (2010). Is Brownian motion necessary to model high-frequency data? Ann. Statist. 38, 3093–3128.10.1214/09-AOS749CrossRef Google Scholar

Aït-Sahalia, Y. and Jacod, J. (2014). High-Frequency Financial Econometrics. Princeton University Press.Google Scholar

Berk, K. N. (1973). A central limit theorem for m-dependent random variables with unbounded m . Ann. Prob. 1, 352–354.10.1214/aop/1176996992CrossRef Google Scholar

Bibinger, M., Hautsch, N., Malec, P. and Reiß, M. (2019). Estimating the spot covariation of asset prices – Statistical theory and empirical evidence. J. Business Econom. Statist. 37, 419–435.10.1080/07350015.2017.1356728CrossRef Google Scholar

Bibinger, M., Jirak, M. and Reiß, M. (2016). Volatility estimation under one-sided errors with applications to limit order books. Ann. Appl. Prob. 26, 2754–2790.10.1214/15-AAP1161CrossRef Google Scholar

Bibinger, M., Neely, C. and Winkelmann, L. (2019). Estimation of the discontinuous leverage effect: Evidence from the Nasdaq order book. J. Econometrics 209. 158–184.10.1016/j.jeconom.2019.01.001CrossRef Google Scholar

Bibinger, M. and Winkelmann, L. (2018). Common price and volatility jumps in noisy high-frequency data. Electron. J. Statist. 12, 2018–2073.10.1214/18-EJS1444CrossRef Google Scholar

Bishwal, J. P. N. (2022). Parameter Estimation in Stochastic Volatility Models. Springer, Cham.10.1007/978-3-031-03861-7CrossRef Google Scholar

Chaker, S. (2017). On high frequency estimation of the frictionless price: The use of observed liquidity variables. J. Econometrics 201. 127–143.10.1016/j.jeconom.2017.06.018CrossRef Google Scholar

Clinet, S. and Potiron, Y. (2019). Testing if the market microstructure noise is fully explained by the informational content of some variables from the limit order book. J. Econometrics 209, 289–337.10.1016/j.jeconom.2019.01.004CrossRef Google Scholar

Clinet, S. and Potiron, Y. (2021). Estimation for high-frequency data under parametric market microstructure noise. Ann. Inst. Statist. Math. 73, 649–669.10.1007/s10463-020-00762-3CrossRef Google Scholar

El Euch, O., Fukasawa, M. and Rosenbaum, M. (2018). The microstructural foundations of leverage effect and rough volatility. Finance Stoch. 22, 241–280.10.1007/s00780-018-0360-zCrossRef Google Scholar

Hansen, P. R. and Lunde, A. (2006). Realized variance and market microstructure noise. J. Business Econom. Statist. 24, 127–161.10.1198/073500106000000071CrossRef Google Scholar

Hoffmann, M., Munk, A. and Schmidt-Hieber, J. (2012). Adaptive wavelet estimation of the diffusion coefficient under additive error measurements. Ann. Inst. H. Poincaré Prob. Statist. 48, 1186–1216.10.1214/11-AIHP472CrossRef Google Scholar

Jacod, J. and Protter, P. (2012). Discretization of Processes. Springer, Berlin.10.1007/978-3-642-24127-7CrossRef Google Scholar

Janson, S. (2007). Brownian excursion area, Wright’s constants in graph enumeration, and other Brownian areas. Prob. Surv. 4, 80–145.10.1214/07-PS104CrossRef Google Scholar

Jirak, M., Meister, A. and Reiß, M. (2014). Adaptive function estimation in nonparametric regression with one-sided errors. Ann. Statist. 42, 1970–2002.10.1214/14-AOS1248CrossRef Google Scholar

Li, Y., Xie, S. and Zheng, X. (2016). Efficient estimation of integrated volatility incorporating trading information. J. Econometrics 195 33–50.10.1016/j.jeconom.2016.05.017CrossRef Google Scholar

Li, Z. M. and Linton, O. (2022). A ReMeDI for microstructure noise. Econometrica 90, 367–389.10.3982/ECTA17505CrossRef Google Scholar

Liu, Y., Liu, Q., Liu, Z. and Ding, D. (2017). Determining the integrated volatility via limit order books with multiple records. Quant. Finance 17, 1697–1714.10.1080/14697688.2017.1307510CrossRef Google Scholar

Mancini, C. (2009). Non-parametric threshold estimation for models with stochastic diffusion coefficient and jumps. Scand. J. Statist. 36, 270–296.10.1111/j.1467-9469.2008.00622.xCrossRef Google Scholar

Mancini, C., Mattiussi, V. and Renò, R. (2015). Spot volatility estimation using delta sequences. Finance Stoch. 19, 261–293.10.1007/s00780-015-0255-1CrossRef Google Scholar

Meister, A. and Reiß, M. (2013). Asymptotic equivalence for nonparametric regression with non-regular errors. Prob. Theory Relat. Fields 155, 201–229.10.1007/s00440-011-0396-xCrossRef Google Scholar

Reiß, M. and Wahl, M. (2019). Functional estimation and hypothesis testing in nonparametric boundary models. Bernoulli 25, 2597–2619.10.3150/18-BEJ1064CrossRef Google Scholar

Rosenbaum, M. and Tomas, M. (2021). From microscopic price dynamics to multidimensional rough volatility models. Adv. Appl. Prob. 53, 425–462.10.1017/apr.2020.60CrossRef Google Scholar

Shepp, L. A. (1979). The joint density of the maximum and its location for a Wiener process with drift. J. Appl. Prob. 16, 423–427.10.2307/3212910CrossRef Google Scholar

Takács, L. (1996). On a generalization of the arc-sine law. Ann. Appl. Prob. 6, 1035–1040.10.1214/aoap/1034968240CrossRef Google Scholar

Tauchen, G. and Todorov, V. (2011). Volatility jumps. J. Business Econom. Statist. 29, 356–371.Google Scholar

Figure 1. Monte Carlo means to estimate $\Psi_n(\sigma^2)$ over a fine grid (interpolated line) for $n=23\,400$ and $n\cdot h_n=15$. Left: the dotted line shows the identity function. Right: the dotted line is a linear function with slope $1.046$.

Table 1. Regression slopes to measure the bias of estimator (8) and deviation $\Psi_n(\sigma^2)-\sigma^2$.

Figure 2. True and estimated spot volatility with pointwise confidence sets.

Table 2. Summary statistics of estimation for different values of $h_n$ and $K_n$. MSD = mean standard deviation, MAB = mean absolute bias, MABC = MAB of bias-corrected estimator.

Article contents

Inference on the intraday spot volatility from high-frequency order prices with irregular microstructure noise

Abstract

Keywords

MSC classification

1. Introduction

2. Model with lower-bounded, one-sided noise and assumptions

3. Construction of spot volatility estimators

4. Asymptotic results

5. Implementation and simulations

5.1. Monte Carlo approximation of $\Psi_n$

5.2. Simulation study of estimators

6. Proofs

6.1. Law of the integrated negative part of a Brownian motion

6.2. Asymptotics of the spot volatility estimation in the continuous case

6.3. Asymptotics of the truncated spot volatility estimation with jumps

Acknowledgement

Funding information

Competing interests

References

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests