1. Introduction
Single-server queues are fundamental models in applied probability and queueing theory, and heavy-traffic limits have been well studied; see the recent surveys in [Reference Chen and Yao7], [Reference Whitt31], and [Reference Whitt32], and the recent work on the model with abandonment in [Reference Ward and Glynn28] and [Reference Ward and Glynn30]. In these studies, the arrival processes are usually assumed to be Poisson or renewal processes. Recently, to account for the excessive burstiness, clustering effects, and path-dependence, other point processes such as Hawkes and Pólya processes have been used to model arrivals in queueing models. They can be used to model, for example, internet/social media traffic flows, patient flows during a pandemic, neuron interaction processes, and high-frequency transaction processes in limit order books. Queues with Hawkes input are studied in [Reference Chen8], [Reference Daw and Pender11], [Reference Gao and Zhu16], [Reference Koops, Saxena, Boxma and Mandjes22], and [Reference Selvamuthu and Tardelli26], and single-server queues with Pólya arrival processes are considered in [Reference Fendick and Whitt14] and [Reference Fendick and Whitt15]. On the other hand, many systems have parallel servers, such as data centers and internet processors, and various randomized routeing schemes have been developed in the literature (see e.g. the recent survey [Reference Der Boor, Borst, Van Leeuwaarden and Mukherjee12]). A simple scheme is to send incoming jobs/customers randomly to any of the servers upon arrival.
In this paper we consider parallel single-server queues in heavy traffic, where the arrival process for each queue is randomly split from one Hawkes process. That is, upon each arrival, a customer or job is randomly assigned to one of the queues. The service times in the different queues are mutually independent and may have different distributions, and within a queue the service times are assumed to be independent and identically distributed (i.i.d.). The service discipline in each queue is first-come first-served (FCFS). We assume that each queue is critically loaded in heavy traffic, that is, the traffic intensity gets close to one. We also consider the queueing model with abandonment, where customers joining each queue may abandon the system while waiting in the queue and before receiving service. We focus on the joint queue length and workload processes, and prove the functional central limit theorem (FCLT) for the associated diffusion-scaled processes.
Randomly split Hawkes processes have recently been studied by the present authors [Reference Li and Pang23]. Unlike Poisson processes, the split Hawkes processes become a multivariate Hawkes process with a particular dependence structure (see the covariance function of scaling limits in Proposition 2.1). As a consequence, the vector of queueing (workload) processes of the parallel server queues will be correlated. In contrast, in the case of randomly split Poisson processes, because of the independence property of Poisson thinning/sampling, the queues at the parallel servers are mutually independent and each queue can be analyzed as an M/G/1 queue. However, our model becomes more challenging to analyze directly. This is in a similar flavor to the open problem recently posed by Mandjes [Reference Mandjes24] about multivariate M/G/1 queues with coupled input and parallel service. Our model presents another example of multivariate single-server queues with correlated input, and our results provide heavy-traffic diffusion approximations for such a system.
For the queueing models without abandonment, the queueing limit process is a multidimensional reflected Brownian motion (RBM) in the non-negative orthant with orthonormal reflections. In many queueing networks, RBMs arise as the heavy-traffic limits of the queueing and workload processes; see e.g. the recent surveys [Reference Dai and Harrison10] and [Reference Williams34]. In the very original article by Harrison [Reference Harrison17], as a diffusion approximation for a pair of single-server queues in series, a two-dimensional RBM in the non-negative quadrant was introduced, which has a normal reflection at one axis and a tangential reflection at the other axis. On the other hand, our model with two queues provides an example of RBMs in the non-negative quadrant with normal reflections at both axes. A multidimensional RBM limit with orthonormal reflections was derived for a network of parallel single-server queues with Markov-modulated service speeds in [Reference Dorsman, Vlasiou and Zwart13]. This work contributes to the literature of queueing network scaling limits by providing a concrete example of RBMs in a non-negative orthant with orthonormal reflections.
Although the correlation structure of the split Hawkes processes can be explicitly characterized (as can the Brownian motion without reflection), it is challenging to compute the covariance functions for the limiting RBM in the parallel server queueing model. Moreover, one can check that the conditions of Propositions 8 and 9 in [Reference Dai and Harrison9] (see also [Reference Williams33]) are not satisfied so that the limit process does not possess a product-form stationary distribution. Fortunately, the numerical approach developed in [Reference Dai and Harrison9] to compute the stationary distribution can be used for the limiting RBM in our model. We implement that algorithm in a two-queue example to illustrate how the parameters in the Hawkes process and the splitting probabilities as well as the heavy-traffic scalings affect the correlation between the steady state of the two queues in the limit (see Section 3.1).
For the queueing models with abandonment, the queueing limit process is a multidimensional reflected Ornstein–Uhlenbeck (OU) diffusion in the non-negative orthant with orthonormal reflections. This extends the one-dimensional results in [Reference Ward and Glynn30]. This result is of interest in two respects. First, there is a very limited result in the queueing network literature that gives a multidimensional reflected OU limit. Huang and Zhang [Reference Huang and Zhang20] study open Jackson networks with reneging, and obtain a multidimensional reflected OU diffusion limit with a reflection being characterized via the routeing probability matrix (extending the RBM approximations for open Jackson networks without abandonment in [Reference Reiman25]). The reflection is very complex in that model, but the reflections in the limits of our model are orthonormal, which is of interest in its own right. It is worth noting that in the studies of dynamic scheduling in multiclass GI/GI/1+GI queues [Reference Ata and Tongarlak1, Reference Kim and Ward21], although the queueing processes are multidimensional, the Brownian control problem is solved via the so-called workload process as a one-dimensional controlled OU process with reflection. Second, there have been studies of the stationary distributions and ergodic properties of reflected OU processes in one dimension; see e.g. [Reference Ward and Glynn29] and [Reference Xing, Zhang and Wang35]. However, such results beyond one dimension are wide open. Our limit process provides a concrete example for which an explicit characterization of the stationary distribution could potentially be obtained. We leave this as an open problem for future work. (We also refer the readers to a relevant article [Reference Atar, Budhiraja and Dupuis2] for the recurrence properties of multidimensional reflected diffusions in convex polyhedral cones.) It would also be interesting to explore whether the numerical scheme in [Reference Dai and Harrison9] could be further developed for the multidimensional reflected OU processes such as those arising from our model.
The proof for the model without abandonment uses the continuous mapping approach for the multidimensional reflection maps (see [Reference Whitt31]). This is standard, but the particular scaling involved in the Hawkes process and its splitting scheme must be taken into account. For the queueing models with abandonment, we adapt the approach in [Reference Ward and Glynn30] by comparing with the model without abandonment. It is therefore also important to first study the model without abandonment. In the meantime, the comparison approach is non-trivial, since this requires us to use the martingale representations for the Hawkes process and the randomly split Hawkes processes [Reference Bacry, Delattre, Hoffmann and Muzy3, Reference Li and Pang23], as well as some martingale inequalities and properties.
We also discuss two related models in which the arrival processes for the parallel servers come from a multivariate Hawkes process in Section 5. The limits for the models with and without abandonment are of the same type with orthonormal reflections but the driving Brownian motions have a different covariance function. Although the split Hawkes process is also a multivariate Hawkes process, the splitting scheme introduces a particular structure together with only one self-exciting function. On the other hand, a general multivariate Hawkes process has a matrix of self-exciting functions. This is yet another example of the open problems posed in [Reference Mandjes24], and also complements the parallel server queueing model with correlated services due to Markov modulation in [Reference Dorsman, Vlasiou and Zwart13]. The limits for these models will also be of interest in their own right.
1.1. Organization of the paper
In Section 2 we discuss random splitting of Hawkes processes and review the FCLT results. In Section 3 we present the parallel single-server queueing model with split Hawkes arrival processes and the FCLT on the joint queueing and workload processes. A numerical example is provided in Section 3.1 to illustrate the stationary distribution of the limiting queueing process in a model with two queues. In Section 4 we discuss the model with abandonment and state the corresponding FCLT. In Section 5 we consider the parallel server queueing model with a general multivariate Hawkes process and state the corresponding scaling limits. The proofs for these models are given in Section 6 and the Appendix.
1.2. Notation
All random variables and processes are defined in a common complete probability space $(\Omega,\mathcal{F},\{\mathcal{F}_{t}\}_{t\geq 0},\mathbb{P})$ . Throughout the paper, ${\mathbb N}$ denotes the set of natural numbers. ${\mathbb R}^d ({\mathbb R}^d_{+})$ denotes the space of d-dimensional real (non-negative) vectors; we write ${\mathbb R} ({\mathbb R}_{+})$ when $d=1$ . Let ${\mathbb D}^{d}={\mathbb D}^{d}({\mathbb R}_{+},{\mathbb R}^{d})$ denote the space of ${\mathbb R}^{d}$ -valued càdlàg functions on ${\mathbb R}_{+}$ . $({\mathbb D},J_{1})$ denotes the space ${\mathbb D}$ equipped with Skorokhod $J_{1}$ topology (see [Reference Billingsley4]), which is complete and separable. ${\mathbb D}^{d}$ denotes the space of ${\mathbb R}^{d}$ -valued càdlàg functions endowed with the weak Skorokhod $J_{1}$ topology [Reference Whitt32], for which we write $({\mathbb D}^{d},J_{1})$ . For an integrable function $f\;:\; {\mathbb R}\to{\mathbb R}$ , its $L^{1}$ norm is denoted by $\|f\|_{1}$ . Notations $\to$ and $\Rightarrow$ mean convergence of real numbers and convergence in distribution, respectively. For a vector a, $\text{diag} (a)$ denotes the diagonal matrix with the elements of vector a on the main diagonal. For two vectors $a=(a_{k})_{k}$ and $b=(b_{k})_{k}$ , $a b=(a_{k}b_{k})_{k}=\text{diag} (a) b$ denotes their elementwise product. We use $\mathfrak{e}$ to denote the identity function $\mathfrak{e}(t)=t$ for $t \in {\mathbb R}_+$ . Additional notation is introduced in the paper whenever necessary.
2. Random splitting of Hawkes processes
A one-dimensional Hawkes process, $N=\{N(t),t\ge0\}$ , is a simple counting process with conditional intensity
where $\lambda_{0}>0$ is a constant, called the baseline intensity, $H\;:\; {\mathbb R}_{+}\to{\mathbb R}_{+}$ is the self-exciting function, and $\tau_{j}$ denotes the jth event time of N.
We consider the randomly split Hawkes processes $N_k=\{N_k(t),t\ge0\}$ , $k=1,\dots, d$ , defined as follows:
where $\{\xi_{j},j\ge1\}$ is a sequence of i.i.d. random variables, independent of N, with the distribution
It is shown in [Reference Li and Pang23] that the splitting Hawkes process $(N_{k})_{k}$ is a multivariate Hawkes process with conditional intensity
Notice that the split processes $N_{k}$ are not independent.
We consider a sequence of Hawkes processes indexed by n, that is, $N^{(n)}$ is a Hawkes process for the nth system with intensity process $\lambda^{(n)}(\!\cdot\!)$ whose baseline intensity in (2.1) is $\lambda_{0}^{(n)}$ while the self-exciting function H stays the same. The splitting variables are denoted by $\{\xi_{j}^{(n)},j\ge1\}$ with distribution $(p^{(n)}_{k})_{k}$ .
Define the LLN and CLT-scaled processes for the splitting $\bigl(N^{(n)}_{k}\bigr)_{k}$ by
respectively. It is easy to see (see e.g. [Reference Bacry, Delattre, Hoffmann and Muzy3]) that
where $\varphi=\sum_{j\ge1} H^{*j}$ is the renewal function of H,
denotes the convolution of f and g on ${\mathbb R}_{+}$ , and $\mathfrak{e}$ is the identity function.
The following FCLT for the splitting processes $\bigl(N^{(n)}_{k}\bigr)_{k}$ is proved in [Reference Li and Pang23].
Proposition 2.1. Suppose that
We have
where $(\hat{N}_{k})_{k}$ is a d-dimensional Brownian motion with mean zero and covariance matrix
where $\delta_{kk'}=1$ if $k=k'$ and $\delta_{kk'}=0$ if $k\neq k'$ .
The process $\hat{N}_k$ admits the representations
where $(W_{k})_k$ is a standard d-dimensional Brownian motion, $W=\sum_{k=1}^{d}\sqrt{p_{k}}W_{k}$ , and $\hat{S}_{k}=\sqrt{p_{k}}W_{k}- p_{k}W$ .
The condition $\|H\|_{1}\in(0,1)$ is referred to as the stability condition in the literature of Hawkes processes, under which a stationary version of Hawkes process exists and $\varphi*\mathfrak{e}(t)\to {{\|H\|_{1}}/{(1-\|H\|_{1})}}$ as $t\to\infty$ . The first representation in (2.6) is a direct consequence of [Reference Bacry, Delattre, Hoffmann and Muzy3, Theorem 2] for the vector-valued Hawkes process $\bigl(N^{(n)}_{k}\bigr)_{k}$ characterized by (2.2), while the second representation follows from [Reference Whitt32, Chapter 9.5], where $\hat{S}$ is also a d-dimensional Brownian motion, independent of W, with covariance function
It is necessary that $\sum_{k}\hat{S}_{k}=0$ .
In our study of parallel server queues with a split Hawkes process as arrivals, we will also need the following alternative diffusion-scaled process by replacing the centering term $\mathbb{E}\bigl[\bar{N}^{(n)}_{k}(t)\bigr]$ in (2.3) with a linear function:
The proof of the proposition is given in the Appendix.
Proposition 2.2. Under the conditions in Proposition 2.1, if, in addition,
then
where $(\hat{N}_{k})_{k}$ is the same limit as given in Proposition 2.1.
3. Parallel single-server queues with split Hawkes arrival processes
In this model, there are d parallel servers and each server has its own queue. Arrivals in each queue come from the randomly split Hawkes process and are served in the FCFS discipline. Let $\{N(t)\;:\; t \ge 0 \}$ be the Hawkes process, arriving in the system, and let $(N_{k})_{k}$ be the splitting Hawkes arrival processes with the splitting mechanism as described in Section 2. Recall the baseline rate $\lambda_0$ , splitting probability $(p_k)_{k}$ , and self-exciting function H. For every $k=1,2,\ldots,d$ , let $\{\eta_{j,k}\}_j$ represent the service times of customers in the kth queue. We assume that $\{\eta_{j,k}\}_j$ are i.i.d. with finite mean $ m_{k}$ and variance $\sigma_{k}^2$ , and are independent of the Hawkes arrival process and the random splitting process. Denote the service rate of the kth queue by $\mu_{k}\;:\!=\; 1/m_{k}$ , and write $\mu = (\mu_k)_k$ .
We now give a definition of the model. For every $k=1,\ldots,d$ , let
be the partial sum with the i.i.d. service times, and let $U_{k}(t)\;:\!=\; \sup\{m\ge 0\;:\; V_{k}(m)\le t\}$ be its right-continuous inverse. Then $U_{k}(t)$ is a renewal process representing the number of jobs that the server k can potentially complete by time t. Let $Z=(Z_{k})_{k}$ be the workload processes, and let $Q=(Q_{k})_{k}$ be the queue length processes, with $Q(0)=(Q_{k}(0))_{k}$ being the number of initial jobs in each queue. Under our assumptions, the processes N and (V, U) are independent, and we further assume that Q(0) is independent of the new arrivals $N=(N_{k})_{k}$ and the service processes. Then we have the following flow-balance equations for the dynamics at each queue: for $k=1,\ldots,d$ ,
where
is the busy-time process, and its paired idle-time process is given by
The processes $(Q_{k},Z_{k})_{k}$ take values in the non-negative orthant ${\mathbb R}_{+}^{2d}$ , and can be characterized by the following version of reflection maps (see [Reference Whitt32, Chapter 14.2]).
Definition 3.1. For any $x\in{\mathbb D}^{d}$ and any reflection matrix, i.e. a matrix with $\texttt{Q}_{ij}\ge0$ and $\sum_{i}\texttt{Q}_{ij}\le 1$ such that $(\texttt{Q})^{n}\to0$ as $n\to\infty$ , let the feasible regulator set be
where $\texttt{I}$ is the identity matrix and ${\mathbb D}_{\uparrow}^{d}$ is the subset of functions in ${\mathbb D}^{d}$ that are non-decreasing and non-negative in each coordinate. Then the reflection mapping is defined as
where y is called the regulator component, given by
that is, for all i and t,
and where z is called the content component, given by
In the definition above, the matrix $\texttt{I}-\texttt{Q}$ is the direction of reflection (see [Reference Dai and Harrison9, Reference Harrison and Reiman18]), that is, whenever the boundary face $\{z\in{\mathbb R}^{d}_{+}\;:\; z_{j}=0\}$ is hit for some j, the process $w_{j}$ increases and causes an instantaneous displacement of z in the direction given by $\text{col}_{j}(\texttt{I}-\texttt{Q})$ , the jth column of $(\texttt{I}-\texttt{Q})$ ; $y=\psi_{\texttt{Q}}$ is the minimal element in $\Psi_{\texttt{Q}}$ such that $z\ge0$ . Moreover, the complementarity property holds ([Reference Whitt32, Theorem 14.2.3]), that is,
which also characterizes the regulator uniquely. It is proved in [Reference Whitt32, Theorems 14.2.5 and 14.2.7] that $\psi_{\texttt{Q}}$ is well-defined and Lipschitz under both the uniform norm and the Skorokhod $J_1$ topology.
In particular, if the reflection matrix $\texttt{Q}=0$ , then the angle of reflection is 0 (see [Reference Harrison, Landau and Shepp19]), and the reflection direction is orthogonal to the boundary, which is the case in our paper. Thus the reflection is also referred to as orthonormal. Letting $(\phi,\psi)\equiv(\phi_{0},\psi_{0})$ denote the operator in this case, and recalling that $\mathfrak{e}(t)\equiv t$ is the identity function on ${\mathbb R}_+$ , we can rewrite the processes in (3.1) in terms of Definition 3.1 as follows:
and the queue length process can be rewritten similarly.
We consider a sequence of such queueing systems in heavy traffic and indexed by the parameter n with $n\to \infty$ . The traffic intensity for the kth queue is given by
Define the following scaled processes:
and
We assume the following conditions on the service processes.
Assumption 3.1. Assume that the service times $\{\eta_{j,k}\}_j$ for $k=1,2,\ldots,d$ , satisfy
and letting $F^{(n)}_{k}$ be the cumulative distribution function (CDF) of $ \eta_{1,k}^{(n)}$ , we have for every $\varepsilon>0$
The traffic intensity at each queue satisfies the following: for $k=1,\dots,d$ ,
Observe that by the definition in (3.3), under (3.7), we have
Remark 3.1. Condition (3.6) is known as Lindeberg’s condition for the triangular array $\{\eta^{(n)}_{j,k}\}_{j,k}$ (see [Reference Billingsley5, Theorem 27.2]), under which one can show that as $n\to\infty$ ,
and the processes in (3.4) satisfies
where u.o.c. is short for ‘uniformly on every compact set on ${\mathbb R}_{+}$ ’, $\hat{V}$ is a d-dimensional Brownian motion with mean zero and covariance matrix
and $\hat{U}_{k}(t)=- \mu_{k}\hat{V}_{k}(\mu_{k}t)$ for $t\ge 0$ .
Remark 3.2. In addition to (3.5), if we assume
for some $\hat{\lambda}_{0}, \hat{\mu}_{k},\hat{p}_{k}\in{\mathbb R}$ , as $n\to\infty$ , where $p_k$ is given in (2.5), then (3.7) holds with
The process limits remain the same.
We have the following FCLT for the diffusion-scaled processes $\bigl(\hat{Q}^{(n)},\hat{Z}^{(n)}\bigr)$ .
Theorem 3.1. Suppose that Assumption 3.1 and the conditions in (2.5) and (2.8) hold, and that $\hat{Q}^{(n)}(0)\Rightarrow\hat{Q}(0)\geq0$ . Then
with the limit
where
with $\hat{\rho}_{k}$ in (3.7), and $\hat{W}$ is a d-dimensional Brownian motion with covariance function
3.1. The stationary distribution of $\;\hat{\!Q}$
Observe that $\hat{Q}$ in Theorem 3.1 is a reflected Brownian motion on the orthant ${\mathbb R}_{+}^{d}$ with drift vector $\hat{\theta}$ , covariance matrix $\hat{R}$ , and reflection matrix $\texttt{I}$ . Since $\hat{\theta}_{k}>0$ for every $ k=1,2,\ldots,d$ , there exists a unique stationary distribution of $\hat{Q}$ .
Recall that a probability measure $\pi$ on ${\mathbb R}_{+}^{d}$ is called a stationary distribution for the reflected Brownian motion $\hat{Q}$ if, for every bounded Borel function f on ${\mathbb R}_{+}^{d}$ and every $t>0$ ,
It is shown in [Reference Harrison and Reiman18] and [Reference Williams33] that the stationary distribution for a reflected Brownian motion in the orthant is equivalent to Lebesgue measure on ${\mathbb R}_{+}^{d}$ , and the occupation measures on faces are absolutely continuous with respect to the $(d-1)$ -dimensional Lebesgue measure $\nu_{k}(\!\cdot\!)$ on the face $F_{k}\;:\!=\; \{z \in {\mathbb R}_{+}^{d}\;:\; z_{k}=0\}$ , that is,
Furthermore, $(q_{0};\, q_{1},\ldots, q_{d})$ in this paper jointly satisfy the basic adjoint relationship (BAR):
for all $f\in C_{b}^{2}({\mathbb R}_{+}^{d})$ . It can be checked that the coefficients for $\hat{Q}$ do not satisfy the conditions of Propositions 8 and 9 in [Reference Dai and Harrison9] (see also [Reference Williams33]) which means the stationary density cannot be a product function. The numerical approach developed in [Reference Dai and Harrison9] to compute the stationary distribution is applicable to our model.
Noticing that the marginal distribution $\hat{Q}_{k}(\infty)$ for each k is exponentially distributed from the one-dimensional reflected Brownian motion results, we readily obtain
We apply the numerical algorithm in [Reference Dai and Harrison9] to a model with two queues and compute some characteristics of the stationary distribution. Similar to [Reference Dai and Harrison9], we compute the mean of the queueing limit and the correlation coefficients of $(\hat{W}_{1},\hat{W}_{2})$ and $(\hat{Q}_{1}(\infty),\hat{Q}_{2}(\infty))$ , and examine the effect of the splitting probability parameter $p_{k}$ . We fix $\lambda_{0}=1,\sigma_{1}=\sigma_{2}=1,\|H\|_{1}=0.3$ , and change the values of $p_{1}$ . Observe that by equation (3.8), the value of $\mu_{k}$ changes accordingly as $p_{k}$ changes. We also consider two scenarios of $(\hat{\rho}_{1},\hat{\rho}_{2})$ , taking values $(0.3,0.6)$ and $(1,1.5)$ . Notice that the correlation of $(\hat{W}_{1},\hat{W}_{2})$ from (3.11) is independent of the choice of $(\hat{\rho}_{1},\hat{\rho}_{2})$ by definition. Table 1 shows how the splitting probability $p_{1}$ and the parameters $(\hat{\rho}_{1},\hat{\rho}_{2})$ affect the correlation of $(\hat{Q}_{1}(\infty),\hat{Q}_{2}(\infty))$ .
4. Parallel single-server queues with split Hawkes arrival processes and abandonment
In this section we consider the parallel single-server queues as described in the previous section but with an additional feature of abandonment, that is, each customer joining the queue has a patience time and will leave the queue if the patience time runs out before entering service. The arrival and service processes are modeled in the same way as the previous section. Patience times are assumed to be independent of the arrival and service processes as well as the splitting process. Let $\{\vartheta_{j,k,p}\}_{j\in{\mathbb Z}}$ represent the patience times for every customers with CDF $G_{k}$ , $k=1,\dots,d$ . The dependence on k may be interpreted as the impact of each queue upon patience, since their services may have different distributions. Of course, one may also consider the homogeneous scenario with patience times having the same distribution in all the queues. In this model, variables and processes will be indexed with an additional subscript p to distinguish from the model in the previous section whenever necessary.
To give a rigorous definition of the model, we adapt the definition from [Reference Ward and Glynn30] and introduce the offered workload process
for $t>0$ , where $\{\tau_{j,k}\;:\; j \ge 1\}$ are the arrival times of the split Hawkes process $N_k(t)$ ,
with $\Theta_{k,p}(0)=0$ representing the waiting time for the $(j+1)$ th customers initially in the kth queue. Note that $\tilde{Z}_{k,p}(0)=\Theta_{k,p}(Q_{k}(0))$ . The cumulative busy-time process is defined by
Different from the offered workload process, the observed workload process is defined by
which retrospectively counts both the workload from customers that eventually receive service, $\tilde{Z}_{k,p}$ , and from those that are currently in a queue but eventually renege, $\textbf{1}(\cdot\ge\vartheta_{j,k,p}>\cdot)$ , in contrast to the prospective definition for $Z_{k}$ in (3.1).
Similarly, we define the observed queue length process as follows:
which counts the number of customers currently in the queue. Observe that the busy-time process in (4.2) also satisfies
and the idle-time process is given by
We also need the following definition generalizing the multidimensional reflection mapping in Definition 3.1, which is adapted from Appendix A.1 in [Reference Ward and Glynn30].
Definition 4.1. Let $\Gamma$ be a $d\times d$ matrix of real entries and let $\texttt{Q}$ be a reflection matrix in Definition 3.1. Then, for each $x\in{\mathbb D}^{d}$ with $x(0)\ge0$ , let $(z,y)\in{\mathbb D}^{2d}$ be a unique pair satisfying
-
(i) $z(t)=x(t)- \int_{0}^{t}\Gamma z(s)\,{\textrm{d}} s+ (\texttt{I}-\texttt{Q})y(t) \ge 0$ for all $t\ge0$ ,
-
(ii) $y(0)=0$ , $y\in{\mathbb D}^{d}_{\uparrow}$ and $\int_{0}^{\infty}z_{k}(t) \,{\textrm{d}} y_{k}(t)=0$ for every k.
Furthermore, let u be the unique solution to the integral equation
Define the mapping $\mathcal{M}_{\Gamma}\;:\; {\mathbb D}^{d}\to{\mathbb D}^{d}$ by $\mathcal{M}_{\Gamma}(x)=u$ . Then
where $\phi_{\texttt{Q}}$ and $\psi_{\texttt{Q}}$ are the content and reflection operators, respectively, in Definition 3.1.
By [Reference Ward and Glynn30, Lemma 3], $\mathcal{M}_{\Gamma}$ is Lipschitz-continuous with respect to uniform topology on every compact set. One can find from the integral equation (4.4) that
In our model, $\Gamma=\text{diag} (\gamma)$ is a diagonal matrix with $\gamma=(\gamma_{k})_{k}\in{\mathbb R}^{d}$ on the diagonal, and we have
for each coordinate process. In this case, we simply write $u=\mathcal{M}_{\gamma}(x)$ . If $x(t)=x_{0}+ a t+\sigma B(t)$ in Definition 4.1, where B is a d-dimensional standard Brownian motion, $a\in{\mathbb R}^{d}$ and $\sigma\in{\mathbb R}^{d\times d}$ , then $u=\mathcal{M}_{\gamma}(x)$ is the unique strong solution to the stochastic differential equation
This is well-defined and called an Ornstein–Uhlenbeck (OU) process. Moreover, $z=\phi(\mathcal{M}_{\gamma}(x))$ is the regulated (reflected) OU process. Recall that $\phi=\phi_0$ with the matrix $\texttt{Q}\equiv 0$ .
As in the previous section, let $\bigl(Q^{(n)}_{p},Z^{(n)}_{p}\bigr)=\bigl(Q^{(n)}_{k,p}, Z^{(n)}_{k,p}\bigr)_{k}$ be the associated observed queue length and workload processes for the nth queueing system. We are interested in the diffusion-scaled observed processes in the heavy-traffic regime, and define
We make the following assumption on $G^{(n)}_{k}$ for the variables $\vartheta^{(n)}_{j,k,p}$ .
Assumption 4.1. Assume that $G^{(n)}_{k}$ is continuous, and for some $\gamma_{k}>0$ ,
where the convergence holds uniformly on compacts (u.o.c.) over ${\mathbb R}_+$ .
Note that in the case $\vartheta^{(n)}_{j,k,p}\;:\!=\; n\times\vartheta_{j,k,p}$ , that is, $G^{(n)}_{k}(t)=G_{k}(t/n)$ for some common CDF $G_{k}$ , the assumption above is equivalent to $G_{k}$ being differentiable at 0 with $\gamma_{k}=G'_{k}(0)$ . This is the so-called critical case studied in [Reference Ward and Glynn30].
Theorem 4.1. Suppose that Assumptions 3.1–4.1 and the conditions in (2.5) and (2.8) hold, and that $\hat{Q}^{(n)}(0)\Rightarrow\hat{Q}(0)\geq0$ . Then
with the limit
where $\hat{\theta}$ and $\hat{W}$ are as given in Theorem 3.1.
Here $\hat{Q}_p$ is a d-dimensional reflected OU process with initial value $\hat{Q}(0)$ . Although the limit in Theorem 4.1 formally reduces to that in Theorem 3.1 if $\gamma=0$ in Assumption 4.1, it is worth mentioning that the proof of Theorem 4.1 relies on Theorem 3.1, and in the case $\gamma>0$ ,
in contrast to (Z, I) in (3.2).
5. A parallel server model with multivariate Hawkes arrivals
As noted in Section 2, a splitting Hawkes process is a multivariate Hawkes process with a particular conditional intensity process given by (2.2). In this section we present the limits for a parallel server model with a multivariate Hawkes arrival process, defined as a simple point process with conditional intensity process given by
where $\lambda_{0}=(\lambda_{i,0})_{i}$ is the baseline intensity vector, and $H=(H_{ij})_{ij}$ is the kernel matrix function with $H_{ij}\;:\; {\mathbb R}_+\to {\mathbb R}_+$ . The non-explosion criterion in [Reference Bacry, Delattre, Hoffmann and Muzy3] is given by $ \int_{0}^{t}H_{ij}(s)\,{\textrm{d}} s<\infty$ , for all $ t>0$ , under which the point process is well-defined. Given the multivariate Hawkes process as arrivals and the service process as defined in Section 3 in the parallel single-server queues, the queue length process and workload process can be defined in the same way.
We consider a sequence of such queueing systems in heavy traffic with index n, where the baseline intensity is $\lambda^{(n)}$ and the kernel function is H.
Assumption 5.1. Suppose the following conditions hold:
-
(i) $\lambda^{(n)}_{0}\to\lambda_{0}$ as $n\to\infty$ ,
-
(ii) $ \int_{0}^{\infty}H_{ij}(t)\,{\textrm{d}} t<\infty$ and the spectral radius of the matrix $\|H\|_{1}=\bigl(\int_{0}^{\infty}H_{ij}(t)\,{\textrm{d}} t\bigr)_{ij}$ is less than 1,
-
(iii) $\lim_{t\to\infty}\sqrt{t}\int_{t}^{\infty}H_{ij}(s)\,{\textrm{d}} s=0$ for every i and j.
We note in particular that the notation differs from Section 3: $\lambda_0$ is a vector and H is a matrix.
Under Assumption 5.1, it is shown (see [Reference Bacry, Delattre, Hoffmann and Muzy3, Theorem 2] and also Proposition 2.2) that
in $({\mathbb D}^{d},J_{1})$ , where $\hat{N}$ is a d-dimensional Brownian motion with covariance function
for $t, s \ge 0$ , where the superscript $\top$ denotes the transpose of a matrix, comparing it with Proposition 2.1 Note that in this model setup, the kernel matrix H is assumed to be independent of n. The traffic intensity for the kth queue in the nth system is given by (compare with (3.3))
The heavy-traffic condition will then imply that
Under Assumptions 3.1 and 5.1 together with the new traffic intensity above, and $\hat{Q}^{(n)}(0)\Rightarrow\hat{Q}(0)\ge0$ , we obtain
with the limit
where
and $\hat{W}$ is a d-dimensional Brownian motion with the following covariance function: for $t,s \ge 0$ ,
For the parallel server queueing model with abandonment as described in Section 4, suppose the arrival process is also a multivariate Hawkes process as described above. Let $Q^{(n)}_{p}$ and $Z^{(n)}_{p}$ be the queue length process and the workload process for the model, respectively, and suppose that Assumptions 3.1, 4.1, and 5.1 hold, and that $\hat{Q}^{(n)}(0)\Rightarrow\hat{Q}(0)\geq0$ ; then
with the limit
where $\hat{W}$ is as given in (5.2).
We remark that the proofs for both models follow similar arguments by adapting those with a split Hawkes process, which we omit for brevity. However, we note that the split Hawkes processes as arrivals to the parallel server queues have a particular structure due to the splitting mechanism, while the multivariate Hawkes process has a matrix kernel $H=(H_{ij})$ (independent of n). It is important to highlight that the limit processes in both parallel server queueing models with a split Hawkes arrival process and a general multivariate Hawkes process are essentially of the same nature (with orthonormal reflections), except that the driving Brownian motions have different covariance functions (comparing (3.11) and (5.2)).
6. Proofs
6.1. Proof of Theorem 3.1
The proof follows from some modification of the arguments for the heavy-traffic convergence in single-server queues; see e.g. [Reference Chen and Yao7, Chapter 6]. We highlight the differences here for the parallel single-server queueing model.
Recall the diffusion-scaled processes $\hat{Q}^{(n)}_k$ and $\hat{Z}^{(n)}_k$ for the nth system from (3.1):
where
and $\bar{N}^{(n)}_{k}$ is the process in (2.3). We will apply the FCLT established for the split Hawkes processes and renewal process, Proposition 2.2 and Remark 3.1 respectively, and the continuous mapping approach to the multidimensional reflection mapping; see e.g. [Reference Chen and Yao7, Theorems 6.1 and 7.2] and [Reference Whitt32, Chapter 14.2]. The following lemma is a modification of the so-called random change of time result from [Reference Billingsley4, page 151], which is also called the continuity of composition in [Reference Whitt32, Theorem 13.2.2]. In this version, each sub-counting process has a different time scaling, and all the limit processes are assumed to have continuous sample paths. The conditions are slightly different from those in [Reference Whitt32, Chapter 14.2]. Recall that ${\mathbb C}_{\uparrow}^{d}={\mathbb C}^{d}\cap{\mathbb D}_{\uparrow}^{d}$ , where ${\mathbb C}^{d}$ denotes the space of ${\mathbb R}^{d}$ -valued continuous functions.
Lemma 6.1. Let $\bigl(x^{(n)},\chi^{(n)}\bigr)=\bigl(x^{(n)}_{k},\chi^{(n)}_{k}\bigr)_{k}\in{\mathbb D}^{d}\times{\mathbb D}_{\uparrow}^{d}$ and $(x,\chi)=(x_{k},\chi_{k})_{k}\in{\mathbb C}^{d}\times{\mathbb C}_{\uparrow}^{d}$ . If
then
Proof of Theorem 3.1. We have from (6.1), for every $ k=1,2,\ldots,d$ ,
where
$\rho^{(n)}_k$ is defined in (3.3), and $\check{N}^{(n)}_k(t)$ is defined in (2.7). Observe that what differs from a single-server queue is that the $\check{N}^{(n)}_k(t)$ are correlated Hawkes processes, which introduces dependence among the different queueing processes.
By the definition of $\hat{Z}^{(n)}$ in (6.1) and $\bigl(\hat{V}^{(n)},\hat{U}^{(n)}\bigr)$ in (3.4), we also obtain
We now consider the convergence of various components in these representations.
Under Assumption 3.1 and the conditions in Proposition 2.2 for Hawkes processes, from (2.9) for $\check{N}^{(n)}$ , (3.9) for $(\hat{V}^{(n)},\hat{U}^{(n)})$ and their independences, we have the joint weak convergence
where $\hat{N}$ is the Brownian motion in Proposition 2.1, $(\hat{V},\hat{U})$ is the Brownian motion in (3.9) and independent of $\hat{N}$ .
Moreover, it is easy to see that
in probability as $n\to\infty$ , where the identity (3.8) is used in the equality, and abusing notation, $\mathfrak{e}$ is a vector of identity functions.
Thus, applying the continuous mapping theorem and the composition mapping (Lemma 6.1), we obtain the joint convergence
in $({\mathbb D}^{3d},J_{1})$ as $n\to\infty$ .
Now, applying the continuous mapping theorem to the multi-dimensional reflection mapping in Definition 3.1 and using the convergence results in (6.3) and (6.4), together with the fact $ \hat{U}_{k}(t)=-\mu_{k}\hat{V}_{k}(\mu_{k}t)$ from (3.9), we obtain the joint convergence of $\bigl(\hat{Q}^{(n)},\hat{Z}^{(n)}\bigr)$ in (3.10).
Finally, since $\hat{N}$ and $\hat{U}$ are independent, it is easy to check that $\hat{W}=\hat{N}-\hat{U}$ is a Brownian motion with the covariance function given in (3.11). This completes the proof.
6.2. Proof of Theorem 4.1
We adapt the proof idea in [Reference Ward and Glynn30] and highlight the main differences in the proof. For the system indexed by n, recall that the diffusion-scaled processes are defined by
and
Theorem 4.1 is proved following the procedure from [Reference Ward and Glynn30], where we start from the analysis of the offered workload process in (4.1). Specifically, the proof proceeds in the following steps.
-
Step 1. The weak convergence of the diffusion-scaled process $\bigl(\hat{\tilde{Z}}^{(n)}_{p},\hat{I}^{(n)}_{p}\bigr)$ in Theorem 6.1, by relating to $\bigl(\hat{Z}^{(n)},\hat{I}^{(n)}\bigr)$ in the corresponding model without abandonment.
-
Step 2. The following asymptotic equivalence properties: for every $T>0$ and k, as $n\to \infty$ , we have
\begin{align*}\sup_{t\le T}\bigl|\hat{Z}_{k,p}^{(n)}(t)-\hat{\tilde{Z}}_{k,p}^{(n)}(t)\bigr|\Rightarrow 0\quad\text{and}\quad\sup_{t\le T}\bigl|\hat{\tilde{Z}}_{k,p}^{(n)}(t)- m^{(n)}_{k}\hat{Q}_{k,p}^{(n)}(t)\bigr|\Rightarrow 0.\end{align*}They follow the same procedure as [Reference Ward and Glynn30], so their proofs are given in the Appendix for completeness. -
Step 3. Completing the proof: given the convergence results in the two steps, the joint convergence in Theorem 4.1 follows immediately.
Therefore we focus only on the proof of the following theorem.
Theorem 6.1. Under the assumptions of Theorem 4.1,
with the limit
where $\hat{W}$ and $\hat{\theta}$ are as given in Theorem 3.1.
The proof of Theorem 6.1 uses the following two lemmas. In the proof we need Theorem 3.1, and it helps to rewrite $Z^{(n)}_{k}(t)$ in (3.1) in the corresponding queueing model without abandonment as
with
The diffusion-scaled process $\hat{Z}^{(n)}$ is defined in the same way. The corresponding convergence results for $\bigl(\hat{Q}^{(n)},\hat{Z}^{(n)}\bigr)$ in Theorem 3.1 will be used.
Define
with
Lemma 6.2. For every $T>0$ and every k,
Proof. It is easy to see that $\bigl\{M_{k,p,1}^{(n)}(m)\bigr\}_{m\ge1}$ is an $\bigl\{\mathscr{F}_{k,p,m}^{(n)}\bigr\}_{m\ge1}$ -adapted square-integrable martingale, where for $m\ge1$
Applying Doob’s maximal inequality [Reference Shiryaev27, Theorem VII.3.3], we have
for every T and $K_{0}$ . Together with Proposition 2.2 for $\bar{N}^{(n)}_{k}$ , the fact that
and Theorem 3.1 is established for $\hat{Z}^{(n)}_{k}$ , we can establish the weak convergence result: $\sup_{t\le T}\bigl|\hat{M}_{k,p,1}^{(n)}(t)\bigr|\Rightarrow 0$ as $n \to \infty$ . A similar procedure can be used to prove the convergence for $\hat{M}_{k,p,2}^{(n)}$ .
Recall the following martingale representations associated with Hawkes processes [Reference Bacry, Delattre, Hoffmann and Muzy3, Reference Li and Pang23]:
are martingales adapted to the filtrations generated by $N^{(n)}$ and $N^{(n)}_{k}$ , respectively. Denote
Lemma 6.3. For every $T>0$ and every k,
Proof. Using the martingales in (6.8), the integral in the lemma can be split into two martingale integrals and a third component with bounded variation, and we show that each term converges to 0 uniformly on [0, T]. Specifically, we have
where we make use of (2.2) and the following identity from [Reference Bacry, Delattre, Hoffmann and Muzy3]:
and recall that $\varphi=\sum_{j\ge1} H^{*j}$ is the renewal function of H.
Since $X^{(n)}_{k}, X^{(n)}$ are martingales with quadratic variation $N^{(n)}_{k}$ and $N^{(n)}$ , respectively, we have from Doob’s maximal inequality that
for every $K_{0}>0$ , where $\hat{G}^{(n)}_{k}(z)\le\sqrt{n}$ by definition. Given (6.7), we have
One can prove the uniform convergence of the second term in (6.9) similarly.
For the last integral in (6.9), notice that $n\varphi(n(s-r))\,{\textrm{d}} s$ degenerates to a Dirac measure at r as $n\to\infty$ and $\bar{X}^{(n)}$ has bounded variation on [0, T]. For arbitrary $\delta_{n}>0$ , if $t-s>\delta_{n}$ , we have
On the other hand, if $t-s<\delta_{n}$ , we have
Plugging these into the last integral in (6.9), we obtain the following bound:
We next need the following result. For every $T>0$ and $\delta_{n}\to0$ with $\sqrt{n}\delta_{n}\to0$ , we have for every k
It follows from their definitions that
for every $t>s>0$ with $t-s\le \delta_{n}$ . Therefore the convergence of $\hat{Z}^{(n)}_{k}$ in Theorem 3.1 and the fact that $\hat{Z}^{(n)}_{k}\in {\mathbb C}$ can be applied to show the convergence property in (6.10).
Now we continue with the last integral in (6.9), taking $\delta_{n}=n^{-2/3}$ and then $\sqrt{n}\delta_{n}\to0$ , $n\delta_{n}\to\infty$ , by (6.10); we prove the uniform convergence of the last piece and finish the proof.
Now we are ready to prove Theorem 6.1 by making use of Lemmas 6.2 and 6.3.
Proof of Theorem 6.1. We claim that as $n\to\infty$ ,
for every k. The claim can also be rephrased as
where $\hat{I}_{p}^{(n)}$ is the regulator of $\hat{\tilde{Z}}_{p}^{(n)}$ satisfying the conditions of Definition 4.1 with the reflection matrix $\texttt{Q}\equiv 0$ , and $\hat{\varepsilon}_{p}^{(n)}$ is the error term which converges weakly to 0 u.o.c. In other words,
We have shown in Section 6.1 that $\hat{Z}^{(n)}-\hat{I}^{(n)}\Rightarrow m(\hat{Q}(0)-\hat{\theta}\mathfrak{e}+W)$ . By Theorem 3.1 we obtain joint convergence in (6.5).
To obtain (6.11), denote
A simple calculation gives
where $\bar{\lambda}^{(n)}_{k}(t)=\lambda^{(n)}_{k}(nt)$ and we use the expression of $\mathbb{E}\bigl[\bar{\lambda}^{(n)}_{k}(t)\bigr]$ in (2.4). It is thus sufficient to show that all terms on the right-hand side of (6.13) converge weakly to 0 u.o.c.
For the term $\hat{M}_{k,p,0}^{(n)}$ , since $\Theta_{k,p}^{(n)}(j)\le Z^{(n)}_{k}(0)$ for all $j\le Q^{(n)}_{k}(0)$ , we have
for every $K_{0}>0$ . The Markov inequality can be applied to show that $ \hat{M}_{k,p,0}^{(n)}\Rightarrow 0$ as $n\to \infty$ .
Now, given Theorem 3.1 for $\hat{Z}^{(n)}_{k}$ and (6.7), the desired convergences of the last two terms in (6.13) can be checked directly. Further, applying Lemma 6.2, Lemma 6.3, and Proposition 2.1, we can prove the remaining terms and finish the proof.
Appendix A. Additional proofs
A.1 Proofs for the asymptotic equivalence properties
This section is dedicated to the proofs of the asymptotic equivalence properties in Step 2 of the proof of Theorem 4.1. The arguments adapt those in [Reference Ward and Glynn30] with slight modifications to illustrate the role of the split Hawkes processes. We provide the details for completeness.
Lemma A.1. For every $T>0$ , we have for every k
Proof. Considering the difference between $\hat{Z}_{k,p}^{(n)}$ and $\hat{\tilde{Z}}_{k,p}^{(n)}$ , we show that the associated additional terms in (4.3) converge weakly to 0 u.o.c. For every $t>0$ , observe that
Thus, using (6.14), we can show that the first term on the initial quantities converges to zero. On the other hand, for every $T,K_{0}>0$ , on the set $\bigl\{\sup_{t\le T}\hat{Z}^{(n)}_{k}(t)\le K_{0}\bigr\}$ we have for every $t<T$ ,
Lemma 6.2 together with Proposition 2.1 proves the convergence.
Lemma A.2. For every $T>0$ and every k,
Proof. Let $\varsigma^{(n)}_{k,p}(t)$ be the right-continuous increasing version of the arrival time of the customer in service at time t if the server is busy and $\varsigma^{(n)}_{k,p}(t)=t$ if the server is idle. By the same argument as in [Reference Ward and Glynn30], it can be shown that $ \bigl(t-\bar{\varsigma}^{(n)}_{k}(t)\bigr)\Rightarrow 0$ for the nth system, where $\bar{\varsigma}_{k,p}^{(n)}(t)=n^{-1}\varsigma_{k,p}^{(n)}(nt)$ .
On the set $\bigl\{\varsigma^{(n)}_{k,p}(t)>0\bigr\}$ ,
where the last term is the number of customers currently in a queue and eventually abandoning the system without receiving service, which is less than
We thus have for the nth system, for $t\le T$ ,
where $\hat{V}^{(n)}_{k}$ is the process in (3.4). Given Lemma 6.2 and the fact that $\hat{V}^{(n)}_{k}$ and $\bar{N}^{(n)}_{k}$ both have continuous limits, the lemma is proved.
A.2 Proof of Proposition 2.2
Proof. For every $k\ge1$ , given (2.4) and the fact $\|\varphi\|_{1}+1={{1}/{(1-\|H\|_{1})}}$ , it is sufficient to show that
For every $\varepsilon\in(0,{{(1-\|H\|_{1})}/{2}})$ , denote
and let $\varphi_{\varepsilon}$ be the associated renewal function, so $\|H_{\varepsilon}\|_{1}=\|H\|_{1}+2\varepsilon\in(0,1)$ . Moreover, given (2.8),
Thus we have from Karamata’s Tauberian theorem (see [Reference Bingham, Goldie and Teugels6, Theorem 1.7.1]) that
where
denotes the Laplace transform of $H_{\varepsilon}$ . It also follows that
as $z\to0+$ . Then monotone density theory (see [Reference Bingham, Goldie and Teugels6, Theorem 1.7.2]) can be applied to show that
Therefore, for every $\delta>0$ , applying the uniform convergence theorem (see [Reference Bingham, Goldie and Teugels6, Theorem 1.5.1]),
which gives
Passing $\delta\to0$ and making use of the fact $\varphi\le \varphi_{\varepsilon}$ proves the limit in (A.1).
Acknowledgements
We wish to thank the associate editor and reviewers for helpful comments that have improved the exposition of the paper.
Funding information
G. Pang is partly supported by US National Science Foundation grant DMS-2216765.
Competing interests
There were no competing interests to declare which arose during the preparation or publication process of this article.