Hostname: page-component-cd9895bd7-lnqnp Total loading time: 0 Render date: 2024-12-28T14:27:46.893Z Has data issue: false hasContentIssue false

With or without replacement? Sampling uncertainty in Shepp’s urn scheme

Published online by Cambridge University Press:  22 December 2022

Kristoffer Glover*
Affiliation:
University of Technology Sydney
*
*Postal address: UTS Business School, 14–28 Ultimo Road, NSW 2007, Australia. Email address: [email protected]
Rights & Permissions [Opens in a new window]

Abstract

We introduce a variant of Shepp’s classical urn problem in which the optimal stopper does not know whether sampling from the urn is done with or without replacement. By considering the problem’s continuous-time analog, we provide bounds on the value function and, in the case of a balanced urn (with an equal number of each ball type), an explicit solution is found. Surprisingly, the optimal strategy for the balanced urn is the same as in the classical urn problem. However, the expected value upon stopping is lower due to the additional uncertainty present.

Type
Original Article
Copyright
© The Author(s), 2022. Published by Cambridge University Press on behalf of Applied Probability Trust

1. Introduction

Consider the following discrete optimal stopping problem as first described in [Reference Shepp32] by Shepp. An urn initially contains m balls worth $-\$1$ each and p balls worth $+\$1$ each, where m and p are positive integers known a priori. Balls are randomly sampled (one at a time and without replacement) and their value is added to a running total. Before any draw, the optimal stopper can choose to stop sampling and receive the cumulative sum up to that point. The goal is to find the stopping rule which maximizes the expected payout from a given (m, p)-urn.

The urn scheme described above was originally formulated in relation to the classical optimal stopping problem of maximizing the average value of a sequence of independent and identically distributed random variables (see [Reference Breiman4, Reference Chow and Robbins8, Reference Dvoretzky14], among others). The scheme has also been considered in relation to numerous other problems in the subsequent literature. For example, in [Reference Chen, Grigorescu and Kang6], the authors considered an extension in which the stopper exhibits risk aversion (modeled as the limited ability to endure negative fluctuations in the running total). An extension in which the stopper is able to draw more than one ball at a time was considered in [Reference Dehghanian and Kharoufeh11]. Related to the current note, [Reference Chen, Zame, Lin and Wu7] (and subsequently [Reference Hung25]) considered the urn problem where the composition of balls in the urn is not known with certainty (i.e. where $p+m$ is known but p is not).

The aim of the present note is to introduce a variant of Shepp’s urn problem in which the sampling procedure used is not known with certainty. Specifically, while the result of each draw is observable, we assume that the optimal stopper is uncertain about whether or not the balls are removed from the urn after sampling. In other words, whether sampling is done with or without replacement. Since the probability of sampling a given ball type is different under the two different sampling procedures, sequentially observing the random draws will reveal statistical information about the true procedure being used. Hence, we adopt a Bayesian approach and assume the optimal stopper has a prior belief of $\pi$ that the samples are not being replaced. They then, sequentially, update this belief (via Bayes) after each random draw. Since the goal is to maximize the expected payout upon stopping, any stopping rule must account for the expected learning that will occur over time.

Shepp demonstrated that the optimal rule for the original problem is of a threshold type. In particular, denoting by $\mathcal{C}$ the set of all urns with a positive expected value (upon stopping optimally), then $\mathcal{C}=\{(m,p) \mid m\leq\beta(p)\}$ , where $\beta(p)$ is a sequence of unique constants dependent on p (which must be computed via recursive methods, cf. [Reference Boyce3]). It is thus optimal to draw a ball if there are sufficiently many p balls relative to m balls (or sufficiently few m balls relative to p balls). Intuitively, $\beta(p)>p$ , and hence a ball should not be sampled when the current state of the urn satisfies $p-m\leq p-\beta(p)<0$ . Put differently, the optimal stopper should stop sampling when the running total exceeds some critical (positive) threshold, dependent on the current state of the urn.

Of particular importance to the current note, Shepp [Reference Shepp32, p. 1001 also connected the urn problem (when the sampling method was known) with the continuous-time problem of optimally stopping a Brownian bridge. Specifically, via an appropriate scaling, the running total (cumulative sum) process was shown to converge to a Brownian bridge which starts at zero (at $t=0$ ) and pins to some location a (at $t=1$ ). Importantly, the known constant a depends on the initial values of m and p, with $a = ({m-p})/{\sqrt{m+p}}$ . Hence, the sign of the pinning location depends on the relative abundance of m- and p-balls in the urn. The continuous-time problem was shown by Shepp [Reference Shepp32] to admit a closed-form solution, and the optimal stopping strategy was found, once more, to be of threshold type – being the first time that the Brownian bridge exceeds some time-dependent boundary, given by $a+\alpha\sqrt{1-t}$ with $\alpha\approx 0.839\,92$ .

Given the success and closed-form nature of such continuous-time approximations, we choose not to tackle the discrete version of our problem directly, instead formulating and solving the continuous-time analog. In such a setting, uncertainty about the true sampling procedure manifests itself in uncertainty about the drift of the underlying (cumulative sum) process. In particular, the process is believed to be either a Brownian bridge pinning to a (if sampling is done without replacement) or a Brownian motion with drift a (if sampling is done with replacement), and the optimal stopper must learn about which it is over time. Despite this additional uncertainty, we find that the problem has a closed-form solution when $a=0$ and, remarkably, the optimal strategy is found to coincide with the optimal strategy of the classical problem (where the sampling procedure/drift is known with certainty). The expected payout, however, is lower due to the additional uncertainty present. When $a\neq 0$ , the problem is more complicated and a richer solution structure emerges (with multiple optimal stopping boundaries possible).

This note therefore contributes to the literature on both optimally stopping a Brownian bridge (e.g. [Reference Baurdoux, Chen, Surya and Yamazaki2, Reference De Angelis and Milazzo10, Reference Ekström and Vaicenavicius17, Reference Ekström and Wanntorp19Reference Föllmer21, Reference Shepp32]) and optimal stopping in the presence of incomplete information (e.g. [Reference Ekström and Lu15, Reference Ekström and Vaicenavicius16, Reference Ekström and Vannestål18, Reference Gapeev22, Reference Glover23, Reference Johnson and Peskir26, Reference Johnson and Peskir27]). We also note that Brownian bridges have found many applications in the field of finance. For example, they have been used to model the so-called stock pinning effect [Reference Avellandeda and Lipkin1], and the dynamics of certain arbitrage opportunities [Reference Brennan and Schwartz5, Reference Liu and Longstaff29]. In both settings, the existence of the underlying economic force (creating the pinning) is more often than not uncertain. Hence, the additional uncertainty considered in this note may find application in more realistic modeling of these market dynamics.

The rest of this note is structured as follows. We start in Section 2 by commenting further on the connection between the discrete urn problem and the continuous-time analog. In Section 3, we formulate the continuous-time problem, making clear our informational assumptions. Upper and lower bounds on the value function are presented in Section 4, along with the explicit solution to the problem in the case where $a=0$ . We conclude in Section 5 with a brief discussion of the case where a is nonzero.

2. Connecting the urn problem to Brownian bridges/motion

Let $\varepsilon_i$ , for $i=1,\ldots, m+p$ , denote the results of sampling from a given (m, p)-urn, with $\varepsilon_i=-1$ for an m-ball and $\varepsilon_i=1$ for a p-ball. The partial sum after n draws is thus $X_n=\sum_{i=1}^{n}\varepsilon_i$ , with $X_0=0$ . It is well known that the discrete process $\{X_n\}_{n=0}^{m+p}$ can be approximated as a continuous-time diffusion process if we let m and p tend to infinity in an appropriate way. The resulting diffusion, however, will depend on whether sampling is done with or without replacement. Fixing m and p, we define, for $0\leq n\leq m+p$ and $n<(m+p)t\leq n+1$ ,

(2.1) \begin{equation} X_{m,p}(t)=\frac{X_n}{\sqrt{m+p}\,},\qquad 0\leq t\leq 1.\end{equation}

If sampling is done without replacement, then for $n=m+p$ (after all balls have been sampled) we have

\begin{equation*} X_{m,p}(1)=\frac{p-m}{\sqrt{m+p}\,}\;=\!:\;a.\end{equation*}

Hence, the final value (at $t=1$ ) is known with certainty to be the constant a. In this case, it is also clear that the samples $\varepsilon_i$ are not independent and identically distributed (i.i.d.). However, Shepp demonstrated that, if a is fixed, the process $X_{m,p}(t)$ converges in distribution as $p\to\infty$ to a Brownian bridge process pinning to the point a at $t=1$ (see [Reference Shepp32, p. 1001]).

On the other hand, if sampling is done with replacement, then the samples $\varepsilon_i$ are i.i.d., and the process $X_{m,p}(t)$ in (2.1) can be seen to converge in distribution to a Brownian motion (with drift), via Donsker’s theorem. We also note that with replacement, more than $m+p$ balls can be sampled. Indeed, sampling could continue indefinitely if each sampled ball were replaced. However, after $m+p$ balls have been sampled the true nature of the sampling procedure will be revealed, since there will either be no balls left or another sample is produced. In our modified urn problem we therefore make the natural assumption that stopping must occur before more than $m+p$ balls have been sampled.

Remark 2.1. If the optimal stopper were allowed to continue beyond $m+p$ samples, then the stopper would never stop for $p>m$ ( $a>0$ ) in the sampling-with-replacement scenario (since the cumulative sum is expected to increase indefinitely). It would then also be optimal never to stop before $m+p$ balls are sampled (for $\pi<1$ at least) due to the unbounded payoff expected after $m+p$ balls. On the other hand, if $p\leq m$ ( $a\leq 0$ ), the stopper would stop at $m+p$ balls in all scenarios since the cumulative sum process after $m+p$ balls is a supermartingale regardless of whether sampling with or without replacement had been revealed. Thus, the solution to the stopping problems with restricted and unrestricted stopping times would coincide for the case where $a\leq 0$ . To avoid the degeneracy of the problem for $a>0$ , i.e. to guarantee a finite value, we chose to restrict the set of admissible stopping times to $n\leq m+p$ .

To apply Donsker’s theorem, we note that the probability of drawing a given ball type (with replacement) is constant and given by $p/(m+p)$ for a positive ball and $m/(m+p)$ for a negative ball. Therefore, $\mathbb{E}[\varepsilon_i]=(p-m)/(m+p)=a/\sqrt{m+p}$ and $\textrm{Var}(\varepsilon_i)=1-a^2/(m+p)$ . This allows us to rewrite (2.1) as

(2.2) \begin{equation} X_{m,p}(t)=\frac{an}{m+p}+\bigg(1-\frac{a^2}{m+p}\bigg)^{1/2}\bigg(\frac{\sum_{i=1}^{n}\widehat{\varepsilon}_{i}}{\sqrt{m+p}\,}\bigg),\end{equation}

where $\widehat{\varepsilon}_i$ are now standardized random variables (with zero mean and unit variance). Since we are restricting our attention to $n\leq m+p$ , we can once more fix m and p and define $n<(m+p)t\leq n+1$ . Letting $p\to\infty$ , the process $X_{m,p}(t)$ in (2.2) thus converges to $X_t=at+B_t$ , $0\leq t\leq 1$ , where $(B_t)_{0\leq t\leq 1}$ is a standard Brownian motion (cf. [Reference Donsker12]). Note that the drift in this equation coincides with the pinning point of the Brownian bridge in the case without replacement.

With this necessary connection in place, we now proceed to formulate the continuous-time stopping problem corresponding to our variant of Shepp’s urn scheme.

3. Problem formulation and learning assumptions

Let $X=(X_t)_{t\geq 0}$ denote an observable stochastic process that is believed by an optimal stopper to be either a Brownian motion with known drift a, or a Brownian bridge that pins to a at $t=1$ . Adopting a Bayesian approach, we also assume that the optimal stopper has an initial belief of $\pi$ that the true process is a Brownian bridge (and hence a belief of $1-\pi$ that it is a Brownian motion).

This information structure can be realised on a probability space $(\Omega,\mathcal{F},\mathbb{P}_{\pi})$ where the probability measure $\mathbb{P}_{\pi}$ has the following structure:

(3.1) \begin{equation} \mathbb{P}_{\pi}=(1-\pi)\mathbb{P}_0+\pi\mathbb{P}_1 \quad \textrm{for } \pi\in[0,1],\end{equation}

where $\mathbb{P}_0$ is the probability measure under which the process X is the Brownian motion and $\mathbb{P}_1$ is the probability measure under which the process X is the Brownian bridge (cf. [Reference Peskir and Shiryaev31, Chapter VI, Section 21]). More formally, we can introduce an unobservable random variable $\theta$ taking values 0 or 1 with probability $1-\pi$ and $\pi$ under $\mathbb{P}_\pi$ , respectively. Thus, the process X solves the stochastic differential equation

(3.2) \begin{equation} \mathrm{d} X_t=\bigg[(1-\theta)a+\theta\bigg(\frac{a-X_t}{1-t}\bigg)\bigg]\mathrm{d} t+\mathrm{d} B_t, \qquad X_0=0,\end{equation}

where $B=(B_t)_{t\geq 0}$ is a standard Brownian motion, independent of $\theta$ under $\mathbb{P}_{\pi}$ .

The problem under investigation is to find the optimal stopping strategy that maximizes the expected value of X upon stopping, i.e.

(3.3) \begin{equation} V(\pi)=\sup_{0\leq\tau\leq 1}\mathbb{E}_{\pi}[X_{\tau}] \quad \textrm{for } \pi\in[0,1].\end{equation}

Recall that the time horizon of the optimal stopping problem in (3.3) is set to one, since the uncertainty about the nature of the process is fully revealed at $t=1$ (it either pins to a or it does not).

If the process was known to be a Brownian bridge then it would be evident from (3.3) that $V\geq a$ , since simply waiting until $t=1$ would yield a value of a with certainty. However, uncertainty about $\theta$ introduces additional uncertainty in the terminal payoff, since the value received at $t=1$ could be less than a if the true process was actually a Brownian motion.

Remark 3.1. The problem described above is related to the problem studied in [Reference Ekström and Vaicenavicius17], in which the underlying process is known to be a Brownian bridge, but for which the location of the pinning point is unknown. Specifically, if the process defined in (3.2) was a standard Brownian motion then the distribution of its expected location at $t=1$ would be normal, i.e. $X_1\sim\mathcal{N}(a,1)$ . On the other hand, if the process was a Brownian bridge pinning to a at $t=1$ , then the distribution of its expected location at $t=1$ would be a point mass, i.e. $X_1\sim\delta_a$ (where $\delta_a$ denotes the Dirac delta). Hence, setting a prior on the location of the pinning point in [Reference Ekström and Vaicenavicius17] to $\mu=\pi\delta_a+(1-\pi)\mathcal{N}(a,1)$ is equivalent to the problem formulated in this note.

To account for the uncertainty about $\theta$ in (3.2), we define the posterior probability process $\Pi_t \;:\!=\; \mathbb{P}_{\pi}\big(\theta=1\mid\mathcal{F}_t^X\big)$ for $t\geq 0$ , which represents the belief that the process will pin at $t=1$ , and importantly how it is continually updated over time through observations of the process X. To determine the dynamics of the process $\Pi=(\Pi_t)_{t\geq 0}$ , we appeal to well-known results from stochastic filtering theory (see [Reference Liptser and Shiryaev28, Theorem 9.1] or [Reference Johnson and Peskir27, Section 2]), namely that, for $t\geq 0$ ,

(3.4) \begin{align} \mathrm{d} X_t = \bigg[(1-\Pi_t)a+\Pi_t\bigg(\frac{a-X_t}{1-t}\bigg)\bigg]\mathrm{d} t+\mathrm{d} \bar{B}_t, \quad X_0 = 0, \end{align}
(3.5) \begin{align} \mathrm{d} \Pi_t = \rho(t,X_t)\Pi_t(1-\Pi_t)\mathrm{d} \bar{B}_t, \quad\qquad\qquad\qquad \Pi_0 = \pi, \end{align}

where $\bar{B}=(\bar{B}_t)_{t\geq 0}$ is a $\mathbb{P}_{\pi}$ -Brownian motion called the innovation process, and $\rho$ denotes the signal-to-noise ratio, defined as $\rho(t,X_t) \;:\!=\; ({a-X_t})/({1-t})-a$ . While the payoff in (3.3) is only dependent on X (not $\Pi$ ), the drift of X in (3.4) contains $\Pi$ . Therefore, at first blush, it would appear that the optimal stopping problem is two-dimensional (in X and $\Pi$ ). However, since both X and $\Pi$ are driven by the same Brownian motion ( $\bar{B}$ ), the problem can, in fact, be reduced to only one spatial variable (either X or $\Pi$ ) by identifying a (time-dependent) mapping between $X_t$ and $\Pi_t$ . In what follows, we formulate the problem in terms of the original process X, since this facilitates a more transparent comparison with the case where the process is known to pin with certainty.

To establish the mapping between $X_t$ and $\Pi_t$ we have the following result.

Proposition 3.1. Given the processes $X=(X_t)_{t\geq 0}$ and $\Pi=(\Pi_t)_{t\geq 0}$ defined by (3.4) and (3.5), respectively, the following identity holds for $t\in[0,1)$ :

(3.6) \begin{equation} \frac{\Pi_t}{1-\Pi_t}=\frac{\pi}{1-\pi}L^a(t,X_t), \qquad L^a(t,x)\;:\!=\;\frac{1}{\sqrt{1-t}\,}\exp\bigg({-}\frac{1}{2}a^2-\frac{(x-at)^{2}}{2(1-t)}\bigg). \end{equation}

Proof. To establish the mapping we take advantage of the fact that both processes are driven by the same Brownian motion and define the process (cf. [Reference Johnson and Peskir26, Proposition 4])

(3.7) \begin{equation} U_t = \ln\!\bigg(\frac{\Pi_t}{1-\Pi_t}\bigg)+aX_t-\frac{aX_t}{1-t}+\frac{X_t^2}{2(1-t)}, \end{equation}

which, after applying Itô’s formula, is seen to be of bounded variation with dynamics

\begin{equation*} \mathrm{d} U_t = \frac{1}{2}\bigg[a^2-\frac{a^2}{(1-t)^2}+\frac{1}{1-t}\bigg]\mathrm{d} t, \qquad U_0=\ln\bigg(\frac{\pi}{1-\pi}\bigg). \end{equation*}

Thus, $U_t$ can be solved explicitly as

(3.8) \begin{equation} U_t = \ln\bigg(\frac{\pi}{1-\pi}\bigg)+\frac{a^2}{2}t-\frac{a^2}{2(1-t)}-\ln\sqrt{1-t}, \end{equation}

and, after combining (3.7) and (3.8), we obtain the desired result.

To solve the optimal stopping problem in (3.3), we will exploit various changes of measure. In particular, from $\mathbb{P}_{\pi}$ to $\mathbb{P}_0$ (under which the process X is a standard Brownian motion with drift a) and then from $\mathbb{P}_0$ to $\mathbb{P}_1$ (under which X is a Brownian bridge pinning to a). In order to perform these measure changes, we have the following result that establishes the necessary Radon–Nikodym derivatives (cf. [Reference Johnson and Peskir27, Lemma 1]).

Proposition 3.2. Let $\mathbb{P}_{\pi,\tau}$ be the restriction of the measure $\mathbb{P}_\pi$ to $\mathcal{F}_{\tau}^X$ for $\pi\in[0,1]$ . We thus have the following:

(3.9) \begin{align} \begin{split} \textit{(i)} \quad & \frac{\mathrm{d} \mathbb{P}_{\pi,\tau}}{\mathrm{d} \mathbb{P}_{1,\tau}} = \frac{\pi}{\Pi_{\tau}}, \\[5pt] \textit{(ii)} \quad & \frac{\mathrm{d} \mathbb{P}_{\pi,\tau}}{\mathrm{d} \mathbb{P}_{0,\tau}} = \frac{1-\pi}{1-\Pi_{\tau}}, \end{split} \qquad\qquad\qquad \end{align}
(3.10) \begin{align} \quad \textit{(iii)} \quad & \frac{\mathrm{d} \mathbb{P}_{1,\tau}}{\mathrm{d} \mathbb{P}_{0,\tau}} = \frac{1-\pi}{\pi}\frac{\Pi_{\tau}}{1-\Pi_{\tau}}=L^a(\tau,X_{\tau}), \end{align}

for all stopping times $\tau$ of X, where $L^a$ is given in (3.6). The process in (3.10) is often referred to as the likelihood ratio process.

Proof. A standard rule for Radon–Nikodym derivatives under (3.1) gives

(3.11) \begin{align} \Pi_\tau & = \mathbb{P}_\pi\big(\theta=1\mid\mathcal{F}_\tau^X\big) \nonumber \\[5pt] & = (1-\pi)\mathbb{P}_0\big(\theta=1\mid\mathcal{F}_\tau^X\big)\frac{\mathrm{d} \mathbb{P}_{0,\tau}}{\mathrm{d} \mathbb{P}_{\pi,\tau}} + \pi\mathbb{P}_1\big(\theta=1\mid\mathcal{F}_\tau^X\big)\frac{\mathrm{d} \mathbb{P}_{1,\tau}}{\mathrm{d} \mathbb{P}_{\pi,\tau}} = \pi\frac{\mathrm{d} \mathbb{P}_{1,\tau}}{\mathrm{d} \mathbb{P}_{\pi,\tau}} \end{align}

for any $\tau$ and $\pi$ , yielding identity (i). Similar arguments show that

(3.12) \begin{align} 1-\Pi_\tau & = \mathbb{P}_\pi\big(\theta=0\mid\mathcal{F}_\tau^X\big) \nonumber \\[5pt] & = (1-\pi)\mathbb{P}_0\big(\theta=0\mid\mathcal{F}_\tau^X\big)\frac{\mathrm{d} \mathbb{P}_{0,\tau}}{\mathrm{d} \mathbb{P}_{\pi,\tau}} + \pi\mathbb{P}_1\big(\theta=0\mid\mathcal{F}_\tau^X\big)\frac{\mathrm{d} \mathbb{P}_{1,\tau}}{\mathrm{d} \mathbb{P}_{\pi,\tau}} \nonumber \\[5pt] & = (1-\pi)\frac{\mathrm{d} \mathbb{P}_{0,\tau}}{\mathrm{d} \mathbb{P}_{\pi,\tau}}, \end{align}

yielding (ii). Using (3.11) and (3.12) together, and noting (3.6), we get (iii).

Next, we embed (3.3) into a Markovian framework where the process X starts at time t with value x. However, in doing so, we cannot forget that the optimal stopper’s learning about the true nature of the underlying process started at time 0 with an initial belief of $\pi$ and with $X_0=0$ . To incorporate this information, we exploit the mapping in (3.6) to calculate the stopper’s updated belief should the process reach x at time t. In other words, in our Markovian embedding, we must assume that the ‘initial’ belief at time t is not $\pi$ but $\Pi_t$ (which depends on t and x). More formally, the embedded optimal stopping problem becomes

(3.13) \begin{equation} V(t,x,\pi)=\sup_{0\leq\tau\leq 1-t}\mathbb{E}_{\pi}\bigl[X_{t+\tau}^{t,x}\bigr],\end{equation}

where the processes $X=X^{t,x}$ and $\Pi$ are defined by

(3.14) \begin{align} & \left\{ \begin{array}{l@{\quad}l} \mathrm{d} X_{t+s}= (a+\Pi_{t+s}\rho(t+s,X_{t+s}))\mathrm{d} s+\mathrm{d} \bar{B}_{t+s}, & 0\leq s<1-t,\\[5pt] X_t=x, & x\in\mathbb{R} ; \\[5pt] \end{array}\right. \end{align}
(3.15) \begin{align} & \left\{ \begin{array}{l@{\quad}l} \mathrm{d} \Pi_{t+s}=\rho(t+s,X_{t+s})\Pi_{t+s}(1-\Pi_{t+s})\mathrm{d} \bar{B}_{t+s}, & 0\leq s<1-t,\\[5pt] \Pi_t=\frac{\frac{\pi}{1-\pi}L^a(t,x)}{\big(1+\frac{\pi}{1-\pi}L^a(t,x)\big)} \;=\!:\; \Pi(t,x,\pi). & \\[5pt] \end{array}\right. \end{align}

Note that the function $L^a$ is defined as in (3.6) and, with a slight abuse of notation, we have defined the function $\Pi(t,x,\pi)$ to be the ‘initial’ value of $\Pi$ in the embedding (dependent on t, x, and $\pi$ ). Note further that, since we are able to replace any dependence on $\Pi_{t+s}$ (for $s>0$ ) via the mapping in (3.6), we no longer need to consider the dynamics for $\Pi$ in what follows (only the initial point $\Pi_t$ ).

Since its value will be used in our subsequent analysis, we conclude this section by reviewing the solution to the classical Brownian bridge problem which is known to pin to a (at $t=1$ ) with certainty (i.e. when $\pi=1$ ). In this case, the stopping problem in (3.13) has an explicit solution (cf. [Reference Ekström and Wanntorp19, p. 175]) given by

(3.16) \begin{equation} V^{a}_1(t,x)\;:\!=\;\left\{ \begin{array}{l@{\quad}l} a+\sqrt{2\pi(1-t)}(1-\alpha^2)\exp\Bigl(\frac{(x-a)^2}{2(1-t)}\Bigr)\Phi\Bigl(\frac{x-a}{\sqrt{1-t}\,}\Bigr), & x<b(t),\\[5pt] x, & x\geq b(t) \end{array}\right.\end{equation}

for $t<1$ and $V^a_1(1,a)=a$ . The function $\Phi(y)$ denotes the standard cumulative normal distribution function, and $b(t)\;:\!=\;a+\alpha\sqrt{1-t}$ with $\alpha$ being the unique positive solution to

(3.17) \begin{equation} \sqrt{2\pi}\!\left(1-\alpha^2\right)\mathrm{e}^{\frac{1}{2}\alpha^2}\Phi(\alpha)=\alpha,\end{equation}

which is approximately 0.839 924. (Note that $\pi$ in (3.16) and (3.17) denotes the universal constant and not the initial belief.) Further, the optimal stopping strategy in this case is given by

(3.18) \begin{equation} \tau_b=\inf\{s\geq 0\mid X_{t+s}\geq b(t+s)\} \quad \textrm{for all } t<1.\end{equation}

4. Bounds on the value function and solution when $\boldsymbol{a}=0$

As may be expected, the solution to (3.13) depends crucially on the value of a. In fact, we find below that the problem is completely solvable in closed form when $a=0$ (corresponding to $m=p$ ). For a nonzero value of a, the problem is more complicated and a richer solution structure emerges. However, we are able to provide the following useful bounds on the value function in (3.13) for an arbitrary a. Moreover, these bounds can be seen to coincide when $a=0$ , yielding the explicit solution in this case.

Proposition 4.1. (Upper bound.) The value function defined in (3.13) satisfies

(4.1) \begin{equation} V(t,x,\pi)\leq (1-\Pi(t,x,\pi))(x+\max\!(a,0))+\Pi(t,x,\pi)V_{1}^{a}(t,x), \end{equation}

where $V_1^a$ is as given in (3.16), and the function $\Pi$ is the updated belief conditional on the process reaching x at time t, defined in (3.15).

Proof. To establish the upper bound, we consider a situation in which the true nature of the process (i.e. $\theta$ ) was revealed to the optimal stopper immediately after starting, i.e. at time $t+$ . In this situation, the optimal stopper would subsequently be able to employ the optimal stopping strategy for the problem given full knowledge of the nature of the underlying process. Specifically, if the process was revealed as a Brownian bridge, then using $\tau_b$ , as defined in (3.18), would be optimal, generating an expected value (at $t=t+$ ) of $V_1^a(t,x)$ . On the other hand, if the process was revealed as a Brownian motion with drift a, then the optimal strategy would be different. In the case where $a<0$ , it would be optimal to stop immediately and receive the value x, and in the case where $a>0$ it would be optimal to wait until $t=1$ and receive the expected value $\mathbb{E}_0[X_1]=x+a$ . When $a=0$ , however, any stopping rule would yield an expected value of x, due to the martingality of the process X in this case.

Considering now the value function at $t=t-$ . Acknowledging that the true nature of the process will be immanently revealed, the expected payout is given by $(1-\Pi_t)(x+\max\!(a,0))+\Pi_t V_1^a(t,x)$ , upon noting that $\Pi_t=\Pi(t,x,\pi)$ represents the current belief about the true value of $\theta$ . Finally, recognizing that the set of stopping times in (3.13) is a subset of the stopping times used in the situation described above (where $\theta$ is revealed at $t+$ ), the stated inequality is clear.

Proposition 4.2. (Lower bound.) The value function defined in (3.13) satisfies

(4.2) \begin{equation} V(t,x,\pi)\geq (1-\Pi(t,x,\pi))\mathbb{E}_0[X_{(t+\tau_b)\wedge 1}]+\Pi(t,x,\pi)V^{a}_{1}(t,x), \end{equation}

where $V_1^a$ is as given in (3.16), and $\tau_b$ denotes the optimal strategy for the known pinning case described in (3.18). Moreover, the function $\Pi$ is the updated belief conditional on the process reaching x at time t, defined in (3.15).

Proof. The desired bound can be established by employing the optimal strategy for the known pinning case, defined in (3.18), in the stopping problem in (3.13), for $\pi<1$ . In detail, letting $X=X^{t,x}$ for ease of notation, we have

\begin{align} V(t,x,\pi) & = \sup_{0\leq\tau\leq 1-t}\mathbb{E}_{\pi}[X_{t+\tau}] =\sup_{0\leq\tau\leq 1-t}\bigg\{(1-\Pi_t)\mathbb{E}_0\bigg[\frac{X_{t+\tau}}{1-\Pi_{t+\tau}}\bigg]\bigg\} \nonumber \\[5pt] & = \sup_{0\leq\tau\leq 1-t}\bigg\{(1-\Pi_t)\mathbb{E}_0\bigg[X_{t+\tau}\bigg(1+\frac{\Pi_{t+\tau}}{1-\Pi_{t+\tau}}\bigg)\bigg]\bigg\} \nonumber \\[5pt] & = \sup_{0\leq\tau\leq 1-t}\{(1-\Pi_t)\mathbb{E}_0[X_{t+\tau}]+\Pi_t\mathbb{E}_1[X_{t+\tau}]\}, \nonumber \end{align}

where we have applied the measure change from $\mathbb{P}_{\pi}$ to $\mathbb{P}_{0}$ , via (3.9), in the second equality, and the measure change from $\mathbb{P}_0$ to $\mathbb{P}_1$ , via (3.10), in the last equality. Furthermore, employing the stopping rule $\tau_b$ from (3.18) (which may or may not be optimal) yields

\begin{equation*} V(t,x,\pi) \geq (1-\Pi_t)\mathbb{E}_0[X_{(t+\tau_b)\wedge 1}]+\Pi_t\mathbb{E}_1[X_{t+\tau_b}] = (1-\Pi_t)\mathbb{E}_0[X_{(t+\tau_b)\wedge 1}]+\Pi_t V_1^a(t,x) \end{equation*}

upon noting the definition of $V_1^a$ , and where we have ensured that stopping under $\mathbb{P}_0$ happens at or before $t=1$ (since the boundary b is not guaranteed to be hit by a Brownian motion with drift, unlike the Brownian bridge).

Computation of $\mathbb{E}_0[X_{(t+\tau_b)\wedge 1}]$ is difficult in general, being the expected hitting level of a Brownian motion with drift to a square-root boundary. Alternatively, we have $\mathbb{E}_0[X_{(t+\tau_b)\wedge 1}]=x+a\mathbb{E}_0[\tau_b\wedge(1-t)]+\mathbb{E}_0[B_{(t+\tau_b)\wedge 1}]=x+a\mathbb{E}_0[\tau_b\wedge(1-t)]$ , with the first-passage time $\tau_b=\inf\{s\geq 0\mid B_s\geq c(s)\}$ , where $c(s)\;:\!=\;a(1-s)-x+\alpha\sqrt{1-t-s}$ . Hence, the computation reduces to the problem of finding the mean first-passage time of a driftless Brownian motion (started at zero) to a time-dependent boundary (which is a mixture of a linear and square-root function). While no explicit expression for $\mathbb{E}_0[\tau_b\wedge(1-t)]$ exists, there are numerous numerical approximations available; see, for example, [Reference Durbin and Williams13], or more recently [Reference Herrmann and Tanré24]. When $a=0$ , it is clear that $\mathbb{E}_0[X_{(t+\tau_b)\wedge 1}]=x$ , a result which we will exploit below.

Given Propositions 4.1 and 4.2, the following result is evident, and constitutes the main result of this note.

Theorem 4.1. When $a=0$ , the value function in (3.13) is given by

(4.3) \begin{equation} V(t,x,\pi)=(1-\Pi(t,x,\pi))x+\Pi(t,x,\pi)V^0_1(t,x) \quad \textit{for }\pi\in[0,1], \end{equation}

where $\Pi$ is defined in (3.15) and $V^0_1$ is defined in (3.16) (upon setting $a=0$ ). Further, the optimal stopping strategy in (3.13) is given by $\tau^*=\tau_b\wedge(1-t)$ . This stopping strategy is the same for all $\pi\in[0,1]$ .

Proof. The result is evident given the fact that the upper bound defined in (4.1) and the lower bound defined in (4.2) coincide when $a=0$ . Specifically, we observe that $\mathbb{E}_0[X_{(t+\tau_b)\wedge 1}]=x$ in (4.2) since X is a $\mathbb{P}_0$ -martingale when $a=0$ . Moreover, since the process is not guaranteed to pin at $t=1$ , we specify explicitly that the stopper must stop at $t=1$ should the boundary b not be hit.

Note that the optimality of the solution presented in (4.3) does not need to be verified since it follows directly from the proven identity in (4.3) and the existing verification arguments establishing the optimality of $V_1^0$ (provided in [Reference Ekström and Wanntorp19], for example).

The equality found in (4.3) also demonstrates that (when $a=0$ ) there is no loss in value due to the optimal stopper using a sub-optimal stopping strategy for the ‘true’ drift. The optimal stopping strategy for a Brownian bridge also achieves the maximum possible value for the Brownian motion (due to martingality), hence using $\tau^*$ will achieve the maximum possible value regardless of the true nature of the underlying process. For $a\neq 0$ , it would not be possible to achieve the maximum value in both drift scenarios simultaneously through a single optimal stopping rule. Hence there would be loss in value due to this, as indicated by the inequality in Propositions 4.1.

Remark 4.1. It is also worth noting that the arguments in the proof of Theorem 4.1 would carry over to a more general setting in which the process is believed to be either a martingale M or a diffusion X (with an initial probability $\pi$ of being X). In this case, similar arguments to Propositions 4.1 will show that $V(t,x,\pi)\leq(1-\Pi_t)x+\Pi_tV_1(t,x)$ , where $V_1$ denotes the solution to the associated stopping problem for the diffusion X. Under $\mathbb{P}_0$ , all stopping rules generate the expected value of x, due to M being a $\mathbb{P}_0$ -martingale. Moreover, similar arguments to Propositions 4.2 will show that $V(t,x,\pi)\geq(1-\Pi_t)x+\Pi_t V_1(t,x)$ , upon using the optimal strategy for the optimal stopping problem under $\mathbb{P}_1$ , and noting again that $\mathbb{E}_0[X_{t+\tau}]=x$ , for any stopping rule. Finally, we must note that the function $\Pi_t$ would need to be found on a case-by-case basis via a mapping similar to (3.6). In general, however, this mapping could also include path-dependent functionals of the process over [0, t], in addition to the values of t and x (cf. [Reference Johnson and Peskir27, Proposition 4]).

Next, Theorem 4.1 also implies the following result.

Corollary 4.1. When $a=0$ , we have $x\leq V(t,x,\pi)\leq V_1^0(t,x)$ and $\pi\mapsto V(t,x,\pi)$ is increasing, with $V(t,x,0)=x$ and $V(t,x,1)=V_1^0(t,x)$ .

Proof. From (4.3) we have that $V-V_1^0=(1-\Pi)(x-V_1^0)\leq 0$ , where the inequality is due to the fact that $\Pi\leq 1$ and $V_1^0\geq x$ , from (3.16). Direct differentiation of (4.3), upon noting (3.15), also shows that

\begin{equation*}\frac{\partial V}{\partial\pi} = \frac{L^0(V_1^0-x)}{\big[(1-\pi)\big(1+\tfrac{\pi}{1-\pi}L^0\big)\big]^2}\geq 0,\end{equation*}

proving the second claim.

Corollary 4.1 reveals that, while the optimal stopping strategy is the same with pinning certainty or uncertainty when $a=0$ , the value function with uncertainty is lower than if the pinning was certain/known. In other words, when sampling from a balanced urn with uncertainty about replacement, the optimal stopping strategy is the same as with replacement, but the expected payout is lower. To illustrate this, Figure 1 plots the value function V in (4.3) in comparison with $V_1^0$ as defined in (3.16). We confirm that a larger $\pi$ (hence a stronger belief that the process is indeed a Brownian bridge) corresponds to a larger value of V.

Figure 1. The solution to the problem in (3.13) when the process is believed to be a Brownian bridge (pinning to $a=0$ ) with probability $\pi$ or a (driftless) Brownian motion with probability $1-\pi$ . Solid lines = $V(0,x,\pi)$ from (4.3) for $\pi=\{$ 0.1, 0.5, 0.9} (higher lines correspond to larger $\pi$ ); dashed line = $V_1^0(0,x)$ from (3.16); dotted line = x.

Figure 1 also highlights the fact that the value function in (3.13) can be negative, since pinning to zero is not guaranteed (and hence stopping at $t=1$ does not guarantee a minimum payoff of zero). For example, if $\pi=0.5$ (i.e. sampling with or without replacement were both initially thought to be equally likely), then the value function in (4.3) would be negative for all $x<-0.286$ . This does not mean, however, that it would be optimal to stop once the running payoff drops below this value, since an immediate negative payoff would be received, compared to the zero expected payoff from continuing and stopping according to $\tau^*$ .

5. The case where a is nonzero

If the urn is not balanced, meaning that $m\neq p$ , then a nonzero drift and a nonzero pinning point are introduced into the process X. This asymmetry complicates the problem considerably and, while the bounds in (4.1) and (4.2) are still valid, a closed-form solution to (3.13) is no longer available. Attempting to provide a detailed analytical investigation of this case is beyond the scope of this note. However, numerical investigation of the variational inequality associated with (3.13) suggests that a rich solution structure emerges, particularly in the case where $a>0$ , when multiple stopping boundaries can arise. We therefore conclude this note by exposing some of this structure to pique the reader’s interest.

Remark 5.1. It should be noted that if the drift of the Brownian motion was zero, but the Brownian bridge had a nonzero pinning level, then the results of Theorem 4.1 would still hold (due to the martingality of X under $\mathbb{P}_0$ ). However, this situation does not correspond to the urn problem described in Section 2, in which both the drift and the pinning point must be the same.

To shed some light on the optimal stopping strategy for a nonzero a, it is useful to reformulate the problem in (3.13) under the measure $\mathbb{P}_0$ as follows:

\begin{align} V(t,x,\pi) & = \sup_{0\leq\tau\leq 1-t}\mathbb{E}_{\pi}[X_{t+\tau}] = (1-\Pi_t)\sup_{0\leq\tau\leq 1-t}\mathbb{E}_0\bigg[\frac{X_{t+\tau}}{1-\Pi_{t+\tau}}\bigg] \nonumber \\[5pt] & = (1-\Pi(t,x,\pi))\sup_{0\leq\tau\leq 1-t}\mathbb{E}_0\bigg[X_{t+\tau}\bigg(1+\frac{\pi}{1-\pi}L^a(t+\tau,X_{t+\tau})\bigg)\bigg] \nonumber \\[5pt] & = (1-\Pi(t,x,\pi))\sup_{0\leq\tau\leq 1-t}\mathbb{E}_0[G^{\pi,a}(t+\tau,X_{t+\tau})] \nonumber \\[5pt] & = (1-\Pi(t,x,\pi))\widetilde{V}^{\pi,a}(t,x), \nonumber\end{align}

where we have used (3.9) in the second equality (to change measure) and the mapping from (3.6) in the third equality (to eliminate $\Pi_{t+\tau}$ ). We have also defined the auxiliary optimal stopping problem

(5.1) \begin{equation} \widetilde{V}^{\pi,a}(t,x)\;:\!=\;\sup_{0\leq\tau\leq 1-t}\mathbb{E}_0[G^{\pi,a}(t+\tau,X_{t+\tau})],\end{equation}

and the payoff function

\begin{equation*} G^{\pi,a}(t,x)\;:\!=\;x\bigg(1+\frac{\pi}{1-\pi}L^a(t,x)\bigg),\end{equation*}

where $L^a$ is given in (3.6), which importantly is dependent on the parameter a.

Next, defining the infinitesimal generator associated with X as

\begin{equation*}\mathbb{L}_X\;:\!=\;\frac{1}{2}\frac{\partial^2}{\partial x^2}+a\frac{\partial}{\partial x},\end{equation*}

then Itô’s formula and an application of the optional sampling theorem for any given $\tau$ yield

(5.2) \begin{equation} \mathbb{E}_0[G^{\pi,a}(t+\tau,X_{t+\tau})]=G^{\pi,a}(t,x)+\mathbb{E}_0\int_0^{\tau}H(t+s,X_{t+s}) \, \mathrm{d} s,\end{equation}

where

\begin{equation*} H(t,x)\;:\!=\;\bigg(\frac{\partial}{\partial t}+\mathbb{L}_X\bigg)G^{\pi,a}(t,x) = a-\frac{\pi(x-a)}{(1-\pi)(1-t)}L^a(t,x).\end{equation*}

Hence, from (5.2) it is clear that it would never be optimal to stop at a point (t, x) for which $H(t,x)>0$ . For $a=0$ , this region corresponds to $x<0$ . However, the shape of this region is qualitatively different for nonzero a. To illustrate this, Figure 2 plots the behavior of H for both positive and negative values of a.

Figure 2. The behavior of the function H (for $\pi=0.5$ and at various times) for $a=-1$ (on the left) and $a=1$ (on the right). The solid line represents the value at $t=0$ , and the dashed lines represent it at $t=\{$ 0.2, 0.4, 0.6, 0.8}.

Considering the case where $a<0$ , Figure 2 reveals that H is strictly negative for all x before some critical time (calculated to be 0.536 for the $a=-1$ example). Furthermore, when the function does become positive, it only does so in a rather narrow interval (below a). This suggests that the incentive to stop is rather strong when $a<0$ , as one might expect. However, little more can be gleaned from the function H in this case. For $a>0$ , however, the function H is more informative about the optimal stopping strategy. Here, we find that H is strictly positive for all x before some critical time (again found to be 0.536 for $a=1$ ). This indicates that when $a>0$ , it would never be optimal to stop before this critical time. Moreover, since $\lim_{x\to\infty}H(t,x)=a$ , we also observe that any stopping region must be contained in a finite interval (above a). This suggests the existence of a disjoint continuation region and the presence of two separate optimal stopping boundaries. Indeed, these predictions are confirmed numerically below. This richer structure is also consistent with the results in [Reference Ekström and Vaicenavicius17], whose authors found similar disjoint continuation regions in a situation where the location of the pinning point of a Brownian bridge was uncertain.

To investigate the solution to (5.1), and hence (3.13), numerically, we employ finite difference techniques applied to an associated variational inequality. The connection between optimal stopping problems and variational inequalities has long been established (see, for example, [Reference Øksendal30, Section 10.4]). Specifically, it can be seen that a candidate solution to (5.1) can be obtained by solving the following variational inequality, expressed as a linear complementarity problem (see [Reference Zhu, Wu, Chern and Sun33, Section 2.5.5] for a general formulation):

(5.3) \begin{equation} \left\{ \begin{array}{l} \left[\dfrac{\partial\widetilde{V}^{\pi,a}}{\partial t}(t,x)+\mathbb{L}_X\widetilde{V}^{\pi,a}(t,x)\right]\left[\widetilde{V}^{\pi,a}(t,x)-G^{\pi,a}(t,x)\right]=0 , \\[13pt] \dfrac{\partial\widetilde{V}^{\pi,a}}{\partial t}(t,x)+\mathbb{L}_X\widetilde{V}^{\pi,a}(t,x) \leq 0 , \\[12pt] \widetilde{V}^{\pi,a}(t,x)-G^{\pi,a}(t,x)\geq 0 , \\[5pt] \widetilde{V}^{\pi,a}(1,x)=G^{\pi,a}(1,x). \end{array}\right.\end{equation}

We have stated the problem as a linear complementarity problem (as opposed to a free-boundary problem with smooth-pasting applied at the unknown boundaries), since the structure of the continuation and stopping regions in (t, x) is not known a priori for (5.1). As can be seen from (5.3), the location of the optimal stopping boundaries do not appear explicitly in the problem formulation, instead being implicitly defined by the condition $\widetilde{V}^{\pi,a}\geq G^{\pi,a}(t,x)$ . Once problem (5.3) has been solved (or numerically approximated), the optimal stopping boundaries can simply be read off from the solution by identifying where the function $\widetilde{V}^{\pi,a}-G^{\pi,a}$ switches from being positive to zero. The implicit treatment of the optimal stopping boundaries in (5.3) means that any complex structure of the continuation and stopping regions will be revealed as part of the solution. Indeed, this is exactly what we see below when $a>0$ , where multiple stopping boundaries are identified.

To approximate the solution to (5.3) numerically, we discretize the problem using the Crank–Nicolson differencing scheme and then solve the resulting system of equations using the projected successive over relaxation (PSOR) algorithm. A more detailed description of the PSOR method, along with proofs of convergence, can be found in [Reference Cryer9]. More details about the discretization and implementation of the algorithm can also be made available from the author upon request. Figure 3 shows the optimal stopping boundaries obtained from our numerical procedure for various values of a (both negative and positive).

Figure 3. The optimal stopping boundaries (found numerically) for various a. On the left: $a=\{-2.0, -1.0, -0.5\}$ (solid lines) and $\alpha\sqrt{1-t}$ (dashed line). On the right: $a=\{0.2, 0.5, 1.0\}$ (solid lines) and $a+\alpha\sqrt{1-t}$ (dashed lines).

Let us first discuss the case where $a>0$ . As predicted, Figure 3 reveals that it would never be optimal to stop before some critical time (for large enough a or small enough $\pi$ at least). Recalling that the optimal strategy for a Brownian motion with positive drift is to wait until $t=1$ , it would appear that waiting to learn more about the true nature of the process is optimal (at least initially). In addition, beyond some critical time, we observe two disjoint continuation regions. Indicating that, depending on the sample path experienced, it can be optimal to stop either after an increase in X (i.e. after a p-ball has been drawn) or after a decrease in X (i.e. after an m-ball has been drawn). Based on the terminology introduced in [Reference Ekström and Vaicenavicius17], we can interpret the former boundary as a too-good-to-persist boundary and the latter as a stop-loss boundary. The emergence of an endogenous stop-loss boundary in the optimal stopping strategy is a unique feature of the problem with uncertain pinning. Finally, we also observe that both stopping boundaries lie above the corresponding boundary if pinning was certain (given by $a+\alpha\sqrt{1-t}$ ), indicating that when $a>0$ , stopping will happen later in the presence of pinning uncertainty.

For the case where $a<0$ , we have the following remarks. Firstly, numerical investigations suggest that it is never optimal to stop when $x<0$ , despite the negative drift. Secondly, the optimal stopping strategy appears to be of the form $\tau=\inf\{s\geq 0\mid X_{t+s}\geq \widehat{b}(t+s)\}\wedge(1-t)$ for some time-dependent boundary $\widehat{b}$ . Further, $\widehat{b}(t)$ appears to converge to zero at $t=1$ , although it does not do so monotonically for all parameters. Moreover, the boundary itself is not monotone in the parameter a, i.e. $a\mapsto\widehat{b}(t)$ is not monotone. This behavior is most likely due to the differing effects of a on the linear drift of the Brownian motion and the pinning behavior of the Brownian bridge.

Due to the existence of multiple stopping boundaries, and their observed non-monotonic behavior, further analytical investigation of the problem for $a\neq 0$ would be challenging and is left for the subject of future research.

Acknowledgements

I wish to thank Erik Ekström for many fruitful conversations about this work, and two anonymous referees for their detailed comments, all of which helped improve both the results and the presentation of the paper.

Funding information

I am grateful for funding received from the UTS Business School Research Grant Scheme.

Competing interests

There were no competing interests to declare which arose during the preparation or publication process of this article.

References

Avellandeda, M. and Lipkin, M. (2003). A market-induced mechanism for stock pinning. Quant. Finance 3, 417425.CrossRefGoogle Scholar
Baurdoux, E. J., Chen, N., Surya, B. A. and Yamazaki, K. (2015). Optimal double stopping of a Brownian bridge. Adv. Appl. Prob. 47, 12121234.CrossRefGoogle Scholar
Boyce, W. (1973). On a simple optimal stopping problem. Discrete Math. 5, 297312.CrossRefGoogle Scholar
Breiman, L. (1964). Stopping rule problems. In Applied Combinatorial Mathematics, ed E. F. Beckenbach. Wiley, New York, pp. 284319.Google Scholar
Brennan, M. J. and Schwartz, E. S. (1990). Arbitrage in stock index futures. J. Business 63, 731.CrossRefGoogle Scholar
Chen, R. W., Grigorescu, I. and Kang, W. (2015). Optimal stopping for Shepp’s urn with risk aversion. Stochastics 84, 702722.CrossRefGoogle Scholar
Chen, R. W., Zame, A., Lin, C. T. and Wu, H. (2005). A random version of Shepp’s urn scheme. SIAM J. Discrete Math. 19, 149164.CrossRefGoogle Scholar
Chow, Y. S. and Robbins, H. (1965). On optimal stopping rules for $S_n/n$ . Illinois J. Math. 9, 444454.CrossRefGoogle Scholar
Cryer, C. (1971). The solution of a quadratic programming problem using systematic overrelaxation. SIAM J. Control 9, 385392.CrossRefGoogle Scholar
De Angelis, T. and Milazzo, A. (2020). Optimal stopping for the exponential of a Brownian bridge. J. Appl. Prob. 57, 361384.CrossRefGoogle Scholar
Dehghanian, A. and Kharoufeh, J. P. (2019). Optimal stopping with a capacity constraint: Generalizing Shepp’s urn scheme. Operat. Res. Lett. 47, 311316.CrossRefGoogle Scholar
Donsker, M. D. (1951). An invariance principle for certain probability limit theorems. Mem. Amer. Math. Soc. 5, 4355.Google Scholar
Durbin, J. and Williams, D. (1992). The first-passage density of the Brownian motion process to a curved boundary. J. Appl. Prob. 29, 291304.CrossRefGoogle Scholar
Dvoretzky, A. (1967). Existence and properties of certain optimal stopping rules. In Proc. Fifth Berkeley Symp. Math. Statist. Prob., Vol. 1. University of California Press, Berkeley, CA, pp. 441452.Google Scholar
Ekström, E. and Lu, B. (2011) Optimal selling of an asset under incomplete information. Int. J. Stoch. Anal., 543590.Google Scholar
Ekström, E. and Vaicenavicius, J. (2016). Optimal liquidation of an asset under drift uncertainty. SIAM J. Financial Math. 7, 357381.CrossRefGoogle Scholar
Ekström, E. and Vaicenavicius, J. (2020). Optimal stopping of a Brownian bridge with unknown pinning point. Stoch. Process. Appl. 130, 806823.CrossRefGoogle Scholar
Ekström, E. and Vannestål, M. (2019). American options with incomplete information. Internat. J. Theor. Appl. Finance 22, 1950035.CrossRefGoogle Scholar
Ekström, E. and Wanntorp, H. (2009). Optimal stopping of a Brownian bridge. J. Appl. Prob. 46, 170180.CrossRefGoogle Scholar
Ernst, P. A. and Shepp, L. A. (2015). Revisiting a theorem of L. A. Shepp on optimal stopping. Commun. Stoch. Anal. 9, 419423.Google Scholar
Föllmer, H. (1972). Optimal stopping of constrained Brownian motion. J. Appl. Prob. 9, 557571.CrossRefGoogle Scholar
Gapeev, P. (2012). Pricing of perpetual American options in a model with partial information. Internat. J. Theor. Appl. Finance 15, 122.CrossRefGoogle Scholar
Glover, K. (2022). Optimally stopping a Brownian bridge with an unknown pinning time: A Bayesian approach. Stoch. Process. Appl. 150, 919937.CrossRefGoogle Scholar
Herrmann, S. and Tanré, E. (2016). The first-passage time of the Brownian motion to a curved boundary: An algorithmic approach. SIAM J. Sci. Comput. 38, A196A215.CrossRefGoogle Scholar
Hung, Y.-C. (2009). A note on randomized Shepp’s urn scheme. Discrete Math. 309, 17491759.CrossRefGoogle Scholar
Johnson, P. and Peskir, G. (2017). Quickest detection problems for Bessel processes. Ann. Appl. Prob. 27, 10031056.CrossRefGoogle Scholar
Johnson, P. and Peskir, G. (2018). Sequential testing problems for Bessel processes. Trans. Amer. Math. Soc. 370, 20852113.CrossRefGoogle Scholar
Liptser, R. S. and Shiryaev, A. N. (2001). Statistics of Random Processes I: General Theory, 2nd edn (Appl. Math. 5). Springer, New York.Google Scholar
Liu, J. and Longstaff, F. A. (2004). Losing money on arbitrage: Optimal dynamic portfolio choice in markets with arbitrage opportunities. Rev. Financial Studies 17, 611641.CrossRefGoogle Scholar
Øksendal, B. (2004). Stochastic Differential Equations: An Introduction with Applications, 6th edn. Springer, New York.Google Scholar
Peskir, G. and Shiryaev, A. N. (2006). Optimal Stopping and Free-Boundary Problems (Lect. Math. ETH Zürich). Birkhäuser, Basel.Google Scholar
Shepp, L. A. (1969). Explicit solutions to some problems of optimal stopping. Ann. Math. Statist. 40, 9931010.CrossRefGoogle Scholar
Zhu, Y.-L., Wu, X., Chern, I.-L. and Sun, Z.-Z. (2004). Derivative Securities and Difference Methods. Springer, New York.CrossRefGoogle Scholar
Figure 0

Figure 1. The solution to the problem in (3.13) when the process is believed to be a Brownian bridge (pinning to $a=0$) with probability $\pi$ or a (driftless) Brownian motion with probability $1-\pi$. Solid lines = $V(0,x,\pi)$ from (4.3) for $\pi=\{$0.1, 0.5, 0.9} (higher lines correspond to larger $\pi$); dashed line = $V_1^0(0,x)$ from (3.16); dotted line = x.

Figure 1

Figure 2. The behavior of the function H (for $\pi=0.5$ and at various times) for $a=-1$ (on the left) and $a=1$ (on the right). The solid line represents the value at $t=0$, and the dashed lines represent it at $t=\{$0.2, 0.4, 0.6, 0.8}.

Figure 2

Figure 3. The optimal stopping boundaries (found numerically) for various a. On the left: $a=\{-2.0, -1.0, -0.5\}$ (solid lines) and $\alpha\sqrt{1-t}$ (dashed line). On the right: $a=\{0.2, 0.5, 1.0\}$ (solid lines) and $a+\alpha\sqrt{1-t}$ (dashed lines).