1. Introduction
Research on applying stochastic control theory to analyze insurance problems has long attracted a great deal of interest among actuaries because optimal control provides both theoretical and practical solutions to optimization problems in insurance. There are many research papers studying optimal insurance problems involving reinsurance and investment, which help the insurer to increase profits and to reduce the claim risk. For example, [Reference Browne8] considered two optimal investment problems for an insurer under a diffusion risk model, namely, maximizing expected exponential utility of terminal wealth and minimizing the probability of ruin. The probability of ruin for a diffusion risk model was minimized via reinsurance and investment in [Reference Promislow and Young26]; [Reference Schmidli28] studied a similar problem for a compound Poisson risk model; [Reference Irgens and Paulsen17] incorporated proportional and excess-of-loss reinsurance into a jump-diffusion model with investment to maximize the expected utility of terminal wealth; [Reference Liang, Yuen and Guo22] explored an optimal proportional reinsurance and investment model in a stock market driven by an Ornstein–Uhlenbeck process; and [Reference Li, Li and Young20] studied an optimal investment and reinsurance problem under the mean-variance criterion. See also [Reference Bi, Liang and Yuen3, Reference Hipp and Taksar15, Reference Sun, Zhang and Yuen29, Reference Zhang and Siu32], to name just a few.
In most of above-referenced research, the insurer has complete knowledge of the model and of the values of the processes in that model, i.e. the problems are considered under full information. However, in reality, the insurance company generally only has partial information, which generally assumes knowledge of the model but not of the values of all the processes in that model. Portfolio optimization problems with unobservable information has been an active topic in mathematical finance. However, results related to insurance models are relatively few. Research on portfolio optimization generally assumes the drift of the traded stock unobservable; see, for example, [Reference Brendle6, Reference Brendle7]. Also, [Reference Lakner18, Reference Lakner19] investigated a similar problem by using the martingale-duality approach; [Reference Bäuerle and Rieder2] assumed that the dynamic of the stock price follows a geometric Brownian motion with Poisson jumps, in which the jump intensity is unobservable; and [Reference Liang and Bayraktar21] considered an optimal reinsurance and investment problem by maximizing the expected exponential utility of the insurer’s terminal wealth in a Black–Scholes financial market. The claim process is a compound Poisson process in which the claim intensity and the jump-size distribution depend on the state of a non-observable Markov chain. See also [Reference Bäuerle and Leimcke1, Reference Björk, Davis and Landén4, Reference Brachetta and Ceci5, Reference Gennotte14, Reference Honda16, Reference Peng and Hu25, Reference Xiong, Xu and Zheng30].
In this paper we find the optimal reinsurance strategy for an insurer to maximize the expected exponential utility of its terminal wealth. We use a diffusion model to describe the dynamics of the claim, in which the drift of the claims follows a mean-reverting Ornstein–Uhlenbeck process. We consider two cases: full information and partial information. Full information occurs when the insurer directly observes the drift; partial information occurs when the insurer observes only its claims. We use the filtering technique to transfer the unobservable problem into an observable one; then, by applying the dynamic programming approach, we derive explicit expressions for the value function and the corresponding optimal reinsurance strategy. We also determine a relationship between the value function and reinsurance strategy under full information with the value function and reinsurance strategy under partial information. Finally, we present numerical examples to illustrate possible outcomes of the model.
The rest of this paper is organized as follows. In Section 2, we describe the insurance model in both full and partial information frameworks. In Section 3, we use a dynamic Hamilton–Jacobi–Bellman (HJB) equation approach to find the value function and optimal reinsurance strategy under complete information. In Section 4, we consider the problem with unobservable information. Moreover, we discuss how to handle a constraint on the proportion reinsured, and we provide numerical simulation results to compute the probability that the reinsurance proportion lies outside of the interval [0, 1] for some values of the parameters. Finally, in Section 5 we compare the value function and the optimal reinsurance strategy under partial observation with the ones under full observation. We also provide numerical examples to show the difference between the two optimal reinsurance proportions at the end of this section.
2. Problem formulation
In this section we describe the claim process of the insurer, and we formulate the problem of maximizing the insurer’s expected exponential utility of terminal wealth. Let $(\Omega, \mathcal F, \mathbb P)$ be a probability space that supports two correlated standard Brownian motions $W_1$ and $W_2$ , with constant coefficient of correlation $\rho \in [{-}1, 1]$ . As in [Reference Promislow and Young26], assume that the claim process $S = \{S_t\}_{0 \le t \le T}$ follows Brownian motion with drift, in which $S_t$ equals the cumulative claims paid by the insurer during the time interval [0, t]. However, unlike [Reference Promislow and Young26], the drift follows a random process. Specifically, $\text{d} S_t = \mu_t \text{d} t - \sigma \text{d} W_{1,t}$ , in which $S_0 = 0$ and $\sigma$ is a positive constant, and
in which a, b, and $\bar{\mu}$ are positive constants, i.e. $\{\mu_t\}_{0 \le t \le T}$ follows a mean-reverting Ornstein–Uhlenbeck process. We loosely interpret $\bar{\mu}$ as a long-run value of $\mu_t$ , and a measures the ‘speed’ at which $\mu_t$ moves towards $\bar{\mu}$ . This model was explored in [Reference Dassios and Jang12] as the diffusion approximation of a Cox process with a shot-noise process as the claim intensity.
The insurer collects premium at the constant rate $c = (1 + \theta)\bar{\mu}$ , in which $\theta \ge 0$ is the constant proportional risk loading. The insurer is able to purchase proportional reinsurance with constant proportional risk loading $\eta \ge \theta$ . Let $q_t$ denote the proportional amount of business retained at time $t \in [0, T]$ ; thus, the controlled surplus $X = \{X_t\}_{0 \le t \le T}$ follows the stochastic differential equation (SDE)
with $X_0 = x$ . If $0 \le q_t \le 1$ , then $q_t$ is the usual proportional reinsurance. In Examples 4.1 and 4.2 we calculate the probability that the optimal $q_t$ lies outside the interval [0, 1].
The insurer chooses a retention strategy $q = \{ q_t \}_{0 \le t \le T}$ based on the available information, and we consider two cases in this paper:
-
Full information In this case, the insurer observes both $\{S_t\}$ and $\{ \mu_t \}$ . A retention strategy q is admissible in this case if (i) q is adapted to the filtration $\mathbb F = \{\mathcal F_t\}_{0 \le t \le T}$ , in which $\mathcal F_t = \sigma (S_s, \mu_s\colon 0 \le s \le t)$ for all $t \in [0, T]$ ; (ii) q is conditional ${L}^2$ -integrable, i.e. $\mathbb E \big[\int^T_t q^2_u \, \text{d} u \mid \mathcal{F}_t \big] < \infty$ for any $0 \le t \le T$ ; and (iii) the SDE (2.2) has a pathwise unique solution $\{X^q_t\}_{t\in [0,T]}$ . Let $\mathcal A^{f}$ denote the set of admissible strategies in the full information case.
-
Partial information In this case, the insurer observes only $\{S_t\}$ and does not know the drift of its claim process, although the insurer knows the conditional expectation and variance of $\mu_0$ . A retention strategy q is admissible in this case if (i) q is adapted to the filtration $\mathbb G = \{ \mathcal G_t \}_{0 \le t \le T}$ , in which $\mathcal G_t = \sigma (S_s\colon 0 \le s \le t)$ for all $t \in [0, T]$ ; (ii) if $\mathbb E \big[\int^T_t q^2_u \, \text{d} u \mid \mathcal{G}_t \big] < \infty$ for any $0 \le t \le T$ ; and (iii) the SDE of the controlled surplus under $\mathcal G_t$ has a pathwise unique solution. Let $\mathcal A^{p}$ denote the set of admissible strategies in the partial information case. Note that $\mathcal G_t \subset \mathcal F_t$ for all $t \in [0, T]$ , i.e. $\mathbb G$ is a subfiltration of $\mathbb F$ ; we also assume that $\mathbb F$ and $\mathbb G$ are augmented to satisfy the usual conditions of completeness and right continuity.
In both cases, the insurer chooses q to maximize the expectation of exponential utility of wealth at time T. Let ${V^f}$ denote the maximum expected exponential utility of terminal wealth under full information, i.e.
in which $\gamma > 0$ is the (constant) coefficient of absolute risk aversion.
For the partial information case, as in [Reference Björk, Davis and Landén4, Reference Brendle6, Reference Brendle7], we first project the drift process $\mu_t$ onto the observable filtration $\mathcal G$ , in order to reduce the partially observable problem to an equivalent problem with full information. Define $m_t = \mathbb E (\mu_t \mid \mathcal G_t)$ , $0 \le t \le T$ ; then, [Reference Liptser and Shiryaev23, Theorem 10.3] shows us that $\{m_t\}_{0 \le t \le T}$ follows the SDE
in which $\bar W_1 = \{\bar W_{1,t}\}_{0 \le t \le T}$ is the so-called innovations process given by
and $\bar W_1$ is a $(\mathbb P, \mathbb G)$ -standard Brownian motion. In other words, $\{m_t\}_{0 \le t \le T}$ follows a $\mathbb G$ -Ornstein–Uhlenbeck process with non-constant volatility.
If we define $v(t) = {\text{Var}}(\mu_t \mid \mathcal G_t)$ for $t \in [0, T]$ , then $v = v(t)$ satisfies the Riccati equation
with initial value $v(0) = {\text{Var}}(\mu_0 \mid \mathcal G_0)$ . See Appendix A for a derivation of (2.6) and the following solution:
in which
and
Moreover, by substituting for $W_1$ in terms of $\bar W_1$ in (2.2), we obtain that X follows the dynamics
which is $\mathbb G$ -adapted in the partial information case because q is $\mathbb G$ -adapted in that case. Let ${V^p}$ denote the maximum expected exponential utility of terminal wealth under partial information, i.e.
3. Full information case
We begin by stating a relevant verification theorem without proof because the proof is standard in the actuarial and financial mathematics literature; see, for example, [Reference Promislow and Young26, Theorem 2.1].
Theorem 3.1. Suppose ${v^f} \in \mathcal C^{1, 2, 2}([0, T] \times \mathbb R \times \mathbb R)$ takes values in $\mathbb R^{-}$ , is non-decreasing and concave in x, and satisfies the HJB equation
with terminal condition ${v^f}(T, x, \mu) = - \text{e}^{-\gamma x}$ . The maximizer of (3.1) is
If the retention strategy ${q^f}$ given in feedback form by $q^f_t = q^*(t, X^*_t, \mu_t)$ for all $0 \le t \le T$ is admissible, then the value function ${V^f}$ defined by (2.3) equals ${v^f}$ . Here, $X^*_t$ is the optimally controlled surplus at time t.
In the following theorem, we solve the HJB equation in (3.1) with boundary condition ${v^f}(T, x, \mu) = - \text{e}^{-\gamma x}$ . Theorem 3.1 then allows us to deduce that the solution equals the value function ${V^f}$ in (2.3).
Theorem 3.2. The maximum expected exponential utility of terminal wealth under full information is ${V^f}(t, x, \mu) = - \text{e}^{-\gamma x} \exp \{{A}(t)\mu^2 + {B} (t)\mu + {C} (t)\}$ , in which
with $\Delta$ given in (2.7), and we define $\alpha_1 > 0$ and $\alpha_2 > 0$ as $\alpha_1 = 2 \sigma \big( \sigma \Delta + (\sigma a - \rho b) \big)$ , $\alpha_2 = 2 \sigma \big( \sigma \Delta - (\sigma a - \rho b) \big)$ . Moreover, the optimal retention strategy ${q^f}$ is given in feedback form by
Proof. See Appendix B for a proof of this theorem.
Remark 3.1. We derive explicit expressions for A and B in (3.2) and (3.3), respectively, because ${q^f}$ in (3.5) relies on A and B. However, for the sake of space, we do not present an explicit expression for C.
Remark 3.2. Because $A(t) \le 0$ for all $0 \le t \le T$ , as $\mu_t$ increases the proportion retained, namely $q^f_t$ , decreases. This monotonicity makes sense because, as the drift of claims increases, we expect the insurance company to retain less of its risk, especially given a fixed premium rate $(1 + \eta)\bar{\mu}$ . Also, note that, when $\mu_t = 0$ , $q^f_t > 0$ because $B(t) \ge 0$ for all $0 \le t \le T$ .
In the following corollary we show how A and B change with time. We omit the calculations because they are straightforward from (3.2) and (3.3).
Corollary 3.1. The derivative of A(t) is
which is positive for $0 \le t \le T$ , and the derivative of B(t) is
which is negative for $0 \le t \le T$ . Thus, the slope of $q^f_t$ as a linear function of $\mu_t$ , namely $({1}/({\sigma^2 \gamma})) (2 \sigma \rho b A(t) - 1)$ , becomes less negative over time (specifically, it increases to $-1/(\sigma^2 \gamma)$ as t increases to T), and the intercept, namely $({1}/({\sigma^2 \gamma})) (\sigma \rho b B(t) + (1+\eta)\bar{\mu})$ , becomes less positive over time (specifically, it decreases to $(1 + \eta)\bar{\mu}/(\sigma^2 \gamma)$ as t increases to T).
Remark 3.3. Corollary 3.1 shows us that, over time, the proportion of retained risk, as a linear function of $\mu_t$ , flattens. This flattening is consistent with the risk aversion of the insurer. As time approaches the horizon T, the insurer will not wish to change its retention as much (as a function of $\mu_t$ ) as when further from the horizon. Intuitively, the closer time is to the horizon, the less time the insurer has to maximize its expected utility and, therefore, the insurer reacts less strongly to changes in the drift of the surplus. We see a similar phenomenon in the optimal investment strategy of [Reference Brendle6]; namely, the closer time is to T, the less the investor changes their investment in reaction to changes in the drift of the risky asset.
4. Partial information case
In this section we analyze the problem under partial information. The corresponding verification theorem and its solution parallel the results in Section 3.
Theorem 4.1. Suppose ${v^p} \in \mathcal C^{1, 2, 2}([0, T] \times \mathbb R \times \mathbb R)$ takes values in $\mathbb R^{-}$ , is non-decreasing and concave in x, and satisfies the HJB equation
with terminal condition ${v^p}(T, x, \mu) = - \text{e}^{-\gamma x}$ . The maximizer of (4.1) is
If the retention strategy ${q^p}$ given in feedback form by $q^p_t = q^*(t, X^*_t, m_t)$ for all $0 \le t \le T$ is admissible, then the value function ${V^p}$ defined by (2.9) equals ${v^p}$ . Here, $X^*_t$ is the optimally controlled surplus at time t.
In the following theorem we solve the HJB equation in (4.1) with boundary condition ${v^p}(T, x, \mu) = - \text{e}^{-\gamma x}$ . Theorem 4.1 then allows us to deduce that the solution equals the value function ${V^p}$ in (2.9).
Theorem 4.2. The maximum expected exponential utility of terminal wealth under partial information is ${V^p}(t,x,m) = - \text{e}^{-\gamma x} \exp\{\widehat A(t) m^2 + \widehat B(t) m + \widehat C(t)\}$ , in which
in which R is given in (2.8).
Moreover, the optimal retention strategy ${q^p}$ is given in feedback form by
Proof. See Appendix C for a proof of this theorem.
Remark 4.1. As in Section 3, we derive explicit expressions for $\widehat A$ and $\widehat B$ in (4.2) and (4.3), respectively, because ${q^p}$ in (4.5) relies on $\widehat A$ and $\widehat B$ . However, for the sake of space, we do not present an explicit expression for $\widehat C$ .
Because $\sigma \rho b - v(t)$ can be negative, it is not clear whether the slope of $q^p_t$ as a function of $m_t$ is negative, as in the case for $q^p_t$ as a function of $\mu_t$ . The following corollary tells us that the slope of $q^p_t$ is, indeed, negative.
Corollary 4.1. The slope of $q^p_t$ as a linear function of $m_t$ is negative, i.e.
for all $0 \le t \le T$ .
Proof. By substituting for $\widehat A$ and v from (4.2) and (2.6), respectively, and by simplifying the result, we can show that inequality (4.6) is equivalent to
or
By substituting for R from (2.8), we find that inequality (4.7) is equivalent to
which is true because $\sigma \Delta > \pm (\sigma a - \rho b)$ .
As in Corollary 3.1, we show how $\widehat A$ and $\widehat B$ change with time in the following corollary.
Corollary 4.2. The derivative of $\widehat A(t)$ is
for $0 \le t \le T$ , and the derivative of $\widehat B(t)$ is
for $0 \le t \le T$ .
Remark 4.2. Unlike Corollary 3.1, we cannot assert that $\widehat A^{\prime}(t) \ge 0$ and $\widehat B^{\prime}(t) \le 0$ because these inequalities might not hold if R is negative enough, which occurs, for example, when v(0) is relatively large. On the other hand, if $R > 0$ , then it is clear that $\widehat A^{\prime}(t) \ge 0$ and $\widehat B^{\prime}(t) \le 0$ .
In the following, we present two numerical examples to further explore the reinsurance strategies in both full and partial information cases. For each example, we set $a=1$ , $b=2.5$ , $\theta=0.4$ , $\eta=0.8$ , $\rho=0.4$ , $\sigma=3$ , $\gamma=1.2$ , $\bar{\mu}=2$ , and $\mu_0\sim N(0,1)$ .
Example 4.1. In Figure 1 we set $T=1$ and used a Monte Carlo approach to simulate the sample paths of $\mu_t$ and $q^f_t$ under full information. We present three sample paths in these two figures.
To compute the probability that $q^f_t < 0$ , $q^f_t > 1$ , or $\mu_t < 0$ , we worked with 3000 sample paths, each discretized into 2000 time intervals. Let i denote a sample path, and let j denote a time instance. For $i =1, 2, \dots, 3000$ and $j = 1, 2, \dots, 2000$ , we counted the number of points for which ${q^f}(i, j) < 0$ , and computed the proportion of that number divided by the total number of observations, namely, $3000 \times 2000$ . That proportion is our estimate of the probability of $q^f_t < 0$ . Similarly, we estimated the probabilities of $q^f_t > 1$ and $\mu_t < 0$ . For our parameter values, we computed that the probability of $q^f_t < 0$ equals $0.0682$ , the probability of $q^f_t > 1$ equals 0, and the probability of $\mu_t<0$ equals $0.1081$ .
In Figure 2 we set $T = 50$ , and we present one sample path of the mean-reverting process $\mu_t$ and the reinsurance proportion $q^f_t$ . In this case, we estimated, via 3000 discretized sample paths, that the probability of $q^f_t < 0$ equals $0.1461$ , and the probability of $\mu_t < 0$ equals $0.1323$ . Thus, as the terminal time T increases, the probabilities of $q^f_t < 0$ and $\mu_t < 0$ also increase.
As an aside, if b is relatively small and if T is large, say, 50, then the value of $\mu_t$ is close to $\bar{\mu}$ for a good portion of the interval [0, T]; thus, the probability of $\mu_t < 0$ is small. If $T = 1$ , then because $\mu_0$ might be very different from $\bar{\mu}$ , we cannot say that the probability of $\mu_t < 0$ is small.
We also observed (in work not shown here) that, as $\bar{\mu}$ decreases, the probability of $q^f_t < 0$ increases, and as $\bar{\mu}$ increases, the probability of $q^f_t > 1$ increases. Moreover, as $\bar{\mu}$ increases, then the probability of $\mu_t < 0$ decreases.
Example 4.2. In Figure 3 we simulate the sample paths of $m_t$ and $q^p_t$ under partial information. We set $T=1$ , $m_0=1.6$ , and $v(0) = 0.5$ . As in Example 4.1, we plot three sample paths of $q^p_t$ and $m_t$ . In this example, among 3000 discretized sample paths, we find that the optimal reinsurance proportion $q^p_t$ always lies in [0, 1], and the filtered drift process $m_t$ is always positive.
In Figure 4 we set $T=50$ and plot one sample path. As in the case for $T = 1$ , among 3000 discretized sample paths, the optimal reinsurance proportion $q^p_t$ always lies in [0, 1], and the filtered drift process $m_t$ is always positive.
For the optimization problem with the reinsurance constraint, that is, $0 \le q_t \le 1$ for all $0 \le t \le T$ , as in the discussion before, the value function under full information (or partial information) satisfies the HJB equation in (3.1) (or (4.1)) for ${V^f}$ (or ${V^p}$ ), except that q is constrained to lie in [0, 1]. Indeed, from [Reference Crandall, Ishii and Lions10, Reference Yong and Zhou31], we know that a value function for a constrained problem is the unique viscosity solution of its HJB equation subject to the constraint. However, due to the reinsurance constraint, we cannot derive further explicit results by following the classical HJB equation approach. Motivated by the convex-duality approach, as in [Reference Cvitanić and Karatzas11, Reference Putschögl and Sass27], we need to introduce an auxiliary unconstrained optimization problem by modifying the original problem with an auxiliary stochastic parameter, then find the relationship between the value function of the auxiliary problem and that of the original problem. Due to the randomness of the auxiliary parameter process, the corresponding value function satisfies a stochastic HJB equation, which is a special backward stochastic partial differential equation or infinite-dimensional BSDE. The solvability of the stochastic HJB equation was studied in [Reference Peng24], which provided an existence and uniqueness theorem for the case in which the volatility coefficient of the state process does not contain the control variable. In the finance and insurance literature, the BSDE approach is becoming an efficient technique to solve utility maximization problems with stochastic coefficients. A stochastic Stackelberg differential game between an insurer and a reinsurer was considered in [Reference Chen and Shen9] by applying the BSDE approach. See also [Reference Delong13] for more details about the applications in actuarial and financial models with the BSDE approach. Because the convex-duality-plus-BSDE approach differs greatly from the HJB equation approach in this paper, we will leave that work for future research.
5. The relationship between Vf and Vp
In this section we adapt the technique in [Reference Brendle7] to show the link between the value function ${V^f}$ under full observation and the value function ${V^p}$ under partial observation.
Theorem 5.1. The following relationships hold among A, B, C and $\widehat A, \widehat B, \widehat C$ :
in which N and Q satisfy the following differential equations:
with boundary conditions $N(T) = Q(T) = 0$ . Moreover, the value functions ${V^f}$ and ${V^p}$ satisfy
in which
Finally, the optimal retention strategies ${q^f}$ and ${q^p}$ satisfy
Proof. The validity of (5.1)–(5.3) can be checked directly by differentiating terms on the right-hand sides and comparing them with the existing differential equations satisfied by the left-hand sides.
We next prove (5.4). Because the distribution of $\mu_t$ conditional on $\mathcal G_t$ is Gaussian with mean $m_t$ and variance v(t), we have
Hence, we obtain
Finally, from the expressions of $q^f_t$ and $q^p_t$ in (3.5) and (4.5), respectively, we derive
which completes our proof.
Note that the slopes of both $q^f_t$ and $q^p_t$ as functions of $\mu_t$ and $m_t$ , respectively, are negative for $0 \le t \le T$ . Also, the vertical intercept of $q^f_t$ thought of as a function of $\mu_t$ is positive, although the corresponding statement for the vertical intercept of $q^p_t$ is not necessarily true; see, for example, the right panel of Figure 5. A natural question is how these slopes and intercepts compare to each other, and the following corollary answers this query.
Corollary 5.1. For all $0 \le t \le T$ ,
and
with strict inequalities when $0 \le t < T$ .
Proof. If $t = T$ , then the first inequality in (5.5) is an equality because $A(T) = \widehat A(T) = 0$ . For $0 \le t < T$ , the first inequality in (5.5) holds strictly if and only if the following string of implications holds:
which is true because $A(t) < 0$ and $v(t) > 0$ for all $0 \le t < T$ . The proof of (5.6) is similar (using $B(t) > 0$ when $0 \le t < T$ ), so we omit it.
It is intuitively pleasing that $q^p_t$ reacts less strongly to changes in $m_t$ than $q^f_t$ reacts to $\mu_t$ . Indeed, in the partial information case, the risk-averse insurer has less information and, therefore, is more cautious in changing the proportion of retained risk. Similarly, inequality (5.6) implies that the insurer retains less risk when $m_t = 0$ in the partial information case than when $\mu_t = 0$ in the full information case. See Figure 5 for an illustration of Corollary 5.1.
In the following, we present three numerical examples to further explore the difference between $q^f_t$ and $q^p_t$ . For each example, we choose $b=2.5$ , $\theta=0.4$ , $\eta=0.8$ , $\rho=0.4$ , $\sigma=3$ , $\gamma=1.2$ , $\bar{\mu}=2$ , and $T=2$ .
Example 5.1. In the left panel of Figure 5, we set $a=1$ and $v(0) = 0.5$ . We plot the graphs of $q^f_t$ and $q^p_t$ at time $t=0.5$ as functions of $\mu_t$ and $m_t$ by assuming that $m_t = \mu_t$ at this time. We observe that both $q^f_t$ and $q^p_t$ are linear functions of $\mu_t = m_t$ , as we expect from Theorems 3.2 and 4.2, but the slope of $q^f_t$ is steeper than that of $q^p_t$ , as we expect from Corollary 5.1. From our algebraic work, we also note that $\lim_{t \to T^-} q^f_t = q^p_t$ when $\mu_t = m_t$ , which our numerical work (not shown here) confirms.
Next, in the right panel of Figure 5 we enlarge the value of v(0) by setting $v(0) = 100$ , and we plot the graphs of $q^f_t$ and $q^p_t$ at time $t=0$ . We observe that, unlike the full information case, for which the vertical intercept of $q^f_t$ is positive (see Remark 3.2), the intercept of $q^p_t$ as a function of $m_t$ can be negative.
Example 5.2. From Corollary 3.1, we know that the slope of $q^f_t$ increases with t, and the intercept of $q^f_t$ decreases with t. Motivated by this corollary, in this example, we investigate the changes of the slope and the intercept of $q^p_t$ with t. In the left panel of Figure 6 we set $a=1$ and $v(0)=0.5$ . We plot the graph of the slope of $q^p_t$ divided by the ratio $1/(\sigma^2 \gamma)$ as a function of $t \in [0, T]$ . In the right panel of Figure 6 we plot the graph of the intercept of $q^p_t$ divided by the ratio $1/(\sigma^2 \gamma)$ as a function of $t \in [0, T]$ . In these graphs, when v(0) is relatively small, we see that the slope and intercept of $q^p_t$ increase and decrease with t, respectively, as is true for $q^f_t$ .
In Figure 7 we plot the slope and intercept of $q^p_t$ divided by the ratio $1/(\sigma^2 \gamma)$ when v(0) is relatively large, that is, $v(0) = 100$ . We find that both the slopes and intercepts of $q^p_t$ are monotonic with time t, but the monotonicity is the opposite of that when v(0) is relatively small, that is, $v(0) = 0.5$ .
Example 5.3. In Figure 8 we plot the graph of the optimal proportional retention $q^f_t$ and $q^p_t$ when the parameter a is large. We set $a=3$ and $v(0) = 0.5$ . This graph shows that the values of $q^f_t$ and $q^p_t$ are close to each other when the parameter a is large.
In Figure 9 we plot the slope and intercept of $q^p_t$ divided by the ratio $1/(\sigma^2 \gamma)$ in this case; we see that they are monotonic with respect to t, with the same monotonicity that the slope and intercept of $q^f_t$ possess. Furthermore, when we take a larger value of v(0), such as $v(0) = 5$ , Figure 10 shows that they both lose monotonicity with t.
Appendix A. Derivation and solution of the Riccati equation (2.6)
First, [Reference Liptser and Shiryaev23, Section 10.2.1] proved that $v(t) = \mathbb E ( (\mu_t - m_t)^2 )$ , i.e. the conditional variance of $\mu_t$ equals the unconditional variance.
Define the process $\{\delta_t\}_{0 \le t \le T}$ by $\delta_t = \mu_t - m_t$ . Then, by (2.1) and (2.4), we have
Use (2.5) to replace $\text{d}\bar W_{1,t}$ with $\text{d} W_{1,t} - ({\delta_t}/{\sigma}) \text{d} t$ ,
and Itô’s formula gives us
Because $W_1$ and $W_2$ are $\mathbb F$ -standard Brownian motions, and because v(t) and $\mathbb E(\delta^2_t)$ are continuous with respect to t and, thus, are bounded on [0, T], we have
By taking (unconditional) expectation of both sides of (A.1), we obtain
which gives us the Riccati equation (2.6).
To solve this equation (see [Reference Lakner19, Remark 4.2], which also provides an explicit solution for the constant-coefficient, one-dimensional Riccati equation), first define the function y by
(2.6) gives us the following Riccati equation for y:
Next, define the function u by
or equivalently,
then, (A.3) gives us the following second-order ordinary differential equation (ODE) for u:
The ODE in (A.5) has the general solution $u(t) = A_1 \text{e}^{r_1 t} + A_2 \text{e}^{r_2 t}$ , in which $A_1$ and $A_2$ are constants to be determined, $r_1 = - a + \Delta$ , and $r_2 = - a - \Delta$ , with $\Delta$ given in (2.7). By reversing (A.2) and (A.4), we obtain the following general expression for v:
or equivalently,
in which $R = A_1/A_2$ . By using the given initial condition v(0), we determine R:
which gives us R as in (2.8).
Appendix B. Proof of Theorem 3.2
From related work with exponential utility (e.g. [Reference Brendle6]), we hypothesize that the value function is of the form
for some functions of time A, B, and C, with $A(T) = B(T) = C(T) = 0$ . The terminal conditions follow from ${V^f}(T, x, \mu) = - \text{e}^{-\gamma x}$ .
Because ${V^f}$ in (B.1) is concave with respect to x, the first-order necessary condition in (3.1) is sufficient, and we obtain the optimal retention in feedback form as
in which we abuse notation slightly by using ${q^f}$ to refer both to the optimal retention strategy (as in ${q^f} = \{q^f_t\}_{0 \le t \le T}$ ) and to the deterministic function ${q^f}$ in (B.2). Note that ${q^f}$ in (B.2) is independent of the surplus.
By substituting (B.1) and (B.2) into (3.1) and rearranging terms, we obtain
Thus, we have the following three differential equations for A, B, and C:
with terminal conditions $A(T) = 0$ , $B(T) = 0$ , and $C(T) = 0$ .
Equation (B.3) is a constant-coefficient Riccati equation, which we can solve explicitly by using the same method as in Appendix A, although we are given $A(T) = 0$ instead of v(0); by doing so, we obtain the expression for A in (3.2).
Equation (B.4) is a linear differential equation, which, by substituting for A from (3.2), we can rewrite as
By integrating this from t to T and using $B(T) = 0$ , we obtain the expression for B in (3.3).
Finally, by integrating (B.5) from t to T and by using $C(T) = 0$ , we obtain the integral representation for C in (3.4).
It remains to show that ${V^f}$ and ${q^f}$ satisfy the conditions of Theorem 3.1. By construction, ${V^f}$ satisfies the HJB equation (3.1) with boundary condition ${V^f}(T, x, \mu) = -\text{e}^{-\gamma x}$ . To show that the retention strategy ${q^f}$ in (3.5) is admissible, we check the three conditions in the definition of admissibility. First, note that ${q^f}$ is adapted to the filtration $\mathbb{F}$ by its definition. Second, ${q^f}$ is conditional $L^2$ -integrable because A(t) and B(t) are bounded functions on [0, T], and $\mu = \{\mu_t \}$ is conditional $L^2$ -integrable. Third, the SDE (2.2) has a pathwise unique solution, which is easy to see because ${q^f}$ is independent of the surplus X. Thus, ${q^f}$ is admissible.
From Theorem 3.1, we deduce that ${V^f}$ and ${q^f}$ as stated in Theorem 3.2 equal the value function and optimal retention strategy, respectively.
Appendix C. Proof of Theorem 4.2
As in Appendix B, we hypothesize that the value function is of the form
for some functions of time $\widehat A$ , $\widehat B$ , and $\widehat C$ , with $\widehat A(T) = \widehat B(T) = \widehat C(T) = 0$ . The terminal conditions follow from ${V^p}(T, x, \mu) = - \text{e}^{-\gamma x}$ .
Because ${V^p}$ in (C.1) is concave with respect to x, the first-order necessary condition in (4.1) is sufficient, and we obtain the optimal retention in feedback form as
in which we abuse notation slightly by using ${q^p}$ to refer both to the optimal retention strategy and to the deterministic function ${q^p}$ in (C.2).
By substituting (C.1) and (C.2) into (4.1) and rearranging terms, we obtain
Thus, we have the following three differential equations for $\widehat A(t)$ , $\widehat B(t)$ , and $\widehat C(t)$ :
with terminal conditions $\widehat A(T) = 0$ , $\widehat B(T) = 0$ , and $\widehat C(T) = 0$ .
We can rewrite (C.3) as
and by integrating this from t to T and using $\widehat A(T) = 0$ , we obtain the expression for $\widehat A$ in (4.2).
Equation (C.4) is a linear differential equation, which, by substituting for $\widehat A$ from (4.2), we rewrite as
and by integrating this from t to T and using $\widehat B(T) = 0$ , we obtain the expression for $\widehat B$ in (4.3).
Finally, by integrating (C.5) from t to T and using $\widehat C(T) = 0$ , we obtain the integral representation of $\widehat C$ in (4.4).
It remains to show that ${V^p}$ and ${q^p}$ satisfy the conditions of Theorem 4.1. The argument is similar to the one in the proof of Theorem 3.2, so we omit it. Thus, ${V^p}$ and ${q^p}$ as stated in Theorem 4.2 equal the value function and optimal retention strategy, respectively.
Funding information
X. Liang thanks the Research Foundation for Returned Scholars of Hebei Province (C20200102), Natural Science Foundation of Tianjin (19JCYBJC30400), and the Natural Science Foundation of Hebei Province (A2020202033) for financial support, and Virginia R. Young thanks the Cecil J. and Ethel M. Nesbitt Professorship of Actuarial Mathematics for financial support.
Competing interests
There were no competing interests to declare which arose during the preparation or publication process of this article.