1. Introduction
Tail risk denotes the chance of losses exceeding a fixed threshold. Since extreme losses can compromise the stability of a portfolio of risks, the quantification of tail risk is a widely studied topic in operational research and in actuarial science [Reference Asmussen, Blanchet, Juneja and Rojas-Nandayapa4, Reference Fallahgoul and Loeper15, Reference Risk and Ludkovski33]. In the risk management of financial and insurance portfolios, modelling the tail risk provides important insights. The calculation of tail risk measures is imposed by regulators on banks and insurance companies in order to assess the companies’ solvency ability [Reference Embrechts, Liu and Wang14, Reference Liu and Wang26, Reference McNeil, Frey and Embrechts27].
One fundamental problem in the study of tail risk concerns its allocation into individual risk sources [Reference Kim and Kim22]. For instance, consider a portfolio of insurance contracts. Estimating the contribution of individual risks (i.e. policyholders) to the total portfolio loss is essential for undertaking proper actuarial and risk management actions. Different tail risk allocation methods can be found in the literature: [Reference Furman and Landsman16] allocates the tail variance loss into individual risk contributions and calls this allocation method the tail covariance premium principle; [Reference Wang39] slightly modifies this premium principle and introduces the adjusted tail covariance premiums; [Reference Embrechts, Liu and Wang14] studies the properties of optimal risk sharing methods for quantile-based risk measures; [Reference Chen and Xie6] allocates the worst-case value-at-risk among homogeneous agents.
The Shapley value [Reference Shapley34] from cooperative game theory is attracting increasing attention as an allocation method satisfying desirable properties. The main rationale of the Shapley value is to fairly attribute the total value created by a team of players to individual team members. If players are identified with risks and the grand total value with a specific risk metric, the Shapley value becomes a very flexible and general fair attribution method. It has been applied to quantify variable importance in statistical linear regression models [Reference Grömping18, Reference Lipovetsky and Conklin25], computer experiments [Reference Iooss and Prieur21, Reference Owen29–Reference Plischke, Rabitti and Borgonovo31], health risk analysis [Reference Cox9], and actuarial science [Reference Lemaire24, Reference Rabitti and Borgonovo32, Reference Tsanakas and Barnett36]. The Shapley value was proposed in [Reference Abbasi and Hosseinifard1] as a solution to allocating the capital according to the tail expectation risk measure.
A recent work [Reference Colini-Baldeschi, Scarsini and Vaccari8] considered the problem of allocating the variance and the standard deviation of a portfolio into individual risk components using the Shapley value. It provided a closed-form expression for the Shapley value for the variance game and conjectures the relationship of the Shapley values for the variance and standard deviation games (namely, the normalized variance Shapley values majorize the standard deviation ones); [Reference Chen, Hu and Tang5, Reference Galeotti and Rabitti17] proved that the conjecture holds true in the case of independent risks, while [Reference Galeotti and Rabitti17] provided some counterexamples in the dependent case.
In this work, we want to extend the results of [Reference Colini-Baldeschi, Scarsini and Vaccari8] to allocating the tail risk measured by the tail variance and tail standard deviation. To do this, we first introduce the Shapley values for the tail variance and the tail standard deviation games, and we provide a closed formula for the Shapley value for the former game. We then investigate the relationship between the two Shapley values. To illustrate, suppose you have a set of n independent risks (i.e. non-negative random variables) and consider the tail-conditioned random variables generated by a tail-risk scenario. This way, the total risk is given by the sum of losses exceeding the threshold. As a consequence, a dependence is generated among the tail-conditioned random variables. Does the conjecture of [Reference Colini-Baldeschi, Scarsini and Vaccari8] (which holds true for the independent, unconditioned risks) continue to hold for the tail-conditioned random variables, independently from the tail threshold? We present two illustrative examples. In the first one the original independent risks are light-tail distributed and the majorization continues to hold in the tail-conditioned case. In fact, comparing two differently distributed tail-conditioned variables, our computations show that the difference of their variances increases as the tail threshold increases, and so does the difference of the covariances between the two risks with a third one whose distribution is equal to that of the second one.
In the case of a heavy-tail example, instead, the original random variables have suitable heavy-tail distributions and the majorization is inverted for a sufficiently high threshold of the tail. Then, we conjecture that such inversion cannot occur for light-tail distributions, while it occurs for suitable heavy-tail distributions. We observe, in particular, that the inverted majorization may imply an inverted ranking between the Shapley values relative to the variance and standard deviation games. Since heavy-tail distributions are used to model catastrophic risks, our conjecture, if verified, can have important theoretical and practical implications. For instance, one consequence is that the allocation of the standard deviation and the variance might produce two different risk rankings, since a policyholder could be considered as the riskiest one in one metric but not in the other.
Furthermore, the analytical derivation of the Shapley values for the tail variance game allows the characterization of the covariance premium principles of [Reference Furman and Landsman16, Reference Wang39] as Shapley values. The fairness of the Shapley value as an allocation method provides a theoretically sound justification for the adoption of these premium principles. Lastly, we discuss the possible application of the above Shapley value allocations to the pricing of a peer-to-peer reinsurance contract [Reference Denuit11]. Since the majorization of the Shapley values for the tail variance and the tail standard deviation games in [Reference Colini-Baldeschi, Scarsini and Vaccari8] does not hold in general, the choice of a tail risk premium might invert the risk ranking contribution in peer-to-peer insurance. Finally, we recall that [Reference Il Idrissi, Chabridon and Iooss20] recently proposed the Shapley value for tail sensitivity analysis. However, our contribution is quite different, since they consider another type of game, do not find a closed form expression for the Shapley value, nor study the majorization problem.
This paper is structured as follows. Section 2 introduces the Shapley value. Section 3 presents the variance and the standard deviation games of [Reference Colini-Baldeschi, Scarsini and Vaccari8] and recalls a result on the inverted majorization in the dependent case. In Section 4 we introduce the Shapley values for the tail variance and standard deviation games. Section 5 discusses the majorization problem and the dependence of the risk ranking on the conditioning threshold and on the tail fatness. Section 6 presents the implications for the tail risk management of portfolios of insurance risks.
2. Shapley value
Consider the situation in which a team of players generate a value. The Shapley value originates from game theory [Reference Shapley34] as a method to attribute to every player of the team the fair part of the value they contributed. Let v(J) be the value generated by the coalition of players, $J \subseteq N =\{1 ,2 ,\ldots ,n\}$ , with the assumption $v(\varnothing ) =0$ . The total value of the game produced by the team is then v(N).
Consider the following desirable properties for an attribution method:
-
Efficiency: $\sum _{i =1}^{n}\phi _{i} (v) =v (N)$ .
-
Symmetry: If $v (J \cup \{i\}) = v (J \cup \{j\})$ for all $J \subseteq N \setminus \{i ,j\}$ , then $\phi_{i}(v) = \phi_{j}(v)$ .
-
Dummy player: If $v (J \cup \{i\}) =v (J)$ for all $J \subseteq N$ , then $\phi _{i} (v) =0$ .
-
Linearity: If two value functions v and w have Shapley values, respectively, $\phi_{i}(v)$ and $\phi_{i}(w)$ , then the game with value $\alpha v + \beta w$ has Shapley values $\alpha\phi_{i}(v) + \beta\phi_{i}(w)$ for all $\alpha ,\beta \in \mathbb{R}$ .
The efficiency property states that the sum of the Shapley values of all players must equal the total value of the game to be shared. The symmetry property requires that players making the same contributions to any coalition must be paid equal shares. If the marginal contribution $v (J \cup\{i\}) - v (J)$ is null whatever coalition the ith player joins, this player receives a null share and is called a dummy player. The linearity property states that if two games are combined, then the received share of each player must be the combination of their shares from the two games.
Shapley [Reference Shapley34] proved that the value given by
is the unique attribution method satisfying these four properties, where $\vert J\vert $ denotes the cardinality of J. This is called the Shapley value. We note that the Shapley value for the ith player is based on the marginal increase in the value $v (J \cup \{i\}) - v(J)$ when they join coalition J, and such marginal increase is then averaged over all possible coalitions J.
Equivalently, the Shapley value can be expressed in terms of permutations as [Reference Colini-Baldeschi, Scarsini and Vaccari8]
where $\mathcal{P}(N)$ is the set of all permutations of N, and $P^{\psi}(i)$ is the set of all players who precede i in the order directed by the permutation $\psi $ .
One important feature of the Shapley value is that it is considered a fair attribution method. To formalize the concept of fairness, consider the following property of balanced contributions.
-
Balanced contributions: Denote by $v_{ -k}$ the game obtained by restricting the set of players to $N \setminus \{ k\}$ . Then $\phi$ satisfies the axiom of balanced contributions if $\phi_{i}(v) - \phi_{i}(v_{-j}) = \phi_{j}(v) - \phi_{j}(v_{-i})$ for all $i ,j \in N$ .
The Shapley value was characterized in [Reference Hart and Mas-Colell19] using the efficiency and the balance contribution properties. This property can be interpreted in terms of fairness since, for two cooperating players, it states that the value produced by them is the same with respect to what they would produce without cooperation. It was stated in [Reference Algaba, Fragnelli and Sanchez-Soriano3] that the Shapley value is the unique efficient attribution method satisfying this equal contribution property, making it ‘a benchmark for fairness’ (p. 23). This fairness feature becomes a relevant justification for the adoption of the Shapley value as allocation method.
3. Variance and standard deviation games of [Reference Colini-Baldeschi, Scarsini and Vaccari8]
Given a set of risks $(X_{1}, X_{2}, \ldots, X_{n})$ , pose $S = \sum _{i =1}^{n}X_{i}$ and $S_{J} = \sum _{i \in J}X_{i}$ for every $J\subseteq N =\{1 ,2 ,\ldots ,n\}$ . The value function for the variance game was defined in [Reference Colini-Baldeschi, Scarsini and Vaccari8] as $\nu(J) = \mathrm{Var}[S_{J}]$ . The intuition is to regard risks as players producing a total value of the game $\nu (N) = \mathrm{Var}[S]$ , i.e. the variance of the portfolio. Thus, the Shapley value for the variance game quantifies the contribution of every risk to the porfolio variance. It was proved that the Shapley value for the variance game is
where $\mathrm{Cov}[\cdot ,\cdot ]$ denotes the covariance. The representation in (1) is very convenient, since in general it is not possible to find an explicit representation of the Shapley values and expensive numerical techniques must be adopted [Reference Plischke, Rabitti and Borgonovo31].
We note that this expression of the Shapley values for the variance game in (1) is well studied in the actuarial literature. In particular, the normalized Shapley value for the variance game,
is known as the covariance allocation method [Reference Dhaene, Tsanakas, Valdez and Vanduffel13, Reference Overbeck28]:
[the covariance allocation method] explicitly takes into account the dependence structure of the random losses $(X_{1},X_{2},\ldots,X_{n})$ . Business units with a loss that is more correlated with the aggregate portfolio loss S are penalized by requiring them to hold a larger amount of capital than those that are less correlated [Reference Dhaene, Tsanakas, Valdez and Vanduffel13, p. 8].
A second game was introduced in [Reference Colini-Baldeschi, Scarsini and Vaccari8]. It defined the standard deviation game considering the value function $\mu(J) = \sqrt{\mathrm{Var}[S_{J}]}$ for all $J\subseteq N$ . However, in such a case it is not possible to express the resulting Shapley value $\phi_{i}(\mu)$ in closed form. Then, [Reference Colini-Baldeschi, Scarsini and Vaccari8] compared the resulting Shapley values for the variance game with those for the standard deviation game in terms of vector majorization. In particular, a conjecture was formulated on the majorization of the Shapley values. Given two vectors $\textbf{x},\textbf{y}\in \mathbb{R}^{n}$ , $\textbf{x}$ is said to be majorized by $\textbf{y}$ ( $\textbf{x}\leq \textbf{y}$ ) if
where $x_{(1)}\leq x_{(2)}\leq \cdots \leq x_{(n)}$ is the increasing rearrangement of $\textbf{x}$ . Using this notation, the conjecture can be formulated as follows.
Conjecture 1. (Colini-Baldeschi et al. [Reference Colini-Baldeschi, Scarsini and Vaccari8].) For any $n\times n$ covariance matrix $\Sigma$ , if $\nu$ is the corresponding variance game and $\mu$ the corresponding standard deviation game, then
where $\Phi$ denotes the vector of the Shapley values.
This conjecture was proved to hold true in the independent case in [Reference Chen, Hu and Tang5, Reference Galeotti and Rabitti17]. However, [Reference Galeotti and Rabitti17] provided two counterexamples to the conjecture considering three dependent random variables. In general, such counterexamples can be seen to derive from Theorem 1.
3.1. Inverted majorization
Denote by y and x the normalized vectors of Shapley values relative, respectively, to the variance and the standard deviation game, i.e.
Then the following theorem holds.
Theorem 1. Consider three non-negative valued risks $(X_{1}, X_{2}, X_{3})$ such that $\mathrm{Var}[X_{1}] < \mathrm{Var}[X_{2}] < \mathrm{Var}[X_{3}]$ , whereas the Shapley values relative to the variance game are equal, i.e. $\mathrm{Cov}[X_{i}, X_{1} + X_{2} + X_{3}] = \frac{1}{3}\mathrm{Var}[X_{1} + X_{2} + X_{3}]$ , $i = 1, 2, 3$ . Then, denoting by $\textbf{y}$ and $\textbf{x}$ the vectors of normalized Shapley values relative, respectively, to the variance game $\nu$ and the standard deviation game $\mu$ , $\textbf{y}$ is majorized by $\textbf{x}$ .
Remark 1. It is easily checked that the theorem also holds in the cases $\mathrm{Var}[X_{1}] = \mathrm{Var}[X_{2}] < \mathrm{Var}[X_{3}]$ and $\mathrm{Var}[X_{1}] < \mathrm{Var}[X_{2}] = \mathrm{Var}[X_{3}]$ .
The proof can be found in [Reference Galeotti and Rabitti17]. Theorem 1 provides counterexamples to the conjecture in [Reference Colini-Baldeschi, Scarsini and Vaccari8], as under its conditions the majorization in (3) is inverted. Conversely, if the three risks were independent, the conjectured majorization would hold, as proved in [Reference Chen, Hu and Tang5, Reference Galeotti and Rabitti17]. In such a case the Shapley values relative to the variance game are equal to the variances of the risks.
As we saw in Section 2 the Shapley value is a fair method to share the total value of the game among individual risks. The Shapley values of the two games are fair allocations of the two total game values, namely the portfolio variance and standard deviation. However, coming to the relationship between the two Shapley values, Theorem 1 shows that, in the dependent case, the Shapley value for the variance allocation may produce a risk ranking which is different from the one produced by the Shapley value for the standard deviation game. In fact, under the assumptions of the theorem, considering the variance and standard deviation Shapley values as risk metrics, it occurs that, by a small modification of the random variables’ distributions, the riskiest random variable relative to the variance game is not the riskiest one in the standard deviation game. This implies that, depending on the choice of the value of the game, one random variable $X_{i}$ can be considered proportionally less or more risky for the portfolio. Then, simply changing the game (i.e. the way we measure risk in the portfolio), an effect of risk transfer takes place among the random variables, since the risk ranking is changed.
4. Tail variance and standard deviation games
In this paper we want to extend the results of [Reference Colini-Baldeschi, Scarsini and Vaccari8] to the tail variance and the tail standard deviation. This allows the risk analyst to identify and quantify the most important risks driving the tail variability; it can be done introducing a tail threshold s and conditioning the loss as follows.
Definition 1. We define the value function for the tail variance game as $\nu_{s}(J) = \mathrm{Var}[S_{J} \mid S > s]$ .
In parallel to the results of [Reference Colini-Baldeschi, Scarsini and Vaccari8], we can characterize the tail variance game.
Theorem 2. For the value function defined in Definition 1, $\phi_{i}(\nu_{s}) = \mathrm{Cov}[X_{i}, S \mid S > s]$ .
Proof. First of all, we define new variables $(\widetilde{X}_{1},\widetilde{X}_{2},\ldots,\widetilde{X}_{n})$ as follows:
Then, it is easily checked that, for any subset J of $\{1,2,\ldots,n\}$ ,
It follows that the tail variance game $\nu_{s}$ , relative to the variables $(X_{1}, X_{2}, \ldots,X_{n})$ , is equivalent to the variance game relative to $(\widetilde{X}_{1}, \widetilde{X}_{2}, \ldots, \widetilde{X}_{n})$ . Hence, we can apply the result of [Reference Colini-Baldeschi, Scarsini and Vaccari8], implying $\phi_{i}(\nu_{s}) = \mathrm{Cov}\big[\widetilde{X}_{i},\sum_{j=1}^{n}\widetilde{X}_{j}\big]$ . On the other hand, we get $\mathrm{Cov}\big[\widetilde{X}_{i},\sum_{j=1}^{n}\widetilde{X}_{j}\big] = \mathrm{Cov}[X_{i},S \mid S > s]$ , which proves the theorem.
This closed-form expression generates the Shapley value for the tail variance game, avoiding the problem of estimating the value for all the $2^{n}$ possible coalitions, which becomes computationally demanding as n increases. The Shapley value may be negative: if $X_{i}$ contributes to hedging the total tail risk, it is rewarded with a negative Shapley value. In the actuarial literature we can find different works concerning this allocation principle. For instance, [Reference Valdez37] derived an analytical expression for $\mathrm{Cov}[X_{i},S \mid S > s]$ in the case of sums of multivariate normal random variables. This result was extended in [Reference Valdez38] to the case of elliptical distributions. Nonetheless, in our contribution for the first time this allocation principle is interpreted as a Shapley value. If we normalize the Shapley values for the tail variance game, we obtain
which are again Shapley values by linearity. In analogy to the covariance allocation principle in (2), we call the Shapley value $\widetilde{\phi}_{i}(\nu_{s})$ the tail covariance allocation principle. Note that this allocation is a pure number independent of the currency unit. It represents the fractional contribution of the random variable $X_{i}$ to the tail covariance.
In parallel to [Reference Colini-Baldeschi, Scarsini and Vaccari8], we can introduce the tail standard deviation game.
Definition 2. We define the value function for the tail standard deviation game as $\mu_{s}(J) = \sqrt{\mathrm{Var}[S_{J} \mid S > s]}$ .
The related Shapley value allocates the tail standard deviation as the total value of the game. However, as in the unconditioned case, no closed-form expression is available. In order to compare the rankings of the two Shapley values arising from Definitions 1 and 2, we consider the majorization problem in the next section.
5. The majorization problem and the dependence on the threshold
It is interesting to investigate whether the analogue of Conjecture 1 could hold true in the tail conditional case. The answer is negative for suitable heavy-tail variables, as our following example will show.
To illustrate our scenario, we consider the situation where non-negative random variables refer to losses of an insurance portfolio. The insurance premiums are paid only if the total loss of the portfolio overtakes a given threshold $s\geq 0$ . Thus, when $s>0$ , we have observed how that is equivalent to considering a new portfolio $(\widetilde{X}_{1},\widetilde{X}_{2},\ldots,\widetilde{X}_{n})$ defined by (4). Therefore, assuming $X_{1},X_{2},\ldots,X_{n}$ to be independent, $\widetilde{X}_{1},\widetilde{X}_{2},\ldots,\widetilde{X}_{n}$ are no longer so. Hence, we wonder whether the majorization $\widetilde{\textbf{x}}(s)\leq \widetilde{\textbf{y}}(s)$ , holding when $s=0$ , may not hold or even be inverted for higher values of s.
To this end we consider two examples, where three independent non-negative variables $X_{1} ,X_{2} ,X_{3}$ have, respectively, light-tail (exponential) and heavy-tail (slightly modified Pareto ones) distributions. By applying Theorem 1 we prove that in the latter case the inverted majorization occurs for suitable high values of the tail threshold, while the majorization is preserved in the light-tail case. In the following, we provide outlines of the proofs of Propositions 1 and 2, omitting the details that appear in the full proofs in the Appendix.
5.1. A light-tail example without inversion
Consider three independent non-negative variables $X_{1} ,X_{2} ,X_{3}$ with distributions given by $F_{1}(x) = 1 - {\mathrm{e}}^{-(1 + \varepsilon)x}$ , $F_{2}(x) = F_{3}(x) = 1 - {\mathrm{e}}^{-x}$ . We have $\mathrm{Var}[X_{1}] = (1 + \varepsilon)^{-2} < \mathrm{Var}[X_{2}] = \mathrm{Var}[X_{3}] = 1$ , while $\mathbb{E}[X_{1}] = (1 + \varepsilon)^{-1} < \mathbb{E}[X_{2}] = \mathbb{E}[X_{3}] = 1$ .
Proposition 1. Consider the tail-conditioned variables $\widetilde{X}_{i}$ . Then, no matter how high the tail threshold s is and how small the parameter $\varepsilon $ is, $\phi_{1}(\nu_{s}) < \phi_{2}(\nu_{s}) = \phi_{3}(\nu_{s})$ , so that the inverted majorization caused by Theorem 1 does not occur.
Proof (outline). Taking $s>0$ sufficiently high and $\varepsilon >0$ sufficiently small, we want to estimate the sign of the difference
Explicitly, recalling that $\widetilde{X}_{2}$ and $\widetilde{X}_{3}$ are identically distributed,
First of all, we observe that, for $i\neq j\neq k$ ,
where $\overline{F_{i}}(z) = 1 - F_{i}(z)$ , $f_{i}(z) = F_{i}^{\prime}(z)$ , and $f_{j}\ast f_{k}$ denotes the convolution product, so that $\int_{s-x}^{+\infty}(f_{j} \ast f_{k})(z) \,{\mathrm{d}} z = (\overline{F_{j} \ast F_{k}})(s - x)$ . Moreover,
where
We can now proceed to compute the terms in (5) by expanding them as formal power series of $\varepsilon$ . Observing that $G(0,s) = 0$ for all $s > 0$ and $G(\varepsilon,0) < 0$ for all $\varepsilon > 0$ , we are led to the formal power series $G(\varepsilon, s) = {\mathrm{e}}^{-s}\sum\varepsilon^{m}p_{m}(s)$ . Since the functions $p_{m}(s)$ grow at most polynomially in s, taking $\varepsilon \leq {\mathrm{e}}^{-s}$ means the series $\sum\varepsilon^{m}p_{m}(s)$ converges for any high value of s. In this case, the sign of $G(\varepsilon, s)$ when s is sufficiently high is given by the sign of $p_{1}(s)$ . Therefore, if it happens that, for sufficiently high values of s, $p_{1}(s) > 0$ , there exists some pair $(\overline{\varepsilon},\overline{s})$ satisfying $G(\overline{\varepsilon},\overline{s}) = 0$ and therefore meeting the conditions of Theorem 1. The first observation concerns the computation of $\mathbb{E}[\widetilde{X}_{1}]$ and $\mathbb{E}[\widetilde{X}_{2}] = \mathbb{E}[\widetilde{X}_{3}]$ . It is checked that, in any case, $\mathbb{E}[\widetilde{X}_{i}] = {\mathrm{e}}^{-s}a_{i}(s) + {\mathrm{e}}^{-s}(\varepsilon b_{i}(s) + \mathrm{h.o.t.}(\varepsilon))$ , where $a_{i}(s)$ and $b_{i}(s)$ grow polynomially in s, and $\mathrm{h.o.t.}(\varepsilon)$ means higher-order terms in $\varepsilon$ . Therefore, the coefficients of $\varepsilon$ in $\mathbb{E}^{2}[\widetilde{X}_{i}]$ and $\mathbb{E}[\widetilde{X}_{i}]\mathbb{E}[\widetilde{X}_{j}]$ all tend to zero when $s \rightarrow +\infty$ , as ${\mathrm{e}}^{-2s}q(s)$ , where q(s) is a polynomial in s.
Consider now $\mathbb{E}[\widetilde{X}_{1}^{2}]$ . We have
Integrating by parts and letting L(x) denote the primitive of $2x\overline{F}_{1}(x)$ which satisfies $L(+\infty) = 0$ , it remains to calculate
Then, after a certain number of steps, it is found that the contribution to $\varepsilon p_{1}(s)$ is given by $\varepsilon(\!-\!{s^{4}}/{6} + \mathrm{l.o.t.}(s))$ , where l.o.t. stands for lower-order terms.
In the case of $\mathbb{E}[\widetilde{X}_{2}^{2}]$ the contribution to $\varepsilon p_{1}(s)$ is only given by
and standard calculations lead to $\varepsilon(\!-\!{s^{4}}/{12} + \mathrm{l.o.t.}(s))$ , which, in $G(\varepsilon, s)$ , becomes $\varepsilon({s^{4}}/{12} + \mathrm{l.o.t.}(s))$ .
Consider now $\mathbb{E}[\widetilde{X}_{1}\widetilde{X}_{3}]$ and $\mathbb{E}[\widetilde{X}_{2}\widetilde{X}_{3}]$ . Then,
since $K(x,y) = \mathrm{Prob}(\widetilde{X}_{3} \leq x,\,\widetilde{X}_{1}\leq y)$ . Therefore, by (6), $({\partial}/{\partial x\partial y})K(x,y) = ({\partial}/{\partial x\partial y})H(x,y)$ , where
with $x,y>0$ and $\overline{F}_{2}(s-x-y)<1$ when $x+y<s$ . Hence, we have to calculate $\mathbb{E}[\widetilde{X}_{1}\widetilde{X}_{3}]$ on the triangle $T = \{0<x<s,0<y<s-x\}$ , i.e.
By the same arguments, in estimating $\mathbb{E}[\widetilde{X}_{2}\widetilde{X}_{3}]$ , we have to compute
where $R(x,z) = \overline{F}_{3}(x)\overline{F}_{2}(z)\overline{F}_{1}(s-x-z)$ . By a suitable change of variables we obtain
so that the difference $\iint\limits_{T}xy\frac{\partial}{\partial x\partial y}H(x,y) \,{\mathrm{d}} x\,{\mathrm{d}} y - \iint\limits_{T}xz\frac{\partial}{\partial x\partial z}R(x,z) \,{\mathrm{d}} x\,{\mathrm{d}} z$ is zero. Hence, the contribution to $\varepsilon p_{1}(s)$ of higher order in s coming from $\mathbb{E}[\widetilde{X}_{1}\widetilde{X}_{3}] - \mathbb{E}[\widetilde{X}_{2}\widetilde{X}_{3}]$ is given by $\int_{0}^{s}xf_{3}(x) \,{\mathrm{d}} x \int_{s-x}^{+\infty}yf_{1}(y) \,{\mathrm{d}} y$ , which yields, after simple steps, $\varepsilon(\!-\!{s^{4}}/{12} + \mathrm{l.o.t.}(s))$ . In conclusion, $\varepsilon p_{1}(s) = \varepsilon(\!-\!{s^{4}}/{6} + \mathrm{l.o.t.}(s))$ . Hence, no matter how high s is and how small $\varepsilon$ is accordingly, $G(\varepsilon, s) < 0$ and Theorem 1 cannot be applied.
5.2. A heavy-tail example with inversion
Consider three independent non-negative variables (risks) $X_{1}, X_{2}, X_{3}$ , whose distributions are
where $h = 2 + \varepsilon$ , with $k > 0$ high enough and $\varepsilon > 0$ as small as we will need.
Proposition 2. Using the same notation as Section 5.1, there exist pairs $(\overline{\varepsilon},\overline{s})$ satisfying the conditions of Theorem 1 and therefore inverting the majorization of the Shapley values for the tail variance and standard deviation games.
Proof (outline). First of all, $\varepsilon$ can be chosen so small as to ensure $f_{1}(x) = F_{1}^{\prime}(x) > 0$ for $x \geq 0$ . Then, straightforward computations show that, for k sufficiently large and $\varepsilon$ sufficiently small, $\mathrm{Var}[X_{1}] - \mathrm{Var}[X_{2}] > {\varepsilon}/{257}$ , implying, in terms of Shapley values, $\phi_{1}(\nu_{0}^\varepsilon) > \phi_{2}(\nu_{0}^\varepsilon) = \phi_{3}(\nu_{0}^\varepsilon)$ , where $\nu_0^\varepsilon$ denotes the value function for the tail variance game depending on the parameter $\varepsilon$ and the threshold $s=0$ . On the other hand, we observe that
Then, adopting the same notation as the previous proposition, we have to estimate the sign of $G(\varepsilon, s)$ when $s > 0$ is sufficiently high and $\varepsilon > 0$ is accordingly sufficiently small, knowing that $G(0, s) = 0$ for all $s \geq 0$ and $G(\varepsilon, 0) > 0$ for sufficiently small values of $\varepsilon > 0$ .
By steps analogous to those followed in the previous proposition, we are first led to calculate the contribution to $\varepsilon p_{1}(s)$ given by $\mathbb{E}[\widetilde{X}_{1}^{2}]$ . Precisely, we need to compute
where L(x) is the primitive of $2x\overline{F}_{1}(x)$ satisfying $L(\!+\!\infty) = 0$ . In particular, applying the mean value theorem, we have to estimate
where $\overline{z} \in (0,s)$ . This implies that, calling $\varepsilon a(s)$ the contribution of $\mathbb{E}[\widetilde{X}_{1}^{2}]$ to $\varepsilon p_{1}(s)$ , a(s) tends to zero whenever $s \rightarrow +\infty$ not faster than $s^{-1}$ . In order to evaluate its sign, let us apply the mean value theorem the other way around,
so that, when s is sufficiently high, $a(s) < 0$ . It can be checked that the other contributions to $\varepsilon p_{1}(s)$ tend to zero when $s \rightarrow +\infty$ at least as $s^{-2}$ , except possibly the one given by $\int_{0}^{s}xf_{3}(x) \,{\mathrm{d}} x\int_{s-x}^{+\infty}yf_{1}(y) \,{\mathrm{d}} y$ , yielding
Hence, the integrand function tends to zero not faster than $({1-{5}/{4h}})/{(1+hs)^{2}}$ , which can be written as
Then, when s is sufficiently high, the above integral is smaller than
where b and c are suitable positive numbers.
It follows that, for s sufficiently high and $\varepsilon$ sufficiently small, $G(\varepsilon, s) < 0$ , so that, since $G(\varepsilon,0) > 0$ , there exist pairs $(\overline{\varepsilon},\overline{s})$ satisfying $G(\overline{\varepsilon},\overline{s}) = 0$ . Therefore, Theorem 1 can be applied and the inverted majorization occurs.
5.3. Conjecture for tail-conditioned risks
First of all, we observe that the previous examples can be generalized as follows. Consider a set of n risks (non-negative random variables) $(X_{1}, \ldots, X_{n-1}, X_{n})$ that are independent and such that $X_{1}, \ldots, X_{n-1}$ are identically distributed, while the distribution of $X_{n}$ is different; in particular, it has a higher or lower variance (using, as we do, a parameter $\varepsilon$ , the variance depends on the sign of $\varepsilon$ ). The first example we analyzed refers to light-tail distributions (approximated by exponential distributions). Then, our computations show, in the case $n=3$ , and suggest, for any n, that, comparing the tail-conditioned variables $\widetilde{X}_{n}$ and $\widetilde{X}_{i}$ , $1 \leq i \leq n-1$ , the difference between the two variances, in absolute value, increases as the tail threshold increases, and so does the absolute value of the difference between the covariances of $\widetilde{X}_{n}$ and $\widetilde{X}_{i}$ with a third random variable $\widetilde{X}_{j}$ , $j \neq i,n$ . The opposite, instead, occurs in our second example, where suitable heavy-tail distributions are considered. In fact, in such an example, computations show that the inversion, for a suitable choice of the parameters, is allowed precisely by the fact that the tails of the distributions are rational functions, rather than negative exponential ones. As a consequence, in spite of the limits of our examples, we argue there may be sufficient reasons to formulate the following conjecture.
Conjecture 2. Consider a portfolio of independent risks $S = \sum_{i=1}^{n}X_{i}$ . The inverted majorization of the tail Shapley values does not occur if the risks $X_{i}$ are light-tail distributed for all $i = 1, 2, \ldots, n$ .
We observe that the unconditional risks are independent. In the tail-conditional case, the risks become dependent. In our conjecture the majorization of [Reference Colini-Baldeschi, Scarsini and Vaccari8] is maintained if the original distributions are light-tail.
6. Implications for tail risk management
The implications of the above results concern the actuarial allocation of the tail loss of a portfolio of risks into individual risk contributions. Allocating the tail variance into risk contributions is essential for the individual fair pricing of insurance/reinsurance contracts. In this regard the Shapley value provides a useful attribution method for sharing the insurance premium among the holders of the risks in the portfolio.
Consider a portfolio of non-negative risks $(X_{1},X_{2},\ldots,X_{n})$ . Let us assume for the moment that risks are independent and consider the limit case $s=0$ . Then, adopting the Shapley values for the standard deviation game, the highest risks are proportionally allocated a smaller fraction than they would receive when allocating the variance. In other words, these Shapley values induce a ‘solidarity’ effect among the risks in the portfolio, since the conjecture of [Reference Colini-Baldeschi, Scarsini and Vaccari8] holds true. Thus, asking whether the majorization holds for any $s>0$ is equivalent to asking whether the Shapley values for the tail standard deviation game always induce a stronger solidarity effect than those for the variance game. However, our example shows that in general the solidarity of the two Shapley values depends on the threshold s and on the tail fatness. The implication is that the ranking of the individual risk contributions can be changed by allocating either the tail variance or the tail standard deviation. In the next section we show that well-known premium principles are Shapley values for tail games and, thus, the ranking of the premiums to be individually paid might be changed simply depending on the chosen tail game.
6.1. Tail mean-variance games and premium principles
We define the game $\varepsilon_{s}(J) = \mathbb{E}[S_{J} \mid S > s]$ as the tail mean game. By the linearity of the conditional expectation, $\varepsilon_{s}(I\cup J) = \varepsilon_{s}(I) + \varepsilon_{s}(J)$ for all $I,J\subset N$ such that $I\cap J = \varnothing$ . A game satisfying this condition is called an additive game [Reference Colini-Baldeschi, Scarsini and Vaccari8]. It follows that $\varepsilon_{s}$ is an additive game and hence the Shapley value is equal to $\phi_{i}(\varepsilon_{s}) = \mathbb{E}[X_{i} \mid S > s]$ . In the actuarial literature, the Shapley value for the tail mean $\phi_{i}(\varepsilon_{s})$ is well known as the conditional tail expectation (CTE) allocation rule [Reference Denault10, Reference Dhaene, Henrard, Landsman, Vandendorpe and Vanduffel12, Reference Dhaene, Tsanakas, Valdez and Vanduffel13, Reference Kim and Kim22, Reference Tasche and Dev35].
Consider now the value function
where k is a positive constant. Using the tail mean and the tail variance games, we construct the game $\gamma_{s}$ in (7) as a linear combination of the two games. We call the game $\gamma_{s}$ the tail mean-variance game. This choice is motivated by analogy with the tail mean-variance risk model introduced in [Reference Landsman23]. We can characterize the Shapley values for the game $\gamma_{s}$ .
Proposition 3. For the value function defined in (7),
Proof. Consider the tail mean game $\varepsilon_{s}$ and the tail variance game $\nu_{s}$ . Their related Shapley values are $\phi_{i}(\varepsilon_{s}) = \mathbb{E}[X_{i} \mid S > s]$ and $\phi_{i}(\nu_{s}) = \mathrm{Cov}[X_{i},S \mid S > s]$ respectively. By the definition of $\gamma _{s}$ in (7) and the linearity of the Shapley value, the proof is concluded.
We can connect the result of Proposition 3 to premium principles presented in the actuarial literature. For the random variable $X_{i}$ with $i=1,2,\ldots,n$ , [Reference Furman and Landsman16] defined the tail covariance premium $\mathrm{TCP}_{s}(X_{i} \mid S)$ as
where a is a non-negative constant. Since the covariance is expressed in a different currency unit, [Reference Wang39] introduced the tail covariance premium adjusted, $\mathrm{TCPA}_{s}(X_{i} \mid S)$ as
From (8) it is clear that the premium principles $\mathrm{TCP}_{s}(X_{i} \mid S)$ and $\mathrm{TCPA}_{s}(X_{i} \mid S)$ are Shapley values with the choice $k=a$ and $k = a\cdot(\mathrm{Var}[S \mid S > s])^{-1/2}$ , respectively. More precisely, by the linearity property of the Shapley value we can write
This connection constitutes an important theoretical justification for the use of these two premium principles. In particular, the fairness of the two allocation methods is an appealing property derived from the Shapley value axiomatization.
6.2. Peer-to-peer reinsurance pricing
The peer-to-peer insurance scheme is a recent form of participating insurance in which a community of policyholders agree to share the first layer of losses that hit participants [Reference Clemente and Marano7, Reference Denuit11]. While the monetary transfers (i.e. loss coverage and/or partial premium refunds) among the policyholders in the community take place ex post, there is the need to protect ex ante the insurance community from large losses which cannot be distributed among participants. Thus, a reinsurance contract must be purchased in advance to protect the community. The problem is then how to calculate a fair reinsurance premium for every participant.
We propose the Shapley value for the tail variance and standard deviation games as a reliable answer to this question. Specifically, consider a reinsurance company which is asked to indemnify the losses S given the fact that they exceed a threshold s. The total reinsurance premium paid to the reinsurance company to be shared by policyholders can be quantified according to the variance principle,
or the standard deviation principle,
In the standard deviation principle $P_\mathrm{SD}$ the two terms $\mathbb{E}[S-s \mid S>s]$ and $\sqrt{\mathrm{Var}[S-s \mid S>s]}$ are expressed in the same monetary unit ( $\$ + \$ $ ), while the variance principle includes terms of different units ( $\$ + \$^{2}$ ). The Shapley values for the tail variance and standard deviation games represent an ideal attribution method to allocate the reinsurance premiums (9) and (10) among the peer-to-peer community members. Precisely, for the ith policyholder, where $i=1,2,\ldots,n$ , we find the Shapley values $\phi_{i}(P_\mathrm{V}(S,s))$ and $\phi_{i}(P_\mathrm{SD}(S,s))$ respectively, with the former expressed in closed form as proved by Proposition 3. Note that, if the coefficients in $\alpha_\mathrm{V}$ and $\alpha_\mathrm{SD}$ in (9) and (10) were set to zero as well as s, then we would find $\phi_{i}(P_\mathrm{V}(S,0)) = \phi_{i}(P_\mathrm{SD}(S,0)) = \mathbb{E}[X_{i} \mid S>0]$ , which is exactly the CTE allocation method proposed in [Reference Denuit11] to price the peer-to-peer insurance premiums without a reinsurance purchase.
Finally, [Reference Albrecher, Beirlant and Teugels2] stated that ‘measuring variability by variance or standard deviation gives just different weight to “additional” risk and in that sense is a matter of taste and choice’ [Reference Albrecher, Beirlant and Teugels2, p. 220]. The results of our work show that, if the reinsurance company allocates either the total variance premium (9) or the total standard deviation premium (10) among policyholders using the Shapley value, it might generate different rankings concerning the individual paid premiums across the community members. Investigating the mechanism and the conditions by which the threshold level and the tail fatness make the two rankings change is an open question for future research.
Appendix
Full proof of Proposition 1. Taking $s>0$ sufficiently high and $\varepsilon >0$ sufficiently small, we want to estimate the sign of the difference
Explicitly, recalling that $\widetilde{X}_{2}$ and $\widetilde{X}_{3}$ are identically distributed,
First of all, we observe that, for $i\neq j\neq k$ ,
where $\overline{F}_{i}(z) = 1 - F_{i}(z)$ , $f_{i}(z) = F_{i}^{\prime}(z)$ , and $f_{j} \ast f_{k}$ denotes the convolution product, so that we can write $\int_{s-x}^{+\infty}(f_{j} \ast f_{k})(z) \,{\mathrm{d}} z = (\overline{F_{j} \ast F_{k}})(s-x)$ . Moreover,
where
Next, it is easily seen that $G(0,s) = 0$ for all $s>0$ and $G(\varepsilon,0) < 0$ for all $\varepsilon > 0$ . Therefore, by Taylor expansions, we can write the formal power series
Since the functions $p_{m}(s)$ are easily seen to grow at most polynomially in s, taking, say, $\varepsilon \leq {\mathrm{e}}^{-s}$ , the series $\sum\varepsilon^{m}p_{m}(s)$ converges for any high value of s and the sign of $G(\varepsilon,s)$ , if s is sufficiently high, is given by the sign of $p_{1}(s)$ . Therefore, if it happens that, for sufficiently high values of s, $p_{1}(s)>0$ , there exists some pair $(\overline{\varepsilon},\overline{s})$ satisfying $G(\overline{\varepsilon},\overline{s}) = 0$ and therefore meeting the conditions of Theorem 1.
In order to proceed with the computations, we first calculate
The first observation concerns the computation of $\mathbb{E}[\widetilde{X}_{1}]$ and $\mathbb{E}[\widetilde{X}_{2}] = \mathbb{E}[\widetilde{X}_{3}]$ . It is easily checked that, in any case,
where $a_{i}(s)$ and $b_{i}(s)$ grow polynomially in s. Therefore, the coefficients of s in $\mathbb{E}^{2}[\widetilde{X}_{i}]$ and $\mathbb{E}[\widetilde{X}_{i}]\mathbb{E}[\widetilde{X}_{j}]$ all tend to zero when $s \rightarrow +\infty$ as ${\mathrm{e}}^{-2s}q(s)$ , where q(s) is a polynomial in s.
Consider now $\mathbb{E}[\widetilde{X}_{1}^{2}]$ . We have
Then, integrating by parts, and calling L(x) the primitive of $2x\overline{F}_{1}(x)$ satisfying $L(\!+\!\infty) = 0$ , it remains to calculate
Moreover,
We can consider the Taylor expansions of $(1+\varepsilon)^{-1}$ , $(1+\varepsilon)^{-2}$ , and ${\mathrm{e}}^{-\varepsilon x}$ . Hence, it follows from straightforward computations that the contribution to $\varepsilon p_{1}(s)$ is, eventually, given by $\varepsilon(\!-\!{s^{4}}/{6} + \mathrm{l.o.t.}(s))$ .
In the case of $\mathbb{E}[\widetilde{X}_{2}^{2}]$ the contribution to $\varepsilon p_{1}(s)$ is only given by
and calculations analogous to the previous ones lead to $\varepsilon(\!-\!{s^{4}}/{12} + \mathrm{l.o.t.}(s))$ , which, in $G(\varepsilon,s)$ , becomes $\varepsilon({s^{4}}/{12} + \mathrm{l.o.t.}(s))$ .
Next, we consider $\mathbb{E}[\widetilde{X}_{1}\widetilde{X}_{3}]$ and $\mathbb{E}[\widetilde{X}_{2}\widetilde{X}_{3}]$ . Then,
where $K(x,y) = \mathrm{Prob}(\widetilde{X}_{3} \leq x,\,\widetilde{X}_{1} \leq y) = 1 - \mathrm{Prob}(\widetilde{X}_{3} > x) - \mathrm{Prob}(\widetilde{X}_{1} > y) + \mathrm{Prob}(\widetilde{X}_{3} > x,$ $\widetilde{X}_{1} > y)$ . Therefore, $({\partial^{2}}/{\partial x\partial y})K(x,y) = ({\partial^{2}}{\partial x\partial y})H(x,y)$ with
$x,y>0$ , and $\overline{F}_{2}(s-x-y)<1$ when $x+y<s$ . Hence, we have to calculate $\mathbb{E}[\widetilde{X}_{1}\widetilde{X}_{3}]$ on the triangle $T=\{0<x<s,0<y<s-x\}$ , i.e.
Then, after simple calculations we find
By the same arguments, in estimating $\mathbb{E}[\widetilde{X}_{2}\widetilde{X}_{3}]$ we have to compute
where $R(x,z) = \overline{F}_{3}(x)\overline{F}_{2}(z)\overline{F}_{1}(s-x-z)$ . Setting $z=s-x-y$ and $y=s-x-z$ , we obtain
and the difference
Hence, it follows that the contribution to $\varepsilon p_{1}(s)$ of higher order in s coming from $\mathbb{E}[\widetilde{X}_{1}\widetilde{X}_{3}] - \mathbb{E}[\widetilde{X}_{2}\widetilde{X}_{3}]$ is given by $\int_{0}^{s}xf_{3}(x) \,{\mathrm{d}} x\int_{s-x}^{+\infty}yf_{1}(y) \,{\mathrm{d}} y$ , which becomes, after some simple steps, $\varepsilon(\!-\!{s^{4}}/{12} + \mathrm{l.o.t.}(s))$ . In conclusion, $\varepsilon p_{1}(s) = \varepsilon(\!-\!{s^{4}}/{6} + \mathrm{l.o.t.}(s))$ . Hence, no matter how high s is and how accordingly small $\varepsilon$ is, $G(\varepsilon,s) < 0$ and Theorem 1 cannot be applied.
Full proof of Proposition 2. First of all, $\varepsilon$ can be chosen so small as to ensure $f_{1}(x) = F_{1}^{\prime}(x) > 0$ for $x\geq 0$ . Then, straightforward computations show that, for k sufficiently large and $\varepsilon$ sufficiently small, $\mathrm{Var}[X_{1}] - \mathrm{Var}[X_{2}] > \varepsilon/{257}$ , implying that, in terms of Shapley values, $\phi_{1}(\nu^{\varepsilon}_0) > \phi_{2}(\nu^{\varepsilon}_0) = \phi_{3}(\nu^{\varepsilon}_0)$ . On the other hand,
Adopting the same notation as the previous proposition, we have to estimate the sign of $G(\varepsilon,s)$ when $s>0$ is sufficiently high and $\varepsilon > 0$ is, accordingly, sufficiently small, knowing that $G(0,s) = 0$ for all $s\geq 0$ and $G(\varepsilon,0) > 0$ for sufficiently small values of $\varepsilon > 0$ . By steps analogous to those in Proposition 1, we calculate, first, the contribution to $\varepsilon p_{1}(s)$ given by $\mathbb{E}[\widetilde{X}_{1}^{2}]$ . Hence, as we have seen, we are led to compute
where L(x) is the primitive of $2x\overline{F}_{1}(x)$ satisfying $L(\!+\!\infty) = 0$ . In particular, applying the mean value theorem, we have to estimate
where $\overline{z}\in(0,s)$ . This implies that, denoting by $\varepsilon a(s)$ the contribution of $\mathbb{E}[\widetilde{X}_{1}^{2}]$ to $\varepsilon p_{1}(s)$ , a(s) tends to zero when $s\rightarrow +\infty$ not faster than $s^{-1}$ . In order to evaluate its sign, let us apply the mean value theorem the other way around,
so that, when s is sufficiently high, $a(s) < 0$ . Then, we can calculate $\mathbb{E}[\widetilde{X}_{1}]$ . By the same arguments, we find that in the expression $\mathbb{E}[\widetilde{X}_{1}]$ , $\varepsilon$ is multiplied by a quantity b(s) tending to zero when $s\rightarrow +\infty$ faster than a(s) (in fact, as $s^{-2}$ ). Now, let us estimate the contribution to $\varepsilon p_{1}(s)$ given by $\mathbb{E}[\widetilde{X}_{2}]$ and $\mathbb{E}[\widetilde{X}_{2}^{2}]$ . Consider, for example, the latter. Following the above steps, we are led to calculate
where L(x) is the primitive of $2x\overline{F}_{2}(x)$ satisfying $L(\!+\!\infty) = 0$ . By applying the mean value theorem and changing the variable x to $s-x$ , we find
However, $(F_{1} \ast F_{3})(s) = \int_{0}^{s}F_{3}(s-z)f_{1}(z)\,{\mathrm{d}} z = F_{3}(\psi(s))\int_{0}^{s}f_{1}(z)\,{\mathrm{d}} z = F_{3}(\psi(s))F_{1}(s)$ . It follows that the contribution to $\varepsilon p_{1}(s)$ is given by a quantity $\varepsilon c(s)$ , where c(s) tends to zero when $s\rightarrow +\infty $ as $s^{-3}$ . Finally, as far as $\mathbb{E}[\widetilde{X}_{1}\widetilde{X}_{3}]$ and $\mathbb{E}[\widetilde{X}_{2}\widetilde{X}_{3}]$ are concerned, we can repeat the arguments of Proposition 1, so that the only contribution to $p_{1}(s)$ tending to zero as $s^{-1}$ is given by $\int_{0}^{s}xf_{3}(x)\,{\mathrm{d}} x\int_{s-x}^{+\infty}yf_{1}(y)\,{\mathrm{d}} y$ , yielding, in particular,
Hence, the integrand function tends to zero not faster than $({1-{5}/{4h}})/{(1+hs)^{2}}$ , which can be written as
Then, when s is sufficiently high, the above integral is smaller than
where b and c are suitable positive numbers.
It follows that, for s sufficiently high and $\varepsilon$ , accordingly, sufficiently small, $G(\varepsilon,s) < 0$ , so that, since $G(\varepsilon,0) > 0$ , there exist pairs $(\overline{\varepsilon},\overline{s})$ satisfying $G(\overline{\varepsilon},\overline{s}) = 0$ . Therefore, Theorem 1 can be applied and the inverted majorization occurs.
Acknowledgements
The authors would like to thank the Editor and the two Referees for their careful reading and their suggestions which we think have improved the quality of the paper. Marcello Galeotti is a member of GNAMPA (Group of Analysis and Probability), INDAM (Italian Institute of High Mathematics).
Funding information
There are no funding bodies to thank relating to the creation of this article.
Competing interests
There were no competing interests to declare which arose during the preparation or publication process of this article.