1. Introduction
The insurance industry is characterized by an inverted production cycle in which the premium for a new contract has to be determined before observing the associated loss. Pricing actuaries estimate the technical price of a cover by modeling historical loss data. In non-life insurance, the total loss L on a new contract is often estimated via a frequency-severity decomposition (Denuit et al,. Reference Denuit, Marechal, Pitrebois and Walhin2007; Frees and Valdez, Reference Frees and Valdez2008), which models the expected loss as
assuming independence between the number of occurred claims N and their severity Y. Risk-based premiums then follow by taking risk characteristics into account when building predictive models for the historical claim frequency and severity data. Pricing requires a data set with claim counts registered at the level of individual contracts and ultimate claim sizes at the level of individual claims.
Figure 1 visualizes the development process of a single claim. This process starts with the occurrence of an insured event, which is reported to the insurer after some delay. If the claim is eligible for compensation under the insurance policy, a number of payments follow. Finally, the claim settles and we observe its total cost. Depending on the insurer and line of business other relevant events (e.g., the involvement of a lawyer) will be registered during the lifetime of a claim. For claims that settled before the moment of evaluation, the reserving actuary observes the full development process and thus the total claim size. However, the development process is only partially observed for reported, but not yet settled claims. For claims that occurred in the past but are not yet reported, the entire development process is missing in the insurer’s database. Such reporting and settlement delays are particularly relevant in long-tailed business lines (e.g., workers’ compensation insurance or reinsurance contracts) where claim settlement can take several years.
Due to the delays present in the claim development process, the pricing actuary only observes the number of reported claims instead of the total number of claims that occurred in past exposure periods. Similarly, the amounts already paid for open claims underestimate actual, ultimate losses, since future payments are missing. As a result of the incomplete claim history, pricing requires a two-step approach. First, claim counts and sizes are estimated per policy and per claim, respectively, based on the available claim history, that is
where $\mathcal{F}_{\tau}$ denotes the information available at the evaluation or observation date $\tau$ . In a second step, these estimates, so-called best estimates, are treated as actual observations when the pricing actuary constructs predictive models for claim frequency and severity as a function of risk characteristics.
In practice, pricing actuaries may ignore the first step of this pricing procedure and only consider reported and settled claims. This approach is feasible when reporting and settlement delays are small and limited bias is introduced by ignoring the censoring present in the data. Alternatively, the best estimates in the first step of the pricing procedure can be constructed in several ways. Claim handlers may estimate the number of unreported claims per policy and the future claim costs based on their expert opinion. Combined with the amount already paid, the estimate of the future cost on a claim then constitutes the expert’s best estimate of the ultimate claim size, also called the incurred claim amount. As a data-driven alternative, methods from non-life reserving can be adapted to estimate the total, ultimate cost of individual claims as well as the number of occurred but not reported claims. The literature on non-life reserving unravels along two axes: aggregate and individual reserving models. Aggregate reserving models (e.g., the chain ladder method; Mack, Reference Mack1993 Reference Mack1999) ignore individual claim characteristics and model a single claim development process for all claims that occur within an accident year. Best estimates for pricing are then obtained by applying this (aggregate) development pattern to individual, reported claims. Constructing these best estimates from an aggregate reserving model has two important disadvantages. First, most aggregate reserving models do not distinguish between open and settled claims. Consequently, the development pattern is estimated from a mix of both open and closed claims and is then applied to both types of claims. While the best estimate of a settled claim will differ from its true, observed cost, it should replace the true, observed value to be consistent with the aggregate reserving model. Second, ignoring risk characteristics when constructing best estimates diminishes the heterogeneity that is present in the claim severity data analyzed by pricing actuaries. Following Norberg (Reference Norberg1993, Reference Norberg1999), a literature on individual reserving models has emerged, where best estimates are constructed at the level of individual claims. We see most potential in a stream of individual reserving models in discrete time adopting techniques from the insurance pricing literature. Larsen (Reference Larsen2007), Wüthrich (Reference Wüthrich2018), Crevecoeur et al. (Reference Crevecoeur, Robben and Antonio2022) and Delong et al. (Reference Delong, Lindholm and Wüthrich2022) focus on generalized linear models (GLMs), regression trees, gradient boosting models and neural networks for claims reserving, respectively. In these approaches, the inclusion of claim-specific covariates tailors the best estimates to the characteristics of the individual claims. Consequently, the completed data sets will more accurately reflect the heterogeneity in the claim data. To the best of our knowledge, no data-driven methods have been published for estimating the number of unreported claims at the level of individual policies.
Insurance pricing literature mainly puts focus on the second step of the pricing procedure, where a statistical model is fitted to the best estimates. Although actual observations and best estimates follow different statistical distributions, the frequency-severity decomposition still holds, that is
as a result of the tower rule. This property is essential, since it enables an unbiased estimate of the loss from predictive models calibrated on the best estimate claims data. However, many other properties of the loss (e.g., the variance) are not preserved when treating best estimates as actual observations. In particular, severity is underestimated for policies covering losses above a (known) deductible D. This is a consequence of Jensen’s inequality (Jensen, Reference Jensen1906), which states that $ E[\varphi(Y) \mid \mathcal{F}_{\tau}] \geq \varphi(E(Y \mid \mathcal{F}_{\tau})) $ for any random variable Y and convex function $\varphi(.)$ . Indeed, applied to an insurance contract with deductible D, we obtain
for the convex function $\varphi\,{:}\, Y \to (Y - D)_+$ . This is especially relevant in excess-of-loss reinsurance pricing, where deductibles are high and long settlement delays result in many open claims. Moreover, the risk characteristics selected when modeling frequency and severity data and their calibrated effects should rather be interpreted as effects on best estimates instead of effects on actual observations. These effects are likely to be affected by the method used for constructing these best estimates.
Our paper contributes by proposing a novel approach for non-life insurance pricing that resolves the inconsistencies between actual observations and best estimates in traditional pricing. Moreover, by modeling the occurrence and development process of claims, our proposed model is also readily available for reserving. Hence, we bridge two key tasks of the non-life actuary that are typically studied in silos. We demonstrate our methodology with a case study on pricing and reserving for both a traditional insurance and a reinsurance portfolio. This is one of the first papers applying techniques from individual reserving on a reinsurance data set. The reinsurance industry is characterized by low claim frequencies and large claim severities (Albrecher et al., Reference Albrecher, Beirlant and Teugels2017), which demands special attention when building predictive models for the development of individual claims.
This paper is organized as follows. Section 2 introduces our proposed model for the occurrence and development of claims at the level of an individual non-life insurance contract. Section 3 illustrates how this model can be used for pricing and reserving with non-life insurance policies. Section 4 demonstrates this methodology on two case studies. Section 5 concludes the paper.
2. An occurrence and development model for non-life insurance claims
We present a discrete time occurrence and development model (ODM). This ODM captures the occurrence and the reporting of claims at the level of an individual insurance contract, as explained in Section 2.1. Section 2.2 details how the ODM structures the development of individual claims after reporting. Together these two building blocks drive the complete development of all claims occurring on a portfolio of insurance contracts. In the remainder of this paper, we implicitly assume a yearly, discrete time grid. However, our approach extends directly to quarterly, monthly or daily time grids.
2.1. Modeling the occurrence and reporting of non-life claims
We consider a portfolio with historical claims data registered on n policies. Each of these policies covers the claims occurring during a single year of exposure.Footnote 1 Let $N_i$ denote the claim frequency on policy i, that is, the total number of claims that occur in the occurrence year, $\texttt{occ(i)}$ , covered by this policy. Due to a possible delay in reporting (see Figure 1), these counts $N_i$ are not directly observable. Instead we observe counts $N_{ij}$ , which register the number of claims from policy i that are reported in the $(j-1)$ -th year after occurrence, that is, in year $\texttt{occ(i)} + j - 1$ . At the observation date $\tau$ , the set of observed claims consists of $\{N_{ij} \mid i = 1, \ldots, n, j = 1, \ldots, \tau_i \}$ where $\tau_i \,{:\!=}\,\max(d, \tau - \texttt{occ}(i) + 1)$ is the number of observed reporting years for policy i with d denoting the maximal reporting delay. The set of not (yet) reported claims consists of $\{N_{ij} \mid i = 1, \ldots, n, j = \tau_i + 1, \ldots, d \}$ . Following Jewell (Reference Jewell1990) and Norberg (Reference Norberg1993), we propose a model to predict these unreported claim counts, based on the following assumptions:
-
(F1) Claims are reported with a maximal delay of d years. This maximal delay d is at most the length of the observation window $\tau$ of the portfolio, that is, $d \leq \tau$ .
-
(F2) Conditional on the observed policy covariates $\boldsymbol{x}_i$ , claim counts $N_i$ (with $i=1,\ldots,n$ ) are independent and follow a Poisson distribution with intensity $\lambda(\boldsymbol{x}_i)$ .
-
(F3) Conditional on the total number of claims $N_i$ on policy i and its covariates $\boldsymbol{x}_i$ , the reported claim counts $N_{ij}$ are multinomially distributed with reporting probabilities $p_{j}(\boldsymbol{x}_i)$ , where $j=1,\ldots,d$ .
Assumption (F1) limits the reporting delay and allows to retrieve the total claim frequency on policy i as
The independence assumptions in (F2–F3) are similar to those in classical insurance pricing but might be violated in case of high impact events, for example, extreme weather with claims occurring in clusters, requiring more advanced modeling techniques to capture dependencies. As a result of the thinning property for Poisson distributions, assumptions (F2–F3) imply
The log-likelihood of the observed claim counts then follows as
where $\boldsymbol{\lambda}$ and $\boldsymbol{p}$ are a shorthand notation for the parameters used in the Poisson intensities and reporting probabilities. Extending the work of Verbelen et al. (Reference Verbelen, Antonio, Claeskens and Crevecoeur2022) designed for claims reserving, we now specify the above likelihood at the level of individual policies, with a tailored specification for the reporting of claims via the reporting probabilities in $\boldsymbol{p}$ . The joint estimation of $\boldsymbol{\lambda}$ and $\boldsymbol{p}$ in (2.1) is complicated by the presence of the interaction term $\lambda(\boldsymbol{x}_i) \cdot p_{j}(\boldsymbol{x}_i)$ . Using an EM algorithm (Dempster et al., Reference Dempster, Laird and Rubin1977), the occurrence and reporting parameters in (2.1) can be decoupled and estimated iteratively. The k-th expectation (E) step then imputes the hidden observations $\{N_{ij} \mid i \leq n, \tau_i < j \leq d\}$ as follows:
where the superscript $(k-1)$ refers to the parameter estimates obtained in the previous iteration of the EM algorithm. The k-th maximization (M) step then maximizes the completed log-likelihood
where $N^{(k-1)}_i = \sum_{j=1}^d N^{(k-1)}_{ij}$ . The likelihood now splits into an occurrence and reporting contribution. For the occurrence process, we maximize
This likelihood is proportional to the Poisson likelihood that is typically optimized in the claim frequency models used in insurance pricing. The partially observed claim counts $N_i$ are replaced by counts $N^{(k-1)}_i$ , adjusted for unreported claims. For the reporting process, we maximize
The estimation of the reporting probabilities $p_{ij}:=p^{(k)}_{j}(\boldsymbol{x}_i)$ in this multinomial likelihood is complicated by the sum-to-one restriction on the reporting probabilities that must hold for each policy i. Following Kalbfleisch and Lawless (Reference Kalbfleisch and Lawless1991), we overcome the sum-to-one restriction by projecting the d probabilities $(p_{ij})_{j=1\ldots,d}$ into $d-1$ probabilities $(q_{ij})_{j=1\ldots,d-1}$ as follows:
The q probabilities take the form of inverted, discrete time hazard rates from which the vector of probabilities $(p_{ij})_{j=1\ldots,d}$ can be retrieved as
Combining (2.3) with (2.2) and changing the order of summation, the likelihood for the reporting process becomes
This likelihood is a sum of binomial likelihood contributions and can be optimized with standard statistical modeling techniques.
When applied for pricing, the proposed occurrence and reporting model estimates the expected number of claims $\hat{\lambda}(\boldsymbol{x}_i)$ per policy while correcting for the existence of unreported claims that occurred in the exposure year covered by the policy. When used for reserving, the model estimates the number of claims that will be reported in future years on policy i, that is, $(N_{i, \tau_i+1}, \dots, N_{i, d})$ , as well as their associated reporting delays. Estimating unreported claims at policy level has the advantage that policy specific reserves can be booked for these claims.
2.2. A hierarchical model for the development of reported non-life claims
Insurers track many dynamic claim characteristics (e.g., the amount paid, settlement status, involvement of a lawyer) over the lifetime of a claim. We predict the joint evolution of these dynamic claim characteristics using a hierarchical individual claims reserving model originally proposed in Crevecoeur et al. (Reference Crevecoeur, Robben and Antonio2022). This section restates the key features of this model and proposes some extensions. For a more in-depth analysis of the hierarchical reserving model, we refer to our original paper.
When modeling the development of a claim, we will differentiate between the initial state of the claim characteristics as observed in the reporting year of the claim and the updates in later years. This distinction reflects the difference in information that becomes available in the reporting year compared to later years. For example, in the reporting year the claim expert sets the initial incurred amount, which can be quite large, while in later development years small adjustments to this initial incurred are observed. We let the vector $\boldsymbol{I}_k$ structure the initial claim characteristics for claim k as registered at the end of its reporting year, denoted $\texttt{rep(k)}$ . In later years, updated vectors $\boldsymbol{U}_k^j$ (with $j \geq 2$ ) structure the evolution of claim k in the $(j-1)$ -th year since reporting, that is, in year $\texttt{rep(k)} + j - 1$ . The information captured by the vectors $\boldsymbol{I}_k$ and $\boldsymbol{U}_k^j$ is tailored to the portfolio at hand. The case studies in Section 4 illustrate a possible setup in which the joint evolution of the settlement status, the amount paid and the incurred are tracked over the lifetime of a claim. We refer to these chosen characteristics as the layers of the hierarchical reserving model. Let the vector $\mathcal{X}_k$ store the observed development of claim k, that is
with $\tau_k = \tau - \texttt{rep(k)} + 1$ the number of observed years since reporting for claim k. Our approach models the development of claim k as recorded in $\mathcal{X}_k$ based on a single assumption:
-
(S1) Conditional on static claim covariates $\boldsymbol{x}_k$ available at the reporting of claim k, the development of the claim is independent of the development of the other claims in the portfolio.
This independence assumption is essential for modeling the development at the level of individual claims. As a result of (S1), we can write the likelihood for a portfolio with m reported claims as
where $f(\boldsymbol{I}_k, \boldsymbol{U}_k^2, \ldots, \boldsymbol{U}_k^{\tau_k} \mid \boldsymbol{x}_k)$ is the joint likelihood of the development process observed for claim k. Our hierarchical approach decomposes this joint likelihood over time as well as over the layers (i.e., the respective dimensions) of the vectors $\boldsymbol{I}_k$ and $\boldsymbol{U}_k^j$ by applying the law of conditional probability twice. First, the likelihood is split into chronological order
By conditioning on past events, we allow the model to use the historical development of a claim (e.g., total amount paid, reserve, settlement status in previous years) when modeling the development in future years. Second, we decompose the likelihood over the layers of $\boldsymbol{I}_k$ and $\boldsymbol{U}_k^j$
where v and w denote the length (i.e., the number of layers) of the initial vector $\boldsymbol{I}_k$ and update vector $\boldsymbol{U}_k^j$ , respectively. Through conditioning on the layered structure, we allow for dependencies in the development of the claim characteristics within a time period. We model this decomposed likelihood by specifying a statistical model per layer, leading to a total of $v+w$ statistical models.
When applied to pricing, we use the proposed hierarchical development model to estimate the total severity of claims. When used for reserving purposes, the model allows to estimate the future cost of reported as well as not yet reported claims, while accounting for their static characteristics registered at reporting as well as their observed development so far.
3. Pricing and reserving with the ODM
3.1. Non-life pricing
Following the frequency-severity decomposition discussed in Section 1, we estimate the pure premium $\pi_i$ for policy i as the product of its expected claim frequency, $E(N_i)$ , and expected claim severity, $E(Y_i)$ , that is
Risk-based claim frequency estimates follow immediately from the occurrence and reporting model proposed in Section 2.1. In contrast with traditional claim frequency models, our approach adjusts the estimated claim frequencies for the presence of unreported claims. We consider two strategies for modeling the distribution of the claim severity given a set of policy covariates $\boldsymbol{x}_i$ with our ODM. The first approach simulates new claims for a given policy from ground up, whereas the second approach simulates the future development of already reported open claims.
Simulating new claims We use the ODM calibrated on historical claims data to simulate the ultimate cost of a large number of new claims occurring on a given policy. Algorithm 1 outlines the procedure to simulate the occurrence, reporting and development of a new claim on a policy with characteristics $\boldsymbol{x}$ .
It is essential that the paid amount or the incurred is tracked within $\boldsymbol{I}$ and $\boldsymbol{U}^{j}$ , such that the claim’s ultimate cost at settlement can be computed as a function of the simulated development process. Using these simulated paths, we obtain an empirical distribution of a claim’s ultimate cost from which the expected severity follows.
Simulating future paths for open claims In this alternative modeling strategy, we first simulate for each open claim a large number of future paths, say $n_{\texttt{path}}$ . Each simulated path p of an open claim k corresponds to a scenario for the ultimate claim size $Y_{k, p}$ . Combining these simulated paths, we obtain a distribution of the ultimate size per claim. In a second step, we fit a severity distribution by assigning a weight of one to actual observations from closed claims and a weight of $\frac{1}{n_{\texttt{path}}}$ to the ultimate claim sizes corresponding to the simulated paths for the open claims, that is, we maximize the following log-likelihood:
where $f_Y(.)$ is the proposed parametric severity distribution and $\texttt{settled}_k$ is one when claim k settles before the evaluation date $\tau$ and zero otherwise. Contract-specific covariates can be included in the severity distribution $f_Y(.)$ . This likelihood includes all possible paths for open claims, whereas traditional severity models average these paths to obtain a best estimate of the ultimate cost of an open claim. Consequently, these traditional methods maximize
Our proposed approach for severity modeling (see (3.1)) stays close to traditional pricing practice but resolves the contradiction between best estimates and actual observations that is present in traditional pricing. We refer to Albrecher and Bladt (Reference Albrecher and Bladt2022) for the development of a more general framework to incorporate datapoint uncertainty (e.g., severity for open claims) into parametric estimation procedures.
3.2. Non-life reserving
Reserving models estimate the aggregated future cost of unsettled, open claims that occurred in past exposure periods. We split the total claims reserve into a reserve for incurred but not (yet) reported claims, that is, the IBNR reserve, and a reserve for reported but not (yet) settled claims, that is, the RBNS reserve. The total reserve, denoted $\mathcal{R}$ , is the sum of these two reserve contributions, that is
We compute the IBNR reserve by aggregating (over all policies i) the expected severity for occurred, yet unreported claims, that is
Similar to the frequency-severity decomposition in pricing, this formula assumes independence between the number of claims and the claim severity. Estimates for the number of reported claims per year, $N_{ij}$ , follow immediately from the occurrence and reporting model proposed in Section 2.1. Expected claim severity is estimated with the techniques outlined in Section 3.1 for pricing.
For the RBNS reserve, we compute the future cost of all reported but not yet settled claims. Hereto, we use the hierarchical reserving model outlined in Section 2.2 and simulate the joint evolution of all open claims. As a result of independence assumption (S1), simulating this joint evolution reduces to independently simulating a single path for each open claim. We aggregate the simulated future costs across all claims to obtain an estimate of the total RBNS reserve. A distribution and the expected value of the RBNS reserve are then obtained by repeating these steps.
4. Case studies on pricing and reserving with the ODM
4.1. An insurance portfolio
We first illustrate the ODM on a European motor third party liability (MTPL) insurance data set. The portfolio consists of 1024805 policies active between January 1, 2007 and December 31, 2016, resulting in 78627 reported claims. Policies are restricted to a single calendar year. When policyholders were insured in multiple calendar years, the insured period is broken down by calendar year into multiple records. Table 1 lists the available policy and claim covariates.
4.1.1. Occurrence and reporting of claims
In our data set, $96.2\%$ of the observed claims were reported in the year of occurrence and $3.6\%$ in the next year. Only $0.2\%$ of the claims have a reporting delay of more than one calendar year. In this analysis, we remove claims with a delay of more than 1 year to put focus on the bulk of claims reported shortly after occurrence.
We follow the EM algorithm outlined in Section 2.1 and model in the M-step the occurrence of claims via
and the probability of reporting the claim in its year of occurrence is specified as
Figure 2(a), (b), (c) and (d) shows the fitted parameters for the occurrence model specifications in (4.1). Expected claim frequency is higher for young drivers and drivers occupying a higher level in the bonus malus scale. Internal changes at the insurer cause an administrative increase in the number of registered claims per policy after 2010. This change is captured by the calendar year effect. For fuel, we see that drivers using gasoline have fewer claims. The missing level here essentially corresponds to other motorized vehicles, such as mopeds, being included in the portfolio. Their estimated effect indicates a lower claim risk compared to cars. Figure 2(e), (f) and (g) displays the parameters fitted for (4.2), the probability of reporting the claim in its year of occurrence. Only bonus malus shows a significant effect with longer reporting delays for policyholders occupying higher bonus malus levels. As a consequence, when directly modeling the claim frequency from the observed claim counts in this MTPL data set, the pricing actuary will underestimate the claim frequency of drivers occupying high bonus malus levels.
4.1.2. Hierarchical claim development model
The layers. For each reported claim, the data set tracks the evolution of its settlement status, the amount paid and the amount incurred per observation year since reporting. We use the hierarchical claim development model discussed in Section 2.2 to structure the joint evolution of these claim characteristics. Figure 3 sketches its layered structure, tailored to the MTPL insurance data set. A three-layered specification for $\boldsymbol{I}_k$ keeps track of the claim characteristics in the year of reporting. Layer 1 tracks the settlement status of the claim, which is then used as input when modeling whether a payment takes place (layer 2) and (if so) the size of that payment (layer 3). When a claim does not settle in the year of reporting, layer 4 registers the initial reserve set by the claim expert. Beyond the year of reporting, a 7-dimensional $\boldsymbol{U}_k^j$ structures the development of claim k in observation year $j-1$ (with $j\geq 2$ ) since reporting. Hereby, the meaning of layers 1–3 does not change. Following a payment, the claim-specific reserve is automatically reduced with the paid amount and upon settlement the incurred is put equal to the paid amount. These are deterministic, automatized operations that do not require any stochastic modeling. However, layer 4 tracks if a (non-automatic) change in the claim-specific reserve takes place (yes or no). Layer 5 then verifies whether that change is positive (yes or no), layer 6 tracks the nominal increase in the reserve (if any) and layer 7 the percentage decrease (if any). A more detailed, technical description of these layers is provided in Appendix A.
Predictive model and distributional assumption per layer. We model each of the layers with a tree-based gradient boosting machine (GBM) (Friedman, Reference Friedman2001), which additively combines shallow decision trees into one predictor. Three properties make GBMs interesting for automatization. First, automatic binning of continuous covariates allows for capturing non-linear effects. Second, interaction effects are automatically detected when using shallow trees with multiple splits. Third, covariate selection is integrated in the calibration process. For each GBM, we tune five parametersFootnote 2 using five-fold cross validation on our training data set. Table 2 specifies the distributional assumption for each of the layers in the hierarchical claim development model. We distinguish three types of outcome variables: binary outcomes, percentage changes and numeric outcomes not bounded to the interval (0, 1). We model binary outcomes (e.g. settlement) with a binomial GBM with logit link function, that is, we minimize the loss
where the sum runs over the available observations for the target layer, $y_i$ is the observed 0/1 outcome, $\boldsymbol{z}_i$ denotes the available covariates for observation i and $f^{\texttt{binary}}(\boldsymbol{z}_i)$ is the prediction delivered by the GBM such that $\text{logit}(P(Y_i=1)) = f^{\texttt{binary}}(\boldsymbol{z}_i)$ . Percentage outcomes (e.g., pct_decrease_reserve) are first transformed to the domain $(-\infty, \infty)$ using a logit transform and then modeled using a Gaussian GBM, that is, we minimize the loss
The variance $\sigma^2$ of the Gaussian distribution is estimated as the mean squared error of the residuals, that is
where n is the number of observations. Other numeric outcomes (e.g. increase_paid) are modeled with a gamma distribution by minimizing the loss
The shape parameter k of the gamma distribution is estimated by maximizing the profile likelihood
Covariates in the layer-specific predictive model. We train the layers of the hierarchical reserving model on a data set where each record corresponds to an observation year (since reporting) of a reported claim. Records consist of target variables, static and dynamic covariates. Target variables register the outcome variables of the layers of the hierarchical development model. Static covariates relate to policy characteristics (e.g., the fuel type of a car) or claim characteristics (e.g., the reporting delay of a claim) and remain constant over the claim development process. Dynamic covariates become available during the claim development process and can be expressed as a function of the target variables. We distinguish three classes of dynamic covariates, namely absolute, relative and aggregated dynamic covariates. Absolute dynamic covariates describe claim characteristics in a fixed, predefined development year (e.g., the payment size in development year two since reporting). Once these covariates become available, they remain constant for the remainder of the development process. Relative dynamic covariates describe claim characteristics in the current or previous development year (e.g., payment size in the previous development year). Aggregated dynamic covariates combine the past claim history in a single aggregated outcome (e.g., total amount paid or the current development year). The evolution of these covariates between development years can often be written as a recursive relation. Since we construct one predictive model per layer using data from all development years, we only use relative and aggregated dynamic covariates in our models because these covariates are available independent of the length of the available historical claim information. Figure 4 summarizes the target variables and the covariates included when modeling these targets in the insurance case study. For each covariate, the figure indicates the covariate type and the layers in which the covariate is updated.
4.1.3. Reserving
We illustrate the use of the calibrated ODM for reserving. We focus in this illustration on the future development of the RBNS claims. Figure 5 shows 95% confidence intervals for the evolution of the total amount paid and the total incurred amount for all claims that are open at the end of 2011. Plotted dots indicate the actual observed amounts, as registered in the data set. These realized values fall within the confidence intervals, constructed via 200 simulated paths. The insurer’s reserving policy asks claim experts to provide a conservative estimate of the future claim cost. Hence, the decrease in the amount incurred (i.e., the paid amount + the outstanding reserve) over time in Figure 5, the estimate of the claim experts is higher than the actual cost of a claim. The grid of plots in Figure 6 extends this evaluation across multiple evaluation dates. For each of the considered evaluation dates, we estimate the total paid and incurred amounts for claims that are reported before the evaluation date. We then compare the actual realizations with the estimates obtained with our ODM. Our model closely follows the actual portfolio evolution, while claim experts overestimate the total claim cost.
4.2. A reinsurance portfolio
Next, we illustrate our method on a Belgian MTPL reinsurance data set registering the detailed development of ${4277}$ large motor insurance claims that occurred between 2000 and 2017. These claims originate from 21 underlying MTPL insurance portfolios, which act as the insured clients or policyholders from the reinsurer’s perspective. We label these portfolios as A, B,...,U. Using our proposed ODM, we develop a pricing as well as a reserving strategy for a portfolio of excess-of-loss reinsurance contracts. In such an excess-of-loss contract, the reinsurer reimburses the cost of an individual claim exceeding a deductible D, up to a limit L (Albrecher et al., Reference Albrecher, Beirlant and Teugels2017).
When it comes to large claims, insurers will carefully monitor the evolution of incurred amount, the expected total cost as set by claim handling experts. For the purpose of pricing excess-of-loss reinsurance contracts, insurers are obliged to report a claim to the reinsurer once its incurred exceeds a predefined threshold, the so-called reporting priority. The reporting priority is determined upfront and is specific to both the underlying portfolio and the occurrence year of the claim. Figure 7 visualizes the thresholds (priority, deductible and limit) for the excess-of-loss contract under consideration. In this example, claim 1 (in black) is reported to the reinsurer in year two when its incurred first exceeds the reporting priority. Even when the incurred of claim 2 (in red) falls below the priority in year four, the reinsurer keeps receiving yearly updates on this claim. At settlement, the amount incurred and the paid amount are equal and the reinsurer covers the amount of the loss between the deductible and the limit (region III), while the insurer covers the remaining loss amount (regions I, II and IV).
To evaluate model performance, we split the data and train our model on the years 2000–2014. The remaining years 2015–2017 constitute the out-of-time test data set. Before fitting our ODM, we apply three preprocessing steps to the data. First, we remove negative payments. Since the data set contains only a small number of negative payments (accounting for less than 2% of the total amount paid), we believe that the potential gain in model accuracy by incorporating negative payments does not outweigh the implied increase in model complexity. Second, we remove small payments and changes in the incurred of less than 100 euro by combining them with the next large payment or change in the incurred, respectively. Such small changes are frequent, but irrelevant given the large claim sizes in our data. Removing these small changes allows the model to put focus on the important changes in the amount paid and the incurred. Finally, we deflate the payments to the level of 2014 using the inflation curve provided by the reinsurer. After modeling the deflated data, we reinflate the simulated yearly payments to the corresponding payment years when calculating prices and reserves.
4.2.1. Occurrence and reporting of large claims
We slightly adapt Section 2.1 to our reinsurance setting. A policy, indexed with i, now refers to a reinsurance contract on an insurance portfolio covering a single underwriting year. In our data set, a claim from policy i is reported when its incurred amount exceeds the reporting priority, denoted $\texttt{priority(i)}$ . These priorities are policy-specific, which complicates the comparison of occurrence intensities and reporting delays across policies. Therefore, we choose a new, common priority P shared by all policies. $N_{ij}^{P}$ then denotes the number of claims from policy i for which the incurred first exceeds the priority P in the $(j-1)$ -th year since occurrence, that is, year $\texttt{occ(i)} + j - 1$ . The total number of claims from policy i that exceed the priority P at least once during their development is
Since long reporting delays are common in reinsurance, we set the maximal delay d equal to 15, the length of the observation window in our data set. The specification of a common reporting priority P naturally restricts the available reinsurance data set to the MTPL insurance portfolios for which $\texttt{priority(i)} \leq P$ . Only for these policies, we observe the reported claim counts $N_{ij}^{P}$ . To investigate the effect of priority P on the estimated price of the excess-of-loss contract, we model the occurrence intensity and reporting delay above three priorities: ${750,000}$ , ${1,000,000}$ and ${1,250,000}$ . With these priorities, we observe claims from 9, 15 and 15 portfolios (from the original 21), respectively.
Following Section 2.1, we model the claim occurrence process with a Poisson distribution with intensity
where $e_{i}$ is the exposure expressed as the number of vehicles insured by policy i and $\lambda_{\texttt{portfolio(i)}}$ denotes the portfolio-specific claim intensity. We model the reporting probabilities $p_{i, j}$ via their one-to-one connection to the probabilities $q_{i, j}$ introduced in (2.3). The q probabilities are estimated by maximizing the likelihood in (2.4) of a binomial GLM with logit link function and
where $\gamma_j$ is the effect of the reporting year and the $\gamma_{\texttt{portfolio(i)}}$ parameters capture reporting delay variations across portfolios.
Figure 8 visualizes the estimated occurrence intensity and the reporting delay distribution when $P = {750,000}$ , using the data from the 9 portfolios available at this priority. Figure 8(a) shows the claim occurrence intensity per ${100,000}$ insured vehicles for each of these portfolios. We clearly distinguish two regimes in the occurrence intensity: low occurrence intensities ( $2.17{-}2.51$ large claims per ${100,000}$ vehicles) in insurance portfolios $\texttt{A, H, K}$ and $\texttt{O}$ and high occurrence intensities ( $3.18{-}3.54$ large claims per ${100,000}$ vehicles) in insurance portfolios $\texttt{B, I, J, M}$ and $\texttt{S}$ . This split in two regimes could indicate a different share of more exposed vehicles (e.g., buses and trucks) insured in these portfolios.
Figure 8(b) shows the estimated reporting delay distribution per insurance portfolio. The incurred amount is volatile in the first years after the occurrence of claims, when there is significant uncertainty regarding the final claim amount for example, because the physical damage has not yet been decided in court or the victim has not yet reached the age of majority. As a result, a portfolio of reinsurance contracts is characterized by long reporting delays between the occurrence of a claim and the moment its incurred first exceeds the reporting priority. Moreover, since each insurer follows its own reserving policy, we find considerable differences in reporting delay.
The insights revealed in Figure 8 are important for reinsurers when pricing reinsurance contracts issued to these insurance portfolios. Reinsurers can share these insights with their policyholders, that is, the insurers. This enables insurers to benchmark the observed reporting delay for their portfolio to the market and provides incentives to insurers with long reporting delays to put more focus on accurately reserving large claims.
4.2.2. A hierarchical model for the development of large claims after reporting
For each reported claim, our data set tracks the evolution of the settlement status, the amount paid and the amount incurred per year. Since these events in a claim’s development process are clearly dependent (e.g., no payments after settlement, low settlement probability when the outstanding reserve is large), we use the hierarchical model of Section 2.2 to model the joint evolution of these claim characteristics.
The layers. We choose a reporting priority, $P = {750,000}$ , and interpret $\boldsymbol{I}_k$ as the dynamic claim characteristics registered for claim k when its incurred amount first exceeds ${750,000}$ . The top row in Figure 9 visualizes the 3-layer hierarchical structure for $\boldsymbol{I}_k$ , a three-dimensional vector. At reporting, the incurred exceeds the reporting priority of ${750,000}$ . Layer 1 (the first entry in the vector $\boldsymbol{I}_k$ ) captures the excess amount of the incurred above this reporting priority, that is, the difference between the initial incurred amount and this reporting priority. As a result of the data preprocessing step, we only record differences of at least 100 euro. The outcome of this first layer is an input when modeling the amount paid in layer 2 and 3, the second and third entries in $\boldsymbol{I}_k$ . Layer 2 tracks whether a part of the incurred is paid at reporting (yes or no). In case of a payment, layer 3 stores the amount paid at reporting as a percentage of the total incurred. We do not model the settlement status in the year of reporting, because claims never settle immediately at reporting in this reinsurance data set. The bottom row in Figure 9 visualizes the 8-layer hierarchical specification for the update vector $\boldsymbol{U}_k^j$ in the $(j-1)$ -th year since reporting (with $j\geq 2$ ). First, layer 1 registers the settlement status of a claim. Settlement status is used as an input when modeling payments and changes in the incurred. Layer 2 tracks the presence of a payment and layer 3 captures the size of a payment conditional on the presence of a payment. Note that we only take payments above 100 euro into account. Following a payment, we deterministically decrease the claim-specific reserve by the payment size. When the claim settles, the incurred is set equal to the total amount paid. This is a deterministic operation and no modeling is required. However, when a claim does not settle, layers 4-8 express the reserve changes. These five layers let our model capture a drop of the reserve to zero, a nominal increase in the reserve or a decrease expressed as a percentage of the outstanding reserve. A more detailed, technical description of the 11 (3+8) layers is available in Appendix A.
Predictive model and distributional assumption per layer. Similar to the insurance case study, we model each of the layers with a tree-based GBM. Table 3 specifies the distributional assumption per layer. We distinguish three types of outcome variables: binary outcomes, percentage changes and numeric outcomes not bounded to the interval (0, 1). For the binary outcomes and the percentage changes, we follow the distributional assumptions discussed in Section 4.1.2. However, other numeric outcomes (e.g., increase_paid) are in this example left-truncated at 100 because of the removal of small payments and changes in the incurred in the data preprocessing step. Moreover, these outcomes are heavily right skewed given our reinsurance context. Therefore, we first normalize these observations by applying a power transform, that is, we replace the random variable X by $X^p$ for some power p, and then estimate a truncated Gaussian GBM for the normalized outcomes. We minimize the following loss function:
where p is the exponent in the power transform and $\Phi(\cdot \mid \mu, \sigma)$ is the cdf of the Gaussian distribution with mean $\mu$ and standard deviation $\sigma$ . We opt for a two-step calibration approach. First, we minimize (4.3) with respect to $\sigma$ , p and a constant $f^{\texttt{numeric}}(\cdot)$ . Second, we re-estimate $f^{\texttt{numeric}}(\cdot)$ and $\sigma$ using a truncated Gaussian GBM, while keeping the power p fixed.
Feature effects. Figure 10 shows for each fitted GBM the relative importance of the included covariates. The variable importance of a covariate is the decrease in the GBM’s loss function over all tree splits using the covariate under consideration, when optimal values are used for the tuning parameters. portfolio is an important covariate in almost all layers, indicating clear differences in the handling of large claims between the insurers in the data set. Most noteworthy is the effect of the portfolio on the layer change_reserve. In some portfolios, experts re-evaluate their large claims almost every year, whereas other insurers rarely update their large claims. Both covariates reserve (i.e., incurred – paid) and ratio paid incurred (i.e., $\frac{\text{paid}}{\text{incurred}}$ ) describe a relationship between the incurred amount and the paid amount. Together these covariates are for many layers the most important predictors for a claim’s future development. In traditional, aggregated reserving models, claim development depends only on the number of years elapsed since reporting, that is, the development year. Surprisingly, this covariate becomes irrelevant when more informative claim characteristics are available.
Figure 11 uses partial dependence plots to visualize the marginal effect of selected covariates on the outcome layers in the hierarchical model. The excess incurred is smaller for claims reported to the reinsurer after a long delay (Figure 11(a)) and for these claims a larger fraction of the incurred has already been paid at reporting by the insurer (Figure 11(b)). This is intuitive taking into account that the reporting delay is different from the insurer’s and the reinsurer’s perspective and that large claims are often quickly reported to the insurer. Late reporting of a claim to the reinsurer thus gives the insurer more time to make claim payments. As expected, Figure 11(c) shows that claims are likely to settle when the outstanding reserve is near zero. The incurred is more likely to increase when either little has been paid yet for the claim or when the paid amount is close to the incurred amount (Figure 11(d)).
4.2.3. Pricing an excess-of-loss reinsurance contract
We price an excess-of-loss reinsurance contract covering losses from individual claims exceeding a deductible $D = {2,500,000}$ up to a limit $L = {5,000,000}$ . Following the frequency-severity decomposition, the pure premium $\pi^{P}$ is
Here $N^P$ and $Y^P$ are the frequency and severity, respectively, of claims reported above a priority P, $(Y^P \wedge L)$ denotes the minimum of $Y^P$ and L, and $(Y-D)_+$ equals $Y-D$ if $Y \geq D$ and zero otherwise. The pure reinsurance premium scales directly with the number of insured vehicles in the underlying insurance portfolio, that is, the exposure $e_i$ . In this section, we set the exposure to one and hence compute the premium for a single insured vehicle.
In Section 4.2.1, we calibrated the occurrence of claims from policy i as
where $\lambda^P_{\texttt{portfolio(i)}}$ is the expected number of clams per insured vehicle from $\texttt{portfolio}(i)$ exceeding the reporting priority P, that is, our frequency estimate. Figure 8(a) pictures the fitted parameters $\lambda^P_{\texttt{portfolio(i)}}$ for the various portfolios in our data set when using a reporting priority of ${750000}$ .
Section 3.1 outlined two strategies for simulating the claim severity distribution. The first strategy simulates a large number of paths for a new claim from ground up, whereas the second strategy simulates the future development of open claims. We illustrate the use of both simulation strategies to model the severity distribution of a new claim from portfolio A when the reporting priority P is equal to ${750,000}$ . Appendix B outlines the details of both simulation strategies.
Comparing simulated severity distributions Figure 12 shows the empirical claim severity distributions based on simulated paths from ground up (blue) and simulated paths for the future development of observed claims (red). Since we price an excess-of-loss contract with a limit L of ${5,000,000}$ , we only show the distribution of the ultimate claim severity below ${5,000,000}$ . For portfolio $\texttt{A}$ both simulation strategies result in nearly identical severity distributions. Repeating the same approach for portfolio $\texttt{B}$ , we retrieve a more heavy tailed severity distribution when simulating paths for the future development of observed claims. Figure 12 compares the claim severity distributions proposed in our paper with the empirical cdf based on best estimates (green), where for each open claim the best estimate is calculated by averaging the ultimate claim severity over the 200 simulated paths. This distribution has the same mean, but a lower variance than the distribution based on the simulated paths for observed claims. As argued in Section 1, an underestimation of the variance can have severe implications when pricing complex (re)insurance products.
Pricing an excess-of-loss contract Across the available portfolios and for three chosen reporting priorities, Figure 13 shows the pure premium per insured vehicle for an excess-of-loss policy. Hereby, we use the severity distribution obtained via (lhs) simulating ${20,000}$ new claims from ground up and (rhs) observed claims complemented with 200 simulated paths for the future development of each open claim. In theory, the choice of the reporting priority should not influence the price of the reinsurance contract. In practice however, some differences in the estimated pure premium arise since the priority determines the available historical claims when calibrating the ODM. We investigate the sensitivity of the pure premium with respect to the priority by modeling the frequency and severity above a reporting priority of ${750,000}$ , ${1,000,000}$ and ${1,250,000}$ . For most portfolios, the price remains relatively constant when changing the priority, but larger variations are observed for some small portfolios (e.g., portfolio S). These variations mainly result from the claim frequency model for which the priority determines the available claims when training the model. Since we detected two regimes in the occurrence intensity in Figure 8(a), our frequency model could be made more robust by estimating a single occurrence intensity parameter per regime. Estimated prices are comparable when (lhs) simulating new claims from ground up and (rhs) simulating paths for open claims. Price differences are often the result of realized extreme claims, which more heavily influence the estimated cost based on the paths generated for the observed claims.
4.2.4. Reserving for reinsurance contracts
Reserving actuaries estimate the aggregated, future cost for claims from past exposure years. In reinsurance, these costs depend on the structure of the contract sold. We estimate the reserve that should be held by the reinsurer under two contract types. The first type of contract covers all losses booked on claims for which the incurred exceeds the reporting priority of ${750,000}$ at least once during the claim’s development. Although this contract is not sold in practice, it is relevant to be considered because of its similarities with the reserving problem in the classical insurance setting. For accurately reserving this contract, the ODM should capture the average development pattern of claims over time sufficiently well. The second type of reinsurance contract covers the loss of an individual claim in our reinsurance data set between the deductible of ${2,500,000}$ and the policy limit of ${5,000,000}$ , that is, the reinsurance contract that we priced in Section 4.2.3. This contract puts focus on the performance of our ODM for large claims. For convenience, we assume that the contracts under consideration cover claims from occurrence years 2000–2014 on the available nine MTPL insurance portfolios with a reporting priority below ${750,000}$ .
Reserving follows the outline explained in Section 3.2 and relies on the proposed calibration strategies for pricing as discussed in Section 4.2.3. For the IBNR reserve, we predict the number of occurred, yet unreported claims and their expected reporting date from the occurrence and reporting model. We estimate the severity of these unreported claims by simulating new claims from ground up. In these simulations, we account for the effect of long reporting delays on the reinsurance claim development process (Figure 11(a) and (b)). For the RBNS reserve, we use the ODM to simulate the future development of the reported, open claims.vs
Reserving for a portfolio of reinsurance contracts that cover ground-up losses Figure 14 shows the evolution of the total incurred and paid amounts for claims that occurred between 2000 and 2014 and exceed the reporting priority of ${750,000}$ during their lifetime. For calendar years 2015–2017, we compare our estimates with the actual observations from the out-of-time data set. Figure 14(b) and (c) split the total reserve into the IBNR and RBNS reserve. For the RBNS reserve, the total amount incurred decreases slightly over time. This indicates that claim experts overestimate the expected cost of large claims when setting incurred amounts. For the total reserve (shown in Figure 14(a)), we estimate a sharp increase of the incurred in the first calendar years following 2014 as new claims get reported. Figure 14(b) shows that our model overestimates the increase in the incurred, which is due to an overestimation of the number of unreported claims (not shown). In Belgium, judges use indicative tables based on mortality and discount rates to determine the compensation for bodily injury claims. In 2012, discount rates for these tables dropped from 2 to 1%, which led claim experts to sharply increase the incurred amounts in 2013 and 2014. This initially led to an increase in the number of reported claims, as suddenly more claims exceeded the reporting priority, followed by a decrease in reported claim counts in later years. Since adjustments for these exogenous effects cannot be predicted by data-driven models, expert judgment will always remain important in reserving for reinsurance.
Long delays in our reinsurance data set compel us to use most of the observed calendar years (2000–2014) for training our model, leaving only 3 years (2015–2017) for an out-of-time evaluation. We examine the performance of the proposed reserving model with a moving evaluation date $\tau$ . We use the fitted ODM and the observed claim history at time $\tau$ to predict the future evolution of the incurred and paid amounts for claims that occurred before $\tau$ . This is, however, not a true out-of-time evaluation, since we still train our ODM on the years 2000–2014. Figure 15 shows these evaluations of the total reserve (IBNR + RBNS) for $\tau$ ranging from 2003 to 2014. Overall, the estimated evolution of the incurred and paid amounts roughly follows the evolution recorded in our data set. The discount rate in the indicative table changed in 2002 (4–3%), 2008 (3–2%) and 2012 (2–1%). These changes cause systematic, sudden shocks in the amount incurred which can be seen on panels 2000–2007, 2000–2008 and 2000–2012 and result in an underestimation of the incurred on the short term.
Reserving for a portfolio of reinsurance contracts that cover the excess-of-loss We now focus on reserving for a portfolio of excess-of-loss contracts as considered in the pricing example of Section 4.2.3. Figure 16 shows the estimated evolution of the incurred and paid amounts within the layer of our excess-of-loss contract. Although only few payments have yet been recorded within this layer in the available reinsurance data, we can rather accurately infer the payment pattern from the general dynamics estimated with the hierarchical model of Section 4.2.2. This illustrates the importance of calibrating models above a lower reporting priority in reinsurance, safeguarding a sufficient amount of data regarding the development of large claims. Where the incurred for reported claims, that is, the RBNS reserve, remained more or less constant when reserving from ground up (Figure 14(c)), we now observe an initial increase followed by a decrease of the total incurred within the layer of our excess-of-loss contract (Figure 16(c)).
5. Conclusion
We propose an ODM for analyzing the detailed claim information registered in non-life insurance portfolios. Our ODM brings valuable insights for non-life pricing as well as reserving, hereby bridging these two key actuarial tasks. We resolve the contradictions in traditional pricing literature where both actual observations and best estimates are used when calibrating severity models. From a reserving perspective, we model the cost of IBNR claims at the level of individual policies and the future payments on RBNS claims at the level of individual claims. We illustrate our proposed methodology with two case studies: pricing and reserving for a motor insurance as well as reinsurance portfolio. Constructing best estimates for open claims is particularly relevant and complicated in reinsurance, where reporting and settlement delays are long and claim development is uncertain. In the reinsurance setting, our ODM outshines traditional methodology along two directions. First, using Jensen’s inequality we demonstrate that the empirical distribution constructed from best estimates underestimates the variance of the claim severity distribution. This is then confirmed in our simulations where the claim severity distribution modeled by the ODM has a significantly larger variance than the empirical claim severity distribution based on best estimates. Second, our proposed individual reserving model captures the evolution in both paid and incurred amounts. Despite large uncertainties governing the development of reinsurance claims, our model is able to accurately predict the joint evolution of the paid and incurred amounts.
Acknowledgements
This work was supported by KU Leuven’s research council (project COMPACT C24/15/001) and Research Foundation Flanders (FWO) (grant number 11G4619N).
Disclaimer
This paper should not be reported as representing the views of QBE Re. The views expressed in this paper are those of the authors and do not necessarily represent those of QBE Re.
Competing risks
The authors declare none.
A. Layers in the hierarchical claim development models
This appendix provides a technical description of the layers used in the hierarchical development models for the MTPL insurance data set discussed in Section 4.1 and the reinsurance data set covered in Section 4.2.
A.1. Layers of the initial claim characteristics vector $I_k$
The initial claim characteristics $\boldsymbol{I}_k$ capture the claim’s information that is available when it is first reported. The setup of $\boldsymbol{I_k}$ is tailored to each of the case studies discussed in Section 4.
A.1.1. MTPL insurance data set
1. Settlement Indicator (yes/no) that registers whether a claim settles in the year of reporting or not. We model this indicator with a Bernoulli distribution with logit link function.
2. Payment Indicator (yes/no) that registers whether a payment is done in the year of reporting or not. We model this indicator with a Bernoulli distribution with logit link function.
3. Increase paid Amount paid in the year of reporting. When there is no payment, increase paid is set to zero. We model increase paid with a gamma distribution with log link.
4. Initial reserve Reserve estimate set by the claim expert in the year of reporting. We model initial reserve with a gamma distribution with log link. After simulating the initial reserve, we initialize the paid and incurred amounts as follows:
A.1.2. MTPL reinsurance data set
1. Excess incurred This layer registers the difference between the claim’s incurred amount as set by the insurer and the reporting priority in the year that the claim is reported. This excess incurred is positive since claims are only reported by the insurer when their incurred amount exceeds the reporting priority. Furthermore, the excess incurred will be at least 100 euro as a result of data preprocessing. When modeling the excess incurred, we first apply a power transform and then model the transformed outcome with a truncated Gaussian distribution, that is
For the data set analyzed in Section 4.2, the power p was calibrated as $0.117$ . After simulating excess incurred, we compute the incurred as
2. Payment Indicator (yes/no) registering whether the insurer made any claim payments before or in the year of reporting the claim to the reinsurer. We model this indicator with a Bernoulli distribution with logit link function.
3. Pct paid When there is a payment, we model the amount paid at reporting as a percentage of the incurred at reporting. After simulating this layer, the paid amount and the reserve are computed as
Modeling the percentage instead of the paid amount has the advantage that the condition $\texttt{paid} \leq \texttt{incurred}$ is automatically satisfied. We model pct paid by first applying a logit transform and then assuming a Gaussian distribution for the transformed variable, that is
A.2. Layers of the update vectors $U_k^{j}$
The setup of $\boldsymbol{U}_k^{j}$ is almost identical for both case studies in Section 4.
1. Settlement Indicator (yes/no) that registers whether a claim settles in the current development year or not. We model this indicator with a Bernoulli distribution with logit link function.
2. Payment Indicator (yes/no) that registers whether the insurer made a payment in the current development year or not. In the reinsurance case study, this indicator is yes when the payment size exceeds 100 euro. We model this indicator with a Bernoulli distribution with logit link function.
3. Increase paid Amount paid in the current development year. When there is no payment, increase paid is set to zero. If there is a payment, we first apply a power transform on this variable in the reinsurance case study and then assume a truncated Gaussian distribution for the transformed variable. In the insurance case study, we use a gamma distribution because claim sizes are smaller. After simulating increase paid, we increase the amount paid by the size of the new payment and subtract this payment from the outstanding reserve, that is
4. Change reserve Indicator (yes/no) registering whether the reserve changes in the current development year. We only model reserve changes when the claim does not settle in the current year. In the year of settlement, the reserve is deterministically set to zero. In the reinsurance case study, this indicator is only triggered by changes of at least 100 euro as a result of the data preprocessing step. This layer is modeled with a Bernoulli distribution with logit link function.
5. Reserve is zero Indicator (yes/no) registering whether the reserve drops to zero. This layer is modeled conditionally on $\texttt{change reserve} = \text{yes}$ and $\texttt{reserve} \neq 0$ . This layer is modeled with a Bernoulli distribution with logit link function. This layer is not included in the insurance case study, since there the reserve is always larger than zero when the claim has not yet settled.
6. Change reserve pos Indicator (yes/no) registering whether the reserve increases in the current year. This layer is modeled conditionally on $\texttt{change reserve} = \text{yes}$ , $\texttt{reserve is zero} = \text{no}$ and $\texttt{reserve} \neq 0$ . When $\texttt{change reserve} = \text{yes}$ and $\texttt{reserve} = 0$ , this layer is always set to yes as the reserve cannot decrease.
7. Increase reserve Nominal increase in the reserve conditional on an increase in the reserve. In the reinsurance case study, these increases are modeled with a truncated Gaussian distribution after applying a power transformation. A gamma GLM with log link is used in the insurance case study.
8. Pct decrease reserve Percentage decrease in the reserve conditional on a decrease in the reserve. As a result of pre-processing, this percentage is lower bounded by $\frac{100}{\texttt{reserve}}$ in the reinsurance case study. In modeling, we first apply a logit transform to this percentage and then model the transformed outcome with a truncated Gaussian distribution. After simulating this layer, we update the reserve and incurred as
B. Simulation strategies for claim severities
We illustrate both simulation strategies discussed in Section 3.1 on the reinsurance data set. Our goal is to simulate paths from the severity distribution of a claim resulting from portfolio A, with occurrence year 2015.
Simulating paths for a new claim We simulate ${20,000}$ paths for the development of a new claim from policy $\texttt{A}$ that occurred in 2015. Hereto, we follow the steps outlined in Algorithm 1. The occurrence year 2015 is not used in our claim development model but is required for deflating the simulated paths. Figure B.1 visualizes the evolution of the paid and incurred amounts as obtained over these ${20,000}$ paths. Solid lines indicate the average paid and incurred amounts, whereas the dashed lines bound the 95% confidence interval. At reporting, the incurred exceeds ${750,000}$ for all simulated paths. However, soon after reporting the lower bound for the incurred drops to zero as some of these paths will settle without payment, representing the case where the claim is not eligible for compensation within the portfolio. This is a common scenario for large motor insurance claims, where often many parties and hence insurers are involved in an accident and it may initially not be clear which insurer should reimburse the claim. After 15 years have elapsed, that is, the observation window of our training data set, many simulated claims are still open. This is visible in Figure B.1 by the large difference between the paid and incurred amounts after 15 years. Supported by the low importance of the covariate number of elapsed years since reporting in all layers of the hierarchical development model (Figure 10), we extrapolate our model and simulate the development up to 60 years after the reporting of the claim. After 60 years, almost all paths have settled and the paid amount has converged toward the amount incurred. A settlement delay of 60 years may seem long but occurs in practice when victims are compensated via lifelong periodic payments. The empirical distribution of the total amount paid after 60 years is our simulated severity distribution for a new claim from policy $\texttt{A}$ that occurs in 2015 and is reported with a priority of ${750,000}$ .
Simulating future paths for open claims Alternatively, we focus on the observed claim data from insurance portfolio $\texttt{A}$ . By the end of 2014, we observe 401 claims from this portfolio of which 33 are closed and 368 are open. We simulate 200 future development paths for each open claim. Figure B.2 shows the total amount paid for settled claims and a 80% confidence interval for the amount paid at settlement based on the 200 simulated paths per open claim. Compared to Figure B.1, we show 80% instead of 95% confidence intervals, since the scale on the vertical axis is heavily impacted by extreme outcomes registered for individual claims. Claims are sorted by median severity, indicated with a solid black line. The distribution of the simulated ultimate claim amount is heavily right skewed with the median near the lower end of the confidence interval. By maximizing the likelihood in (3.1), a severity distribution can be estimated from the observed ultimate claim sizes of settled claims and the simulations for the ultimate claim amounts of open claims. In the reinsurance case study, we use the empirical cumulative distribution function (ecdf) as a non-parametric estimator for the claim severity distribution. In the construction of this ecdf, the observed outcomes (settled claims) get a weight of 1 and simulated paths (open claims) each get a weight of $\frac{1}{\texttt{number of simulations}}$ .