Published online by Cambridge University Press: 04 November 2013
This paper explores how we can apply various modern data mining techniques to better understand Australian Income Protection Insurance (IPI). We provide a fast and objective method of scoring claims into different portfolios using available rating factors. Results from fitting several prediction models are compared based on not only the conventional loss prediction error function, but also a modified loss function. We demonstrate that the prediction power of all the data mining methods under consideration is clearly evident using a misclassification plot. We also point out that this predictability can be masked by looking at just the conventional prediction error function. We then suggest using the stepwise regression technique to reduce the number of variables used in the data mining methods. Apart from this variable selection method, we also look at principal components analysis to increase understanding of the rating factors that drive claim durations of insured lives. We also discuss and compare how different variable combining techniques can be used to weight available predicting variables. One interesting outcome we discover is that principal components analysis and the weighted combination prediction model together provide very consistent results on identifying the most significant variables for explaining claim durations.