Abstract
This paper aims to better predict highly skewed auto insurance claims by combining candidate predictions. We analyze a version of the Kangaroo Auto Insurance company data and study the effects of combining different methods using five measures of prediction accuracy. The results show the following. First, when there is an outstanding (in terms of Gini Index) prediction among the candidates, the “forecast combination puzzle” phenomenon disappears. The simple average method performs much worse than the more sophisticated model combination methods, indicating that combining different methods could help us avoid performance degradation. Second, the choice of the prediction accuracy measure is crucial in defining the best candidate prediction for “low frequency and high severity” (LFHS) data. For example, mean square error (MSE) does not distinguish well between model combination methods, as the values are close. Third, the performances of different model combination methods can differ drastically. We propose using a new model combination method, named ARM-Tweedie, for such LFHS data; it benefits from an optimal rate of convergence and exhibits a desirable performance in several measures for the Kangaroo data. Fourth, overall, model combination methods improve the prediction accuracy for auto insurance claim costs. In particular, Adaptive Regression by Mixing (ARM), ARM-Tweedie, and constrained Linear Regression can improve forecast performance when there are only weak learners or when no dominant learner exists.
Original language | English |
---|---|
Article number | 19 |
Journal | Econometrics |
Volume | 10 |
Issue number | 2 |
DOIs | |
State | Published - Jun 2022 |
Bibliographical note
Publisher Copyright:© 2022 by the authors. Licensee MDPI, Basel, Switzerland.
Keywords
- Tweedie distribution
- auto insurance
- claim cost prediction
- model averaging
- normalized Gini index
ASJC Scopus subject areas
- Economics and Econometrics