INFLUENCE OF THE
EVENT RATE ON DISCRIMINATION ABILITIES OF BANKRUPTCY PREDICTION MODELS
Lili Zhang1
, Jennifer Priestley2 , and Xuelei Ni3
1Program
in Analytics and Data Science, Kennesaw State University, Georgia, USA
2Analytics
and Data Science Institute, Kennesaw State University, Georgia, USA
3Department
of Statistics, Kennesaw State University, Georgia, USA
ABSTRACT
In bankruptcy prediction, the
proportion of events is very low, which is often oversampled to eliminate this
bias. In this paper, we study the influence of the event rate on discrimination
abilities of bankruptcy prediction models. First the statistical association
and significance of public records and firmographics indicators with the
bankruptcy were explored. Then the event rate was oversampled from 0.12% to
10%, 20%, 30%, 40%, and 50%, respectively. Seven models were developed,
including Logistic Regression, Decision Tree, Random Forest, Gradient Boosting,
Support Vector Machine, Bayesian Network, and Neural Network. Under different
event rates, models were comprehensively evaluated and compared based on
Kolmogorov-Smirnov Statistic, accuracy,F1 score, Type I error, Type II error,
and ROC curve on the hold-out dataset with their best probability cut-offs.
Results show that Bayesian Network is the most insensitive to the event rate,
while Support Vector Machine is the most sensitive.
KEYWORDS
Bankruptcy Prediction, Public
Records, Firmographics, Event Rate, Discrimination Ability
Comments
Post a Comment