INFLUENCE
OF THE EVENT RATE ON DISCRIMINATION ABILITIES OF BANKRUPTCY PREDICTION MODELS
Lili
Zhang1 , Jennifer Priestley2 , and Xuelei Ni3
1Program
in Analytics and Data Science, Kennesaw State University, Georgia, USA
2Analytics
and Data Science Institute, Kennesaw State University, Georgia, USA
3Department
of Statistics, Kennesaw State University, Georgia, USA
ABSTRACT
In bankruptcy
prediction, the proportion of events is very low, which is often oversampled to
eliminate this bias. In this paper, we study the influence of the event rate on
discrimination abilities of bankruptcy prediction models. First the statistical
association and significance of public records and firmographics indicators
with the bankruptcy were explored. Then the event rate was oversampled from
0.12% to 10%, 20%, 30%, 40%, and 50%, respectively. Seven models were
developed, including Logistic Regression, Decision Tree, Random Forest,
Gradient Boosting, Support Vector Machine, Bayesian Network, and Neural
Network. Under different event rates, models were comprehensively evaluated and
compared based on Kolmogorov-Smirnov Statistic, accuracy,F1 score, Type I
error, Type II error, and ROC curve on the hold-out dataset with their best
probability cut-offs. Results show that Bayesian Network is the most
insensitive to the event rate, while Support Vector Machine is the most
sensitive.
KEYWORDS
Bankruptcy Prediction,
Public Records, Firmographics, Event Rate, Discrimination Ability
Orginal Source URL: http://aircconline.com/ijdms/V10N1/10118ijdms01.pdf
Comments
Post a Comment