The following screen shot shows 16 models are built (3 logistic regressions, 2 neural nets, 2 random forests, 1 memory-based reason (K nearest neighbor), 1 decision trees, 2 stochastci gradient boosting, 1 LARS regression and 4 SVM models)
Below is comparison details of the 16 models
The two random forest models stand above the rest in misclassification rate and KS. Notice
- these models are built without much EDA (exploratory data analysis) work.
- A traditional decision tree is not far behind
- Neither of MBR, Boosting, SVM and NN does very well due to the fact there are only a dozen input variables. However, random forest still outshines them using few variables
- Logistic regression (the two HPREG models) models perform low probably due to the default cutoff selection as well