This writing is to show
how one can leverage SAS Enterprise Miner 12.1(“EM”), released August 2012, to
build large number of leading machine learning models in short amount of time, by point-n-click. The comparison shown is
mainly to organize built models, not to support any conclusion about the strengh of the methods. The selected data set has ~40K observations,
with 12 predictor variables. The binary target variable ATTRITE has ~16%=1 (The
data set is from a published data mining book. Forgot which one it is from).
The following screen shot shows 16 models are built (3 logistic regressions, 2 neural nets, 2 random forests, 1 memory-based reason (K nearest neighbor), 1 decision trees, 2 stochastci gradient boosting, 1 LARS regression and 4 SVM models)
Below is comparison details of the 16 models
The two random forest models stand above the rest in misclassification rate and KS. Notice
- these models are built without much EDA (exploratory data analysis) work.
- A traditional decision tree is not far behind
- Neither of MBR, Boosting, SVM and NN does very well due to the fact there are only a dozen input variables. However, random forest still outshines them using few variables
- Logistic regression (the two HPREG models) models perform low probably due to the default cutoff selection as well
I like Enterprise Miner because I can load and set up large number of models (sometimes >100) quickly, easily tweak and manage their subtle differences, and pick the one that fits my domain business the best. Model lineage and knowledge sharing are other two reasons.
Make the most of the increasing career possibilities in this booming technology in Machine Learning by enrolling in AI Patasala Machine Learning Training in Hyderabad.
ReplyDeleteMachine Learning Training in Hyderabad with Placements