Binary Classification Model for Credit Card Default Using R Take 1

Template Credit: Adapted from a template made available by Dr. Jason Brownlee of Machine Learning Mastery.

Dataset Used: Default of Credit Card Clients Data Set

Dataset ML Model: Binary classification with numerical attributes

Dataset Reference: https://archive.ics.uci.edu/ml/datasets/default+of+credit+card+clients

One potential source of performance benchmark: https://www.kaggle.com/uciml/default-of-credit-card-clients-dataset

INTRODUCTION: This dataset contains information on default payments, demographic factors, credit data, history of payment, and bill statements of credit card clients in Taiwan from April 2005 to September 2005.

CONCLUSION: The baseline performance of the ten algorithms achieved an average accuracy of 81.05%. Three algorithms (Support Vector Machine, AdaBoost, and Stochastic Gradient Boosting) achieved the top accuracy scores after the first round of modeling. After a series of tuning trials, Stochastic Gradient Boosting turned in the top result using the training data. It achieved an average accuracy of 82.18%. Using the optimized tuning parameter available, the Stochastic Gradient Boosting algorithm processed the validation dataset with an accuracy of 81.94%, which was just slightly lower than the accuracy of the training data. For this round of modeling, the Stochastic Gradient Boosting ensemble algorithm yielded consistently top-notch training and validation results, which warrant the additional processing required by the algorithm.

The HTML formatted report can be found here on GitHub.