Feature Selection in a Credit Scoring Model

Juan Laborda; Seyong Ryoo

doi:10.3390/math9070746

Feature Selection in a Credit Scoring Model

Mathematics ◽

10.3390/math9070746 ◽

2021 ◽

Vol 9 (7) ◽

pp. 746

Author(s):

Juan Laborda ◽

Seyong Ryoo

Keyword(s):

Feature Selection ◽

Credit Scoring ◽

Superior Performance ◽

Filter Method ◽

Support Vector ◽

Classification Algorithms ◽

Scoring Model ◽

Stepwise Selection ◽

Forward Stepwise ◽

Credit Scoring Model

This paper proposes different classification algorithms—logistic regression, support vector machine, K-nearest neighbors, and random forest—in order to identify which candidates are likely to default for a credit scoring model. Three different feature selection methods are used in order to mitigate the overfitting in the curse of dimensionality of these classification algorithms: one filter method (Chi-squared test and correlation coefficients) and two wrapper methods (forward stepwise selection and backward stepwise selection). The performances of these three methods are discussed using two measures, the mean absolute error and the number of selected features. The methodology is applied for a valuable database of Taiwan. The results suggest that forward stepwise selection yields superior performance in each one of the classification algorithms used. The conclusions obtained are related to those in the literature, and their managerial implications are analyzed.

Download Full-text

Credit Scoring Model based on Kernel Density Estimation and Support Vector Machine for Group Feature Selection

2018 International Conference on Advances in Computing, Communications and Informatics (ICACCI) ◽

10.1109/icacci.2018.8554524 ◽

2018 ◽

Author(s):

Xingzhi Zhang ◽

Zhurong Zhou

Keyword(s):

Support Vector Machine ◽

Feature Selection ◽

Density Estimation ◽

Kernel Density Estimation ◽

Credit Scoring ◽

Kernel Density ◽

Support Vector ◽

Scoring Model ◽

Model Based ◽

Credit Scoring Model

Download Full-text

Two‐Stage Credit Scoring Model Based on Evolutionary Feature Selection and Ensemble Neural Networks

Machine Learning Algorithms and Applications ◽

10.1002/9781119769262.ch6 ◽

2021 ◽

pp. 99-115

Author(s):

Diwakar Tripathi ◽

Damodar Reddy Edla ◽

Annushree Bablani ◽

Venkatanareshbabu Kuppili

Keyword(s):

Neural Networks ◽

Feature Selection ◽

Credit Scoring ◽

Two Stage ◽

Scoring Model ◽

Model Based ◽

Ensemble Neural Networks ◽

Credit Scoring Model

Download Full-text

A novel hybrid credit scoring model based on ensemble feature selection and multilayer ensemble classification

Computational Intelligence ◽

10.1111/coin.12200 ◽

2019 ◽

Vol 35 (2) ◽

pp. 371-394 ◽

Cited By ~ 12

Author(s):

Diwakar Tripathi ◽

Damodar Reddy Edla ◽

Ramalingaswamy Cheruku ◽

Venkatanareshbabu Kuppili

Keyword(s):

Feature Selection ◽

Credit Scoring ◽

Ensemble Classification ◽

Scoring Model ◽

Model Based ◽

Credit Scoring Model

Download Full-text

A credit scoring model using support vector machine

Fifth World Congress on Intelligent Control and Automation (IEEE Cat. No.04EX788) ◽

10.1109/wcica.2004.1341919 ◽

2004 ◽

Author(s):

Xiang Tian ◽

Feiqi Deng

Keyword(s):

Support Vector Machine ◽

Credit Scoring ◽

Support Vector ◽

Scoring Model ◽

Credit Scoring Model

Download Full-text

Feature Selection in Credit Scoring Model for Credit Card Applicants in XYZ Bank: A Comparative Study

International Journal of Multimedia and Ubiquitous Engineering ◽

10.14257/ijmue.2015.10.5.03 ◽

2015 ◽

Vol 10 (5) ◽

pp. 17-24 ◽

Cited By ~ 3

Author(s):

Mediana Aryuni ◽

Evaristus Didik Madyatmadja

Keyword(s):

Feature Selection ◽

Comparative Study ◽

Credit Card ◽

Credit Scoring ◽

Scoring Model ◽

Credit Scoring Model

Download Full-text

A Hybrid Credit Scoring Model Based on Genetic Programming and Support Vector Machines

2008 Fourth International Conference on Natural Computation ◽

10.1109/icnc.2008.205 ◽

2008 ◽

Cited By ~ 13

Author(s):

Defu Zhang ◽

Mhand Hifi ◽

Qingshan Chen ◽

Weiguo Ye

Keyword(s):

Support Vector Machines ◽

Genetic Programming ◽

Credit Scoring ◽

Support Vector ◽

Scoring Model ◽

Model Based ◽

Vector Machines ◽

Credit Scoring Model

Download Full-text

Multi-Class Support Vector Machine for Credit Scoring

Applied Mechanics and Materials ◽

10.4028/www.scientific.net/amm.235.419 ◽

2012 ◽

Vol 235 ◽

pp. 419-422 ◽

Cited By ~ 1

Author(s):

Bo Tang ◽

Sai Bing Qiu

Keyword(s):

Support Vector Machine ◽

Credit Scoring ◽

Real Life ◽

Assessment Model ◽

Behavior Assessment ◽

Support Vector ◽

Classification Problems ◽

Scoring Model ◽

Multiple Classification ◽

Credit Scoring Model

The general credit scoring model is to solve the two classification problems, but in real life we often encounter multiple classification problems. This paper proposes a multi-class support vector machine, which can solve multiple classification problems in the behavior assessment model.

Download Full-text

Improved credit scoring model using XGBoost with Bayesian hyper-parameter optimization

International Journal of Electrical and Computer Engineering (IJECE) ◽

10.11591/ijece.v11i6.pp5477-5487 ◽

2021 ◽

Vol 11 (6) ◽

pp. 5477

Author(s):

Wirot Yotsawat ◽

Pakaket Wattuya ◽

Anongnart Srivihok

Keyword(s):

Parameter Optimization ◽

Missing Values ◽

Credit Scoring ◽

Gradient Boosting ◽

Support Vector ◽

Scoring Model ◽

Ensemble Models ◽

Proposed Model ◽

Extreme Gradient Boosting ◽

Credit Scoring Model

<span>Several credit-scoring models have been developed using ensemble classifiers in order to improve the accuracy of assessment. However, among the ensemble models, little consideration has been focused on the hyper-parameters tuning of base learners, although these are crucial to constructing ensemble models. This study proposes an improved credit scoring model based on the extreme gradient boosting (XGB) classifier using Bayesian hyper-parameters optimization (XGB-BO). The model comprises two steps. Firstly, data pre-processing is utilized to handle missing values and scale the data. Secondly, Bayesian hyper-parameter optimization is applied to tune the hyper-parameters of the XGB classifier and used to train the model. The model is evaluated on four widely public datasets, i.e., the German, Australia, lending club, and Polish datasets. Several state-of-the-art classification algorithms are implemented for predictive comparison with the proposed method. The results of the proposed model showed promising results, with an improvement in accuracy of 4.10%, 3.03%, and 2.76% on the German, lending club, and Australian datasets, respectively. The proposed model outperformed commonly used techniques, e.g., decision tree, support vector machine, neural network, logistic regression, random forest, and bagging, according to the evaluation results. The experimental results confirmed that the XGB-BO model is suitable for assessing the creditworthiness of applicants.</span>

Download Full-text

A Novel Ensemble Credit Scoring Model Based on Extreme Learning Machine and Generalized Fuzzy Soft Sets

Mathematical Problems in Engineering ◽

10.1155/2020/7504764 ◽

2020 ◽

Vol 2020 ◽

pp. 1-12

Author(s):

Dayu Xu ◽

Xuyao Zhang ◽

Junguo Hu ◽

Jiahao Chen

Keyword(s):

Feature Selection ◽

Credit Scoring ◽

Training Data ◽

Soft Sets ◽

Scoring Model ◽

Fuzzy Soft Sets ◽

Credit Data ◽

Learning Machine ◽

Elm Classifier ◽

Credit Scoring Model

This paper mainly discusses the hybrid application of ensemble learning, classification, and feature selection (FS) algorithms simultaneously based on training data balancing for helping the proposed credit scoring model perform more effectively, which comprises three major stages. Firstly, it conducts preprocessing for collected credit data. Then, an efficient feature selection algorithm based on adaptive elastic net is employed to reduce the weakly related or uncorrelated variables to get high-quality training data. Thirdly, a novel ensemble strategy is proposed to make the imbalanced training data set balanced for each extreme learning machine (ELM) classifier. Finally, a new weighting method for single ELM classifiers in the ensemble model is established with respect to their classification accuracy based on generalized fuzzy soft sets (GFSS) theory. A novel cosine-based distance measurement algorithm of GFSS is also proposed to calculate the weights of each ELM classifier. To confirm the efficiency of the proposed ensemble credit scoring model, we implemented experiments with real-world credit data sets for comparison. The process of analysis, outcomes, and mathematical tests proved that the proposed model is capable of improving the effectiveness of classification in average accuracy, area under the curve (AUC), H-measure, and Brier’s score compared to all other single classifiers and ensemble approaches.

Download Full-text

Credit scoring model based on a novel group feature selection method: The case of Chinese small-sized manufacturing enterprises

Journal of the Operational Research Society ◽

10.1080/01605682.2021.1880295 ◽

2021 ◽

pp. 1-17

Author(s):

Zhipeng Zhang ◽

Guotai Chi ◽

Sisira Colombage ◽

Ying Zhou

Keyword(s):

Feature Selection ◽

Credit Scoring ◽

Feature Selection Method ◽

Selection Method ◽

Scoring Model ◽

Manufacturing Enterprises ◽

Model Based ◽

Credit Scoring Model

Download Full-text