Hybrid Harmony Search–Artificial Intelligence Models in Credit Scoring

Rui Ying Goh; Lai Soon Lee; Hsin-Vonn Seow; Kathiresan Gopal

doi:10.3390/e22090989

Hybrid Harmony Search–Artificial Intelligence Models in Credit Scoring

Entropy ◽

10.3390/e22090989 ◽

2020 ◽

Vol 22 (9) ◽

pp. 989

Author(s):

Rui Ying Goh ◽

Lai Soon Lee ◽

Hsin-Vonn Seow ◽

Kathiresan Gopal

Keyword(s):

Artificial Intelligence ◽

Feature Selection ◽

Credit Scoring ◽

Harmony Search ◽

Model Performance ◽

Hybrid Models ◽

Performance Model ◽

Computational Time ◽

Support Vector ◽

Artificial Intelligence Models

Credit scoring is an important tool used by financial institutions to correctly identify defaulters and non-defaulters. Support Vector Machines (SVM) and Random Forest (RF) are the Artificial Intelligence techniques that have been attracting interest due to their flexibility to account for various data patterns. Both are black-box models which are sensitive to hyperparameter settings. Feature selection can be performed on SVM to enable explanation with the reduced features, whereas feature importance computed by RF can be used for model explanation. The benefits of accuracy and interpretation allow for significant improvement in the area of credit risk and credit scoring. This paper proposes the use of Harmony Search (HS), to form a hybrid HS-SVM to perform feature selection and hyperparameter tuning simultaneously, and a hybrid HS-RF to tune the hyperparameters. A Modified HS (MHS) is also proposed with the main objective to achieve comparable results as the standard HS with a shorter computational time. MHS consists of four main modifications in the standard HS: (i) Elitism selection during memory consideration instead of random selection, (ii) dynamic exploration and exploitation operators in place of the original static operators, (iii) a self-adjusted bandwidth operator, and (iv) inclusion of additional termination criteria to reach faster convergence. Along with parallel computing, MHS effectively reduces the computational time of the proposed hybrid models. The proposed hybrid models are compared with standard statistical models across three different datasets commonly used in credit scoring studies. The computational results show that MHS-RF is most robust in terms of model performance, model explainability and computational time.

Download Full-text

Hybrid Model for Credit Risk Prediction: An Application of Neural Network Approaches

International Journal of Artificial Intelligence Tools ◽

10.1142/s0218213019500179 ◽

2019 ◽

Vol 28 (05) ◽

pp. 1950017 ◽

Cited By ~ 1

Author(s):

Guotai Chi ◽

Mohammad Shamsu Uddin ◽

Mohammad Zoynul Abedin ◽

Kunpeng Yuan

Keyword(s):

Neural Network ◽

Artificial Intelligence ◽

Logistic Regression ◽

Credit Risk ◽

Risk Prediction ◽

Hybrid Model ◽

Credit Scoring ◽

Hybrid Models ◽

Multilayer Perceptrons ◽

Inference Systems

Credit risk prediction is essential for banks and financial institutions as it helps them to evade any inappropriate assessments that can lead to wasted opportunities or monetary losses. In recent times, the hybrid prediction model, a combination of traditional and modern artificial intelligence (AI) methods that provides better prediction capacity than the use of single techniques, has been introduced. Similarly, using conventional and topical artificial intelligence (AI) technologies, researchers have recommended hybrid models which amalgamate logistic regression (LR) with multilayer perceptron (MLP). To investigate the efficiency and viability of the proposed hybrid models, we compared 16 hybrid models created by combining logistic regression (LR), discriminant analysis (DA), and decision trees (DT) with four types of neural network (NN): adaptive neurofuzzy inference systems (ANFISs), deep neural networks (DNNs), radial basis function networks (RBFs) and multilayer perceptrons (MLPs). The experimental outcome, investigation, and statistical examination express the capacity of the planned hybrid model to develop a credit risk prediction technique different from all other approaches, as indicated by ten different performance measures. The classifier was authenticated on five real-world credit scoring data sets.

Download Full-text

A clean energy forecasting model based on artificial intelligence and fractional derivative grey Bernoulli models

Grey Systems Theory and Application ◽

10.1108/gs-08-2020-0101 ◽

2020 ◽

Vol ahead-of-print (ahead-of-print) ◽

Author(s):

Yonghong Zhang ◽

Shuhua Mao ◽

Yuxiao Kang

Keyword(s):

Artificial Intelligence ◽

Fractional Derivative ◽

Clean Energy ◽

Hybrid Models ◽

Support Vector ◽

Forecasting Model ◽

Content Type ◽

Original Series ◽

Production And Consumption ◽

Memory Characteristics

PurposeWith the massive use of fossil energy polluting the natural environment, clean energy has gradually become the focus of future energy development. The purpose of this article is to propose a new hybrid forecasting model to forecast the production and consumption of clean energy.Design/methodology/approachFirstly, the memory characteristics of the production and consumption of clean energy were analyzed by the rescaled range analysis (R/S) method. Secondly, the original series was decomposed into several components and residuals with different characteristics by the ensemble empirical mode decomposition (EEMD) algorithm, and the residuals were predicted by the fractional derivative grey Bernoulli model [FDGBM (p, 1)]. The other components were predicted using artificial intelligence (AI) models (least square support vector regression [LSSVR] and artificial neural network [ANN]). Finally, the fitting values of each part were added to get the predicted value of the original series.FindingsThis study found that clean energy had memory characteristics. The hybrid models EEMD–FDGBM (p, 1)–LSSVR and EEMD–FDGBM (p, 1)–ANN were significantly higher than other models in the prediction of clean energy production and consumption.Originality/valueConsider that clean energy has complex nonlinear and memory characteristics. In this paper, the EEMD method combined the FDGBM (P, 1) and AI models to establish hybrid models to predict the consumption and output of clean energy.

Download Full-text

Cost-based feature selection for Support Vector Machines: An application in credit scoring

European Journal of Operational Research ◽

10.1016/j.ejor.2017.02.037 ◽

2017 ◽

Vol 261 (2) ◽

pp. 656-665 ◽

Cited By ~ 46

Author(s):

Sebastián Maldonado ◽

Juan Pérez ◽

Cristián Bravo

Keyword(s):

Feature Selection ◽

Support Vector Machines ◽

Credit Scoring ◽

Support Vector ◽

Vector Machines ◽

Selection For

Download Full-text

Credit Scoring Using Support Vector Machine: A Comparative Analysis

Advanced Materials Research ◽

10.4028/www.scientific.net/amr.433-440.6527 ◽

2012 ◽

Vol 433-440 ◽

pp. 6527-6533 ◽

Cited By ~ 2

Author(s):

S. Harikrishna ◽

M.A.H. Farquad ◽

Shabana

Keyword(s):

Feature Selection ◽

Credit Scoring ◽

Decision Makers ◽

Recursive Feature Elimination ◽

Second Step ◽

Support Vector ◽

Ensemble Of Classifiers ◽

Well Efficiency ◽

Credit Data ◽

Intelligent Models

Credit Scoring is the use of statistical/intelligent models to transform relevant data into numerical measures that guide the management and decision makers to make decisions such as accept/reject, pricing, pay/no pay and collections. This study focuses on predicting whether a credit applicant can be categorized as good or bad from the supplied data. Many researchers have recently worked on an ensemble of classifiers for such problems. It is observed from the literature that feature selection reduces the complexity of the system and improves the accuracy as well. Efficiency of SVM for feature selection and as a classifier in tandem and its application to credit scoring is analyzed in this paper. During the first step, SVM-RFE (Recursive Feature Elimination) is employed for feature selection and during the second step various architectures of SVM viz., Standard SVM, PSO-SVM and EVO-SVM are employed for classification purpose. The effectiveness of various approaches tested are evaluated using UK credit data and German credit data. It is observed that feature selection using SVM-RFE not only simplifies the process of credit scoring but also improves the accuracy of the system.

Download Full-text

CREDIT SCORING MODELS WITH AUC MAXIMIZATION BASED ON WEIGHTED SVM

International Journal of Information Technology & Decision Making ◽

10.1142/s0219622009003582 ◽

2009 ◽

Vol 08 (04) ◽

pp. 677-696 ◽

Cited By ~ 31

Author(s):

LIGANG ZHOU ◽

KIN KEUNG LAI ◽

JEROME YEN

Keyword(s):

Support Vector Machines ◽

Quantitative Methods ◽

Credit Scoring ◽

Feature Weighting ◽

Computational Time ◽

Support Vector ◽

Operating Characteristics ◽

Auc Maximization ◽

Vector Machines ◽

Rbf Kernel

Credit scoring models are very important tools for financial institutions to make credit granting decisions. In the last few decades, many quantitative methods have been used for the development of credit scoring models with focus on maximizing classification accuracy. This paper proposes the credit scoring models with the area under receiver operating characteristics curve (AUC) maximization based on the new emerged support vector machines (SVM) techniques. Three main SVM models with different features weighted strategies are discussed. The weighted SVM credit scoring models are tested using 10-fold cross validation with two real world data sets and the experimental results are compared with other six traditional methods including linear regression, logistic regression, k nearest neighbor, decision tree, and neural network. Results demonstrate that weighted 2-norm SVM with radial basis function (RBF) kernel function and t-test feature weighting strategy has the overall better performance with very narrow margin than other SVM models. However, it also consumes more computational time. In considering the balance of performance and time, least squares support vector machines (LSSVM) with RBF kernel maybe a better choice for large scale credit scoring applications.

Download Full-text

Comparison of Profit-Based Multi-Objective Approaches for Feature Selection in Credit Scoring

Algorithms ◽

10.3390/a14090260 ◽

2021 ◽

Vol 14 (9) ◽

pp. 260

Author(s):

Naomi Simumba ◽

Suguru Okami ◽

Akira Kodaka ◽

Naohiko Kohtake

Keyword(s):

Feature Selection ◽

Selection Process ◽

Credit Scoring ◽

Computational Time ◽

Multi Objective Optimization ◽

Multi Objective ◽

Base Classifier ◽

Grasshopper Optimization Algorithm ◽

Specific Implementation ◽

The Impact

Feature selection is crucial to the credit-scoring process, allowing for the removal of irrelevant variables with low predictive power. Conventional credit-scoring techniques treat this as a separate process wherein features are selected based on improving a single statistical measure, such as accuracy; however, recent research has focused on meaningful business parameters such as profit. More than one factor may be important to the selection process, making multi-objective optimization methods a necessity. However, the comparative performance of multi-objective methods has been known to vary depending on the test problem and specific implementation. This research employed a recent hybrid non-dominated sorting binary Grasshopper Optimization Algorithm and compared its performance on multi-objective feature selection for credit scoring to that of two popular benchmark algorithms in this space. Further comparison is made to determine the impact of changing the profit-maximizing base classifiers on algorithm performance. Experiments demonstrate that, of the base classifiers used, the neural network classifier improved the profit-based measure and minimized the mean number of features in the population the most. Additionally, the NSBGOA algorithm gave relatively smaller hypervolumes and increased computational time across all base classifiers, while giving the highest mean objective values for the solutions. It is clear that the base classifier has a significant impact on the results of multi-objective optimization. Therefore, careful consideration should be made of the base classifier to use in the scenarios.

Download Full-text

Credit Scoring Model based on Kernel Density Estimation and Support Vector Machine for Group Feature Selection

2018 International Conference on Advances in Computing, Communications and Informatics (ICACCI) ◽

10.1109/icacci.2018.8554524 ◽

2018 ◽

Author(s):

Xingzhi Zhang ◽

Zhurong Zhou

Keyword(s):

Support Vector Machine ◽

Feature Selection ◽

Density Estimation ◽

Kernel Density Estimation ◽

Credit Scoring ◽

Kernel Density ◽

Support Vector ◽

Scoring Model ◽

Model Based ◽

Credit Scoring Model

Download Full-text

Developing Machine Learning Models to Predict Roadway Traffic Noise: An Opportunity to Escape Conventional Techniques

Transportation Research Record Journal of the Transportation Research Board ◽

10.1177/0361198119838514 ◽

2019 ◽

Vol 2673 (4) ◽

pp. 158-172 ◽

Cited By ~ 1

Author(s):

Mohamad Ali Khalil ◽

Khaled Hamad ◽

Abdallah Shanableh

Keyword(s):

Machine Learning ◽

Feature Selection ◽

United Arab Emirates ◽

Traffic Noise ◽

Free Field ◽

Model Performance ◽

Noise Model ◽

Support Vector ◽

Feature Selection Technique ◽

Performance Results

Accurate prediction of roadway traffic noise remains challenging. Many researchers continue to improve the performance of their models by either adding more variables or improving their modeling algorithms. In this research, machine learning (ML) modeling techniques were developed to predict roadway traffic noise accurately. The ML techniques applied were: regression decision trees, support vector machine, ensembles, and artificial neural network. The parameters of each of these models were fine-tuned to achieve the best performance results. In addition, a state-of-the-art hybrid feature-selection technique has been employed to select a minimum set of input features (variables) while maintaining the accuracy of the developed models. By optimizing the number of features used in the model, the resources needed to develop and utilize a model to predict roadway noise would be less, hence decreasing the development cost. The proposed approach has been applied to develop a free-field roadway traffic noise model for Sharjah City in the United Arab Emirates. The best developed ML model was compared with a conventional regression model which was developed earlier under the same conditions. The cross-validated results clearly indicate that the best ML model outperformed the regression modeling. The performance of the ML model was also assessed after reducing the number of its input features based on the outcome of the feature-selection algorithm; the model performance was slightly affected. This result emphasizes the importance of considering only features that greatly influence the roadway traffic noise.

Download Full-text

Computational time reduction for credit scoring: An integrated approach based on support vector machine and stratified sampling method

Expert Systems with Applications ◽

10.1016/j.eswa.2011.12.057 ◽

2012 ◽

Vol 39 (8) ◽

pp. 6774-6781 ◽

Cited By ~ 53

Author(s):

Akhil Bandhu Hens ◽

Manoj Kumar Tiwari

Keyword(s):

Support Vector Machine ◽

Sampling Method ◽

Credit Scoring ◽

Integrated Approach ◽

Stratified Sampling ◽

Computational Time ◽

Support Vector ◽

Time Reduction

Download Full-text

Feature Selection Using Harmony Search for Script Identification from Handwritten Document Images

Journal of Intelligent Systems ◽

10.1515/jisys-2016-0070 ◽

2018 ◽

Vol 27 (3) ◽

pp. 465-488 ◽

Cited By ~ 5

Author(s):

Pawan Kumar Singh ◽

Supratim Das ◽

Ram Sarkar ◽

Mita Nasipuri

Keyword(s):

Feature Selection ◽

Harmony Search ◽

Distribution Networks ◽

Feature Subset Selection ◽

Support Vector ◽

Feature Subset ◽

Document Images ◽

Global Features ◽

Script Identification ◽

Handwritten Document

Abstract The feature selection process can be considered a problem of global combinatorial optimization in machine learning, which reduces the irrelevant, noisy, and non-contributing features, resulting in acceptable classification accuracy. Harmony search algorithm (HSA) is an evolutionary algorithm that is applied to various optimization problems such as scheduling, text summarization, water distribution networks, vehicle routing, etc. This paper presents a hybrid approach based on support vector machine and HSA for wrapper feature subset selection. This approach is used to select an optimized set of features from an initial set of features obtained by applying Modified log-Gabor filters on prepartitioned rectangular blocks of handwritten document images written in either of 12 official Indic scripts. The assessment justifies the need of feature selection for handwritten script identification where local and global features are computed without knowing the exact importance of features. The proposed approach is also compared with four well-known evolutionary algorithms, namely genetic algorithm, particle swarm optimization, tabu search, ant colony optimization, and two statistical feature dimensionality reduction techniques, namely greedy attribute search and principal component analysis. The acquired results show that the optimal set of features selected using HSA gives better accuracy in handwritten script recognition.

Download Full-text