scholarly journals Hybrid Harmony Search–Artificial Intelligence Models in Credit Scoring

Entropy ◽  
2020 ◽  
Vol 22 (9) ◽  
pp. 989
Author(s):  
Rui Ying Goh ◽  
Lai Soon Lee ◽  
Hsin-Vonn Seow ◽  
Kathiresan Gopal

Credit scoring is an important tool used by financial institutions to correctly identify defaulters and non-defaulters. Support Vector Machines (SVM) and Random Forest (RF) are the Artificial Intelligence techniques that have been attracting interest due to their flexibility to account for various data patterns. Both are black-box models which are sensitive to hyperparameter settings. Feature selection can be performed on SVM to enable explanation with the reduced features, whereas feature importance computed by RF can be used for model explanation. The benefits of accuracy and interpretation allow for significant improvement in the area of credit risk and credit scoring. This paper proposes the use of Harmony Search (HS), to form a hybrid HS-SVM to perform feature selection and hyperparameter tuning simultaneously, and a hybrid HS-RF to tune the hyperparameters. A Modified HS (MHS) is also proposed with the main objective to achieve comparable results as the standard HS with a shorter computational time. MHS consists of four main modifications in the standard HS: (i) Elitism selection during memory consideration instead of random selection, (ii) dynamic exploration and exploitation operators in place of the original static operators, (iii) a self-adjusted bandwidth operator, and (iv) inclusion of additional termination criteria to reach faster convergence. Along with parallel computing, MHS effectively reduces the computational time of the proposed hybrid models. The proposed hybrid models are compared with standard statistical models across three different datasets commonly used in credit scoring studies. The computational results show that MHS-RF is most robust in terms of model performance, model explainability and computational time.

2019 ◽  
Vol 28 (05) ◽  
pp. 1950017 ◽  
Author(s):  
Guotai Chi ◽  
Mohammad Shamsu Uddin ◽  
Mohammad Zoynul Abedin ◽  
Kunpeng Yuan

Credit risk prediction is essential for banks and financial institutions as it helps them to evade any inappropriate assessments that can lead to wasted opportunities or monetary losses. In recent times, the hybrid prediction model, a combination of traditional and modern artificial intelligence (AI) methods that provides better prediction capacity than the use of single techniques, has been introduced. Similarly, using conventional and topical artificial intelligence (AI) technologies, researchers have recommended hybrid models which amalgamate logistic regression (LR) with multilayer perceptron (MLP). To investigate the efficiency and viability of the proposed hybrid models, we compared 16 hybrid models created by combining logistic regression (LR), discriminant analysis (DA), and decision trees (DT) with four types of neural network (NN): adaptive neurofuzzy inference systems (ANFISs), deep neural networks (DNNs), radial basis function networks (RBFs) and multilayer perceptrons (MLPs). The experimental outcome, investigation, and statistical examination express the capacity of the planned hybrid model to develop a credit risk prediction technique different from all other approaches, as indicated by ten different performance measures. The classifier was authenticated on five real-world credit scoring data sets.


2020 ◽  
Vol ahead-of-print (ahead-of-print) ◽  
Author(s):  
Yonghong Zhang ◽  
Shuhua Mao ◽  
Yuxiao Kang

PurposeWith the massive use of fossil energy polluting the natural environment, clean energy has gradually become the focus of future energy development. The purpose of this article is to propose a new hybrid forecasting model to forecast the production and consumption of clean energy.Design/methodology/approachFirstly, the memory characteristics of the production and consumption of clean energy were analyzed by the rescaled range analysis (R/S) method. Secondly, the original series was decomposed into several components and residuals with different characteristics by the ensemble empirical mode decomposition (EEMD) algorithm, and the residuals were predicted by the fractional derivative grey Bernoulli model [FDGBM (p, 1)]. The other components were predicted using artificial intelligence (AI) models (least square support vector regression [LSSVR] and artificial neural network [ANN]). Finally, the fitting values of each part were added to get the predicted value of the original series.FindingsThis study found that clean energy had memory characteristics. The hybrid models EEMD–FDGBM (p, 1)–LSSVR and EEMD–FDGBM (p, 1)–ANN were significantly higher than other models in the prediction of clean energy production and consumption.Originality/valueConsider that clean energy has complex nonlinear and memory characteristics. In this paper, the EEMD method combined the FDGBM (P, 1) and AI models to establish hybrid models to predict the consumption and output of clean energy.


2012 ◽  
Vol 433-440 ◽  
pp. 6527-6533 ◽  
Author(s):  
S. Harikrishna ◽  
M.A.H. Farquad ◽  
Shabana

Credit Scoring is the use of statistical/intelligent models to transform relevant data into numerical measures that guide the management and decision makers to make decisions such as accept/reject, pricing, pay/no pay and collections. This study focuses on predicting whether a credit applicant can be categorized as good or bad from the supplied data. Many researchers have recently worked on an ensemble of classifiers for such problems. It is observed from the literature that feature selection reduces the complexity of the system and improves the accuracy as well. Efficiency of SVM for feature selection and as a classifier in tandem and its application to credit scoring is analyzed in this paper. During the first step, SVM-RFE (Recursive Feature Elimination) is employed for feature selection and during the second step various architectures of SVM viz., Standard SVM, PSO-SVM and EVO-SVM are employed for classification purpose. The effectiveness of various approaches tested are evaluated using UK credit data and German credit data. It is observed that feature selection using SVM-RFE not only simplifies the process of credit scoring but also improves the accuracy of the system.


Author(s):  
LIGANG ZHOU ◽  
KIN KEUNG LAI ◽  
JEROME YEN

Credit scoring models are very important tools for financial institutions to make credit granting decisions. In the last few decades, many quantitative methods have been used for the development of credit scoring models with focus on maximizing classification accuracy. This paper proposes the credit scoring models with the area under receiver operating characteristics curve (AUC) maximization based on the new emerged support vector machines (SVM) techniques. Three main SVM models with different features weighted strategies are discussed. The weighted SVM credit scoring models are tested using 10-fold cross validation with two real world data sets and the experimental results are compared with other six traditional methods including linear regression, logistic regression, k nearest neighbor, decision tree, and neural network. Results demonstrate that weighted 2-norm SVM with radial basis function (RBF) kernel function and t-test feature weighting strategy has the overall better performance with very narrow margin than other SVM models. However, it also consumes more computational time. In considering the balance of performance and time, least squares support vector machines (LSSVM) with RBF kernel maybe a better choice for large scale credit scoring applications.


Algorithms ◽  
2021 ◽  
Vol 14 (9) ◽  
pp. 260
Author(s):  
Naomi Simumba ◽  
Suguru Okami ◽  
Akira Kodaka ◽  
Naohiko Kohtake

Feature selection is crucial to the credit-scoring process, allowing for the removal of irrelevant variables with low predictive power. Conventional credit-scoring techniques treat this as a separate process wherein features are selected based on improving a single statistical measure, such as accuracy; however, recent research has focused on meaningful business parameters such as profit. More than one factor may be important to the selection process, making multi-objective optimization methods a necessity. However, the comparative performance of multi-objective methods has been known to vary depending on the test problem and specific implementation. This research employed a recent hybrid non-dominated sorting binary Grasshopper Optimization Algorithm and compared its performance on multi-objective feature selection for credit scoring to that of two popular benchmark algorithms in this space. Further comparison is made to determine the impact of changing the profit-maximizing base classifiers on algorithm performance. Experiments demonstrate that, of the base classifiers used, the neural network classifier improved the profit-based measure and minimized the mean number of features in the population the most. Additionally, the NSBGOA algorithm gave relatively smaller hypervolumes and increased computational time across all base classifiers, while giving the highest mean objective values for the solutions. It is clear that the base classifier has a significant impact on the results of multi-objective optimization. Therefore, careful consideration should be made of the base classifier to use in the scenarios.


Author(s):  
Mohamad Ali Khalil ◽  
Khaled Hamad ◽  
Abdallah Shanableh

Accurate prediction of roadway traffic noise remains challenging. Many researchers continue to improve the performance of their models by either adding more variables or improving their modeling algorithms. In this research, machine learning (ML) modeling techniques were developed to predict roadway traffic noise accurately. The ML techniques applied were: regression decision trees, support vector machine, ensembles, and artificial neural network. The parameters of each of these models were fine-tuned to achieve the best performance results. In addition, a state-of-the-art hybrid feature-selection technique has been employed to select a minimum set of input features (variables) while maintaining the accuracy of the developed models. By optimizing the number of features used in the model, the resources needed to develop and utilize a model to predict roadway noise would be less, hence decreasing the development cost. The proposed approach has been applied to develop a free-field roadway traffic noise model for Sharjah City in the United Arab Emirates. The best developed ML model was compared with a conventional regression model which was developed earlier under the same conditions. The cross-validated results clearly indicate that the best ML model outperformed the regression modeling. The performance of the ML model was also assessed after reducing the number of its input features based on the outcome of the feature-selection algorithm; the model performance was slightly affected. This result emphasizes the importance of considering only features that greatly influence the roadway traffic noise.


2018 ◽  
Vol 27 (3) ◽  
pp. 465-488 ◽  
Author(s):  
Pawan Kumar Singh ◽  
Supratim Das ◽  
Ram Sarkar ◽  
Mita Nasipuri

Abstract The feature selection process can be considered a problem of global combinatorial optimization in machine learning, which reduces the irrelevant, noisy, and non-contributing features, resulting in acceptable classification accuracy. Harmony search algorithm (HSA) is an evolutionary algorithm that is applied to various optimization problems such as scheduling, text summarization, water distribution networks, vehicle routing, etc. This paper presents a hybrid approach based on support vector machine and HSA for wrapper feature subset selection. This approach is used to select an optimized set of features from an initial set of features obtained by applying Modified log-Gabor filters on prepartitioned rectangular blocks of handwritten document images written in either of 12 official Indic scripts. The assessment justifies the need of feature selection for handwritten script identification where local and global features are computed without knowing the exact importance of features. The proposed approach is also compared with four well-known evolutionary algorithms, namely genetic algorithm, particle swarm optimization, tabu search, ant colony optimization, and two statistical feature dimensionality reduction techniques, namely greedy attribute search and principal component analysis. The acquired results show that the optimal set of features selected using HSA gives better accuracy in handwritten script recognition.


Sign in / Sign up

Export Citation Format

Share Document