scholarly journals A Machine Learning Approach to Evaluate the Performance of Rural Bank

Complexity ◽  
2021 ◽  
Vol 2021 ◽  
pp. 1-10
Author(s):  
Jun Wei ◽  
Tao Ye ◽  
Zhe Zhang

In the current performance evaluation works of commercial banks, most of the researches only focus on the relationship between a single characteristic and performance and lack a comprehensive analysis of characteristics. On the other hand, they mainly focus on causal inference and lack systematic quantitative conclusions from the perspective of prediction. This paper is the first to comprehensively investigate the predictability of multidimensional features on commercial bank performance using boosting regression tree. The dimensionality in the financial-related fields is relatively high. There are not only observable price data, financial fundamentals data, etc., but also many unobservable undisclosed data and undisclosed events; more sources of income cannot be explained by existing models. Aiming at the characteristics of commercial bank data, this paper proposes an adaptively reduced step size gradient boosting regression tree algorithm for bank performance evaluation. In this method, a random subsample sampling is performed before training each regression tree. The adaptive reduction step size is used to replace the reduction step size setting of the original algorithm, which overcomes the shortcomings of low accuracy and poor generalization ability of the existing regression decision tree model. Compared to the BIRCH algorithm for classification of existing data, our proposed gradient boosting regression tree algorithm with adaptively reduced step size obtains better classification results. This paper empirically uses data from rural banks in 30 provinces in China to classify the different characteristics of rural banks’ performance in order to better evaluate their performance.

Author(s):  
Guangtong Gu ◽  
Bing Xu ◽  
◽  
◽  

Based on the purchase price data of new real estate markets three cities in China, Beijing, Shanghai, and Guangzhou, including architectural features, neighborhood property features, and location features, in this study a boosting regression tree model was built to study the factors and the influence path of housing prices from the microcosmic perspective. First, a classical hedonic price model was constructed to analyze and compare the significant effect factors on housing prices in the market segments of the three cities. Second, the gradient boosting regression tree method that is proposed in this paper was applied to the three markets in combination to analyze the influence paths and factors and the importance of the type of housing hedonic price. The influence paths of housing hedonic prices and decision tree rules are visualized. The significant housing features are effectively extracted. Finally, we present three main conclusions and several suggestions for policy makers to improve urban functions while stabilizing real estate prices.


2020 ◽  
Vol 12 (4) ◽  
pp. 1481 ◽  
Author(s):  
Xiaobo Xue Romeiko ◽  
Zhijian Guo ◽  
Yulei Pang ◽  
Eun Kyung Lee ◽  
Xuesong Zhang

Agriculture ranks as one of the top contributors to global warming and nutrient pollution. Quantifying life cycle environmental impacts from agricultural production serves as a scientific foundation for forming effective remediation strategies. However, methods capable of accurately and efficiently calculating spatially explicit life cycle global warming (GW) and eutrophication (EU) impacts at the county scale over a geographic region are lacking. The objective of this study was to determine the most efficient and accurate model for estimating spatially explicit life cycle GW and EU impacts at the county scale, with corn production in the U.S.’s Midwest region as a case study. This study compared the predictive accuracies and efficiencies of five distinct supervised machine learning (ML) algorithms, testing various sample sizes and feature selections. The results indicated that the gradient boosting regression tree model built with approximately 4000 records of monthly weather features yielded the highest predictive accuracy with cross-validation (CV) values of 0.8 for the life cycle GW impacts. The gradient boosting regression tree model built with nearly 6000 records of monthly weather features showed the highest predictive accuracy with CV values of 0.87 for the life cycle EU impacts based on all modeling scenarios. Moreover, predictive accuracy was improved at the cost of simulation time. The gradient boosting regression tree model required the longest training time. ML algorithms demonstrated to be one million times faster than the traditional process-based model with high predictive accuracy. This indicates that ML can serve as an alternative surrogate of process-based models to estimate life-cycle environmental impacts, capturing large geographic areas and timeframes.


2019 ◽  
Vol 21 (9) ◽  
pp. 662-669 ◽  
Author(s):  
Junnan Zhao ◽  
Lu Zhu ◽  
Weineng Zhou ◽  
Lingfeng Yin ◽  
Yuchen Wang ◽  
...  

Background: Thrombin is the central protease of the vertebrate blood coagulation cascade, which is closely related to cardiovascular diseases. The inhibitory constant Ki is the most significant property of thrombin inhibitors. Method: This study was carried out to predict Ki values of thrombin inhibitors based on a large data set by using machine learning methods. Taking advantage of finding non-intuitive regularities on high-dimensional datasets, machine learning can be used to build effective predictive models. A total of 6554 descriptors for each compound were collected and an efficient descriptor selection method was chosen to find the appropriate descriptors. Four different methods including multiple linear regression (MLR), K Nearest Neighbors (KNN), Gradient Boosting Regression Tree (GBRT) and Support Vector Machine (SVM) were implemented to build prediction models with these selected descriptors. Results: The SVM model was the best one among these methods with R2=0.84, MSE=0.55 for the training set and R2=0.83, MSE=0.56 for the test set. Several validation methods such as yrandomization test and applicability domain evaluation, were adopted to assess the robustness and generalization ability of the model. The final model shows excellent stability and predictive ability and can be employed for rapid estimation of the inhibitory constant, which is full of help for designing novel thrombin inhibitors.


2021 ◽  
Vol 7 ◽  
pp. 1246-1255
Author(s):  
Peng Nie ◽  
Michele Roccotelli ◽  
Maria Pia Fanti ◽  
Zhengfeng Ming ◽  
Zhiwu Li

2020 ◽  
Vol 2020 ◽  
pp. 1-19
Author(s):  
De-Cheng Feng ◽  
Bo Fu

In this paper, an intelligent modeling approach is presented to predict the shear strength of the internal reinforced concrete (RC) beam-column joints and used to analyze the sensitivity of the influence factors on the shear strength. The proposed approach is established based on the famous boosting-family ensemble machine learning (ML) algorithms, i.e., gradient boosting regression tree (GBRT), which generates a strong predictive model by integrating several weak predictors, which are obtained by the well-known individual ML algorithms, e.g., DT, ANN, and SVM. The strong model is boosted as each weak predictor has its own weight in the final combination according to the performance. Compared with the conventional mechanical-driven shear strength models, e.g., the well-known modified compression field theory (MCFT), the proposed model can avoid the complicated derivation process of shear mechanism and calibration of the involved empirical parameters; thus, it provides a more convenient, fast, and robust alternative way for predicting the shear strength of the internal RC joints. To train and test the GBRT model, a total of 86 internal RC joint specimens are collected from the literatures, and four traditional ML models and the MCFT model are also employed as comparisons. The results indicate that the GBRT model is superior to both the traditional ML models and MCFT model, as its degree-of-fitting is the highest and the predicting dispersion is the lowest. Finally, the model is used to investigate the influences of different parameters on the shear strength of the internal RC joint, and the sensitivity and importance of the corresponding parameters are obtained.


Sign in / Sign up

Export Citation Format

Share Document