Predicting Hard Rock Pillar Stability Using GBDT, XGBoost, and LightGBM Algorithms

Weizhang Liang; Suizhi Luo; Guoyan Zhao; Hao Wu

doi:10.3390/math8050765

Predicting Hard Rock Pillar Stability Using GBDT, XGBoost, and LightGBM Algorithms

Mathematics ◽

10.3390/math8050765 ◽

2020 ◽

Vol 8 (5) ◽

pp. 765 ◽

Cited By ~ 6

Author(s):

Weizhang Liang ◽

Suizhi Luo ◽

Guoyan Zhao ◽

Hao Wu

Keyword(s):

Large Scale ◽

Prediction Models ◽

Hard Rock ◽

Gradient Boosting ◽

Pillar Stability ◽

Rock Pillar ◽

Light Gradient ◽

Gradient Boosting Machine ◽

Extreme Gradient Boosting ◽

Hard Rock Mines

Predicting pillar stability is a vital task in hard rock mines as pillar instability can cause large-scale collapse hazards. However, it is challenging because the pillar stability is affected by many factors. With the accumulation of pillar stability cases, machine learning (ML) has shown great potential to predict pillar stability. This study aims to predict hard rock pillar stability using gradient boosting decision tree (GBDT), extreme gradient boosting (XGBoost), and light gradient boosting machine (LightGBM) algorithms. First, 236 cases with five indicators were collected from seven hard rock mines. Afterwards, the hyperparameters of each model were tuned using a five-fold cross validation (CV) approach. Based on the optimal hyperparameters configuration, prediction models were constructed using training set (70% of the data). Finally, the test set (30% of the data) was adopted to evaluate the performance of each model. The precision, recall, and F1 indexes were utilized to analyze prediction results of each level, and the accuracy and their macro average values were used to assess the overall prediction performance. Based on the sensitivity analysis of indicators, the relative importance of each indicator was obtained. In addition, the safety factor approach and other ML algorithms were adopted as comparisons. The results showed that GBDT, XGBoost, and LightGBM algorithms achieved a better comprehensive performance, and their prediction accuracies were 0.8310, 0.8310, and 0.8169, respectively. The average pillar stress and ratio of pillar width to pillar height had the most important influences on prediction results. The proposed methodology can provide a reliable reference for pillar design and stability risk management.

Download Full-text

Protein pKa prediction by tree-based machine learning

10.26434/chemrxiv-2021-4d420 ◽

2021 ◽

Author(s):

Ada Y. Chen ◽

Juyong Lee ◽

Ana Damjanovic ◽

Bernard R. Brooks

Keyword(s):

Machine Learning ◽

Machine Learning Algorithms ◽

Gradient Boosting ◽

Pka Prediction ◽

Light Gradient ◽

Structure Database ◽

Gradient Boosting Machine ◽

Extreme Gradient Boosting ◽

Better Than ◽

Protein Pka

We present four tree-based machine learning models for protein pKa prediction. The four models, Random Forest, Extra Trees, eXtreme Gradient Boosting (XGBoost) and Light Gradient Boosting Machine (LightGBM), were trained on three experimental PDB and pKa datasets, two of which included a notable portion of internal residues. We observed similar performance among the four machine learning algorithms. The best model trained on the largest dataset performs 37% better than the widely used empirical pKa prediction tool PROPKA. The overall RMSE for this model is 0.69, with surface and buried RMSE values being 0.56 and 0.78, respectively, considering six residue types (Asp, Glu, His, Lys, Cys and Tyr), and 0.63 when considering Asp, Glu, His and Lys only. We provide pKa predictions for proteins in human proteome from the AlphaFold Protein Structure Database and observed that 1% of Asp/Glu/Lys residues have highly shifted pKa values close to the physiological pH.

Download Full-text

Application of Machine-Learning-Based Fusion Model in Visibility Forecast: A Case Study of Shanghai, China

Remote Sensing ◽

10.3390/rs13112096 ◽

2021 ◽

Vol 13 (11) ◽

pp. 2096

Author(s):

Zhongqi Yu ◽

Yuanhao Qu ◽

Yunxin Wang ◽

Jinghui Ma ◽

Yu Cao

Keyword(s):

Machine Learning ◽

Prediction Models ◽

Eastern China ◽

Prediction Method ◽

Sampling Technique ◽

Environmental Modeling ◽

Gradient Boosting ◽

Fusion Model ◽

Light Gradient ◽

Extreme Gradient Boosting

A visibility forecast model called a boosting-based fusion model (BFM) was established in this study. The model uses a fusion machine learning model based on multisource data, including air pollutants, meteorological observations, moderate resolution imaging spectroradiometer (MODIS) aerosol optical depth (AOD) data, and an operational regional atmospheric environmental modeling System for eastern China (RAEMS) outputs. Extreme gradient boosting (XGBoost), a light gradient boosting machine (LightGBM), and a numerical prediction method, i.e., RAEMS were fused to establish this prediction model. Three sets of prediction models, that is, BFM, LightGBM based on multisource data (LGBM), and RAEMS, were used to conduct visibility prediction tasks. The training set was from 1 January 2015 to 31 December 2018 and used several data pre-processing methods, including a synthetic minority over-sampling technique (SMOTE) data resampling, a loss function adjustment, and a 10-fold cross verification. Moreover, apart from the basic features (variables), more spatial and temporal gradient features were considered. The testing set was from 1 January to 31 December 2019 and was adopted to validate the feasibility of the BFM, LGBM, and RAEMS. Statistical indicators confirmed that the machine learning methods improved the RAEMS forecast significantly and consistently. The root mean square error and correlation coefficient of BFM for the next 24/48 h were 5.01/5.47 km and 0.80/0.77, respectively, which were much higher than those of RAEMS. The statistics and binary score analysis for different areas in Shanghai also proved the reliability and accuracy of using BFM, particularly in low-visibility forecasting. Overall, BFM is a suitable tool for predicting the visibility. It provides a more accurate visibility forecast for the next 24 and 48 h in Shanghai than LGBM and RAEMS. The results of this study provide support for real-time operational visibility forecasts.

Download Full-text

A stacked generalization ensemble model for optimization and prediction of the gas well rate of penetration: a case study in Xinjiang

Journal of Petroleum Exploration and Production Technology ◽

10.1007/s13202-021-01402-z ◽

2021 ◽

Author(s):

Naipeng Liu ◽

Hui Gao ◽

Zhen Zhao ◽

Yule Hu ◽

Longchen Duan

Keyword(s):

Pearson Correlation ◽

Gradient Boosting ◽

Support Vector ◽

Ensemble Model ◽

Rate Of Penetration ◽

Gas Drilling ◽

Light Gradient ◽

Stacked Generalization ◽

Gradient Boosting Machine ◽

Extreme Gradient Boosting

AbstractIn gas drilling operations, the rate of penetration (ROP) parameter has an important influence on drilling costs. Prediction of ROP can optimize the drilling operational parameters and reduce its overall cost. To predict ROP with satisfactory precision, a stacked generalization ensemble model is developed in this paper. Drilling data were collected from a shale gas survey well in Xinjiang, northwestern China. First, Pearson correlation analysis is used for feature selection. Then, a Savitzky-Golay smoothing filter is used to reduce noise in the dataset. In the next stage, we propose a stacked generalization ensemble model that combines six machine learning models: support vector regression (SVR), extremely randomized trees (ET), random forest (RF), gradient boosting machine (GB), light gradient boosting machine (LightGBM) and extreme gradient boosting (XGB). The stacked model generates meta-data from the five models (SVR, ET, RF, GB, LightGBM) to compute ROP predictions using an XGB model. Then, the leave-one-out method is used to verify modeling performance. The performance of the stacked model is better than each single model, with R2 = 0.9568 and root mean square error = 0.4853 m/h achieved on the testing dataset. Hence, the proposed approach will be useful in optimizing gas drilling. Finally, the particle swarm optimization (PSO) algorithm is used to optimize the relevant ROP parameters.

Download Full-text

Buckling and ultimate load prediction models for perforated steel beams using machine learning algorithms

10.31224/osf.io/mezar ◽

2021 ◽

Author(s):

Vitaliy Degtyarev ◽

Konstantinos Daniel Tsavdaridis

Keyword(s):

Machine Learning ◽

Web Application ◽

Failure Modes ◽

Ultimate Load ◽

Prediction Models ◽

Machine Learning Algorithms ◽

Gradient Boosting ◽

Elastic Buckling ◽

Light Gradient ◽

Extreme Gradient Boosting

Large web openings introduce complex structural behaviors and additional failure modes of steel cellular beams, which must be considered in the design using laborious calculations (e.g., exercising SCI P355). This paper presents seven machine learning (ML) models, including decision tree (DT), random forest (RF), k-nearest neighbor (KNN), gradient boosting regressor (GBR), extreme gradient boosting (XGBoost), light gradient boosting machine (LightGBM), and gradient boosting with categorical features support (CatBoost), for predicting the elastic buckling and ultimate loads of steel cellular beams. Large datasets of finite element (FE) simulation results, validated against experimental data, were used to develop the models. The ML models were fine-tuned via an extensive hyperparameter search to obtain their best performance. The elastic buckling and ultimate loads predicted by the optimized ML models demonstrated excellent agreement with the numerical data. The accuracy of the ultimate load predictions by the ML models exceeded the accuracy provided by the existing design provisions for steel cellular beams published in SCI P355 and AISC Design Guide 31. The relative feature importance and feature dependence of the models were evaluated and discussed in the paper. An interactive Python-based notebook and a user-friendly web application for predicting the elastic buckling and ultimate loads of steel cellular beams using the developed optimized ML models were created and made publicly available. The web application deployed to the cloud allows for making predictions in any web browser on any device, including mobile. The source code of the application available on GitHub allows running the application locally and independently from the cloud service.

Download Full-text

Pressure of different gases injected into large-scale coal matrix: Analysis of time–space dependence and prediction using light gradient boosting machine

Fuel ◽

10.1016/j.fuel.2020.118448 ◽

2020 ◽

Vol 279 ◽

pp. 118448

Author(s):

Bin Zhou ◽

Jiang Xu ◽

Feng Han ◽

Fazhi Yan ◽

Shoujian Peng ◽

...

Keyword(s):

Large Scale ◽

Matrix Analysis ◽

Gradient Boosting ◽

Coal Matrix ◽

Light Gradient ◽

Time Space ◽

Gradient Boosting Machine ◽

Space Dependence ◽

Different Gases

Download Full-text

Child’s Target Height Prediction Evolution

Applied Sciences ◽

10.3390/app9245447 ◽

2019 ◽

Vol 9 (24) ◽

pp. 5447 ◽

Cited By ~ 1

Author(s):

João Rala Cordeiro ◽

Octavian Postolache ◽

João C. Ferreira

Keyword(s):

Prediction Accuracy ◽

Population Studies ◽

Gradient Boosting ◽

Target Height ◽

New Approach ◽

Light Gradient ◽

Gradient Boosting Machine ◽

Extreme Gradient Boosting ◽

Height Prediction ◽

Growth Assessment

This study is a contribution for the improvement of healthcare in children and in society generally. This study aims to predict children’s height when they become adults, also known as “target height”, to allow for a better growth assessment and more personalized healthcare. The existing literature describes some existing prediction methods, based on longitudinal population studies and statistical techniques, which with few information resources, are able to produce acceptable results. The challenge of this study is in using a new approach based on machine learning to forecast the target height for children and (eventually) improve the existing height prediction accuracy. The goals of the study were achieved. The extreme gradient boosting regression (XGB) and light gradient boosting machine regression (LightGBM) algorithms achieved considerably better results on the height prediction. The developed model can be usefully applied by pediatricians and other clinical professionals in growth assessment.

Download Full-text

A comparative performance of machine learning algorithm to predict electric vehicles energy consumption: A path towards sustainability

Energy & Environment ◽

10.1177/0958305x211044998 ◽

2021 ◽

pp. 0958305X2110449

Author(s):

Irfan Ullah ◽

Kai Liu ◽

Toshiyuki Yamamoto ◽

Rabia Emhamed Al Mamlook ◽

Arshad Jamal

Keyword(s):

Machine Learning ◽

Energy Consumption ◽

Electric Vehicles ◽

Absolute Error ◽

Gradient Boosting ◽

Light Gradient ◽

Gradient Boosting Machine ◽

Extreme Gradient Boosting ◽

Energy Consumption Prediction ◽

Transport Emissions

The rapid growth of transportation sector and related emissions are attracting the attention of policymakers to ensure environmental sustainability. Therefore, the deriving factors of transport emissions are extremely important to comprehend. The role of electric vehicles is imperative amid rising transport emissions. Electric vehicles pave the way towards a low-carbon economy and sustainable environment. Successful deployment of electric vehicles relies heavily on energy consumption models that can predict energy consumption efficiently and reliably. Improving electric vehicles’ energy consumption efficiency will significantly help to alleviate driver anxiety and provide an essential framework for operation, planning, and management of the charging infrastructure. To tackle the challenge of electric vehicles’ energy consumption prediction, this study aims to employ advanced machine learning models, extreme gradient boosting, and light gradient boosting machine to compare with traditional machine learning models, multiple linear regression, and artificial neural network. Electric vehicles energy consumption data in the analysis were collected in Aichi Prefecture, Japan. To evaluate the performance of the prediction models, three evaluation metrics were used; coefficient of determination ( R2), root mean square error, and mean absolute error. The prediction outcome exhibits that the extreme gradient boosting and light gradient boosting machine provided better and robust results compared to multiple linear regression and artificial neural network. The models based on extreme gradient boosting and light gradient boosting machine yielded higher values of R2, lower mean absolute error, and root mean square error values have proven to be more accurate. However, the results demonstrated that the light gradient boosting machine is outperformed the extreme gradient boosting model. A detailed feature important analysis was carried out to demonstrate the impact and relative influence of different input variables on electric vehicles energy consumption prediction. The results imply that an advanced machine learning model can enhance the prediction performance of electric vehicles energy consumption.

Download Full-text

A Multi-Class Automatic Sleep Staging Method Based on Photoplethysmography Signals

Entropy ◽

10.3390/e23010116 ◽

2021 ◽

Vol 23 (1) ◽

pp. 116

Author(s):

Xiangfa Zhao ◽

Guobing Sun

Keyword(s):

Time Domain ◽

Single Channel ◽

Kappa Statistic ◽

Gradient Boosting ◽

Sleep Staging ◽

Challenging Problem ◽

Sleep State ◽

Light Gradient ◽

Gradient Boosting Machine ◽

The Time Domain

Automatic sleep staging with only one channel is a challenging problem in sleep-related research. In this paper, a simple and efficient method named PPG-based multi-class automatic sleep staging (PMSS) is proposed using only a photoplethysmography (PPG) signal. Single-channel PPG data were obtained from four categories of subjects in the CAP sleep database. After the preprocessing of PPG data, feature extraction was performed from the time domain, frequency domain, and nonlinear domain, and a total of 21 features were extracted. Finally, the Light Gradient Boosting Machine (LightGBM) classifier was used for multi-class sleep staging. The accuracy of the multi-class automatic sleep staging was over 70%, and the Cohen’s kappa statistic k was over 0.6. This also showed that the PMSS method can also be applied to stage the sleep state for patients with sleep disorders.

Download Full-text

A Review of Light Gradient Boosting Machine Method for Hate Speech Classification on Twitter

2020 2nd International Conference on Electrical, Control and Instrumentation Engineering (ICECIE) ◽

10.1109/icecie50279.2020.9309565 ◽

2020 ◽

Author(s):

Muhammad Hafizh Abdurrahman ◽

Budhi Irawan ◽

Casi Setianingsih

Keyword(s):

Hate Speech ◽

Gradient Boosting ◽

Machine Method ◽

Light Gradient ◽

Gradient Boosting Machine ◽

Speech Classification

Download Full-text

Fertility-LightGBM: A fertility-related protein prediction model by multi-information fusion and light gradient boosting machine

Biomedical Signal Processing and Control ◽

10.1016/j.bspc.2021.102630 ◽

2021 ◽

Vol 68 ◽

pp. 102630

Author(s):

Minghui Wang ◽

Lingling Yue ◽

Xinhua Yang ◽

Xiaolin Wang ◽

Yu Han ◽

...

Keyword(s):

Prediction Model ◽

Information Fusion ◽

Gradient Boosting ◽

Related Protein ◽

Light Gradient ◽

Protein Prediction ◽

Gradient Boosting Machine

Download Full-text