Passive Fetal Movement Recognition Approaches Using Hyperparameter Tuned LightGBM Model and Bayesian Optimization

Computational Intelligence and Neuroscience ◽

10.1155/2021/6252362 ◽

2021 ◽

Vol 2021 ◽

pp. 1-18

Author(s):

Sensong Liang ◽

Jiansheng Peng ◽

Yong Xu ◽

Hemin Ye

Keyword(s):

Frequency Domain ◽

Kalman Filtering ◽

Health Monitoring ◽

Fetal Movement ◽

Bayesian Optimization ◽

Gradient Boosting ◽

Wavelet Domain ◽

Prenatal Health ◽

Light Gradient ◽

Low Amplitude

Fetal movement is an important clinical indicator to assess fetus growth and development status in the uterus. In recent years, a noninvasive intelligent sensing fetal movement detection system that can monitor high-risk pregnancies at home has received a lot of attention in the field of wearable health monitoring. However, recovering fetal movement signals from a continuous low-amplitude background that is heavily contaminated with noise and recognizing real fetal movements is a challenging task. In this paper, fetal movement can be efficiently recognized by combining the strength of Kalman filtering, time and frequency domain and wavelet domain feature extraction, and hyperparameter tuned Light Gradient Boosting Machine (LightGBM) model. Firstly, the Kalman filtering (KF) algorithm is used to recover the fetal movement signal in a continuous low-amplitude background contaminated by noise. Secondly, the time domain, frequency domain, and wavelet domain (TFWD) features of the preprocessed fetal movement signal are extracted. Finally, the Bayesian Optimization algorithm (BOA) is used to optimize the LightGBM model to obtain the optimal hyperparameters. Through this, the accurate prediction and recognition of fetal movement are successfully achieved. In the performance analysis of the Zenodo fetal movement dataset, the proposed KF + TFWD + BOA-LGBM approach’s recognition accuracy and F1-Score reached 94.06% and 96.85%, respectively. Compared with 8 existing advanced methods for fetal movement signal recognition, the proposed method has better accuracy and robustness, indicating its potential medical application in wearable smart sensing systems for fetal prenatal health monitoring.

Download Full-text

Prediction of River Stage Using Multistep-Ahead Machine Learning Techniques for a Tidal River of Taiwan

Water ◽

10.3390/w13070920 ◽

2021 ◽

Vol 13 (7) ◽

pp. 920

Author(s):

Wen-Dar Guo ◽

Wei-Bo Chen ◽

Sen-Hai Yeh ◽

Chih-Hsin Chang ◽

Hongey Chen

Keyword(s):

Machine Learning ◽

Flood Control ◽

Tidal River ◽

Bayesian Optimization ◽

Data Driven ◽

Machine Learning Techniques ◽

Gradient Boosting ◽

Support Vector ◽

River Stage ◽

Light Gradient

Time-series prediction of a river stage during typhoons or storms is essential for flood control or flood disaster prevention. Data-driven models using machine learning (ML) techniques have become an attractive and effective approach to modeling and analyzing river stage dynamics. However, relatively new ML techniques, such as the light gradient boosting machine regression (LGBMR), have rarely been applied to predict the river stage in a tidal river. In this study, data-driven ML models were developed under a multistep-ahead prediction framework and evaluated for river stage modeling. Four ML techniques, namely support vector regression (SVR), random forest regression (RFR), multilayer perceptron regression (MLPR), and LGBMR, were employed to establish data-driven ML models with Bayesian optimization. The models were applied to simulate river stage hydrographs of the tidal reach of the Lan-Yang River Basin in Northeastern Taiwan. Historical measurements of rainfall, river stages, and tidal levels were collected from 2004 to 2017 and used for training and validation of the four models. Four scenarios were used to investigate the effect of the combinations of input variables on river stage predictions. The results indicated that (1) the tidal level at a previous stage significantly affected the prediction results; (2) the LGBMR model achieves more favorable prediction performance than the SVR, RFR, and MLPR models; and (3) the LGBMR model could efficiently and accurately predict the 1–6-h river stage in the tidal river. This study provides an extensive and insightful comparison of four data-driven ML models for river stage forecasting that can be helpful for model selection and flood mitigation.

Download Full-text

A Multi-Class Automatic Sleep Staging Method Based on Photoplethysmography Signals

Entropy ◽

10.3390/e23010116 ◽

2021 ◽

Vol 23 (1) ◽

pp. 116

Author(s):

Xiangfa Zhao ◽

Guobing Sun

Keyword(s):

Time Domain ◽

Single Channel ◽

Kappa Statistic ◽

Gradient Boosting ◽

Sleep Staging ◽

Challenging Problem ◽

Sleep State ◽

Light Gradient ◽

Gradient Boosting Machine ◽

The Time Domain

Automatic sleep staging with only one channel is a challenging problem in sleep-related research. In this paper, a simple and efficient method named PPG-based multi-class automatic sleep staging (PMSS) is proposed using only a photoplethysmography (PPG) signal. Single-channel PPG data were obtained from four categories of subjects in the CAP sleep database. After the preprocessing of PPG data, feature extraction was performed from the time domain, frequency domain, and nonlinear domain, and a total of 21 features were extracted. Finally, the Light Gradient Boosting Machine (LightGBM) classifier was used for multi-class sleep staging. The accuracy of the multi-class automatic sleep staging was over 70%, and the Cohen’s kappa statistic k was over 0.6. This also showed that the PMSS method can also be applied to stage the sleep state for patients with sleep disorders.

Download Full-text

A Review of Light Gradient Boosting Machine Method for Hate Speech Classification on Twitter

2020 2nd International Conference on Electrical, Control and Instrumentation Engineering (ICECIE) ◽

10.1109/icecie50279.2020.9309565 ◽

2020 ◽

Author(s):

Muhammad Hafizh Abdurrahman ◽

Budhi Irawan ◽

Casi Setianingsih

Keyword(s):

Hate Speech ◽

Gradient Boosting ◽

Machine Method ◽

Light Gradient ◽

Gradient Boosting Machine ◽

Speech Classification

Download Full-text

Fertility-LightGBM: A fertility-related protein prediction model by multi-information fusion and light gradient boosting machine

Biomedical Signal Processing and Control ◽

10.1016/j.bspc.2021.102630 ◽

2021 ◽

Vol 68 ◽

pp. 102630

Author(s):

Minghui Wang ◽

Lingling Yue ◽

Xinhua Yang ◽

Xiaolin Wang ◽

Yu Han ◽

...

Keyword(s):

Prediction Model ◽

Information Fusion ◽

Gradient Boosting ◽

Related Protein ◽

Light Gradient ◽

Protein Prediction ◽

Gradient Boosting Machine

Download Full-text

Development and validation of a difficult laryngoscopy prediction model using machine learning of neck circumference and thyromental height

BMC Anesthesiology ◽

10.1186/s12871-021-01343-4 ◽

2021 ◽

Vol 21 (1) ◽

Author(s):

Jong Ho Kim ◽

Haewon Kim ◽

Ji Su Jang ◽

Sung Mi Hwang ◽

So Young Lim ◽

...

Keyword(s):

Machine Learning ◽

Random Forest ◽

Confidence Interval ◽

Neck Circumference ◽

Difficult Laryngoscopy ◽

Gradient Boosting ◽

Test Set ◽

Equal Distribution ◽

Light Gradient ◽

Extreme Gradient Boosting

Abstract Background Predicting difficult airway is challengeable in patients with limited airway evaluation. The aim of this study is to develop and validate a model that predicts difficult laryngoscopy by machine learning of neck circumference and thyromental height as predictors that can be used even for patients with limited airway evaluation. Methods Variables for prediction of difficulty laryngoscopy included age, sex, height, weight, body mass index, neck circumference, and thyromental distance. Difficult laryngoscopy was defined as Grade 3 and 4 by the Cormack-Lehane classification. The preanesthesia and anesthesia data of 1677 patients who had undergone general anesthesia at a single center were collected. The data set was randomly stratified into a training set (80%) and a test set (20%), with equal distribution of difficulty laryngoscopy. The training data sets were trained with five algorithms (logistic regression, multilayer perceptron, random forest, extreme gradient boosting, and light gradient boosting machine). The prediction models were validated through a test set. Results The model’s performance using random forest was best (area under receiver operating characteristic curve = 0.79 [95% confidence interval: 0.72–0.86], area under precision-recall curve = 0.32 [95% confidence interval: 0.27–0.37]). Conclusions Machine learning can predict difficult laryngoscopy through a combination of several predictors including neck circumference and thyromental height. The performance of the model can be improved with more data, a new variable and combination of models.

Download Full-text

Boosting Algorithm Choice in Predictive Machine Learning Models for Fracturing Applications

10.2118/205642-ms ◽

2021 ◽

Author(s):

Abdul Muqtadir Khan

Keyword(s):

Machine Learning ◽

Data Science ◽

Oil And Gas ◽

Oil And Gas Industry ◽

Injection Rate ◽

Model Construction ◽

Gradient Boosting ◽

Light Gradient ◽

Fracture Damage ◽

Boosting Technique

Abstract With the advancement in machine learning (ML) applications, some recent research has been conducted to optimize fracturing treatments. There are a variety of models available using various objective functions for optimization and different mathematical techniques. There is a need to extend the ML techniques to optimize the choice of algorithm. For fracturing treatment design, the literature for comparative algorithm performance is sparse. The research predominantly shows that compared to the most commonly used regressors and classifiers, some sort of boosting technique consistently outperforms on model testing and prediction accuracy. A database was constructed for a heterogeneous reservoir. Four widely used boosting algorithms were used on the database to predict the design only from the output of a short injection/falloff test. Feature importance analysis was done on eight output parameters from the falloff analysis, and six were finalized for the model construction. The outputs selected for prediction were fracturing fluid efficiency, proppant mass, maximum proppant concentration, and injection rate. Extreme gradient boost (XGBoost), categorical boost (CatBoost), adaptive boost (AdaBoost), and light gradient boosting machine (LGBM) were the algorithms finalized for the comparative study. The sensitivity was done for a different number of classes (four, five, and six) to establish a balance between accuracy and prediction granularity. The results showed that the best algorithm choice was between XGBoost and CatBoost for the predicted parameters under certain model construction conditions. The accuracy for all outputs for the holdout sets varied between 80 and 92%, showing robust significance for a wider utilization of these models. Data science has contributed to various oil and gas industry domains and has tremendous applications in the stimulation domain. The research and review conducted in this paper add a valuable resource for the user to build digital databases and use the appropriate algorithm without much trial and error. Implementing this model reduced the complexity of the proppant fracturing treatment redesign process, enhanced operational efficiency, and reduced fracture damage by eliminating minifrac steps with crosslinked gel.

Download Full-text

The Concept of Geodetic Analyses of the Measurement Results Obtained by Hydrostatic Leveling

Geosciences ◽

10.3390/geosciences9100406 ◽

2019 ◽

Vol 9 (10) ◽

pp. 406

Author(s):

Kamiński ◽

Makowska

Keyword(s):

Structural Health Monitoring ◽

Kalman Filtering ◽

Health Monitoring ◽

Systematic Errors ◽

Monitoring Systems ◽

Measurement Results ◽

Vertical Displacements ◽

Complete Computation ◽

Computation Scheme ◽

Health Monitoring Systems

The article discusses the issue of hydrostatic leveling. Its application is presented in structural health monitoring systems in order to determine vertical displacements of controlled points. Moreover, the article includes a complete computation scheme that utilizes the estimation from observation differences, allowing the elimination of the influence of individual sensors’ systematic errors. The authors suggest two concepts of processing the measurement results depending on the sensors’ connection method. Additionally, the second concept is extended by the elements allowing the prediction of the displacements by means of Kalman filtering.

Download Full-text

A Swarm Enhanced Light Gradient Boosting Machine for Crowdfunding Project Outcome Prediction

Machine Learning for Cyber Security - Lecture Notes in Computer Science ◽

10.1007/978-3-030-62463-7_34 ◽

2020 ◽

pp. 372-382

Author(s):

Shuang Geng ◽

Miaojia Huang ◽

Zhibo Wang

Keyword(s):

Outcome Prediction ◽

Gradient Boosting ◽

Light Gradient ◽

Project Outcome ◽

Gradient Boosting Machine

Download Full-text

RegioML: Predicting the regioselectivity of electrophilic aromatic substitution reactions using machine learning

10.33774/chemrxiv-2021-l2fvl ◽

2021 ◽

Author(s):

Nicolai Ree ◽

Andreas H. Göller ◽

Jan H. Jensen

Keyword(s):

Machine Learning ◽

Tight Binding ◽

Reaction Centers ◽

Gradient Boosting ◽

Electrophilic Aromatic Substitution ◽

Aromatic Substitution ◽

Substitution Reactions ◽

Test Set ◽

Light Gradient ◽

Out Of Sample

We present RegioML, an atom-based machine learning model for predicting the regioselectivities of electrophilic aromatic substitution reactions. The model relies on CM5 atomic charges computed using semiempirical tight binding (GFN1-xTB) combined with the ensemble decision tree variant light gradient boosting machine (LightGBM). The model is trained and tested on 21,201 bromination reactions with 101K reaction centers, which is split into a training, test, and out-of-sample datasets with 58K, 15K, and 27K reaction centers, respectively. The accuracy is 93% for the test set and 90% for the out-of-sample set, while the precision (the percentage of positive predictions that are correct) is 88% and 80%, respectively. The test-set performance is very similar to the graph-based WLN method developed by Struble et al. (React. Chem. Eng. 2020, 5, 896) though the comparison is complicated by the possibility that some of the test and out-of-sample molecules are used to train WLN. RegioML out-performs our physics-based RegioSQM20 method (J. Cheminform. 2021, 13:10) where the precision is only 75%. Even for the out-of-sample dataset, RegioML slightly outperforms RegioSQM20. The good performance of RegioML and WLN is in large part due to the large datasets available for this type of reaction. However, for reactions where there is little experimental data, physics-based approaches like RegioSQM20 can be used to generate synthetic data for model training. We demonstrate this by showing that the performance of RegioSQM20 can be reproduced by a ML-model trained on RegioSQM20-generated data.

Download Full-text

Establishing a Credit Risk Evaluation System for SMEs Using the Soft Voting Fusion Model

Risks ◽

10.3390/risks9110202 ◽

2021 ◽

Vol 9 (11) ◽

pp. 202

Author(s):

Ge Gao ◽

Hongxin Wang ◽

Pengbin Gao

Keyword(s):

Credit Risk ◽

Evaluation System ◽

Predictive Accuracy ◽

Assessment System ◽

Gradient Boosting ◽

Support Vector ◽

Fusion Model ◽

Light Gradient ◽

Extreme Gradient Boosting ◽

The Government

In China, SMEs are facing financing difficulties, and commercial banks and financial institutions are the main financing channels for SMEs. Thus, a reasonable and efficient credit risk assessment system is important for credit markets. Based on traditional statistical methods and AI technology, a soft voting fusion model, which incorporates logistic regression, support vector machine (SVM), random forest (RF), eXtreme Gradient Boosting (XGBoost), and Light Gradient Boosting Machine (LightGBM), is constructed to improve the predictive accuracy of SMEs’ credit risk. To verify the feasibility and effectiveness of the proposed model, we use data from 123 SMEs nationwide that worked with a Chinese bank from 2016 to 2020, including financial information and default records. The results show that the accuracy of the soft voting fusion model is higher than that of a single machine learning (ML) algorithm, which provides a theoretical basis for the government to control credit risk in the future and offers important references for banks to make credit decisions.

Download Full-text