A Compound Fault Labeling and Diagnosis Method Based on Flight Data and BIT Record of UAV

Ke Zheng; Guozhu Jia; Linchao Yang; Jiaqing Wang

doi:10.3390/app11125410

A Compound Fault Labeling and Diagnosis Method Based on Flight Data and BIT Record of UAV

Applied Sciences ◽

10.3390/app11125410 ◽

2021 ◽

Vol 11 (12) ◽

pp. 5410

Author(s):

Ke Zheng ◽

Guozhu Jia ◽

Linchao Yang ◽

Jiaqing Wang

Keyword(s):

Flight Test ◽

Gradient Boosting ◽

Learning Needs ◽

Limit States ◽

Convolutional Network ◽

Flight Data ◽

Light Gradient ◽

Extreme Gradient Boosting ◽

Diagnosis Method ◽

Test Flight

In the process of Unmanned Aerial Vehicle (UAV) flight testing, plenty of compound faults exist, which could be composed of concurrent single faults or over-limit states alarmed by Built-In-Test (BIT) equipment. At present, there still lacks a suitable automatic labeling approach for UAV flight data, effectively utilizing the information of the BIT record. The performance of the originally employed flight data-driven fault diagnosis models based on machine learning needs to be improved as well. A compound fault labeling and diagnosis method based on actual flight data and the BIT record of the UAV during flight test phase is proposed, through labeling the flight data with compound fault modes corresponding to concurrent single faults recorded by the BIT system, and upgrading the original diagnosis model based on Gradient Boosting Decision Tree (GBDT) and Fully Convolutional Network (FCNN), to eXtreme Gradient Boosting (XGBoost), Light Gradient Boosting Machine (LightGBM) and modified Convolutional Neural Network (CNN). The experimental results based on actual test flight data show that the proposed method could effectively label the flight data and obtain a significant improvement in diagnostic performance, appearing to be practical in the UAV test flight process.

Download Full-text

Development and validation of a difficult laryngoscopy prediction model using machine learning of neck circumference and thyromental height

BMC Anesthesiology ◽

10.1186/s12871-021-01343-4 ◽

2021 ◽

Vol 21 (1) ◽

Author(s):

Jong Ho Kim ◽

Haewon Kim ◽

Ji Su Jang ◽

Sung Mi Hwang ◽

So Young Lim ◽

...

Keyword(s):

Machine Learning ◽

Random Forest ◽

Confidence Interval ◽

Neck Circumference ◽

Difficult Laryngoscopy ◽

Gradient Boosting ◽

Test Set ◽

Equal Distribution ◽

Light Gradient ◽

Extreme Gradient Boosting

Abstract Background Predicting difficult airway is challengeable in patients with limited airway evaluation. The aim of this study is to develop and validate a model that predicts difficult laryngoscopy by machine learning of neck circumference and thyromental height as predictors that can be used even for patients with limited airway evaluation. Methods Variables for prediction of difficulty laryngoscopy included age, sex, height, weight, body mass index, neck circumference, and thyromental distance. Difficult laryngoscopy was defined as Grade 3 and 4 by the Cormack-Lehane classification. The preanesthesia and anesthesia data of 1677 patients who had undergone general anesthesia at a single center were collected. The data set was randomly stratified into a training set (80%) and a test set (20%), with equal distribution of difficulty laryngoscopy. The training data sets were trained with five algorithms (logistic regression, multilayer perceptron, random forest, extreme gradient boosting, and light gradient boosting machine). The prediction models were validated through a test set. Results The model’s performance using random forest was best (area under receiver operating characteristic curve = 0.79 [95% confidence interval: 0.72–0.86], area under precision-recall curve = 0.32 [95% confidence interval: 0.27–0.37]). Conclusions Machine learning can predict difficult laryngoscopy through a combination of several predictors including neck circumference and thyromental height. The performance of the model can be improved with more data, a new variable and combination of models.

Download Full-text

Establishing a Credit Risk Evaluation System for SMEs Using the Soft Voting Fusion Model

Risks ◽

10.3390/risks9110202 ◽

2021 ◽

Vol 9 (11) ◽

pp. 202

Author(s):

Ge Gao ◽

Hongxin Wang ◽

Pengbin Gao

Keyword(s):

Credit Risk ◽

Evaluation System ◽

Predictive Accuracy ◽

Assessment System ◽

Gradient Boosting ◽

Support Vector ◽

Fusion Model ◽

Light Gradient ◽

Extreme Gradient Boosting ◽

The Government

In China, SMEs are facing financing difficulties, and commercial banks and financial institutions are the main financing channels for SMEs. Thus, a reasonable and efficient credit risk assessment system is important for credit markets. Based on traditional statistical methods and AI technology, a soft voting fusion model, which incorporates logistic regression, support vector machine (SVM), random forest (RF), eXtreme Gradient Boosting (XGBoost), and Light Gradient Boosting Machine (LightGBM), is constructed to improve the predictive accuracy of SMEs’ credit risk. To verify the feasibility and effectiveness of the proposed model, we use data from 123 SMEs nationwide that worked with a Chinese bank from 2016 to 2020, including financial information and default records. The results show that the accuracy of the soft voting fusion model is higher than that of a single machine learning (ML) algorithm, which provides a theoretical basis for the government to control credit risk in the future and offers important references for banks to make credit decisions.

Download Full-text

Interpretable Machine Learning for Early Neurological Deterioration Prediction in Atrial Fibrillation-Related Stroke

10.21203/rs.3.rs-446890/v1 ◽

2021 ◽

Author(s):

Seong Hwan Kim ◽

Eun-Tae Jeon ◽

Sungwook Yu ◽

Kyungmi O ◽

Chi Kyung Kim ◽

...

Keyword(s):

Machine Learning ◽

Atrial Fibrillation ◽

Neurological Deterioration ◽

Gradient Boosting ◽

Support Vector ◽

Light Gradient ◽

Interpretable Machine Learning ◽

Extreme Gradient Boosting ◽

Early Neurological Deterioration ◽

Feature Importance

Abstract We aimed to develop a novel prediction model for early neurological deterioration (END) based on an interpretable machine learning (ML) algorithm for atrial fibrillation (AF)-related stroke and to evaluate the prediction accuracy and feature importance of ML models. Data from multi-center prospective stroke registries in South Korea were collected. After stepwise data preprocessing, we utilized logistic regression, support vector machine, extreme gradient boosting, light gradient boosting machine (LightGBM), and multilayer perceptron models. We used the Shapley additive explanations (SHAP) method to evaluate feature importance. Of the 3,623 stroke patients, the 2,363 who had arrived at the hospital within 24 hours of symptom onset and had available information regarding END were included. Of these, 318 (13.5%) had END. The LightGBM model showed the highest area under the receiver operating characteristic curve (0.778, 95% CI, 0.726 - 0.830). The feature importance analysis revealed that fasting glucose level and the National Institute of Health Stroke Scale score were the most influential factors. Among ML algorithms, the LightGBM model was particularly useful for predicting END, as it revealed new and diverse predictors. Additionally, the SHAP method can be adjusted to individualize the features’ effects on the predictive power of the model.

Download Full-text

Performance Analysis of Boosting Classifiers in Recognizing Activities of Daily Living

International Journal of Environmental Research and Public Health ◽

10.3390/ijerph17031082 ◽

2020 ◽

Vol 17 (3) ◽

pp. 1082 ◽

Cited By ~ 7

Author(s):

Saifur Rahman ◽

Muhammad Irfan ◽

Mohsin Raza ◽

Khawaja Moyeezullah Ghori ◽

Shumayla Yaqoob ◽

...

Keyword(s):

Physical Activity ◽

Performance Analysis ◽

Activities Of Daily Living ◽

Daily Living ◽

Supervised Machine Learning ◽

Gradient Boosting ◽

Method Performance ◽

Light Gradient ◽

Extreme Gradient Boosting ◽

Boosting Algorithms

Physical activity is essential for physical and mental health, and its absence is highly associated with severe health conditions and disorders. Therefore, tracking activities of daily living can help promote quality of life. Wearable sensors in this regard can provide a reliable and economical means of tracking such activities, and such sensors are readily available in smartphones and watches. This study is the first of its kind to develop a wearable sensor-based physical activity classification system using a special class of supervised machine learning approaches called boosting algorithms. The study presents the performance analysis of several boosting algorithms (extreme gradient boosting—XGB, light gradient boosting machine—LGBM, gradient boosting—GB, cat boosting—CB and AdaBoost) in a fair and unbiased performance way using uniform dataset, feature set, feature selection method, performance metric and cross-validation techniques. The study utilizes the Smartphone-based dataset of thirty individuals. The results showed that the proposed method could accurately classify the activities of daily living with very high performance (above 90%). These findings suggest the strength of the proposed system in classifying activity of daily living using only the smartphone sensor’s data and can assist in reducing the physical inactivity patterns to promote a healthier lifestyle and wellbeing.

Download Full-text

Modeling of nitrogen solubility in normal alkanes using machine learning methods compared with cubic and PC-SAFT equations of state

Scientific Reports ◽

10.1038/s41598-021-03643-8 ◽

2021 ◽

Vol 11 (1) ◽

Author(s):

Seyed Ali Madani ◽

Mohammad-Reza Mohammadi ◽

Saeid Atashrouz ◽

Ali Abedi ◽

Abdolhossein Hemmati-Sarapardeh ◽

...

Keyword(s):

Machine Learning ◽

Molecular Weight ◽

Oil Recovery ◽

Equations Of State ◽

Coefficient Of Determination ◽

Gradient Boosting ◽

Operating Pressure ◽

Normal Alkanes ◽

Light Gradient ◽

Extreme Gradient Boosting

AbstractAccurate prediction of the solubility of gases in hydrocarbons is a crucial factor in designing enhanced oil recovery (EOR) operations by gas injection as well as separation, and chemical reaction processes in a petroleum refinery. In this work, nitrogen (N2) solubility in normal alkanes as the major constituents of crude oil was modeled using five representative machine learning (ML) models namely gradient boosting with categorical features support (CatBoost), random forest, light gradient boosting machine (LightGBM), k-nearest neighbors (k-NN), and extreme gradient boosting (XGBoost). A large solubility databank containing 1982 data points was utilized to establish the models for predicting N2 solubility in normal alkanes as a function of pressure, temperature, and molecular weight of normal alkanes over broad ranges of operating pressure (0.0212–69.12 MPa) and temperature (91–703 K). The molecular weight range of normal alkanes was from 16 to 507 g/mol. Also, five equations of state (EOSs) including Redlich–Kwong (RK), Soave–Redlich–Kwong (SRK), Zudkevitch–Joffe (ZJ), Peng–Robinson (PR), and perturbed-chain statistical associating fluid theory (PC-SAFT) were used comparatively with the ML models to estimate N2 solubility in normal alkanes. Results revealed that the CatBoost model is the most precise model in this work with a root mean square error of 0.0147 and coefficient of determination of 0.9943. ZJ EOS also provided the best estimates for the N2 solubility in normal alkanes among the EOSs. Lastly, the results of relevancy factor analysis indicated that pressure has the greatest influence on N2 solubility in normal alkanes and the N2 solubility increases with increasing the molecular weight of normal alkanes.

Download Full-text

Artificial Intelligence-Based Prediction of Key Textural Properties from LUCAS and ICRAF Spectral Libraries

Agronomy ◽

10.3390/agronomy11081550 ◽

2021 ◽

Vol 11 (8) ◽

pp. 1550

Author(s):

Mohamed Zakaria Gouda ◽

El Mehdi Nagihi ◽

Lotfi Khiari ◽

Jacques Gallichand ◽

Mahmoud Ismail

Keyword(s):

Soil Texture ◽

High Performance ◽

Management Practices ◽

Textural Properties ◽

Machine Learning Algorithms ◽

Gradient Boosting ◽

Accurate Estimation ◽

Light Gradient ◽

Extreme Gradient Boosting ◽

Spectral Libraries

Soil texture is a key soil property influencing many agronomic practices including fertilization and liming. Therefore, an accurate estimation of soil texture is essential for adopting sustainable soil management practices. In this study, we used different machine learning algorithms trained on vis–NIR spectra from existing soil spectral libraries (ICRAF and LUCAS) to predict soil textural fractions (sand–silt–clay %). In addition, we predicted the soil textural groups (G1: Fine, G2: Medium, and G3: Coarse) using routine chemical characteristics as auxiliary. With the ICRAF dataset, multilayer perceptron resulted in good predictions for sand and clay (R2 = 0.78 and 0.85, respectively) and categorical boosting outperformed the other algorithms (random forest, extreme gradient boosting, linear regression) for silt prediction (R2 = 0.81). For the LUCAS dataset, categorical boosting consistently showed a high performance for sand, silt, and clay predictions (R2 = 0.79, 0.76, and 0.85, respectively). Furthermore, the soil texture groups (G1, G2, and G3) were classified using the light gradient boosted machine algorithm with a high accuracy (83% and 84% for ICRAF and LUCAS, respectively). These results, using spectral data, are very promising for rapid diagnosis of soil texture and group in order to adjust agricultural practices.

Download Full-text

Convolutional Neural Network Classifies Pathological Voice Change in Laryngeal Cancer with High Accuracy

Journal of Clinical Medicine ◽

10.3390/jcm9113415 ◽

2020 ◽

Vol 9 (11) ◽

pp. 3415

Author(s):

HyunBum Kim ◽

Juhyeong Jeon ◽

Yeon Jae Han ◽

YoungHoon Joo ◽

Jonghwan Lee ◽

...

Keyword(s):

Neural Network ◽

Convolutional Neural Network ◽

Sensitivity And Specificity ◽

Laryngeal Cancer ◽

Healthy Subjects ◽

Gradient Boosting ◽

Support Vector ◽

Vowel Sound ◽

Light Gradient ◽

Extreme Gradient Boosting

Voice changes may be the earliest signs in laryngeal cancer. We investigated whether automated voice signal analysis can be used to distinguish patients with laryngeal cancer from healthy subjects. We extracted features using the software package for speech analysis in phonetics (PRAAT) and calculated the Mel-frequency cepstral coefficients (MFCCs) from voice samples of a vowel sound of /a:/. The proposed method was tested with six algorithms: support vector machine (SVM), extreme gradient boosting (XGBoost), light gradient boosted machine (LGBM), artificial neural network (ANN), one-dimensional convolutional neural network (1D-CNN) and two-dimensional convolutional neural network (2D-CNN). Their performances were evaluated in terms of accuracy, sensitivity, and specificity. The result was compared with human performance. A total of four volunteers, two of whom were trained laryngologists, rated the same files. The 1D-CNN showed the highest accuracy of 85% and sensitivity and sensitivity and specificity levels of 78% and 93%. The two laryngologists achieved accuracy of 69.9% but sensitivity levels of 44%. Automated analysis of voice signals could differentiate subjects with laryngeal cancer from those of healthy subjects with higher diagnostic properties than those performed by the four volunteers.

Download Full-text

Predicting Hard Rock Pillar Stability Using GBDT, XGBoost, and LightGBM Algorithms

Mathematics ◽

10.3390/math8050765 ◽

2020 ◽

Vol 8 (5) ◽

pp. 765 ◽

Cited By ~ 6

Author(s):

Weizhang Liang ◽

Suizhi Luo ◽

Guoyan Zhao ◽

Hao Wu

Keyword(s):

Large Scale ◽

Prediction Models ◽

Hard Rock ◽

Gradient Boosting ◽

Pillar Stability ◽

Rock Pillar ◽

Light Gradient ◽

Gradient Boosting Machine ◽

Extreme Gradient Boosting ◽

Hard Rock Mines

Predicting pillar stability is a vital task in hard rock mines as pillar instability can cause large-scale collapse hazards. However, it is challenging because the pillar stability is affected by many factors. With the accumulation of pillar stability cases, machine learning (ML) has shown great potential to predict pillar stability. This study aims to predict hard rock pillar stability using gradient boosting decision tree (GBDT), extreme gradient boosting (XGBoost), and light gradient boosting machine (LightGBM) algorithms. First, 236 cases with five indicators were collected from seven hard rock mines. Afterwards, the hyperparameters of each model were tuned using a five-fold cross validation (CV) approach. Based on the optimal hyperparameters configuration, prediction models were constructed using training set (70% of the data). Finally, the test set (30% of the data) was adopted to evaluate the performance of each model. The precision, recall, and F1 indexes were utilized to analyze prediction results of each level, and the accuracy and their macro average values were used to assess the overall prediction performance. Based on the sensitivity analysis of indicators, the relative importance of each indicator was obtained. In addition, the safety factor approach and other ML algorithms were adopted as comparisons. The results showed that GBDT, XGBoost, and LightGBM algorithms achieved a better comprehensive performance, and their prediction accuracies were 0.8310, 0.8310, and 0.8169, respectively. The average pillar stress and ratio of pillar width to pillar height had the most important influences on prediction results. The proposed methodology can provide a reliable reference for pillar design and stability risk management.

Download Full-text

Protein pKa prediction by tree-based machine learning

10.26434/chemrxiv-2021-4d420 ◽

2021 ◽

Author(s):

Ada Y. Chen ◽

Juyong Lee ◽

Ana Damjanovic ◽

Bernard R. Brooks

Keyword(s):

Machine Learning ◽

Machine Learning Algorithms ◽

Gradient Boosting ◽

Pka Prediction ◽

Light Gradient ◽

Structure Database ◽

Gradient Boosting Machine ◽

Extreme Gradient Boosting ◽

Better Than ◽

Protein Pka

We present four tree-based machine learning models for protein pKa prediction. The four models, Random Forest, Extra Trees, eXtreme Gradient Boosting (XGBoost) and Light Gradient Boosting Machine (LightGBM), were trained on three experimental PDB and pKa datasets, two of which included a notable portion of internal residues. We observed similar performance among the four machine learning algorithms. The best model trained on the largest dataset performs 37% better than the widely used empirical pKa prediction tool PROPKA. The overall RMSE for this model is 0.69, with surface and buried RMSE values being 0.56 and 0.78, respectively, considering six residue types (Asp, Glu, His, Lys, Cys and Tyr), and 0.63 when considering Asp, Glu, His and Lys only. We provide pKa predictions for proteins in human proteome from the AlphaFold Protein Structure Database and observed that 1% of Asp/Glu/Lys residues have highly shifted pKa values close to the physiological pH.

Download Full-text

Landslide Susceptibility Analysis using Gradient Boosting Models: A Case Study in Penang Island, Malaysia

Disaster Advances ◽

10.25303/148da2221 ◽

2021 ◽

pp. 22-37

Author(s):

Han Gao ◽

Pei Shan Fam ◽

Lea Tien Tay ◽

Heng Chin Low

Keyword(s):

Feature Selection ◽

Landslide Susceptibility ◽

Roc Curves ◽

Spatial Prediction ◽

Prediction Performance ◽

Gradient Boosting ◽

Support Vector ◽

Prediction Ability ◽

Light Gradient ◽

Extreme Gradient Boosting

Tree-based gradient boosting (TGB) models gain popularity in various areas due to their powerful prediction ability and fast processing speed. This study aims to compare the landslide spatial prediction performance of TGB models and non-tree-based machine learning (NML) models in Penang Island, Malaysia. Two specific instances of TGB models, eXtreme Gradient Boosting (XGBoost) and Light Gradient Boosting Machine (LightGBM) and two specific instances of NML models, artificial neural network (ANN) and support vector machine (SVM), are applied to make predictions of landslide susceptibility. Feature selection and oversampling techniques are considered to improve the prediction performance as well. The results are analyzed and discussed mainly based on receiver operating characteristic (ROC) curves as well as the area under the curves (AUC). The results show that TGB models give better prediction performance compared to NML models, no matter what the sample size is. The TGB models’ performances are improved when training with the dataset considering either feature selection or oversampling techniques. The highest AUC value of 0.9525 is obtained from the combination of XGBoost and SMOTE. The landslide susceptibility maps (LSMs) produced by XGBoost and LightGBM can provide valuable information in landslide management and mitigation in Penang Island, Malaysia.

Download Full-text