Landslide and Wildfire Susceptibility Assessment in Southeast Asia Using Ensemble Machine Learning Methods

Qian He; Ziyu Jiang; Ming Wang; Kai Liu

doi:10.3390/rs13081572

Landslide and Wildfire Susceptibility Assessment in Southeast Asia Using Ensemble Machine Learning Methods

Remote Sensing ◽

10.3390/rs13081572 ◽

2021 ◽

Vol 13 (8) ◽

pp. 1572

Author(s):

Qian He ◽

Ziyu Jiang ◽

Ming Wang ◽

Kai Liu

Keyword(s):

Machine Learning ◽

Southeast Asia ◽

Machine Learning Algorithms ◽

Receiver Operating Curve ◽

Susceptibility Assessment ◽

Adaptive Boosting ◽

Susceptibility Modeling ◽

Ensemble Machine Learning ◽

Susceptibility Maps ◽

Wildfire Susceptibility

Southeast Asia (SEA) is a region affected by landslide and wildfire; however, few studies on susceptibility modeling for the two hazards together have been conducted for this region, and the intersection and the uncertainty of the two hazards are rarely assessed. Thus, the intersection of landslide and wildfire susceptibility and the spatial uncertainty of the susceptibility maps were studied in this paper. Reliable landslide and wildfire susceptibility maps are necessary for disaster management and land use planning. This work used three advanced ensemble machine learning algorithms: RF (Random Forest), GBDT (Gradient Boosting Decision Tree) and AdaBoost (Adaptive Boosting) to assess the landslide and wildfire susceptibility for SEA. A geo-database was established with 2759 landslide locations, 1633 wildfire locations and 18 predictor variables in total. The performances of the models were assessed using the overall classification accuracy (ACC), Precision, the area under the ROC (receiver operating curve) (AUC) and confusion matrix values. The results showed RF performs superior in both landslide (ACC = 0.81, Precision = 0.78 and AUC= 0.89) and wildfire (ACC= 0.83, Precision = 0.83 and AUC = 0.91) susceptibility modeling, followed by GBDT and AdaBoost. The overall superiority of RF over other models indicates that it is potentially an efficient model for landslide and wildfire susceptibility mapping. The landslide and wildfire susceptibility were obtained using the RF model. This paper also conducted an overlay analysis of the two hazards. The uncertainty of the susceptibility was further assessed using the coefficient of variation (CV). Additionally, the distance to roads is relatively important in both landslide and wildfire susceptibility, which is the most important in landslides and the second most important in wildfires. The result of this paper is useful for mastering the whole situation of hazard susceptibility and proves that RF is a robust model in the hazard susceptibility assessment in SEA.

Download Full-text

A Machine Learning-Based Approach for Wildfire Susceptibility Mapping. The Case Study of the Liguria Region in Italy

Geosciences ◽

10.3390/geosciences10030105 ◽

2020 ◽

Vol 10 (3) ◽

pp. 105 ◽

Cited By ~ 3

Author(s):

Marj Tonini ◽

Mirko D’Andrea ◽

Guido Biondi ◽

Silvia Degli Esposti ◽

Andrea Trucchia ◽

...

Keyword(s):

Machine Learning ◽

Expert Knowledge ◽

Vegetation Type ◽

Climatic Conditions ◽

Predisposing Factors ◽

Machine Learning Algorithms ◽

Susceptibility Map ◽

Susceptibility Maps ◽

A Site ◽

Wildfire Susceptibility

Wildfire susceptibility maps display the spatial probability of an area to burn in the future, based solely on the intrinsic local proprieties of a site. Current studies in this field often rely on statistical models, often improved by expert knowledge for data retrieving and processing. In the last few years, machine learning algorithms have proven to be successful in this domain, thanks to their capability of learning from data through the modeling of hidden relationships. In the present study, authors introduce an approach based on random forests, allowing elaborating a wildfire susceptibility map for the Liguria region in Italy. This region is highly affected by wildfires due to the dense and heterogeneous vegetation, with more than 70% of its surface covered by forests, and due to the favorable climatic conditions. Susceptibility was assessed by considering the dataset of the mapped fire perimeters, spanning a 21-year period (1997–2017) and different geo-environmental predisposing factors (i.e., land cover, vegetation type, road network, altitude, and derivatives). One main objective was to compare different models in order to evaluate the effect of: (i) including or excluding the neighboring vegetation type as additional predisposing factors and (ii) using an increasing number of folds in the spatial-cross validation procedure. Susceptibility maps for the two fire seasons were finally elaborated and validated. Results highlighted the capacity of the proposed approach to identify areas that could be affected by wildfires in the near future, as well as its goodness in assessing the efficiency of fire-fighting activities.

Download Full-text

Spatial Modeling of Asthma-Prone Areas Using Remote Sensing and Ensemble Machine Learning Algorithms

Remote Sensing ◽

10.3390/rs13163222 ◽

2021 ◽

Vol 13 (16) ◽

pp. 3222

Author(s):

Seyed Vahid Razavi-Termeh ◽

Abolghasem Sadeghi-Niaraki ◽

Soo-Mi Choi

Keyword(s):

Machine Learning ◽

Remote Sensing ◽

Spatial Modeling ◽

Learning Algorithms ◽

Traffic Volume ◽

Machine Learning Algorithms ◽

Weight Of Evidence ◽

Adaptive Boosting ◽

Adaboost Algorithm ◽

Ensemble Machine Learning

In this study, asthma-prone area modeling of Tehran, Iran was provided by employing three ensemble machine learning algorithms (Bootstrap aggregating (Bagging), Adaptive Boosting (AdaBoost), and Stacking). First, a spatial database was created with 872 locations of asthma patients and affecting factors (particulate matter (PM10 and PM2.5), ozone (O3), sulfur dioxide (SO2), carbon monoxide (CO), nitrogen dioxide (NO2), rainfall, wind speed, humidity, temperature, distance to street, traffic volume, and a normalized difference vegetation index (NDVI)). We created four factors using remote sensing (RS) imagery, including air pollution (O3, SO2, CO, and NO2), altitude, and NDVI. All criteria were prepared using a geographic information system (GIS). For modeling and validation, 70% and 30% of the data were used, respectively. The weight of evidence (WOE) model was used to assess the spatial relationship between the dependent and independent data. Finally, three ensemble algorithms were used to perform asthma-prone areas mapping. According to the Gini index, the most influential factors on asthma occurrence were distance to the street, NDVI, and traffic volume. The area under the curve (AUC) of receiver operating characteristic (ROC) values for the AdaBoost, Bagging, and Stacking algorithms was 0.849, 0.82, and 0.785, respectively. According to the findings, the AdaBoost algorithm outperforms the Bagging and Stacking algorithms in spatial modeling of asthma-prone areas.

Download Full-text

Machine Learning Approach for Predicting Lane-Change Maneuvers using the SHRP2 Naturalistic Driving Study Data

Transportation Research Record Journal of the Transportation Research Board ◽

10.1177/03611981211003581 ◽

2021 ◽

pp. 036119812110035

Author(s):

Anik Das ◽

Mohamed M. Ahmed

Keyword(s):

Machine Learning ◽

Prediction Accuracy ◽

Machine Learning Algorithms ◽

Support Vector ◽

Lane Change ◽

Adaptive Boosting ◽

Extreme Gradient Boosting ◽

Naturalistic Driving Study ◽

Naturalistic Driving ◽

Change Prediction

Accurate lane-change prediction information in real time is essential to safely operate Autonomous Vehicles (AVs) on the roadways, especially at the early stage of AVs deployment, where there will be an interaction between AVs and human-driven vehicles. This study proposed reliable lane-change prediction models considering features from vehicle kinematics, machine vision, driver, and roadway geometric characteristics using the trajectory-level SHRP2 Naturalistic Driving Study and Roadway Information Database. Several machine learning algorithms were trained, validated, tested, and comparatively analyzed including, Classification And Regression Trees (CART), Random Forest (RF), eXtreme Gradient Boosting (XGBoost), Adaptive Boosting (AdaBoost), Support Vector Machine (SVM), K Nearest Neighbor (KNN), and Naïve Bayes (NB) based on six different sets of features. In each feature set, relevant features were extracted through a wrapper-based algorithm named Boruta. The results showed that the XGBoost model outperformed all other models in relation to its highest overall prediction accuracy (97%) and F1-score (95.5%) considering all features. However, the highest overall prediction accuracy of 97.3% and F1-score of 95.9% were observed in the XGBoost model based on vehicle kinematics features. Moreover, it was found that XGBoost was the only model that achieved a reliable and balanced prediction performance across all six feature sets. Furthermore, a simplified XGBoost model was developed for each feature set considering the practical implementation of the model. The proposed prediction model could help in trajectory planning for AVs and could be used to develop more reliable advanced driver assistance systems (ADAS) in a cooperative connected and automated vehicle environment.

Download Full-text

Comparison of Ensemble Machine Learning Methods for Soil Erosion Pin Measurements

ISPRS International Journal of Geo-Information ◽

10.3390/ijgi10010042 ◽

2021 ◽

Vol 10 (1) ◽

pp. 42

Author(s):

Kieu Anh Nguyen ◽

Walter Chen ◽

Bor-Shiun Lin ◽

Uma Seeboonruang

Keyword(s):

Machine Learning ◽

Soil Erosion ◽

Ensemble Methods ◽

Machine Learning Algorithms ◽

Multivariate Adaptive Regression Splines ◽

Gradient Boosting ◽

Support Vector ◽

Ensemble Machine Learning ◽

Boosting Method ◽

Bagging Method

Although machine learning has been extensively used in various fields, it has only recently been applied to soil erosion pin modeling. To improve upon previous methods of quantifying soil erosion based on erosion pin measurements, this study explored the possible application of ensemble machine learning algorithms to the Shihmen Reservoir watershed in northern Taiwan. Three categories of ensemble methods were considered in this study: (a) Bagging, (b) boosting, and (c) stacking. The bagging method in this study refers to bagged multivariate adaptive regression splines (bagged MARS) and random forest (RF), and the boosting method includes Cubist and gradient boosting machine (GBM). Finally, the stacking method is an ensemble method that uses a meta-model to combine the predictions of base models. This study used RF and GBM as the meta-models, decision tree, linear regression, artificial neural network, and support vector machine as the base models. The dataset used in this study was sampled using stratified random sampling to achieve a 70/30 split for the training and test data, and the process was repeated three times. The performance of six ensemble methods in three categories was analyzed based on the average of three attempts. It was found that GBM performed the best among the ensemble models with the lowest root-mean-square error (RMSE = 1.72 mm/year), the highest Nash-Sutcliffe efficiency (NSE = 0.54), and the highest index of agreement (d = 0.81). This result was confirmed by the spatial comparison of the absolute differences (errors) between model predictions and observations using GBM and RF in the study area. In summary, the results show that as a group, the bagging method and the boosting method performed equally well, and the stacking method was third for the erosion pin dataset considered in this study.

Download Full-text

SURG-02. SURVIVAL PREDICTION AFTER NEUROSURGICAL RESECTION OF BRAIN METASTASES: A MACHINE LEARNING APPROACH

Neuro-Oncology ◽

10.1093/neuonc/noaa215.849 ◽

2020 ◽

Vol 22 (Supplement_2) ◽

pp. ii203-ii203

Author(s):

Alexander Hulsbergen ◽

Yu Tung Lo ◽

Vasileios Kavouridis ◽

John Phillips ◽

Timothy Smith ◽

...

Keyword(s):

Machine Learning ◽

Brain Metastases ◽

External Validation ◽

Superior Performance ◽

Prognostic Models ◽

Receiver Operating Curve ◽

Gradient Boosting ◽

Survival Prediction ◽

Ensemble Model ◽

Adaptive Boosting

Abstract INTRODUCTION Survival prediction in brain metastases (BMs) remains challenging. Current prognostic models have been created and validated almost completely with data from patients receiving radiotherapy only, leaving uncertainty about surgical patients. Therefore, the aim of this study was to build and validate a model predicting 6-month survival after BM resection using different machine learning (ML) algorithms. METHODS An institutional database of 1062 patients who underwent resection for BM was split into a 80:20 training and testing set. Seven different ML algorithms were trained and assessed for performance. Moreover, an ensemble model was created incorporating random forest, adaptive boosting, gradient boosting, and logistic regression algorithms. Five-fold cross validation was used for hyperparameter tuning. Model performance was assessed using area under the receiver-operating curve (AUC) and calibration and was compared against the diagnosis-specific graded prognostic assessment (ds-GPA); the most established prognostic model in BMs. RESULTS The ensemble model showed superior performance with an AUC of 0.81 in the hold-out test set, a calibration slope of 1.14, and a calibration intercept of -0.08, outperforming the ds-GPA (AUC 0.68). Patients were stratified into high-, medium- and low-risk groups for death at 6 months; these strata strongly predicted both 6-months and longitudinal overall survival (p < 0.001). CONCLUSIONS We developed and internally validated an ensemble ML model that accurately predicts 6-month survival after neurosurgical resection for BM, outperforms the most established model in the literature, and allows for meaningful risk stratification. Future efforts should focus on external validation of our model.

Download Full-text

Prediction and Analysis of Gold Prices using Ensemble Machine Learning Algorithms

International Journal for Research in Applied Science and Engineering Technology ◽

10.22214/ijraset.2021.36028 ◽

2021 ◽

Vol 9 (VI) ◽

pp. 4367-4374

Author(s):

Gudipally Chandrashakar

Keyword(s):

Machine Learning ◽

Time Series ◽

Time Series Data ◽

Gold Price ◽

Machine Learning Algorithms ◽

Series Data ◽

Gradient Boosting ◽

Support Vector ◽

Average Value ◽

Ensemble Machine Learning

In this article, we used historical time series data up to the current day gold price. In this study of predicting gold price, we consider few correlating factors like silver price, copper price, standard, and poor’s 500 value, dollar-rupee exchange rate, Dow Jones Industrial Average Value. Considering the prices of every correlating factor and gold price data where dates ranging from 2008 January to 2021 February. Few algorithms of machine learning are used to analyze the time-series data are Random Forest Regression, Support Vector Regressor, Linear Regressor, ExtraTrees Regressor and Gradient boosting Regression. While seeing the results the Extra Tree Regressor algorithm gives the predicted value of gold prices more accurately.

Download Full-text

Feature Selection and Comparison of Machine Learning Algorithms in Classification of Grazing and Rumination Behaviour in Sheep

Sensors ◽

10.3390/s18103532 ◽

2018 ◽

Vol 18 (10) ◽

pp. 3532 ◽

Cited By ~ 16

Author(s):

Nicola Mansbridge ◽

Jurgen Mitsch ◽

Nicola Bollard ◽

Keith Ellis ◽

Giuliana Miguel-Pacheco ◽

...

Keyword(s):

Machine Learning ◽

Random Forest ◽

Time Budget ◽

Learning Algorithms ◽

Eating Behaviour ◽

Machine Learning Algorithms ◽

Support Vector ◽

Optimum Number ◽

Eating Behaviours ◽

Adaptive Boosting

Grazing and ruminating are the most important behaviours for ruminants, as they spend most of their daily time budget performing these. Continuous surveillance of eating behaviour is an important means for monitoring ruminant health, productivity and welfare. However, surveillance performed by human operators is prone to human variance, time-consuming and costly, especially on animals kept at pasture or free-ranging. The use of sensors to automatically acquire data, and software to classify and identify behaviours, offers significant potential in addressing such issues. In this work, data collected from sheep by means of an accelerometer/gyroscope sensor attached to the ear and collar, sampled at 16 Hz, were used to develop classifiers for grazing and ruminating behaviour using various machine learning algorithms: random forest (RF), support vector machine (SVM), k nearest neighbour (kNN) and adaptive boosting (Adaboost). Multiple features extracted from the signals were ranked on their importance for classification. Several performance indicators were considered when comparing classifiers as a function of algorithm used, sensor localisation and number of used features. Random forest yielded the highest overall accuracies: 92% for collar and 91% for ear. Gyroscope-based features were shown to have the greatest relative importance for eating behaviours. The optimum number of feature characteristics to be incorporated into the model was 39, from both ear and collar data. The findings suggest that one can successfully classify eating behaviours in sheep with very high accuracy; this could be used to develop a device for automatic monitoring of feed intake in the sheep sector to monitor health and welfare.

Download Full-text

Learning from Imbalanced Educational Data Using Ensemble Machine Learning Algorithms

Webology ◽

10.14704/web/v18si01/web18053 ◽

2021 ◽

Vol 18 (Special Issue 01) ◽

pp. 183-195

Author(s):

Thingbaijam Lenin ◽

N. Chandrasekaran

Keyword(s):

Machine Learning ◽

Random Forest ◽

Missing Values ◽

Machine Learning Techniques ◽

Gradient Boosting ◽

Adaptive Boosting ◽

Stochastic Gradient Boosting ◽

Ensemble Machine Learning ◽

Learning Techniques ◽

Student’S Performance

Student’s academic performance is one of the most important parameters for evaluating the standard of any institute. It has become a paramount importance for any institute to identify the student at risk of underperforming or failing or even drop out from the course. Machine Learning techniques may be used to develop a model for predicting student’s performance as early as at the time of admission. The task however is challenging as the educational data required to explore for modelling are usually imbalanced. We explore ensemble machine learning techniques namely bagging algorithm like random forest (rf) and boosting algorithms like adaptive boosting (adaboost), stochastic gradient boosting (gbm), extreme gradient boosting (xgbTree) in an attempt to develop a model for predicting the student’s performance of a private university at Meghalaya using three categories of data namely demographic, prior academic record, personality. The collected data are found to be highly imbalanced and also consists of missing values. We employ k-nearest neighbor (knn) data imputation technique to tackle the missing values. The models are developed on the imputed data with 10 fold cross validation technique and are evaluated using precision, specificity, recall, kappa metrics. As the data are imbalanced, we avoid using accuracy as the metrics of evaluating the model and instead use balanced accuracy and F-score. We compare the ensemble technique with single classifier C4.5. The best result is provided by random forest and adaboost with F-score of 66.67%, balanced accuracy of 75%, and accuracy of 96.94%.

Download Full-text

A Survey on Machine Learning Algorithms for Vision State Classification and Prediction Through Electroencephalogram (EEG) Signal

10.46532/978-81-950008-1-4_093 ◽

2020 ◽

pp. 426-429

Author(s):

Devipriya A ◽

Brindha D ◽

Kousalya A

Keyword(s):

Machine Learning ◽

Learning Algorithms ◽

Problem Area ◽

Machine Learning Algorithms ◽

Eeg Signal ◽

Ensemble Machine Learning ◽

State Classification ◽

Machine Learning Model ◽

Knn Classification ◽

Electroencephalogram Eeg

Eye state ID is a sort of basic time-arrangement grouping issue in which it is additionally a problem area in the late exploration. Electroencephalography (EEG) is broadly utilized in a vision state in order to recognize people perception form. Past examination was approved possibility of AI & measurable methodologies of EEG vision state arrangement. This research means to propose novel methodology for EEG vision state distinguishing proof utilizing Gradual Characteristic Learning (GCL) in light of neural organizations. GCL is a novel AI methodology which bit by bit imports and prepares includes individually. Past examinations have confirmed that such a methodology is appropriate for settling various example acknowledgment issues. Nonetheless, in these past works, little examination on GCL zeroed in its application to temporal-arrangement issues. Thusly, it is as yet unclear if GCL will be utilized for adapting the temporal-arrangement issues like EEG vision state characterization. Trial brings about this examination shows that, with appropriate element extraction and highlight requesting, GCL cannot just productively adapt to time-arrangement order issues, yet additionally display better grouping execution as far as characterization mistake rates in correlation with ordinary and some different methodologies. Vision state classification is performed and discussed with KNN classification and accuracy is enriched finally discussed the vision state classification with ensemble machine learning model.

Download Full-text

Spam Mail Classification Using Ensemble and Non-Ensemble Machine Learning Algorithms

Machine Learning for Predictive Analysis - Lecture Notes in Networks and Systems ◽

10.1007/978-981-15-7106-0_18 ◽

2020 ◽

pp. 179-189

Author(s):

Khyati Agarwal ◽

Prakhar Uniyal ◽

Suryavanshi Virendrasingh ◽

Sai Krishna ◽

Varun Dutt

Keyword(s):

Machine Learning ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Ensemble Machine Learning

Download Full-text