Assessment of Landslide-Prone Areas and Their Zonation Using Logistic Regression, LogitBoost, and NaïveBayes Machine-Learning Algorithms

Hamid Pourghasemi; Amiya Gayen; Sungjae Park; Chang-Wook Lee; Saro Lee

doi:10.3390/su10103697

Assessment of Landslide-Prone Areas and Their Zonation Using Logistic Regression, LogitBoost, and NaïveBayes Machine-Learning Algorithms

Sustainability ◽

10.3390/su10103697 ◽

2018 ◽

Vol 10 (10) ◽

pp. 3697 ◽

Cited By ~ 34

Author(s):

Hamid Pourghasemi ◽

Amiya Gayen ◽

Sungjae Park ◽

Chang-Wook Lee ◽

Saro Lee

Keyword(s):

Machine Learning ◽

Land Use ◽

Logistic Regression ◽

South Korea ◽

Landslide Susceptibility ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Slope Position ◽

Conditioning Factors ◽

Susceptibility Maps

The occurrence of landslide in the hilly region of South Korea is a matter of serious concern. This study tries to produce landslide susceptibility maps for Jumunjin Country in South Korea. Three machine learning algorithms, namely Logistic Regression (LR), LogitBoost (LB), and NaïveBayes (NB) are used, and their final model outcomes are compared to each other. Firstly, a landslide inventory map and the associated input data layers of the landslide conditioning factors were developed based on field verification, historical records, and high-resolution remote-sensing data in the geographic information system (GIS) environment. Seventeen landslide conditioning factors were prepared, including aspect, slope, altitude, maximum curvature, profile curvature, topographic wetness index (TWI), topographic positioning index (TPI), distance from fault, convexity, forest type, forest diameter, forest density, land use/land cover, lithology, soil, flow accumulation, and mid slope position. The result showed that the area under the curve (AUC) values of LR, LB, and NB models were 84.2%, 70.7%, and 85.2%, respectively. The results revealed that the LR and LB models produced reasonable accuracy than respect to NB model in landslide susceptibility assessment. The final susceptibility maps would be useful for preliminary land-use planning and hazard mitigation purpose.

Download Full-text

Combining Logistic Regression-based hybrid optimized machine learning algorithms with sensitivity analysis to achieve robust landslide susceptibility mapping

Geocarto International ◽

10.1080/10106049.2021.2022009 ◽

2021 ◽

pp. 1-25

Author(s):

Saeed Alqadhi ◽

Javed Mallick ◽

Swapan Talukdar ◽

Ahmed Ali Bindajam ◽

Tamal Kanti Saha ◽

...

Keyword(s):

Machine Learning ◽

Sensitivity Analysis ◽

Logistic Regression ◽

Landslide Susceptibility ◽

Learning Algorithms ◽

Susceptibility Mapping ◽

Machine Learning Algorithms ◽

Landslide Susceptibility Mapping

Download Full-text

A Novel Performance Assessment Approach using Photogrammetric Techniques for Landslide Susceptibility Mapping with Logistic Regression, ANN and Random Forest

Sensors ◽

10.3390/s19183940 ◽

2019 ◽

Vol 19 (18) ◽

pp. 3940 ◽

Cited By ~ 26

Author(s):

Sevgen ◽

Kocaman ◽

Nefeslioglu ◽

Gokceoglu

Keyword(s):

Machine Learning ◽

Logistic Regression ◽

Random Forest ◽

Performance Assessment ◽

Landslide Susceptibility ◽

Machine Learning Algorithms ◽

Landslide Susceptibility Mapping ◽

Susceptibility Maps ◽

Surface Models ◽

Landslide Susceptibility Maps

Prediction of possible landslide areas is the first stage of landslide hazard mitigation efforts and is also crucial for suitable site selection. Several statistical and machine learning methodologies have been applied for the production of landslide susceptibility maps. However, the performance assessment of such methods have conventionally been carried out by utilizing existing landslide inventories. The purpose of this study is to investigate the performances of landslide susceptibility maps produced with three different machine learning algorithms, i.e., random forest, artificial neural network, and logistic regression, in a recently constructed and activated dam reservoir and assess the external quality of each map by using pre- and post-event photogrammetric datasets. The methodology introduced here was applied using digital surface models generated from aerial photogrammetric flight data acquired before and after the dam construction. Aerial photogrammetric images acquired in 2012 and 2018 (after the dam was filled) were used to produce digital terrain models and orthophotos. The 2012 dataset was used for producing the landslide susceptibility maps and the results were evaluated by comparing the Euclidian distances between the two surface models. The results show that the random forest method outperforms the other two for predicting the future landslides.

Download Full-text

Incorporating land-use regression into machine learning algorithms in estimating the spatial-temporal variation of carbon monoxide in Taiwan

Environmental Modelling & Software ◽

10.1016/j.envsoft.2021.104996 ◽

2021 ◽

Vol 139 ◽

pp. 104996

Author(s):

Pei-Yi Wong ◽

Chin-Yu Hsu ◽

Jhao-Yi Wu ◽

Tee-Ann Teo ◽

Jen-Wei Huang ◽

...

Keyword(s):

Machine Learning ◽

Land Use ◽

Carbon Monoxide ◽

Temporal Variation ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Land Use Regression

Download Full-text

Land Use/Land Cover Mapping from Airborne Hyperspectral Images with Machine Learning Algorithms and Contextual Information

Geocarto International ◽

10.1080/10106049.2021.1945149 ◽

2021 ◽

pp. 1-40

Author(s):

Ozlem Akar ◽

Esra Tunc Gormus

Keyword(s):

Machine Learning ◽

Land Use ◽

Land Cover ◽

Contextual Information ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Hyperspectral Images ◽

Land Cover Mapping ◽

Land Use Land Cover

Download Full-text

Machine-learning algorithms for land use dynamics in Lake Haramaya Watershed, Ethiopia

Modeling Earth Systems and Environment ◽

10.1007/s40808-021-01296-0 ◽

2021 ◽

Author(s):

Gezahegn Weldu Woldemariam ◽

Degefie Tibebe ◽

Tesfamariam Engida Mengesha ◽

Tadele Bedo Gelete

Keyword(s):

Machine Learning ◽

Land Use ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Land Use Dynamics

Download Full-text

Predicting hospitalization following psychiatric crisis care using machine learning

BMC Medical Informatics and Decision Making ◽

10.1186/s12911-020-01361-1 ◽

2020 ◽

Vol 20 (1) ◽

Author(s):

Matthijs Blankers ◽

Louk F. M. van der Post ◽

Jack J. M. Dekker

Keyword(s):

Machine Learning ◽

Logistic Regression ◽

Prediction Models ◽

Learning Algorithms ◽

Nearest Neighbors ◽

Machine Learning Algorithms ◽

Gradient Boosting ◽

Ensemble Model ◽

K Nearest Neighbors ◽

Crisis Care

Abstract Background Accurate prediction models for whether patients on the verge of a psychiatric criseis need hospitalization are lacking and machine learning methods may help improve the accuracy of psychiatric hospitalization prediction models. In this paper we evaluate the accuracy of ten machine learning algorithms, including the generalized linear model (GLM/logistic regression) to predict psychiatric hospitalization in the first 12 months after a psychiatric crisis care contact. We also evaluate an ensemble model to optimize the accuracy and we explore individual predictors of hospitalization. Methods Data from 2084 patients included in the longitudinal Amsterdam Study of Acute Psychiatry with at least one reported psychiatric crisis care contact were included. Target variable for the prediction models was whether the patient was hospitalized in the 12 months following inclusion. The predictive power of 39 variables related to patients’ socio-demographics, clinical characteristics and previous mental health care contacts was evaluated. The accuracy and area under the receiver operating characteristic curve (AUC) of the machine learning algorithms were compared and we also estimated the relative importance of each predictor variable. The best and least performing algorithms were compared with GLM/logistic regression using net reclassification improvement analysis and the five best performing algorithms were combined in an ensemble model using stacking. Results All models performed above chance level. We found Gradient Boosting to be the best performing algorithm (AUC = 0.774) and K-Nearest Neighbors to be the least performing (AUC = 0.702). The performance of GLM/logistic regression (AUC = 0.76) was slightly above average among the tested algorithms. In a Net Reclassification Improvement analysis Gradient Boosting outperformed GLM/logistic regression by 2.9% and K-Nearest Neighbors by 11.3%. GLM/logistic regression outperformed K-Nearest Neighbors by 8.7%. Nine of the top-10 most important predictor variables were related to previous mental health care use. Conclusions Gradient Boosting led to the highest predictive accuracy and AUC while GLM/logistic regression performed average among the tested algorithms. Although statistically significant, the magnitude of the differences between the machine learning algorithms was in most cases modest. The results show that a predictive accuracy similar to the best performing model can be achieved when combining multiple algorithms in an ensemble model.

Download Full-text

Application of Machine Learning Algorithms and Their Ensemble for Landslide Susceptibility Mapping

Understanding and Reducing Landslide Disaster Risk - ICL Contribution to Landslide Disaster Risk Reduction ◽

10.1007/978-3-030-60227-7_25 ◽

2020 ◽

pp. 233-239

Author(s):

Bahareh Kalantar ◽

Naonori Ueda ◽

Vahideh Saeidi ◽

Parisa Ahmadi

Keyword(s):

Machine Learning ◽

Landslide Susceptibility ◽

Learning Algorithms ◽

Susceptibility Mapping ◽

Machine Learning Algorithms ◽

Landslide Susceptibility Mapping

Download Full-text

Predicting Hospitalization following Psychiatric Crisis Care using Machine Learning

10.21203/rs.2.12338/v1 ◽

2019 ◽

Author(s):

Matthijs Blankers ◽

Louk F. M. van der Post ◽

Jack J. M. Dekker

Keyword(s):

Machine Learning ◽

Logistic Regression ◽

Learning Algorithms ◽

Nearest Neighbors ◽

Machine Learning Algorithms ◽

Predictor Variables ◽

Gradient Boosting ◽

K Nearest Neighbors ◽

Psychiatric Crisis ◽

Crisis Care

Abstract Background: It is difficult to accurately predict whether a patient on the verge of a potential psychiatric crisis will need to be hospitalized. Machine learning may be helpful to improve the accuracy of psychiatric hospitalization prediction models. In this paper we evaluate and compare the accuracy of ten machine learning algorithms including the commonly used generalized linear model (GLM/logistic regression) to predict psychiatric hospitalization in the first 12 months after a psychiatric crisis care contact, and explore the most important predictor variables of hospitalization. Methods: Data from 2,084 patients with at least one reported psychiatric crisis care contact included in the longitudinal Amsterdam Study of Acute Psychiatry were used. The accuracy and area under the receiver operating characteristic curve (AUC) of the machine learning algorithms were compared. We also estimated the relative importance of each predictor variable. The best and least performing algorithms were compared with GLM/logistic regression using net reclassification improvement analysis. Target variable for the prediction models was whether or not the patient was hospitalized in the 12 months following inclusion in the study. The 39 predictor variables were related to patients’ socio-demographics, clinical characteristics and previous mental health care contacts. Results: We found Gradient Boosting to perform the best (AUC=0.774) and K-Nearest Neighbors performing the least (AUC=0.702). The performance of GLM/logistic regression (AUC=0.76) was above average among the tested algorithms. Gradient Boosting outperformed GLM/logistic regression and K-Nearest Neighbors, and GLM outperformed K-Nearest Neighbors in a Net Reclassification Improvement analysis, although the differences between Gradient Boosting and GLM/logistic regression were small. Nine of the top-10 most important predictor variables were related to previous mental health care use. Conclusions: Gradient Boosting led to the highest predictive accuracy and AUC while GLM/logistic regression performed average among the tested algorithms. Although statistically significant, the magnitude of the differences between the machine learning algorithms was modest. Future studies may consider to combine multiple algorithms in an ensemble model for optimal performance and to mitigate the risk of choosing suboptimal performing algorithms.

Download Full-text

Book Genre Categorization Using Machine Learning Algorithms (K-Nearest Neighbor, Support Vector Machine and Logistic Regression) using Customized Dataset

International Journal of Computer Science and Mobile Computing ◽

10.47760/ijcsmc.2021.v10i03.002 ◽

2021 ◽

Vol 10 (3) ◽

pp. 14-25

Author(s):

Parilkumar Shiroya

Keyword(s):

Machine Learning ◽

Support Vector Machine ◽

Logistic Regression ◽

Nearest Neighbor ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Support Vector ◽

K Nearest Neighbor

Download Full-text

Predicting the Risk of Hypertension Based on Several Easy-to-Collect Risk Factors: A Machine Learning Method

Frontiers in Public Health ◽

10.3389/fpubh.2021.619429 ◽

2021 ◽

Vol 9 ◽

Author(s):

Huanhuan Zhao ◽

Xiaoyu Zhang ◽

Yang Xu ◽

Lisheng Gao ◽

Zuchang Ma ◽

...

Keyword(s):

Machine Learning ◽

Risk Factors ◽

Logistic Regression ◽

Risk Prediction ◽

Disease Risk ◽

Learning Algorithms ◽

Large Population ◽

Machine Learning Algorithms ◽

Hypertension Risk ◽

Model Training

Hypertension is a widespread chronic disease. Risk prediction of hypertension is an intervention that contributes to the early prevention and management of hypertension. The implementation of such intervention requires an effective and easy-to-implement hypertension risk prediction model. This study evaluated and compared the performance of four machine learning algorithms on predicting the risk of hypertension based on easy-to-collect risk factors. A dataset of 29,700 samples collected through a physical examination was used for model training and testing. Firstly, we identified easy-to-collect risk factors of hypertension, through univariate logistic regression analysis. Then, based on the selected features, 10-fold cross-validation was utilized to optimize four models, random forest (RF), CatBoost, MLP neural network and logistic regression (LR), to find the best hyper-parameters on the training set. Finally, the performance of models was evaluated by AUC, accuracy, sensitivity and specificity on the test set. The experimental results showed that the RF model outperformed the other three models, and achieved an AUC of 0.92, an accuracy of 0.82, a sensitivity of 0.83 and a specificity of 0.81. In addition, Body Mass Index (BMI), age, family history and waist circumference (WC) are the four primary risk factors of hypertension. These findings reveal that it is feasible to use machine learning algorithms, especially RF, to predict hypertension risk without clinical or genetic data. The technique can provide a non-invasive and economical way for the prevention and management of hypertension in a large population.

Download Full-text

Assessment of Landslide-Prone Areas and Their Zonation Using Logistic Regression, LogitBoost, and NaïveBayes Machine-Learning Algorithms

Combining Logistic Regression-based hybrid optimized machine learning algorithms with sensitivity analysis to achieve robust landslide susceptibility mapping

A Novel Performance Assessment Approach using Photogrammetric Techniques for Landslide Susceptibility Mapping with Logistic Regression, ANN and Random Forest

Incorporating land-use regression into machine learning algorithms in estimating the spatial-temporal variation of carbon monoxide in Taiwan

Land Use/Land Cover Mapping from Airborne Hyperspectral Images with Machine Learning Algorithms and Contextual Information

Machine-learning algorithms for land use dynamics in Lake Haramaya Watershed, Ethiopia

Predicting hospitalization following psychiatric crisis care using machine learning

Application of Machine Learning Algorithms and Their Ensemble for Landslide Susceptibility Mapping

Predicting Hospitalization following Psychiatric Crisis Care using Machine Learning

Book Genre Categorization Using Machine Learning Algorithms (K-Nearest Neighbor, Support Vector Machine and Logistic Regression) using Customized Dataset﻿

Predicting the Risk of Hypertension Based on Several Easy-to-Collect Risk Factors: A Machine Learning Method

Book Genre Categorization Using Machine Learning Algorithms (K-Nearest Neighbor, Support Vector Machine and Logistic Regression) using Customized Dataset