Application of Machine Learning to Debris Flow Susceptibility Mapping along the China–Pakistan Karakoram Highway

Feng Qing; Yan Zhao; Xingmin Meng; Xiaojun Su; Tianjun Qi; Dongxia Yue

doi:10.3390/rs12182933

Application of Machine Learning to Debris Flow Susceptibility Mapping along the China–Pakistan Karakoram Highway

Remote Sensing ◽

10.3390/rs12182933 ◽

2020 ◽

Vol 12 (18) ◽

pp. 2933

Author(s):

Feng Qing ◽

Yan Zhao ◽

Xingmin Meng ◽

Xiaojun Su ◽

Tianjun Qi ◽

...

Keyword(s):

Machine Learning ◽

Remote Sensing ◽

Debris Flow ◽

Debris Flows ◽

Susceptibility Mapping ◽

Support Vector ◽

Safe Operation ◽

Extreme Gradient Boosting ◽

Karakoram Highway ◽

Debris Flow Susceptibility

The China–Pakistan Karakoram Highway is an important land route from China to South Asia and the Middle East via Pakistan. Due to the extremely hazardous geological environment around the highway, landslides, debris flows, collapses, and subsidence are frequent. Among them, debris flows are one of the most serious geological hazards on the Karakoram Highway, and they often cause interruptions to traffic and casualties. Therefore, the development of debris flow susceptibility mapping along the highway can potentially facilitate its safe operation. In this study, we used remote sensing, GIS, and machine learning techniques to map debris flow susceptibility along the Karakoram Highway in areas where observation data are scarce and difficult to obtain by field survey. First, the distribution of 544 catchments which are prone to debris flow were identified through visual interpretation of remote sensing images. The factors influencing debris flow susceptibility were then analyzed, and a total of 17 parameters related to geomorphology, soil materials, and triggering conditions were selected. Model training was based on multiple common machine learning methods, including Ensemble Methods, Gaussian Processes, Generalized Linear models, Navies Bayes, Nearest Neighbors, Support Vector Machines, Trees, Discriminant Analysis, and eXtreme Gradient Boosting. Support Vector Classification (SVC) was chosen as the final model after evaluation; its accuracy (ACC) was 0.91, and the area under the ROC curve (AUC) was 0.96. Among the factors involved in SVC, the Melton Ratio (MR) was the most important, followed by drainage density (DD), Hypsometric Integral (HI), and average slope (AS), indicating that geomorphic conditions play an important role in predicting debris flow susceptibility in the study area. SVC was used to map debris flow susceptibility in the study area, and the results will potentially facilitate the safe operation of the highway.

Download Full-text

Comparison of Different Machine Learning Methods for Debris Flow Susceptibility Mapping: A Case Study in the Sichuan Province, China

Remote Sensing ◽

10.3390/rs12020295 ◽

2020 ◽

Vol 12 (2) ◽

pp. 295 ◽

Cited By ~ 6

Author(s):

Ke Xiong ◽

Basanta Raj Adhikari ◽

Constantine A. Stamatopoulos ◽

Yu Zhan ◽

Shaolin Wu ◽

...

Keyword(s):

Machine Learning ◽

Debris Flow ◽

Sichuan Province ◽

Susceptibility Mapping ◽

Boosted Regression Trees ◽

Support Vector ◽

Learning Methods ◽

Machine Learning Methods ◽

Susceptibility Maps ◽

Debris Flow Susceptibility

Debris flow susceptibility mapping is considered to be useful for hazard prevention and mitigation. As a frequent debris flow area, many hazardous events have occurred annually and caused a lot of damage in the Sichuan Province, China. Therefore, this study attempted to evaluate and compare the performance of four state-of-the-art machine-learning methods, namely Logistic Regression (LR), Support Vector Machines (SVM), Random Forest (RF), and Boosted Regression Trees (BRT), for debris flow susceptibility mapping in this region. Four models were constructed based on the debris flow inventory and a range of causal factors. A variety of datasets was obtained through the combined application of remote sensing (RS) and geographic information system (GIS). The mean altitude, altitude difference, aridity index, and groove gradient played the most important role in the assessment. The performance of these modes was evaluated using predictive accuracy (ACC) and the area under the receiver operating characteristic curve (AUC). The results of this study showed that all four models were capable of producing accurate and robust debris flow susceptibility maps (ACC and AUC values were well above 0.75 and 0.80 separately). With an excellent spatial prediction capability and strong robustness, the BRT model (ACC = 0.781, AUC = 0.852) outperformed other models and was the ideal choice. Our results also exhibited the importance of selecting suitable mapping units and optimal predictors. Furthermore, the debris flow susceptibility maps of the Sichuan Province were produced, which can provide helpful data for assessing and mitigating debris flow hazards.

Download Full-text

Spatial Predictions of Debris Flow Susceptibility Mapping Using Convolutional Neural Networks in Jilin Province, China

Water ◽

10.3390/w12082079 ◽

2020 ◽

Vol 12 (8) ◽

pp. 2079

Author(s):

Yang Chen ◽

Shengwu Qin ◽

Shuangshuang Qiao ◽

Qiang Dou ◽

Wenchao Che ◽

...

Keyword(s):

Neural Networks ◽

Debris Flow ◽

Statistical Methods ◽

Convolutional Neural Networks ◽

Debris Flows ◽

Susceptibility Mapping ◽

Jilin Province ◽

Support Vector ◽

Validation Set ◽

Debris Flow Susceptibility

Debris flows are a major geological disaster that can seriously threaten human life and physical infrastructures. The main contribution of this paper is the establishment of two–dimensional convolutional neural networks (2D–CNN) models by using SAME padding (S–CNN) and VALID padding (V–CNN) and comparing them with support vector machine (SVM) and artificial neural network (ANN) models, respectively, to predict the spatial probability of debris flows in Jilin Province, China. First, the dataset is randomly divided into a training set (70%) and a validation set (30%), and thirteen influencing factors are selected to build the models. Then, multicollinearity analysis and gain ratio methods are used to quantify the predictive ability of factors. Finally, the area under the receiver operatic characteristic curve (AUC) and statistical methods are utilized to measure the accuracy of the models. The results show that the S–CNN model gets the highest AUC value of 0.901 in the validation set, followed by the SVM model, the V–CNN model, and the ANN model. Three statistical methods also show that the S–CNN model produces minimum errors compared with other models. The S–CNN model is hailed as an important means to improve the accuracy of debris–flow susceptibility mapping and provides a reasonable scientific basis for critical decisions.

Download Full-text

Debris Flow Susceptibility Mapping Using Machine-Learning Techniques in Shigatse Area, China

Remote Sensing ◽

10.3390/rs11232801 ◽

2019 ◽

Vol 11 (23) ◽

pp. 2801 ◽

Cited By ~ 11

Author(s):

Yonghong Zhang ◽

Taotao Ge ◽

Wei Tian ◽

Yuei-An Liou

Keyword(s):

Neural Network ◽

Machine Learning ◽

Debris Flow ◽

Debris Flows ◽

Gradient Boosting ◽

Learning Methods ◽

Machine Learning Methods ◽

Triggering Factors ◽

Extreme Gradient Boosting ◽

Debris Flow Susceptibility

Debris flows have been always a serious problem in the mountain areas. Research on the assessment of debris flows susceptibility (DFS) is useful for preventing and mitigating debris flow risks. The main purpose of this work is to study the DFS in the Shigatse area of Tibet, by using machine learning methods, after assessing the main triggering factors of debris flows. Remote sensing and geographic information system (GIS) are used to obtain datasets of topography, vegetation, human activities and soil factors for local debris flows. The problem of debris flow susceptibility level imbalances in datasets is addressed by the Borderline-SMOTE method. Five machine learning methods, i.e., back propagation neural network (BPNN), one-dimensional convolutional neural network (1D-CNN), decision tree (DT), random forest (RF), and extreme gradient boosting (XGBoost) have been used to analyze and fit the relationship between debris flow triggering factors and occurrence, and to evaluate the weight of each triggering factor. The ANOVA and Tukey HSD tests have revealed that the XGBoost model exhibited the best mean accuracy (0.924) on ten-fold cross-validation and the performance was significantly better than that of the BPNN (0.871), DT (0.816), and RF (0.901). However, the performance of the XGBoost did not significantly differ from that of the 1D-CNN (0.914). This is also the first comparison experiment between XGBoost and 1D-CNN methods in the DFS study. The DFS maps have been verified by five evaluation methods: Precision, Recall, F1 score, Accuracy and area under the curve (AUC). Experiments show that the XGBoost has the best score, and the factors that have a greater impact on debris flows are aspect, annual average rainfall, profile curvature, and elevation.

Download Full-text

Assessment and Comparison of Six Machine Learning Models in Estimating Evapotranspiration over Croplands Using Remote Sensing and Meteorological Factors

Remote Sensing ◽

10.3390/rs13193838 ◽

2021 ◽

Vol 13 (19) ◽

pp. 3838

Author(s):

Yan Liu ◽

Sha Zhang ◽

Jiahua Zhang ◽

Lili Tang ◽

Yun Bai

Keyword(s):

Machine Learning ◽

Remote Sensing ◽

Hybrid Model ◽

Regional Scale ◽

Hybrid Models ◽

Gradient Boosting ◽

Accurate Estimation ◽

Support Vector ◽

Extreme Gradient Boosting ◽

Monteith Equation

Accurate estimates of evapotranspiration (ET) over croplands on a regional scale can provide useful information for agricultural management. The hybrid ET model that combines the physical framework, namely the Penman-Monteith equation and machine learning (ML) algorithms, have proven to be effective in ET estimates. However, few studies compared the performances in estimating ET between multiple hybrid model versions using different ML algorithms. In this study, we constructed six different hybrid ET models based on six classical ML algorithms, namely the K nearest neighbor algorithm, random forest, support vector machine, extreme gradient boosting algorithm, artificial neural network (ANN) and long short-term memory (LSTM), using observed data of 17 eddy covariance flux sites of cropland over the globe. Each hybrid model was assessed to estimate ET with ten different input data combinations. In each hybrid model, the ML algorithm was used to model the stomatal conductance (Gs), and then ET was estimated using the Penman-Monteith equation, along with the ML-based Gs. The results showed that all hybrid models can reasonably reproduce ET of cropland with the models using two or more remote sensing (RS) factors. The results also showed that although including RS factors can remarkably contribute to improving ET estimates, hybrid models except for LSTM using three or more RS factors were only marginally better than those using two RS factors. We also evidenced that the ANN-based model exhibits the optimal performance among all ML-based models in modeling daily ET, as indicated by the lower root-mean-square error (RMSE, 18.67–21.23 W m−2) and higher correlations coefficient (r, 0.90–0.94). ANN are more suitable for modeling Gs as compared to other ML algorithms under investigation, being able to provide methodological support for accurate estimation of cropland ET on a regional scale.

Download Full-text

A comparison of statistical and machine learning methods for debris flow susceptibility mapping

Stochastic Environmental Research and Risk Assessment ◽

10.1007/s00477-020-01851-8 ◽

2020 ◽

Vol 34 (11) ◽

pp. 1887-1907 ◽

Cited By ~ 1

Author(s):

Zhu Liang ◽

Chang-Ming Wang ◽

Zhi-Min Zhang ◽

Kaleem-Ullah-Jan Khan

Keyword(s):

Machine Learning ◽

Debris Flow ◽

Susceptibility Mapping ◽

Learning Methods ◽

Machine Learning Methods ◽

Debris Flow Susceptibility

Download Full-text

Monitoring the Foliar Nutrients Status of Mango Using Spectroscopy-Based Spectral Indices and PLSR-Combined Machine Learning Models

Remote Sensing ◽

10.3390/rs13040641 ◽

2021 ◽

Vol 13 (4) ◽

pp. 641

Author(s):

Gopal Ramdas Mahajan ◽

Bappa Das ◽

Dayesh Murgaokar ◽

Ittai Herrmann ◽

Katja Berger ◽

...

Keyword(s):

Machine Learning ◽

Remote Sensing ◽

Partial Least Square ◽

Least Square ◽

Partial Least Square Regression ◽

Support Vector ◽

Spectral Indices ◽

Learning Models ◽

Leaf Nutrients ◽

Machine Learning Models

Conventional methods of plant nutrient estimation for nutrient management need a huge number of leaf or tissue samples and extensive chemical analysis, which is time-consuming and expensive. Remote sensing is a viable tool to estimate the plant’s nutritional status to determine the appropriate amounts of fertilizer inputs. The aim of the study was to use remote sensing to characterize the foliar nutrient status of mango through the development of spectral indices, multivariate analysis, chemometrics, and machine learning modeling of the spectral data. A spectral database within the 350–1050 nm wavelength range of the leaf samples and leaf nutrients were analyzed for the development of spectral indices and multivariate model development. The normalized difference and ratio spectral indices and multivariate models–partial least square regression (PLSR), principal component regression, and support vector regression (SVR) were ineffective in predicting any of the leaf nutrients. An approach of using PLSR-combined machine learning models was found to be the best to predict most of the nutrients. Based on the independent validation performance and summed ranks, the best performing models were cubist (R2 ≥ 0.91, the ratio of performance to deviation (RPD) ≥ 3.3, and the ratio of performance to interquartile distance (RPIQ) ≥ 3.71) for nitrogen, phosphorus, potassium, and zinc, SVR (R2 ≥ 0.88, RPD ≥ 2.73, RPIQ ≥ 3.31) for calcium, iron, copper, boron, and elastic net (R2 ≥ 0.95, RPD ≥ 4.47, RPIQ ≥ 6.11) for magnesium and sulfur. The results of the study revealed the potential of using hyperspectral remote sensing data for non-destructive estimation of mango leaf macro- and micro-nutrients. The developed approach is suggested to be employed within operational retrieval workflows for precision management of mango orchard nutrients.

Download Full-text

Machine Learning Approach for Predicting Lane-Change Maneuvers using the SHRP2 Naturalistic Driving Study Data

Transportation Research Record Journal of the Transportation Research Board ◽

10.1177/03611981211003581 ◽

2021 ◽

pp. 036119812110035

Author(s):

Anik Das ◽

Mohamed M. Ahmed

Keyword(s):

Machine Learning ◽

Prediction Accuracy ◽

Machine Learning Algorithms ◽

Support Vector ◽

Lane Change ◽

Adaptive Boosting ◽

Extreme Gradient Boosting ◽

Naturalistic Driving Study ◽

Naturalistic Driving ◽

Change Prediction

Accurate lane-change prediction information in real time is essential to safely operate Autonomous Vehicles (AVs) on the roadways, especially at the early stage of AVs deployment, where there will be an interaction between AVs and human-driven vehicles. This study proposed reliable lane-change prediction models considering features from vehicle kinematics, machine vision, driver, and roadway geometric characteristics using the trajectory-level SHRP2 Naturalistic Driving Study and Roadway Information Database. Several machine learning algorithms were trained, validated, tested, and comparatively analyzed including, Classification And Regression Trees (CART), Random Forest (RF), eXtreme Gradient Boosting (XGBoost), Adaptive Boosting (AdaBoost), Support Vector Machine (SVM), K Nearest Neighbor (KNN), and Naïve Bayes (NB) based on six different sets of features. In each feature set, relevant features were extracted through a wrapper-based algorithm named Boruta. The results showed that the XGBoost model outperformed all other models in relation to its highest overall prediction accuracy (97%) and F1-score (95.5%) considering all features. However, the highest overall prediction accuracy of 97.3% and F1-score of 95.9% were observed in the XGBoost model based on vehicle kinematics features. Moreover, it was found that XGBoost was the only model that achieved a reliable and balanced prediction performance across all six feature sets. Furthermore, a simplified XGBoost model was developed for each feature set considering the practical implementation of the model. The proposed prediction model could help in trajectory planning for AVs and could be used to develop more reliable advanced driver assistance systems (ADAS) in a cooperative connected and automated vehicle environment.

Download Full-text

Machine learning models to identify low adherence to influenza vaccination among Korean adults with cardiovascular disease

BMC Cardiovascular Disorders ◽

10.1186/s12872-021-01925-7 ◽

2021 ◽

Vol 21 (1) ◽

Author(s):

Moojung Kim ◽

Young Jae Kim ◽

Sung Jin Park ◽

Kwang Gi Kim ◽

Pyung Chun Oh ◽

...

Keyword(s):

Machine Learning ◽

Cardiovascular Disease ◽

Influenza Vaccination ◽

Machine Learning Techniques ◽

Gradient Boosting ◽

Support Vector ◽

Age Group ◽

Learning Models ◽

Extreme Gradient Boosting ◽

Machine Learning Models

Abstract Background Annual influenza vaccination is an important public health measure to prevent influenza infections and is strongly recommended for cardiovascular disease (CVD) patients, especially in the current coronavirus disease 2019 (COVID-19) pandemic. The aim of this study is to develop a machine learning model to identify Korean adult CVD patients with low adherence to influenza vaccination Methods Adults with CVD (n = 815) from a nationally representative dataset of the Fifth Korea National Health and Nutrition Examination Survey (KNHANES V) were analyzed. Among these adults, 500 (61.4%) had answered "yes" to whether they had received seasonal influenza vaccinations in the past 12 months. The classification process was performed using the logistic regression (LR), random forest (RF), support vector machine (SVM), and extreme gradient boosting (XGB) machine learning techniques. Because the Ministry of Health and Welfare in Korea offers free influenza immunization for the elderly, separate models were developed for the < 65 and ≥ 65 age groups. Results The accuracy of machine learning models using 16 variables as predictors of low influenza vaccination adherence was compared; for the ≥ 65 age group, XGB (84.7%) and RF (84.7%) have the best accuracies, followed by LR (82.7%) and SVM (77.6%). For the < 65 age group, SVM has the best accuracy (68.4%), followed by RF (64.9%), LR (63.2%), and XGB (61.4%). Conclusions The machine leaning models show comparable performance in classifying adult CVD patients with low adherence to influenza vaccination.

Download Full-text

Prediction of Hanwoo Cattle Phenotypes from Genotypes Using Machine Learning Methods

Animals ◽

10.3390/ani11072066 ◽

2021 ◽

Vol 11 (7) ◽

pp. 2066

Author(s):

Swati Srivastava ◽

Bryan Irvine Lopez ◽

Himansu Kumar ◽

Myoungjin Jang ◽

Han-Ha Chai ◽

...

Keyword(s):

Machine Learning ◽

Support Vector ◽

Learning Methods ◽

Eye Muscle ◽

Important Species ◽

Machine Learning Methods ◽

Extreme Gradient Boosting ◽

Boosting Method ◽

Predictive Correlation ◽

Hanwoo Cattle

Hanwoo was originally raised for draft purposes, but the increase in local demand for red meat turned that purpose into full-scale meat-type cattle rearing; it is now considered one of the most economically important species and a vital food source for Koreans. The application of genomic selection in Hanwoo breeding programs in recent years was expected to lead to higher genetic progress. However, better statistical methods that can improve the genomic prediction accuracy are required. Hence, this study aimed to compare the predictive performance of three machine learning methods, namely, random forest (RF), extreme gradient boosting method (XGB), and support vector machine (SVM), when predicting the carcass weight (CWT), marbling score (MS), backfat thickness (BFT) and eye muscle area (EMA). Phenotypic and genotypic data (53,866 SNPs) from 7324 commercial Hanwoo cattle that were slaughtered at the age of around 30 months were used. The results showed that the boosting method XGB showed the highest predictive correlation for CWT and MS, followed by GBLUP, SVM, and RF. Meanwhile, the best predictive correlation for BFT and EMA was delivered by GBLUP, followed by SVM, RF, and XGB. Although XGB presented the highest predictive correlations for some traits, we did not find an advantage of XGB or any machine learning methods over GBLUP according to the mean squared error of prediction. Thus, we still recommend the use of GBLUP in the prediction of genomic breeding values for carcass traits in Hanwoo cattle.

Download Full-text

A Novel Hybrid Method for Landslide Susceptibility Mapping-Based GeoDetector and Machine Learning Cluster: A Case of Xiaojin County, China

ISPRS International Journal of Geo-Information ◽

10.3390/ijgi10020093 ◽

2021 ◽

Vol 10 (2) ◽

pp. 93

Author(s):

Wei Xie ◽

Xiaoshuang Li ◽

Wenbin Jian ◽

Yang Yang ◽

Hongwei Liu ◽

...

Keyword(s):

Machine Learning ◽

Hybrid Method ◽

Roc Curve ◽

Landslide Susceptibility ◽

Susceptibility Mapping ◽

Assessment Model ◽

Landslide Susceptibility Mapping ◽

Support Vector ◽

Susceptibility Map ◽

Area Index

Landslide susceptibility mapping (LSM) could be an effective way to prevent landslide hazards and mitigate losses. The choice of conditional factors is crucial to the results of LSM, and the selection of models also plays an important role. In this study, a hybrid method including GeoDetector and machine learning cluster was developed to provide a new perspective on how to address these two issues. We defined redundant factors by quantitatively analyzing the single impact and interactive impact of the factors, which was analyzed by GeoDetector, the effect of this step was examined using mean absolute error (MAE). The machine learning cluster contains four models (artificial neural network (ANN), Bayesian network (BN), logistic regression (LR), and support vector machines (SVM)) and automatically selects the best one for generating LSM. The receiver operating characteristic (ROC) curve, prediction accuracy, and the seed cell area index (SCAI) methods were used to evaluate these methods. The results show that the SVM model had the best performance in the machine learning cluster with the area under the ROC curve of 0.928 and with an accuracy of 83.86%. Therefore, SVM was chosen as the assessment model to map the landslide susceptibility of the study area. The landslide susceptibility map demonstrated fit with landslide inventory, indicated the hybrid method is effective in screening landslide influences and assessing landslide susceptibility.

Download Full-text