Landslide Susceptibility Mapping Using GIS-Based Data Mining Algorithms

Vali Vakhshoori; Hamid Reza Pourghasemi; Mohammad Zare; Thomas Blaschke

doi:10.3390/w11112292

Landslide Susceptibility Mapping Using GIS-Based Data Mining Algorithms

Water ◽

10.3390/w11112292 ◽

2019 ◽

Vol 11 (11) ◽

pp. 2292 ◽

Cited By ~ 7

Author(s):

Vali Vakhshoori ◽

Hamid Reza Pourghasemi ◽

Mohammad Zare ◽

Thomas Blaschke

Keyword(s):

Data Mining ◽

Landslide Susceptibility ◽

Characteristic Curve ◽

Landslide Susceptibility Mapping ◽

Support Vector ◽

Susceptibility Map ◽

Northern Iran ◽

Data Mining Algorithms ◽

Data Volume ◽

Mining Algorithms

The aim of this study was to apply data mining algorithms to produce a landslide susceptibility map of the national-scale catchment called Bandar Torkaman in northern Iran. As it was impossible to directly use the advanced data mining methods due to the volume of data at this scale, an intermediate approach, called normalized frequency-ratio unique condition units (NFUC), was devised to reduce the data volume. With the aid of this technique, different data mining algorithms such as fuzzy gamma (FG), binary logistic regression (BLR), backpropagation artificial neural network (BPANN), support vector machine (SVM), and C5 decision tree (C5DT) were employed. The success and prediction rates of the models, which were calculated by receiver operating characteristic curve, were 0.859 and 0.842 for FG, 0.887 and 0.855 for BLR, 0.893 and 0.856 for C5DT, 0.891 and 0.875 for SVM, and 0.896 and 0.872 for BPANN that showed the highest validation rates as compared with the other methods. The proposed approach of NFUC proved highly efficient in data volume reduction, and therefore the application of computationally demanding algorithms for large areas with voluminous data was feasible.

Download Full-text

A Novel Hybrid Method for Landslide Susceptibility Mapping-Based GeoDetector and Machine Learning Cluster: A Case of Xiaojin County, China

ISPRS International Journal of Geo-Information ◽

10.3390/ijgi10020093 ◽

2021 ◽

Vol 10 (2) ◽

pp. 93

Author(s):

Wei Xie ◽

Xiaoshuang Li ◽

Wenbin Jian ◽

Yang Yang ◽

Hongwei Liu ◽

...

Keyword(s):

Machine Learning ◽

Hybrid Method ◽

Roc Curve ◽

Landslide Susceptibility ◽

Susceptibility Mapping ◽

Assessment Model ◽

Landslide Susceptibility Mapping ◽

Support Vector ◽

Susceptibility Map ◽

Area Index

Landslide susceptibility mapping (LSM) could be an effective way to prevent landslide hazards and mitigate losses. The choice of conditional factors is crucial to the results of LSM, and the selection of models also plays an important role. In this study, a hybrid method including GeoDetector and machine learning cluster was developed to provide a new perspective on how to address these two issues. We defined redundant factors by quantitatively analyzing the single impact and interactive impact of the factors, which was analyzed by GeoDetector, the effect of this step was examined using mean absolute error (MAE). The machine learning cluster contains four models (artificial neural network (ANN), Bayesian network (BN), logistic regression (LR), and support vector machines (SVM)) and automatically selects the best one for generating LSM. The receiver operating characteristic (ROC) curve, prediction accuracy, and the seed cell area index (SCAI) methods were used to evaluate these methods. The results show that the SVM model had the best performance in the machine learning cluster with the area under the ROC curve of 0.928 and with an accuracy of 83.86%. Therefore, SVM was chosen as the assessment model to map the landslide susceptibility of the study area. The landslide susceptibility map demonstrated fit with landslide inventory, indicated the hybrid method is effective in screening landslide influences and assessing landslide susceptibility.

Download Full-text

Incorporating Landslide Spatial Information and Correlated Features among Conditioning Factors for Landslide Susceptibility Mapping

Remote Sensing ◽

10.3390/rs13112166 ◽

2021 ◽

Vol 13 (11) ◽

pp. 2166

Author(s):

Xin Yang ◽

Rui Liu ◽

Mei Yang ◽

Jingjue Chen ◽

Tianqiang Liu ◽

...

Keyword(s):

Land Use ◽

Hybrid Model ◽

Landslide Susceptibility ◽

Spatial Information ◽

Landslide Susceptibility Mapping ◽

Ratio Method ◽

Support Vector ◽

Susceptibility Map ◽

Conditioning Factors ◽

Proposed Model

This study proposed a new hybrid model based on the convolutional neural network (CNN) for making effective use of historical datasets and producing a reliable landslide susceptibility map. The proposed model consists of two parts; one is the extraction of landslide spatial information using two-dimensional CNN and pixel windows, and the other is to capture the correlated features among the conditioning factors using one-dimensional convolutional operations. To evaluate the validity of the proposed model, two pure CNN models and the previously used methods of random forest and a support vector machine were selected as the benchmark models. A total of 621 earthquake-triggered landslides in Ludian County, China and 14 conditioning factors derived from the topography, geological, hydrological, geophysical, land use and land cover data were used to generate a geospatial dataset. The conditioning factors were then selected and analyzed by a multicollinearity analysis and the frequency ratio method. Finally, the trained model calculated the landslide probability of each pixel in the study area and produced the resultant susceptibility map. The results indicated that the hybrid model benefitted from the features extraction capability of the CNN and achieved high-performance results in terms of the area under the receiver operating characteristic curve (AUC) and statistical indices. Moreover, the proposed model had 6.2% and 3.7% more improvement than the two pure CNN models in terms of the AUC, respectively. Therefore, the proposed model is capable of accurately mapping landslide susceptibility and providing a promising method for hazard mitigation and land use planning. Additionally, it is recommended to be applied to other areas of the world.

Download Full-text

Osteoporosis Risk Prediction Using Data Mining Algorithms

Journal of Community Health Research ◽

10.18502/jchr.v9i2.3401 ◽

2020 ◽

Author(s):

Efat Jabarpour ◽

Amin Abedini ◽

Abbasali Keshtkar

Keyword(s):

Data Mining ◽

Personal Information ◽

Disease Diagnosis ◽

Support Vector ◽

Data Mining Algorithms ◽

Industry Standard ◽

Disease Information ◽

Increased Risk ◽

Using Data ◽

Mining Algorithms

Introduction: Osteoporosis is a disease that reduces bone density and loses the quality of bone microstructure leading to an increased risk of fractures. It is one of the major causes of inability and death in elderly people. The current study aims at determining the factors influencing the incidence of osteoporosis and providing a predictive model for the disease diagnosis to increase the diagnostic speed and reduce diagnostic costs. Methods: An Individual's data including personal information, lifestyle, and disease information were reviewed. A new model has been presented based on the Cross-Industry Standard Process CRISP methodology. Besides, Support Vector Machine (SVM) and Bayes methods (Tree Augmented Naïve Bayes (TAN)) and Clementine12 have been used as data mining tools. Results: Some features have been detected to affect this disease. The rules have been extracted that can be used as a pattern for the prediction of the patients' status. Classification precision was calculated to be 88.39% for SVM, and 91.29% for (TAN) when the precision of TAN is higher comparing to other methods. Conclusion: The most effective factors concerning osteoporosis are detected and can be used for a new sample with defined characteristics to predict the possibility of osteoporosis in a person.

Download Full-text

Applying data mining algorithms to real estate appraisals: a comparative study

International Journal of Housing Markets and Analysis ◽

10.1108/ijhma-07-2020-0080 ◽

2021 ◽

Vol ahead-of-print (ahead-of-print) ◽

Author(s):

Thiago Cesar de Oliveira ◽

Lúcio de Medeiros ◽

Daniel Henrique Marco Detzel

Keyword(s):

Data Mining ◽

Real Estate ◽

Support Vector ◽

Predictive Capacity ◽

Content Type ◽

Data Mining Algorithms ◽

Wide Range ◽

Very Large Databases ◽

Mining Algorithms ◽

Statistical Results

Purpose Real estate appraisals are becoming an increasingly important means of backing up financial operations based on the values of these kinds of assets. However, in very large databases, there is a reduction in the predictive capacity when traditional methods, such as multiple linear regression (MLR), are used. This paper aims to determine whether in these cases the application of data mining algorithms can achieve superior statistical results. First, real estate appraisal databases from five towns and cities in the State of Paraná, Brazil, were obtained from Caixa Econômica Federal bank. Design/methodology/approach After initial validations, additional databases were generated with both real, transformed and nominal values, in clean and raw data. Each was assisted by the application of a wide range of data mining algorithms (multilayer perceptron, support vector regression, K-star, M5Rules and random forest), either isolated or combined (regression by discretization – logistic, bagging and stacking), with the use of 10-fold cross-validation in Weka software. Findings The results showed more varied incremental statistical results with the use of algorithms than those obtained by MLR, especially when combined algorithms were used. The largest increments were obtained in databases with a large amount of data and in those where minor initial data cleaning was carried out. The paper also conducts a further analysis, including an algorithmic ranking based on the number of significant results obtained. Originality/value The authors did not find similar studies or research studies conducted in Brazil.

Download Full-text

Acoustic Signature Based Weld Quality Monitoring for SMAW Process Using Data Mining Algorithms

Applied Mechanics and Materials ◽

10.4028/www.scientific.net/amm.813-814.1104 ◽

2015 ◽

Vol 813-814 ◽

pp. 1104-1113 ◽

Cited By ~ 5

Author(s):

A. Sumesh ◽

Dinu Thomas Thekkuden ◽

Binoy B. Nair ◽

K. Rameshkumar ◽

K. Mohandas

Keyword(s):

Neural Network ◽

Data Mining ◽

Welding Process ◽

Machine Learning Algorithms ◽

Steel Plates ◽

Support Vector ◽

Welding Parameters ◽

Process Data ◽

Data Mining Algorithms ◽

Mining Algorithms

The quality of weld depends upon welding parameters and exposed environment conditions. Improper selection of welding process parameter is one of the important reasons for the occurrence of weld defect. In this work, arc sound signals are captured during the welding of carbon steel plates. Statistical features of the sound signals are extracted during the welding process. Data mining algorithms such as Naive Bayes, Support Vector Machines and Neural Network were used to classify the weld conditions according to the features of the sound signal. Two weld conditions namely good weld and weld with defects namely lack of fusion, and burn through were considered in this study. Classification efficiencies of machine learning algorithms were compared. Neural network is found to be producing better classification efficiency comparing with other algorithms considered in this study.

Download Full-text

Comparing Performance of Data Mining Algorithms in Prediction Heart Diseases

International Journal of Electrical and Computer Engineering (IJECE) ◽

10.11591/ijece.v5i6.pp1569-1576 ◽

2015 ◽

Vol 5 (6) ◽

pp. 1569 ◽

Cited By ~ 13

Author(s):

Moloud Abdar ◽

Sharareh R. Niakan Kalhori ◽

Tole Sutikno ◽

Imam Much Ibnu Subroto ◽

Goli Arji

Keyword(s):

Neural Network ◽

Data Mining ◽

Decision Tree ◽

Heart Diseases ◽

Support Vector ◽

Data Mining Algorithm ◽

Network Support ◽

Data Mining Algorithms ◽

Mining Algorithms ◽

Analysis Models

Heart diseases are among the nation’s leading couse of mortality and moribidity. Data mining teqniques can predict the likelihood of patients getting a heart disease. The purpose of this study is comparison of different data mining algorithm on prediction of heart diseases. This work applied and compared data mining techniques to predict the risk of heart diseases. After feature analysis, models by five algorithms including decision tree (C5.0), neural network, support vector machine (SVM), logistic regression and k-nearest neighborhood (KNN) were developed and validated. C5.0 Decision tree has been able to build a model with greatest accuracy 93.02%, KNN, SVM, Neural network have been 88.37%, 86.05% and 80.23% respectively. Produced results of decision tree can be simply interpretable and applicable; their rules can be understood easily by different clinical practitioner.

Download Full-text

Data mining, fuzzy AHP and TOPSIS for optimizing taxpayer supervision

Indonesian Journal of Electrical Engineering and Computer Science ◽

10.11591/ijeecs.v18.i1.pp75-87 ◽

2020 ◽

Vol 18 (1) ◽

pp. 75

Author(s):

M. Jupri ◽

Riyanarto Sarno

Keyword(s):

Data Mining ◽

Nearest Neighbor ◽

Fuzzy Ahp ◽

Support Vector ◽

K Nearest Neighbor ◽

Data Mining Algorithms ◽

Using Data ◽

Time Required ◽

Mining Algorithms

The achievement of accepting optimal tax need effective and efficient tax supervision can be achieved by classifying taxpayer compliance to tax regulations. Considering this issue, this paper proposes the classification of taxpayer compliance using data mining algorithms; i.e. C4.5, Support Vector Machine, K-Nearest Neighbor, Naive Bayes, and Multilayer Perceptron based on the compliance of taxpayer data. The taxpayer compliance can be classified into four classes, which are (1) formal and material compliant taxpayers, (2) formal compliant taxpayers, (3) material compliant taxpayers, and (4) formal and material non-compliant taxpayers. Furthermore, the results of data mining algorithms are compared by using Fuzzy AHP and TOPSIS to determine the best performance classification based on the criteria of Accuracy, F-Score, and Time required. Selection of the taxpayer's priority for more detailed supervision at each level of taxpayer compliance is ranked using Fuzzy AHP and TOPSIS based on criteria of dataset variables. The results show that C4.5 is the best performance classification and achieves preference value of 0.998; whereas the MLP algorithm results from the lowest preference value of 0.131. Alternative taxpayer A233 is the top priority taxpayer with a preference value of 0.433; whereas alternative taxpayer A051 is the lowest priority taxpayer with a preference value of 0.036.

Download Full-text

Spatiotemporal Landslide Susceptibility Mapping Incorporating the Effects of Heavy Rainfall: A Case Study of the Heavy Rainfall in August 2021 in Kitakyushu, Fukuoka, Japan

Water ◽

10.3390/w13223312 ◽

2021 ◽

Vol 13 (22) ◽

pp. 3312

Author(s):

Jiaying Li ◽

Weidong Wang ◽

Yange Li ◽

Zheng Han ◽

Guangqi Chen

Keyword(s):

Landslide Susceptibility ◽

Heavy Rainfall ◽

Characteristic Curve ◽

Landslide Susceptibility Mapping ◽

Economic Losses ◽

Support Vector ◽

Landslide Risk ◽

Chi Square ◽

Area Index ◽

The Impact

Landslide represents an increasing menace causing huge casualties and economic losses, and rainfall is a predominant factor inducing landslides. Landslide susceptibility assessment (LSA) is a commonly used and effective method to prevent landslide risk, however, the LSA does not analyze the impact of the rainfall on landslides which is significant and non-negligible. Therefore, the spatiotemporal LSA considering the inducing effect of rainfall is proposed to improve accuracy and applicability. In this study, the influencing factors are selected using the chi-square test, out-of-bag error and multicollinearity test. The spatial LSA are thus obtained using the random forest (RF) model, deep belief networks model and support vector machine, and compared using receiver operating characteristic curve and seed cell area index to determine the optimal assessment result. According to the heavy rainfall characteristics in the study area, the rainfall period is divided into four stages, and the effective rainfall model is employed to generate the rainfall impact (RI) maps of the four stages. The spatiotemporal LSAs are obtained by coupling the optimal spatial LSA and various RI maps and verified using the landslide warning map. The results demonstrate that the optimal spatiotemporal LSA is obtained using the spatial LSA of the RF model and temporal LSA of the rainfall data in the peak stage. It can predict the area where rainfall-induced landslides are likely to occur and prevent landslide risk.

Download Full-text

A Novel Intelligence Approach of a Sequential Minimal Optimization-Based Support Vector Machine for Landslide Susceptibility Mapping

Sustainability ◽

10.3390/su11226323 ◽

2019 ◽

Vol 11 (22) ◽

pp. 6323 ◽

Cited By ~ 12

Author(s):

Pham ◽

Prakash ◽

Chen ◽

Ly ◽

Ho ◽

...

Keyword(s):

Support Vector Machine ◽

Hybrid Model ◽

Landslide Susceptibility ◽

Characteristic Curve ◽

Susceptibility Mapping ◽

Landslide Susceptibility Mapping ◽

Support Vector ◽

Sequential Minimal Optimization ◽

Affecting Factors ◽

Landslide Modeling

The main objective of this study is to propose a novel hybrid model of a sequential minimal optimization and support vector machine (SMOSVM) for accurate landslide susceptibility mapping. For this task, one of the landslide prone areas of Vietnam, the Mu Cang Chai District located in Yen Bai Province was selected. In total, 248 landslide locations and 15 landslide-affecting factors were selected for landslide modeling and analysis. Predictive capability of SMOSVM was evaluated and compared with other landslide models, namely a hybrid model of the cascade generalization optimization-based support vector machine (CGSVM), individual models, such as support vector machines (SVM) and naïve Bayes trees (NBT). For validation, different quantitative criteria such as statistical based methods and area under the receiver operating characteristic curve (AUC) technique were used. Results of the study show that the SMOSVM model (AUC = 0.824) has the highest performance for landslide susceptibility mapping, followed by CGSVM (AUC = 0.815), SVM (AUC = 0.804), and NBT (AUC = 0.800) models, respectively. Thus, the proposed novel SMOSVM model is a promising method for better landslide susceptibility mapping and prediction, which can be applied also in other landslide prone areas.

Download Full-text

Performance Analysis of Data Mining Algorithms

Journal of Computational and Theoretical Nanoscience ◽

10.1166/jctn.2019.8260 ◽

2019 ◽

Vol 16 (9) ◽

pp. 3849-3853

Author(s):

Dar Masroof Amin ◽

Atul Garg

Keyword(s):

Data Mining ◽

Big Data ◽

Future Trend ◽

Easy Access ◽

Support Vector ◽

Linear Discriminant ◽

Data Mining Algorithms ◽

Data Files ◽

Using Data ◽

Mining Algorithms

The globalisation of Internet is creating enormous amount of data on servers. The data created during last two years is itself equivalent to the data created during all these years. This exponential creation of data is due to the easy access to devices based on Internet of things. This information has become a source of predictive analysis for future happenings. The versatile use of computing devices is creating data of diverse nature and the analysts are predicting the future trend using data of their respective domain. The technology used to analyse the data has become a bottleneck over the time. The main reason behind this is that the rate with which the data is getting created is much more than the technology used to access the same. There are various mining techniques used to explore the useful information. In this research there is detailed analysis of how data is used and perceived by various data mining algorithms. Mining algorithms like Naïve Bayes, Support Vector Machines, Linear Discriminant Analysis Algorithm, Artificial Neural Networks, C4.5, C5.0, K-Nearest Neighbour are analysed. The input data used in these algorithms is big data files. This research mainly focuses on how the existing data algorithms are interacting with big data files. The research has been done on twitter comments.

Download Full-text