An Initial Screening Method for Tuberculosis Diseases Using a Multi-objective Gradient Evolution-Based Support Vector Machine and C5.0 Decision Tree

Author(s):  
Ferani E. Zulvia ◽  
R. J. Kuo ◽  
E. Roflin
Author(s):  
Zohreh Manoochehri ◽  
Sara Manoochehri ◽  
Farzaneh Soltani ◽  
Majid Sadeghifar

Background: Preeclampsia is a type of pregnancy hypertension disorder that has adverse effects on both the mother and the fetus. Despite recent advances in the etiology of preeclampsia, no adequate clinical screening tests have been identified to diagnose the disorder. Objective: We aimed to provide a model based on data mining approaches that can be used as a screening tool to identify patients with this syndrome and also to identify the risk factors associated with it. Materials and Methods: The data used to perform this cross-sectional study were extracted from the clinical records of 726 mothers with preeclampsia and 726 mothers without preeclampsia who were referred to Fatemieh Hospital in Hamadan City during April 2005–March 2015. In this study, six data mining methods were adopted, including logistic regression, k-nearest neighborhood, C5.0 decision tree, discriminant analysis, random forest, and support vector machine, and their performance was compared using the criteria of accuracy, sensitivity, and specificity. Results: Underlying condition, age, pregnancy season and the number of pregnancies were the most important risk factors for diagnosing preeclampsia. The accuracy of the models were as follows: logistic regression (0.713), k-nearest neighborhood (0.742), C5.0 decision tree (0.788), discriminant analysis (0.687), random forest (0.758) and support vector machine (0.791). Conclusion: Among the data mining methods employed in this study, support vector machine was the most accurate in predicting preeclampsia. Therefore, this model can be considered as a screening tool to diagnose this disorder. Key words: Preeclampsia, Random forest, C5.0 decision tree, Support vector machine, Logistic regression.


2021 ◽  
Vol 9 ◽  
Author(s):  
Qiaomei Su ◽  
Weiheng Tao ◽  
Shiguang Mei ◽  
Xiaoyuan Zhang ◽  
Kaixin Li ◽  
...  

The main purpose of this study is to establish an effective landslide susceptibility zoning model and test whether underground mined areas and ground collapse in coal mine areas seriously affect the occurrence of landslides. Taking the Fenxi Coal Mine Area of Shanxi Province in China as the research area, landslide data has been investigated by the Shanxi Geological Environment Monitoring Center; adopting the 5-fold cross-validation method, and through Geostatistics analysis means the datasets of all non-landslides and landslides were divided into 80:20 proportions randomly for training and validating models. A set of 15 condition factors including terrain, geological, hydrological, land cover, and human engineering activity factors (distance to road, distance to mined area, ground collapse density) were selected as the evaluation indices to construct the susceptibility assessment model. Three machine learning algorithms for landslide susceptibility prediction (LSP) including C5.0 Decision Tree (C5.0), Random Forest (RF), and Support Vector Machine (SVM) have been selected and compared through the Areas under the Receiver Operating Characteristics (ROC) Curves (AUC), and several statistical estimates. The study revealed that for these three models the value range of prediction accuracies vary from 83.49 to 99.29% (in the training stage), and 62.26–73.58% (in the validation stage). In the two stages, AUCs are between 0.92 to 0.99 and 0.71 to 0.80 respectively. Using Jenks Natural Breaks algorithm, three LSPs levels are established as very low, low, medium, high, and very high probability of landslide by dividing the indices of the LSP. Compared with RF and SVM, C5.0 is considered better in five categories according to quantities and distribution of the landslides and their area percentage for different LSP zones. Four factors such as distance to road, lithology, profile curvature, and ground collapse density are the most suitable condition factors for LSP. The distance to mine area factor has a medium contribution and plays an obvious role in the occurrence of landslides in all the models. The result reveals that C5.0 possesses better prediction efficiency than RF and SVM, and underground mined area and ground collapse sifnigicantly affect significantly the occurrence of landslides in the Fenxi Coal Mine Area.


2019 ◽  
Vol 15 (2) ◽  
pp. 275-280
Author(s):  
Agus Setiyono ◽  
Hilman F Pardede

It is now common for a cellphone to receive spam messages. Great number of received messages making it difficult for human to classify those messages to Spam or no Spam.  One way to overcome this problem is to use Data Mining for automatic classifications. In this paper, we investigate various data mining techniques, named Support Vector Machine, Multinomial Naïve Bayes and Decision Tree for automatic spam detection. Our experimental results show that Support Vector Machine algorithm is the best algorithm over three evaluated algorithms. Support Vector Machine achieves 98.33%, while Multinomial Naïve Bayes achieves 98.13% and Decision Tree is at 97.10 % accuracy.


2019 ◽  
Vol 1255 ◽  
pp. 012067
Author(s):  
Natalina Br Sitepu ◽  
Sawaluddin ◽  
M Zarlis ◽  
Syahril Efendi ◽  
Hanna Willa Dhany

2020 ◽  
Vol 16 (2) ◽  
pp. 75
Author(s):  
Didit Widiyanto

Akurasi sebuah klasifikasi citra ditentukan oleh pengklasifikasi.  Meskipun RoI (Region of Interest) tidak menentukan secara langsung akurasi, namun RoI menentukan lingkup klasifikasi citra.   Terdapat tiga algoritma yang dapat digunakan sebagai algoritma RoI yaitu; Balanced Histogram Thresholding (BHT), algoritma Otsu, dan algoritma klasterisasi K-Means.  Paper ini meninjau algoritma Otsu dan algoritma klasterisasi K-Means yang digunakan oleh lima peneliti.  Dari ke lima peneliti; tiga peneliti menerapkan algoritma Otsu dan dua peneliti menerapkan algoritma K-Means sebagai algoritma RoI. Setelah operasi RoI, ke lima peneliti menerapkan algoritma GLCM (Gray Level Co-occurance Matrix) sebagai pengekstraksi ciri tekstur.  Hasil ekstraksi ciri diklasifikasi dengan menggunakan berbagai pengklasifikasi antara lain SVM (Support Vector Machine), Naive Bayes, dan Decision Tree. Akhirnya dengan membandingkan hasil dari ke lima peneliti, akurasi tertinggi diperoleh sebesar 100% dengan pengklasifikasi SVM menggunakan algoritma Otsu sebagai algoritma RoI, dan akurasi terendah adalah sebesar52% yang menggunakan algoritma Otsu pada kanal S dari citra HSV (Hue, Saturation Value).


Sign in / Sign up

Export Citation Format

Share Document