Improving the Prediction Accuracy of Decision Tree Mining with Data Preprocessing

Data Mining in Analysis of Biomechanical Signals

Solid State Phenomena ◽

10.4028/www.scientific.net/ssp.147-149.588 ◽

2009 ◽

Vol 147-149 ◽

pp. 588-593 ◽

Cited By ~ 3

Author(s):

Marcin Derlatka ◽

Jolanta Pauk

Keyword(s):

Data Mining ◽

Principal Component Analysis ◽

Cerebral Palsy ◽

Spina Bifida ◽

Decision Tree ◽

Principal Component ◽

Data Preprocessing ◽

Component Analysis ◽

Kernel Principal Component Analysis

In the paper the procedure of processing biomechanical data has been proposed. It consists of selecting proper noiseless data, preprocessing data by means of model’s identification and Kernel Principal Component Analysis and next classification using decision tree. The obtained results of classification into groups (normal and two selected pathology of gait: Spina Bifida and Cerebral Palsy) were very good.

Download Full-text

Classification and Prediction on the Effects of Nutritional Intake on Overweight/Obesity, Dyslipidemia, Hypertension and Type 2 Diabetes Mellitus Using Deep Learning Model: 4–7th Korea National Health and Nutrition Examination Survey

International Journal of Environmental Research and Public Health ◽

10.3390/ijerph18115597 ◽

2021 ◽

Vol 18 (11) ◽

pp. 5597

Author(s):

Hyerim Kim ◽

Dong Hoon Lim ◽

Yoona Kim

Keyword(s):

Diabetes Mellitus ◽

Logistic Regression ◽

Decision Tree ◽

Prediction Accuracy ◽

Nutritional Intake ◽

Nutrition Examination Survey ◽

Learning Models ◽

Health And Nutrition ◽

Machine Learning Models

Few studies have been conducted to classify and predict the influence of nutritional intake on overweight/obesity, dyslipidemia, hypertension and type 2 diabetes mellitus (T2DM) based on deep learning such as deep neural network (DNN). The present study aims to classify and predict associations between nutritional intake and risk of overweight/obesity, dyslipidemia, hypertension and T2DM by developing a DNN model, and to compare a DNN model with the most popular machine learning models such as logistic regression and decision tree. Subjects aged from 40 to 69 years in the 4–7th (from 2007 through 2018) Korea National Health and Nutrition Examination Survey (KNHANES) were included. Diagnostic criteria of dyslipidemia (n = 10,731), hypertension (n = 10,991), T2DM (n = 3889) and overweight/obesity (n = 10,980) were set as dependent variables. Nutritional intakes were set as independent variables. A DNN model comprising one input layer with 7 nodes, three hidden layers with 30 nodes, 12 nodes, 8 nodes in each layer and one output layer with one node were implemented in Python programming language using Keras with tensorflow backend. In DNN, binary cross-entropy loss function for binary classification was used with Adam optimizer. For avoiding overfitting, dropout was applied to each hidden layer. Structural equation modelling (SEM) was also performed to simultaneously estimate multivariate causal association between nutritional intake and overweight/obesity, dyslipidemia, hypertension and T2DM. The DNN model showed the higher prediction accuracy with 0.58654 for dyslipidemia, 0.79958 for hypertension, 0.80896 for T2DM and 0.62496 for overweight/obesity compared with two other machine leaning models with five-folds cross-validation. Prediction accuracy for dyslipidemia, hypertension, T2DM and overweight/obesity were 0.58448, 0.79929, 0.80818 and 0.62486, respectively, when analyzed by a logistic regression, also were 0.52148, 0.66773, 0.71587 and 0.54026, respectively, when analyzed by a decision tree. This study observed a DNN model with three hidden layers with 30 nodes, 12 nodes, 8 nodes in each layer had better prediction accuracy than two conventional machine learning models of a logistic regression and decision tree.

Download Full-text

COMPARATIVE STUDY OF MACHINE LEARNING KNN, SVM, AND DECISION TREE ALGORITHM TO PREDICT STUDENT’S PERFORMANCE

International Journal of Research -GRANTHAALAYAH ◽

10.29121/granthaalayah.v7.i1.2019.1048 ◽

2019 ◽

Vol 7 (1) ◽

pp. 190-196

Author(s):

Slamet Wiyono ◽

Taufiq Abidin

Keyword(s):

Decision Tree ◽

Student Performance ◽

Prediction Accuracy ◽

Model Building ◽

Decision Tree Algorithm ◽

Tree Algorithm ◽

Svm Algorithm ◽

Tree Algorithms ◽

Student’S Performance ◽

Predicting Student Performance

Students who are not-active will affect the number of students who graduate on time. Prevention of not-active students can be done by predicting student performance. The study was conducted by comparing the KNN, SVM, and Decision Tree algorithms to get the best predictive model. The model making process was carried out by steps; data collecting, pre-processing, model building, comparison of models, and evaluation. The results show that the SVM algorithm has the best accuracy in predicting with a precision value of 95%. The Decision Tree algorithm has a prediction accuracy of 93% and the KNN algorithm has a prediction accuracy value of 92%.

Download Full-text

A Robust UWSN Handover Prediction System Using Ensemble Learning

Sensors ◽

10.3390/s21175777 ◽

2021 ◽

Vol 21 (17) ◽

pp. 5777

Author(s):

Esraa Eldesouky ◽

Mahmoud Bekhit ◽

Ahmed Fathalla ◽

Ahmad Salah ◽

Ahmed Ali

Keyword(s):

Decision Tree ◽

Ensemble Learning ◽

Prediction Accuracy ◽

Performance Metrics ◽

Prediction Models ◽

Sensor Nodes ◽

Wireless Sensor ◽

Water Current ◽

Gradient Boosting ◽

Marine Data

The use of underwater wireless sensor networks (UWSNs) for collaborative monitoring and marine data collection tasks is rapidly increasing. One of the major challenges associated with building these networks is handover prediction; this is because the mobility model of the sensor nodes is different from that of ground-based wireless sensor network (WSN) devices. Therefore, handover prediction is the focus of the present work. There have been limited efforts in addressing the handover prediction problem in UWSNs and in the use of ensemble learning in handover prediction for UWSNs. Hence, we propose the simulation of the sensor node mobility using real marine data collected by the Korea Hydrographic and Oceanographic Agency. These data include the water current speed and direction between data. The proposed simulation consists of a large number of sensor nodes and base stations in a UWSN. Next, we collected the handover events from the simulation, which were utilized as a dataset for the handover prediction task. Finally, we utilized four machine learning prediction algorithms (i.e., gradient boosting, decision tree (DT), Gaussian naive Bayes (GNB), and K-nearest neighbor (KNN)) to predict handover events based on historically collected handover events. The obtained prediction accuracy rates were above 95%. The best prediction accuracy rate achieved by the state-of-the-art method was 56% for any UWSN. Moreover, when the proposed models were evaluated on performance metrics, the measured evolution scores emphasized the high quality of the proposed prediction models. While the ensemble learning model outperformed the GNB and KNN models, the performance of ensemble learning and decision tree models was almost identical.

Download Full-text