Analysis of Various Decision Tree Algorithms for Classification in Data Mining

Extracurricular activities are additional activities in schools, where through this activity, students can add or explore the skills of students in self-development efforts. One of the extracurricular activities is foreign language extracurricular activities, covering 5 languages namely Arabic, English, German, Mandarin, Japanese. In knowing students' interest in extracurricular activities, a study was conducted on the level of interest in extracurricular activities, namely foreign languages, students at the Vocational School Health Analyst Abdurrab. In predicting the level of interest in foreign languages by the process of data mining using the C45 Algorithm method. C45 algorithm is a group of Decision Tree Algorithms. From this research, the school can find out the extent of interest in foreign languages in students and schools can increase extracurricular activities and students can develop their interest in foreign languages as they wish.

Download Full-text

Partition Real Data in Decision Tree Using Statistical Criterion

Applied Mechanics and Materials ◽

10.4028/www.scientific.net/amm.380-384.1469 ◽

2013 ◽

Vol 380-384 ◽

pp. 1469-1472

Author(s):

Gui Jun Shan

Keyword(s):

Machine Learning ◽

Data Mining ◽

Decision Tree ◽

Classification Accuracy ◽

Real Data ◽

Statistical Criterion ◽

Partition Scheme ◽

C4.5 Decision Tree ◽

Tree Algorithms ◽

Partition Method

Partition methods for real data play an extremely important role in decision tree algorithms in data mining and machine learning because the decision tree algorithms require that the values of attributes are discrete. In this paper, we propose a novel partition method for real data in decision tree using statistical criterion. This method constructs a statistical criterion to find accurate merging intervals. In addition, we present a heuristic partition algorithm to achieve a desired partition result with the aim to improve the performance of decision tree algorithms. Empirical experiments on UCI real data show that the new algorithm generates a better partition scheme that improves the classification accuracy of C4.5 decision tree than existing algorithms.

Download Full-text

Classification of Thyroid Disease by Using Data Mining Models: A Comparison of Decision Tree Algorithms

The Oxford Journal of Intelligent Decision and Data Science ◽

10.5899/2016/ojids-00002 ◽

2016 ◽

Vol 2016 (2) ◽

pp. 13-28 ◽

Cited By ~ 7

Author(s):

Ebru Turanoglu-Bekar ◽

Gozde Ulutagay ◽

Suzan Kantarcı-Savas

Keyword(s):

Data Mining ◽

Decision Tree ◽

Thyroid Disease ◽

Tree Algorithms ◽

Using Data

Download Full-text

Comparison of Decision Tree Algorithms for Predicting Potential Air Pollutant Emissions with Data Mining Models

Journal of Environmental Informatics ◽

10.3808/jei.201100186 ◽

2011 ◽

Vol 17 (1) ◽

pp. 46-53 ◽

Cited By ~ 27

Author(s):

D Birant

Keyword(s):

Data Mining ◽

Decision Tree ◽

Air Pollutant ◽

Pollutant Emissions ◽

Tree Algorithms

Download Full-text

Influence of data mining technology in information analysis of human resource management on macroscopic economic management

PLoS ONE ◽

10.1371/journal.pone.0251483 ◽

2021 ◽

Vol 16 (5) ◽

pp. e0251483

Author(s):

Ai Zhang

Keyword(s):

Data Mining ◽

Resource Management ◽

Decision Tree ◽

Human Resource Management ◽

Human Resource ◽

Classification Accuracy ◽

Ensemble Classifier ◽

Economic Management ◽

Mining Technology ◽

Tree Algorithms

The purposes are to manage human resource data better and explore the association between Human Resource Management (HRM), data mining, and economic management. An Ensemble Classifier-Decision Tree (EC-DT) algorithm is proposed based on the single decision tree algorithm to analyze HRM data. The involved single decision tree algorithms include C4.5, Random Tree, J48, and SimpleCart. Then, an HRM system is established based on the designed algorithm, and the evaluation management and talent recommendation modules are tested. Finally, the designed algorithm is compared and tested. Experimental results suggest that C4.5 provides the highest classification accuracy among the single decision tree algorithms, reaching 76.69%; in contrast, the designed EC-DT algorithm can provide a classification accuracy of 79.97%. The proposed EC-DT algorithm is compared with the Content-based Recommendation Method (CRM) and the Collaborative Filtering Recommendation Method (CFRM), revealing that its Data Mining Recommendation Method (DMRM) can provide the highest accuracy and recall, reaching 35.2% and 41.6%, respectively. Therefore, the data mining-based HRM system can promote and guide enterprises to develop according to quantitative evaluation results. The above results can provide a reference for studying HRM systems based on data mining technology.

Download Full-text

Decision Tree Algorithms: Integration of Domain Knowledge for Data Mining

Business Information Systems Workshops - Lecture Notes in Business Information Processing ◽

10.1007/978-3-642-34228-8_2 ◽

2012 ◽

pp. 13-24 ◽

Cited By ~ 2

Author(s):

Auksė Stravinskienė ◽

Saulius Gudas ◽

Aiste Dabrilaite

Keyword(s):

Data Mining ◽

Decision Tree ◽

Domain Knowledge ◽

Tree Algorithms

Download Full-text

Prediction on Deposit Subscription of Customer based on Bank Telemarketing using Decision Tree with Entropy Comparison

Journal of Applied Intelligent System ◽

10.33633/jais.v4i2.2772 ◽

2020 ◽

Vol 4 (2) ◽

pp. 57-66

Author(s):

Ardytha Luthfiarta ◽

Junta Zeniarja ◽

Edi Faisal ◽

Wibowo Wicaksono

Keyword(s):

Data Mining ◽

Decision Making ◽

Decision Tree ◽

Credit Card ◽

Banking System ◽

Decision Making Process ◽

Data Mining Techniques ◽

Related Information ◽

Customer Information ◽

Tree Algorithms

Banking system collect enormous amounts of data every day. This data can be in the form of customer information, transaction details, risk profiles, credit card details, limits and collateral details, compliance Anti Money Laundering (AML) related information, trade finance data, SWIFT and telex messages. In addition, Thousands of decision are made in Banking system. For example, banks everyday creates credit decisions, relationship start up, investment decisions, AML and Illegal financing related decision. To create this decision, comprehensive review on various reports and drills down tools provided by the banking systems is needed. However, this is a manual process which is error prone and time consuming due to large volume of transactional and historical data available. Hence, automatic knowledge mining is needed to ease the decision making process. This research focuses on data mining techniques to handle the mentioned problem. The technique will focus on classification method using Decision Tree algorithms. This research provides an overview of the data mining techniques and procedures will be performed. It also provides an insight into how these techniques can be used in deposit subscription in banking system to make a decision making process easier and more productive. Keywords - Telemarketing, bank deposit, decision tree, classification, data mining, entropy.

Download Full-text

A Survey on Decision Tree Algorithms of Classification in Data Mining

International Journal of Science and Research (IJSR) ◽

10.21275/v5i4.nov162954 ◽

2016 ◽

Vol 5 (4) ◽

pp. 2094-2097 ◽

Cited By ~ 19

Keyword(s):

Data Mining ◽

Decision Tree ◽

Tree Algorithms

Download Full-text

FULL-VIEW: A VISUAL DATA-MINING ENVIRONMENT

International Journal of Image and Graphics ◽

10.1142/s0219467802000524 ◽

2002 ◽

Vol 02 (01) ◽

pp. 127-143 ◽

Cited By ~ 7

Author(s):

FRANÇOIS POULET

Keyword(s):

Data Mining ◽

Decision Tree ◽

Domain Knowledge ◽

High Performance ◽

Visual Data Mining ◽

Data Mining Algorithms ◽

Research Areas ◽

Graphical Environment ◽

Bulletin Boards ◽

Tree Algorithms

This paper presents a 3D user-centered interactive graphical environment dedicated to data mining. The aims of this environment are to involve the user in the data mining process (to use domain knowledge during the process), to improve comprehensibility (of both the data and the results of data mining algorithms), to improve interactivity and to use algorithms from various research areas: statistics, data analysis, visualization and machine learning. The environment is made of a set of bulletin boards where the graphical tools will be mapped; bulletin boards are predefined or can be user-defined. Several different visualization tools might be used in a single display, these tools are linked together to improve data comprehensibility. The tools available in the environment are both graphical and non-graphical tools, they can be used alone or in a cooperative way. One of these tools is more detailed: CIAD, a new graphical interactive decision tree construction algorithm that allows bivariate splits and so gives smaller trees (improving result comprehensibility). Its results are compared to existing decision tree algorithms. This environment can be used on any personal computer (it is based on open-source software and so, is platform independent) as well as on high performance graphical systems like reality centers.

Download Full-text

Machine Learning by Data Mining REPTree and M5P for Predicating Novel Information for PM10

Cloud Computing and Data Science ◽

10.37256/ccds.112020418 ◽

2020 ◽

pp. 40-48

Author(s):

Yas Alsultanny

Keyword(s):

Machine Learning ◽

Data Mining ◽

Decision Tree ◽

Meteorological Parameters ◽

Pm10 Concentration ◽

Divide And Conquer ◽

Time Processing ◽

Elapsed Time ◽

Tree Algorithms

We examined data mining as a technique to extract knowledge from database to predicate PM10 concentration related to meteorological parameters. The purpose of this paper is to compare between the two types of machine learning by data mining decision tree algorithms Reduced Error Pruning Tree (REPTree) and divide and conquer M5P to predicate Particular Matter 10 (PM10) concentration depending on meteorological parameters. The results of the analysis showed M5P tree gave higher correlation compared with REPTree, moreover lower errors, and higher number of rules, the elapsed time for processing REPTree is less than the time processing of M5P. Both of these trees proved that humidity absorbed PM10. The paper recommends REPTree and M5P for predicting PM10 and other pollution gases.

Download Full-text