scholarly journals A Bi-criteria Optimization Model for Adjusting the Decision Tree Parameters

2021 ◽  
Author(s):  
Mohammad Azad ◽  
◽  
Mikhail Moshkov ◽  

Decision trees play a very important role in knowledge representation because of its simplicity and self-explanatory nature. We study the optimization of the parameters of the decision trees to find a shorter as well as more accurate decision tree. Since these two criteria are in conflict, we need to find a decision tree with suitable parameters that can be a trade off between two criteria. Hence, we design two algorithms to build a decision tree with a given threshold of the number of vertices based on the bi-criteria optimization technique. Then, we calculate the local and global misclassification rates for these trees. Our goal is to study the effect of changing the threshold for the bi-criteria optimization of the decision trees. We apply our algorithms to 13 decision tables from UCI Machine Learning Repository and recommend the suitable threshold that can give us more accurate decision trees with a reasonable number of vertices.

2002 ◽  
Vol 13 (03) ◽  
pp. 445-458 ◽  
Author(s):  
HANS ZANTEMA ◽  
HANS L. BODLAENDER

Decision tables provide a natural framework for knowledge acquisition and representation in the area of knowledge based information systems. Decision trees provide a standard method for inductive inference in the area of machine learning. In this paper we show how decision tables can be considered as ordered decision trees: decision trees satisfying an ordering restriction on the nodes. Every decision tree can be represented by an equivalent ordered decision tree, but we show that doing so may exponentially blow up sizes, even if the choice of the order is left free. Our main result states that finding an ordered decision tree of minimal size that represents the same function as a given ordered decision tree is an NP-hard problem; in earlier work we obtained a similar result for unordered decision trees.


2021 ◽  
Vol 11 (15) ◽  
pp. 6728
Author(s):  
Muhammad Asfand Hafeez ◽  
Muhammad Rashid ◽  
Hassan Tariq ◽  
Zain Ul Abideen ◽  
Saud S. Alotaibi ◽  
...  

Classification and regression are the major applications of machine learning algorithms which are widely used to solve problems in numerous domains of engineering and computer science. Different classifiers based on the optimization of the decision tree have been proposed, however, it is still evolving over time. This paper presents a novel and robust classifier based on a decision tree and tabu search algorithms, respectively. In the aim of improving performance, our proposed algorithm constructs multiple decision trees while employing a tabu search algorithm to consistently monitor the leaf and decision nodes in the corresponding decision trees. Additionally, the used tabu search algorithm is responsible to balance the entropy of the corresponding decision trees. For training the model, we used the clinical data of COVID-19 patients to predict whether a patient is suffering. The experimental results were obtained using our proposed classifier based on the built-in sci-kit learn library in Python. The extensive analysis for the performance comparison was presented using Big O and statistical analysis for conventional supervised machine learning algorithms. Moreover, the performance comparison to optimized state-of-the-art classifiers is also presented. The achieved accuracy of 98%, the required execution time of 55.6 ms and the area under receiver operating characteristic (AUROC) for proposed method of 0.95 reveals that the proposed classifier algorithm is convenient for large datasets.


2019 ◽  
Vol 8 (11) ◽  
pp. e298111473
Author(s):  
Hugo Kenji Rodrigues Okada ◽  
Andre Ricardo Nascimento das Neves ◽  
Ricardo Shitsuka

Decision trees are data structures or computational methods that enable nonparametric supervised machine learning and are used in classification and regression tasks. The aim of this paper is to present a comparison between the decision tree induction algorithms C4.5 and CART. A quantitative study is performed in which the two methods are compared by analyzing the following aspects: operation and complexity. The experiments presented practically equal hit percentages in the execution time for tree induction, however, the CART algorithm was approximately 46.24% slower than C4.5 and was considered to be more effective.


Author(s):  
M. Carr ◽  
V. Ravi ◽  
G. Sridharan Reddy ◽  
D. Veranna

This paper profiles mobile banking users using machine learning techniques viz. Decision Tree, Logistic Regression, Multilayer Perceptron, and SVM to test a research model with fourteen independent variables and a dependent variable (adoption). A survey was conducted and the results were analysed using these techniques. Using Decision Trees the profile of the mobile banking adopter’s profile was identified. Comparing different machine learning techniques it was found that Decision Trees outperformed the Logistic Regression and Multilayer Perceptron and SVM. Out of all the techniques, Decision Tree is recommended for profiling studies because apart from obtaining high accurate results, it also yields ‘if–then’ classification rules. The classification rules provided here can be used to target potential customers to adopt mobile banking by offering them appropriate incentives.


2008 ◽  
Vol 07 (01) ◽  
pp. 37-46 ◽  
Author(s):  
Madjid Tavana

Expert systems (ESs) are complex information systems that are expensive to build and difficult to validate. Numerous knowledge representation strategies such as rules, semantic networks, frames, objects and logical expressions are developed to provide high-level abstraction of a system. Rules are the most commonly used form of knowledge representation and they are derived from popular techniques such as decision trees and decision tables. Despite their huge popularity, decision trees and decision tables are static and cannot model the dynamic requirements of a system. In this study, we propose Petri Nets (PNs) for dynamic system representation and rule derivation. PNs with their graphical and precise nature and their firm mathematical foundation are especially useful for building ESs that exhibit a variety of situations, including: sequential execution, conflict, concurrency, synchronisation, merging, confusion, or prioritisation. We demonstrate the application of our methodology in the design and development of a medical diagnostic expert system.


Author(s):  
Natalia Nakaryakova ◽  
◽  
Sergey Rusakov ◽  
Olga Rusakova ◽  
◽  
...  

Mass education in Russian universities in specialties (direction of study) related to the exact and technical sciences is characterized by a high dropout rate, starting from the first year of study. The current level of school education, the system for selecting applicants through the USE procedure, in many cases does not guarantee that future students will be able to successfully master science-intensive specialties. An emphasis on student-centered, individual learning is possible only after students have proven themselves in the early stages of their studies. Therefore, the anticipatory identification of the ability of yesterday's applicants to study effectively is a very urgent task. In this paper, we consider methods for constructing decision trees designed to classify students, highlighting from them a lot of those (risk group) who, with a high degree of probability, will be expelled after the first academic cycle (trimester). At the same time, the minimum information about the freshmen, recorded in their personal file, is used as input data. The construction of the model was carried out according to the data on students of the applied mathematics and computer science direction of the Perm State National Research University for a five-year period of sets of 2014-2018. At the same time, the information from 2014-2017 was used for training, and the flow of 2018 was used as a test one. At the stage of machine learning, several models of decision trees were considered, which were optimized using balancing, restrictions on the maximum tree depth and the minimum number of elements in a leaf. The effectiveness of the binary classification was assessed using a matrix of inaccuracies and a number of numerical criteria obtained on its basis. As a result of machine learning, a decision tree was built, which predicted 16 out of 17 people expelled from the first trimester into the risk group. That is, for a number of reasons, they turned out to be incapable of learning in the direction of applied mathematics and computer science. In addition, it was possible to determine the level of significance of various types of initial data, showing that the results of the USE largely determine the success of students at this stage of training. The definition of the risk group provides certain guidelines for the purposeful activity of teachers and university psychologists, which ultimately can serve as a basis for improving the quality of education and reducing dropout rates. The work performed demonstrates the capabilities of data mining methods in solving poorly formalized tasks characteristic of this type of human activity.


Author(s):  
Dimitris Kalles ◽  
Athanasios Pagagelis

Decision trees are one of the most successful Machine Learning paradigms. This paper presents a library of decision tree algorithms in Java that was eventually used as a programming laboratory workbench. The initial design focus was, as regards the non-expert user, to conduct experiments with decision trees using components and visual tools that facilitate tree construction and manipulation and as regards the expert user, to be able to focus on algorithm design and comparison with few implementation details. The system has been built over a number of years and over various development contexts and has been successfully used as a workbench in a programming laboratory for junior computer science students. The underlying philosophy was to achieve a solid introduction to object-oriented concepts and practices based on a fundamental machine learning paradigm.


Author(s):  
Terrence L. Chambers ◽  
Alan R. Parkinson

Abstract Many different knowledge representations, such as rules and frames, have been proposed for use with engineering expert systems. Every knowledge representation has certain inherent strengths and weaknesses. A knowledge engineer can exploit the advantages, and avoid the pitfalls, of different common knowledge representations if the knowledge can be mapped from one representation to another as needed. This paper derives the mappings between rules, logic diagrams, frames, decision tables and decision trees using the calculus of truth-functional logic. The logical mappings between these representations are illustrated through a simple example, the limitations of the technique are discussed, and the utility of the technique for the rapid-prototyping and validation of engineering expert systems is introduced.


Author(s):  
YAN ZHAO ◽  
YIYU YAO ◽  
JINGTAO YAO

A partition-based framework is presented for a formal study of classification problems. An information table is used as a knowledge representation, in which all basic notions are precisely defined by using a language known as the decision logic language. Solutions to, and solution space of, classification problems are formulated in terms of partitions. Algorithms for finding solutions are modelled as searching in a space of partitions under the refinement order relation. We focus on a particular type of solutions called conjunctively definable partitions. Two level-wise methods for decision tree construction are investigated, which are related to two different strategies: local optimization and global optimization. They are not in competition with, but are complementary to each other. Experimental results are reported to evaluate the two methods.


Author(s):  
Nina Narodytska ◽  
Alexey Ignatiev ◽  
Filipe Pereira ◽  
Joao Marques-Silva

Explanations of machine learning (ML) predictions are of fundamental importance in different settings. Moreover, explanations should be succinct, to enable easy understanding by humans.  Decision trees represent an often used approach for developing explainable ML models, motivated by the natural mapping between decision tree paths and rules. Clearly, smaller trees correlate well with smaller rules, and so one  challenge is to devise solutions for computing smallest size decision trees given training data. Although simple to formulate, the computation of smallest size decision trees turns out to be an extremely challenging computational problem, for which no practical solutions are known. This paper develops a SAT-based model for computing smallest-size decision trees given training data. In sharp contrast with past work, the proposed SAT model is shown to scale for publicly available datasets of practical interest.


Sign in / Sign up

Export Citation Format

Share Document