scholarly journals A System for Induction of Oblique Decision Trees

1994 ◽  
Vol 2 ◽  
pp. 1-32 ◽  
Author(s):  
S. K. Murthy ◽  
S. Kasif ◽  
S. Salzberg

This article describes a new system for induction ofoblique decision trees. This system, OC1, combines deterministic hill-climbing with two forms of randomization to find a goodoblique split (in the form of a hyperplane) at each node of a decisiontree. Oblique decision tree methods are tuned especially for domains in which the attributes are numeric, although they can be adapted to symbolic or mixed symbolic/numeric attributes. We presentextensive empirical studies, using both real and artificial data, thatanalyze OC1's ability to construct oblique trees that are smaller and more accurate than their axis-parallel counterparts. We also examinethe benefits of randomization for the construction of oblique decisiontrees.

2017 ◽  
Vol 9 (4) ◽  
pp. 16-36 ◽  
Author(s):  
Souad Taleb Zouggar ◽  
Abdelkader Adla

To compute a partition quality for a decision tree, we propose a new measure called NIM “New Information Measure”. The measure is simpler, provides similar performance, and sometimes outperforms the existing measures used with tree-based methods. The experimental results using the MONITDIAB application (Taleb & Atmani, 2013) and datasets from the UCI repository (Asuncion & Newman, 2007) confirm the classification capabilities of our proposal in comparison to the Shannon measure used with ID3 and C4.5 decision tree methods.


Author(s):  
Marek Kretowski ◽  
Marek Grzes

Decision trees are, besides decision rules, one of the most popular forms of knowledge representation in Knowledge Discovery in Databases process (Fayyad, Piatetsky-Shapiro, Smyth & Uthurusamy, 1996) and implementations of the classical decision tree induction algorithms are included in the majority of data mining systems. A hierarchical structure of a tree-based classifier, where appropriate tests from consecutive nodes are subsequently applied, closely resembles a human way of decision making. This makes decision trees natural and easy to understand even for an inexperienced analyst. The popularity of the decision tree approach can also be explained by their ease of application, fast classification and what may be the most important, their effectiveness. Two main types of decision trees can be distinguished by the type of tests in non-terminal nodes: univariate and multivariate decision trees. In the first group, a single attribute is used in each test. For a continuousvalued feature usually an inequality test with binary outcomes is applied and for a nominal attribute mutually exclusive groups of attribute values are associated with outcomes. As a good representative of univariate inducers, the well-known C4.5 system developed by Quinlan (1993) should be mentioned. In univariate trees a split is equivalent to partitioning the feature space with an axis-parallel hyper-plane. If decision boundaries of a particular dataset are not axis-parallel, using such tests may lead to an overcomplicated classifier. This situation is known as the “staircase effect”. The problem can be mitigated by applying more sophisticated multivariate tests, where more than one feature can be taken into account. The most common form of such tests is an oblique split, which is based on a linear combination of features (hyper-plane). The decision tree which applies only oblique tests is often called oblique or linear, whereas heterogeneous trees with univariate, linear and other multivariate (e.g., instance-based) tests can be called mixed decision trees (Llora & Wilson, 2004). It should be emphasized that computational complexity of the multivariate induction is generally significantly higher than the univariate induction. CART (Breiman, Friedman, Olshen & Stone, 1984) and OC1 (Murthy, Kasif & Salzberg, 1994) are well known examples of multivariate systems.


2016 ◽  
Vol 46 (4) ◽  
pp. 2924-2934 ◽  
Author(s):  
Muhammad Azam ◽  
Muhammad Aslam ◽  
Khushnoor Khan ◽  
Anwar Mughal ◽  
Awais Inayat

Author(s):  
Faiza Charfi ◽  
Ali Kraiem

A new automated approach for Electrocardiogram (ECG) arrhythmias characterization and classification with the combination of Wavelet transform and Decision tree classification is presented. The approach is based on two key steps. In the first step, the authors adopt the wavelet transform to extract the ECG signals wavelet coefficients as first features and utilize the combination of Principal Component Analysis (PCA) and Fast Independent Component Analysis (FastICA) to transform the first features into uncorrelated and mutually independent new features. In the second step, they utilize some decision tree methods currently in use: C4.5, Improved C4.5, CHAID (Chi - Square Automatic Interaction Detection) and Improved CHAID for the classification of ECG signals, which are taken, from the MIT-BIH database, including normal subjects and subjects affected by arrhythmia. The authors’ results suggest the high reliability and high classification accuracy of C4.5 algorithm with the bootstrap aggregation.


Sign in / Sign up

Export Citation Format

Share Document