Many Are Better Than One: Improving Probabilistic Estimates from Decision Trees

THE USE OF VERSION SPACE CONTROLLED GENETIC ALGORITHMS TO SOLVE THE BOOLE PROBLEM

International Journal of Artificial Intelligence Tools ◽

10.1142/s0218213093000126 ◽

1993 ◽

Vol 02 (02) ◽

pp. 219-234 ◽

Cited By ~ 10

Author(s):

ROBERT G. REYNOLDS ◽

JONATHAN I. MALETIC

Keyword(s):

Genetic Algorithm ◽

Genetic Algorithms ◽

Decision Trees ◽

General Framework ◽

Performance History ◽

New Members ◽

Symbiotic Relationships ◽

Version Space ◽

History Of ◽

Better Than

The Version Space Controlled Genetic Algorithms (VGA) uses the structure of the version space to cache generalizations about the performance history of chromosomes in the genetic algorithm. This cached experience is used to constrain the generation of new members of the genetic algorithms population. The VGA is shown to be a specific instantiation of a more general framework, Autonomous Learning Elements (ALE). The capabilities of the VGA system are demonstrated using the Boole problem suggested by Wilson [Wilson 1987]. The performance of the VGA is compared to that of decision trees and genetic algorithms. The results suggest that the VGA is able to exploit a certain set of symbiotic relationships between its components, so that the resulting system performs better than either component individually.

Download Full-text

Weighted Bagging in Decision Trees: Data Mining

JINAV: Journal of Information and Visualization ◽

10.35877/454ri.jinav149 ◽

2020 ◽

Vol 1 (1) ◽

pp. 1-14

Author(s):

Yousef Elgimati

Keyword(s):

Decision Trees ◽

Learning Algorithm ◽

Majority Vote ◽

Random Noise ◽

Real Data ◽

Data Sets ◽

Bootstrap Sample ◽

Training Set ◽

Multiple Classifier ◽

Better Than

The main focus of this paper is on the use of resampling techniques to construct predictive models from data and the goal is to identify the best possible model which can produce better predications. Bagging or Bootstrap aggregating is a general method for improving the performance of given learning algorithm by using a majority vote to combine multiple classifier outputs derived from a single classifier on a bootstrap resample version of a training set. A bootstrap sample is generated by a random sample with replacement from the original training set. Inspired by the idea of bagging, we present an improved method based on a distance function in decision trees, called modified bagging (or weighted Bagging) in this study. The experimental results show that modified bagging is superior to the usual majority vote. These results are confirmed by both real data and artificial data sets with random noise. The Modified bagged classifier performs significantly better than usual bagging on various tree levels for all sample sizes. An interesting observation is that the weighted bagging performs somewhat better than usual bagging with sumps.

Download Full-text

Estimation of capacity of eccentrically loaded single angle struts with decision trees

Challenge Journal of Structural Mechanics ◽

10.20528/cjsmec.2019.01.001 ◽

2019 ◽

Vol 5 (1) ◽

pp. 1

Author(s):

Saha Dauji

Keyword(s):

Decision Trees ◽

Decision Rules ◽

Practical Implementation ◽

Design Code ◽

Design Standards ◽

Analysis And Design ◽

Transmission Towers ◽

Compression Members ◽

Comparable Accuracy ◽

Better Than

Single angle struts are used as compression members for many structures including roof trusses and transmission towers. The exact analysis and design of such members is challenging due to various uncertainties such as the end fixity or eccentricity of the applied loads. The design standards provide guidelines that have been found inaccurate towards the conservative side. Artificial Neural Networks (ANN) have been observed to perform better than the design standards, when trained with experimental data and this has been reported literature. However, practical implementation of ANN poses problem as the trained network as well as the knowhow regarding the application should be accessible to practitioners. In another data-driven tool, the Decision Trees (DT), the practical application is easier as decision based rules are generated, which are readily comprehended and implemented by designers. Hence, in this paper, DT was explored for the evaluation of capacity of eccentrically loaded single angle struts and was found to be robust and yielded comparable accuracy as ANN, and better than design code (AISC). This has enormous potential for easy and straightforward implementation by practicing engineers through the logic based decision rules, which would be easily programmable on computer. For this application, use of dimensionless ratios as inputs for the development of DT was found to yield better results when compared to the approach of using the original variables as inputs.

Download Full-text

Two-Stage Constructing Hyper-Plane for Each Test Node of Decision Tree

Applied Mechanics and Materials ◽

10.4028/www.scientific.net/amm.26-28.776 ◽

2010 ◽

Vol 26-28 ◽

pp. 776-779

Author(s):

Wei She ◽

Hong Li ◽

Guo Qing Yu ◽

Rui Deng

Keyword(s):

Decision Tree ◽

Decision Trees ◽

Two Stage ◽

Combination Methods ◽

Hyper Plane ◽

Two Stages ◽

The Impact ◽

Better Than

How to construct the “appropriate” split hyper-plane in test nodes is the key of building decision trees. Unlike a univariate decision tree, a multivariate (oblique) decision tree could find the hyper-plane that is not orthogonal to the features’ axes. In this paper, we re-explain the process of building test nodes in terms of geometry. Based on this, we propose a method of learning the hyper-plane with two stages. The tree (TSDT) induced in this way keeps the interpretability of univariate decision trees and the trait of multivariate decision trees which could find oblique hyper-plane. The tests of the impact of Combination methods tell us that TSDT based combination algorithm is much better than other tree based combination methods in accuracy.

Download Full-text

Machine Learning Approaches for Auto Insurance Big Data

Risks ◽

10.3390/risks9020042 ◽

2021 ◽

Vol 9 (2) ◽

pp. 42 ◽

Cited By ~ 1

Author(s):

Mohamed Hanafy ◽

Ruixing Ming

Keyword(s):

Machine Learning ◽

Logistic Regression ◽

Big Data ◽

Random Forest ◽

Decision Trees ◽

Customer Service ◽

Learning Approaches ◽

Auto Insurance ◽

New Methods ◽

Better Than

The growing trend in the number and severity of auto insurance claims creates a need for new methods to efficiently handle these claims. Machine learning (ML) is one of the methods that solves this problem. As car insurers aim to improve their customer service, these companies have started adopting and applying ML to enhance the interpretation and comprehension of their data for efficiency, thus improving their customer service through a better understanding of their needs. This study considers how automotive insurance providers incorporate machinery learning in their company, and explores how ML models can apply to insurance big data. We utilize various ML methods, such as logistic regression, XGBoost, random forest, decision trees, naïve Bayes, and K-NN, to predict claim occurrence. Furthermore, we evaluate and compare these models’ performances. The results showed that RF is better than other methods with the accuracy, kappa, and AUC values of 0.8677, 0.7117, and 0.840, respectively.

Download Full-text

Decision trees work better than feed-forward back-prop neural nets for a specific class of problems

2004 IEEE International Conference on Systems, Man and Cybernetics (IEEE Cat. No.04CH37583) ◽

10.1109/icsmc.2004.1401150 ◽

2005 ◽

Cited By ~ 2

Author(s):

Xiaomei Liu ◽

K.W. Bowyer ◽

L.O. Hall

Keyword(s):

Decision Trees ◽

Neural Nets ◽

Specific Class ◽

Feed Forward ◽

Better Than

Download Full-text

Decision Rules Derived from Optimal Decision Trees with Hypotheses

Entropy ◽

10.3390/e23121641 ◽

2021 ◽

Vol 23 (12) ◽

pp. 1641

Author(s):

Mohammad Azad ◽

Igor Chikalov ◽

Shahid Hussain ◽

Mikhail Moshkov ◽

Beata Zielosko

Keyword(s):

Decision Trees ◽

Decision Rules ◽

Computer Experiments ◽

Optimal Decision ◽

Equivalence Queries ◽

Minimum Number ◽

Minimum Depth ◽

Decision Tables ◽

Programming Algorithms ◽

Better Than

Conventional decision trees use queries each of which is based on one attribute. In this study, we also examine decision trees that handle additional queries based on hypotheses. This kind of query is similar to the equivalence queries considered in exact learning. Earlier, we designed dynamic programming algorithms for the computation of the minimum depth and the minimum number of internal nodes in decision trees that have hypotheses. Modification of these algorithms considered in the present paper permits us to build decision trees with hypotheses that are optimal relative to the depth or relative to the number of the internal nodes. We compare the length and coverage of decision rules extracted from optimal decision trees with hypotheses and decision rules extracted from optimal conventional decision trees to choose the ones that are preferable as a tool for the representation of information. To this end, we conduct computer experiments on various decision tables from the UCI Machine Learning Repository. In addition, we also consider decision tables for randomly generated Boolean functions. The collected results show that the decision rules derived from decision trees with hypotheses in many cases are better than the rules extracted from conventional decision trees.

Download Full-text

Performance of Various Machine Learning Classifiers on Small Datasets with Varying Dimensionalities: A Study

Circulation in Computer Science ◽

10.22632/ccs-2016-251-23 ◽

2016 ◽

Vol 1 (1) ◽

pp. 30-35 ◽

Cited By ~ 3

Author(s):

Sahil Sharma ◽

Vinod Sharma

Keyword(s):

Machine Learning ◽

Decision Trees ◽

Supervised Learning ◽

Predictive Accuracy ◽

Ensemble Method ◽

Reduced Dimensionality ◽

Linear Discriminant ◽

Machine Learning Classifiers ◽

Learning Technique ◽

Better Than

Classification is an important supervised learning technique that is used by many applications. An important factor on which the performance of a classifier depends is the size of the dataset using which the classifier is going to be trained. In this manuscript the authors analyzed five different classification techniques (namely decision trees, KNN, SVM, linear discriminant and Ensemble method) in terms of AUC and predictive accuracy when trained using small datasets with different dimensionalities. The study was done using a dataset with 24 features and 400 instances (samples). The results showed that in general ensemble method (using boosted trees) performed better than others but its performance degraded a bit with reduced dimensionality.

Download Full-text

Applications of Data Mining Algorithm in Equipment Fault Diagnosis

Applied Mechanics and Materials ◽

10.4028/www.scientific.net/amm.644-650.2551 ◽

2014 ◽

Vol 644-650 ◽

pp. 2551-2555

Author(s):

Rong Xiang Li ◽

Zeng Lei Zhang ◽

Yun Liu ◽

Shan Chao Tu

Keyword(s):

Data Mining ◽

Fault Diagnosis ◽

Decision Tree ◽

Decision Trees ◽

Data Mining Algorithm ◽

Basic Principles ◽

Mining Algorithm ◽

Id3 Algorithm ◽

Better Than ◽

Improved Algorithm

The Basic Principles of Data mining Decision-tree ID3 is opened out. The main deficiencies are analysed. An improved algorithm based on the ID3 is calculated. For fault diagnosis of engine exemple, traditional ID3 algorithm and the improved algorithm are applied to estimate the fault diagnosis of engine separately. Decision Trees of traditional ID3 algorithm and the improved algorithm are construct. Experiment result display the accuracy of improved algorithm is better than traditional ID3. The improved algorithm is more fit to applied to the equipment fault diagnosis.

Download Full-text

Time and Latitude in South Africa

International Astronomical Union Colloquium ◽

10.1017/s0252921100026816 ◽

1972 ◽

Vol 1 ◽

pp. 27-38

Author(s):

J. Hers

Keyword(s):

South Africa ◽

Cape Town ◽

Astronomical Observations ◽

Royal Observatory ◽

The Republic ◽

Astronomical Determination ◽

Limited Accuracy ◽

Better Than

In South Africa the modern outlook towards time may be said to have started in 1948. Both the two major observatories, The Royal Observatory in Cape Town and the Union Observatory (now known as the Republic Observatory) in Johannesburg had, of course, been involved in the astronomical determination of time almost from their inception, and the Johannesburg Observatory has been responsible for the official time of South Africa since 1908. However the pendulum clocks then in use could not be relied on to provide an accuracy better than about 1/10 second, which was of the same order as that of the astronomical observations. It is doubtful if much use was made of even this limited accuracy outside the two observatories, and although there may – occasionally have been a demand for more accurate time, it was certainly not voiced.

Download Full-text