Evolutionary Algorithm for Improving Decision Tree with Global Discretization in Manufacturing

Sungbum Jun

doi:10.3390/s21082849

Evolutionary Algorithm for Improving Decision Tree with Global Discretization in Manufacturing

Sensors ◽

10.3390/s21082849 ◽

2021 ◽

Vol 21 (8) ◽

pp. 2849

Author(s):

Sungbum Jun

Keyword(s):

Decision Tree ◽

Evolutionary Algorithm ◽

Decision Trees ◽

Manufacturing Systems ◽

Ensemble Methods ◽

Machine Learning Techniques ◽

Learning Techniques ◽

Industrial Internet ◽

Tree Models ◽

Real World Datasets

Due to the recent advance in the industrial Internet of Things (IoT) in manufacturing, the vast amount of data from sensors has triggered the need for leveraging such big data for fault detection. In particular, interpretable machine learning techniques, such as tree-based algorithms, have drawn attention to the need to implement reliable manufacturing systems, and identify the root causes of faults. However, despite the high interpretability of decision trees, tree-based models make a trade-off between accuracy and interpretability. In order to improve the tree’s performance while maintaining its interpretability, an evolutionary algorithm for discretization of multiple attributes, called Decision tree Improved by Multiple sPLits with Evolutionary algorithm for Discretization (DIMPLED), is proposed. The experimental results with two real-world datasets from sensors showed that the decision tree improved by DIMPLED outperformed the performances of single-decision-tree models (C4.5 and CART) that are widely used in practice, and it proved competitive compared to the ensemble methods, which have multiple decision trees. Even though the ensemble methods could produce slightly better performances, the proposed DIMPLED has a more interpretable structure, while maintaining an appropriate performance level.

Download Full-text

A Practical Tutorial for Decision Tree Induction

ACM Computing Surveys ◽

10.1145/3429739 ◽

2021 ◽

Vol 54 (1) ◽

pp. 1-38

Author(s):

Víctor Adrián Sosa Hernández ◽

Raúl Monroy ◽

Miguel Angel Medina-Pérez ◽

Octavio Loyola-González ◽

Francisco Herrera

Keyword(s):

Decision Tree ◽

Decision Trees ◽

Machine Learning Techniques ◽

Evaluation Measures ◽

Decision Tree Induction ◽

Learning Techniques ◽

Tree Models ◽

Evaluation Measure ◽

Main Components ◽

Support Decision Making

Experts from different domains have resorted to machine learning techniques to produce explainable models that support decision-making. Among existing techniques, decision trees have been useful in many application domains for classification. Decision trees can make decisions in a language that is closer to that of the experts. Many researchers have attempted to create better decision tree models by improving the components of the induction algorithm. One of the main components that have been studied and improved is the evaluation measure for candidate splits. In this article, we introduce a tutorial that explains decision tree induction. Then, we present an experimental framework to assess the performance of 21 evaluation measures that produce different C4.5 variants considering 110 databases, two performance measures, and 10× 10-fold cross-validation. Furthermore, we compare and rank the evaluation measures by using a Bayesian statistical analysis. From our experimental results, we present the first two performance rankings in the literature of C4.5 variants. Moreover, we organize the evaluation measures into two groups according to their performance. Finally, we introduce meta-models that automatically determine the group of evaluation measures to produce a C4.5 variant for a new database and some further opportunities for decision tree models.

Download Full-text

Fighting Under-price DoS Attack in Ethereum with Machine Learning Techniques

ACM SIGMETRICS Performance Evaluation Review ◽

10.1145/3466826.3466835 ◽

2021 ◽

Vol 48 (4) ◽

pp. 24-27

Author(s):

Jose Eduardo A. Sousa ◽

Vinicius C. Oliveira ◽

Julia Almeida Valadares ◽

Alex Borges Vieira ◽

Heder S. Bernardino ◽

...

Keyword(s):

Machine Learning ◽

Decision Tree ◽

Denial Of Service ◽

Ensemble Methods ◽

Machine Learning Techniques ◽

Security Threats ◽

Network Behavior ◽

Dos Attack ◽

Learning Techniques ◽

Tree Models

Ethereum is one of the most popular cryptocurrency currently and it has been facing security threats and attacks. As a consequence, Ethereum users may experience long periods to validate transactions. Despite the maintenance on the Ethereum mechanisms, there are still indications that it remains susceptible to a sort of attacks. In this work, we analyze the Ethereum network behavior during an under-priced DoS attack, where malicious users try to perform denial-of-service attacks that exploit flaws in the fee mechanism of this cryptocurrency. We propose the application of machine learning techniques and ensemble methods to detect this attack, using the available transaction attributes. The proposals present notable performance as the Decision Tree models, with AUC-ROC, F-score and recall larger than 0.94, 0.82, and 0.98, respectively.

Download Full-text

Machine Learning Techniques Applied to Profile Mobile Banking Users in India

International Journal of Information Systems in the Service Sector ◽

10.4018/jisss.2013010105 ◽

2013 ◽

Vol 5 (1) ◽

pp. 82-92 ◽

Cited By ~ 8

Author(s):

M. Carr ◽

V. Ravi ◽

G. Sridharan Reddy ◽

D. Veranna

Keyword(s):

Machine Learning ◽

Logistic Regression ◽

Decision Tree ◽

Decision Trees ◽

Multilayer Perceptron ◽

Machine Learning Techniques ◽

Mobile Banking ◽

Classification Rules ◽

Learning Techniques ◽

Potential Customers

This paper profiles mobile banking users using machine learning techniques viz. Decision Tree, Logistic Regression, Multilayer Perceptron, and SVM to test a research model with fourteen independent variables and a dependent variable (adoption). A survey was conducted and the results were analysed using these techniques. Using Decision Trees the profile of the mobile banking adopter’s profile was identified. Comparing different machine learning techniques it was found that Decision Trees outperformed the Logistic Regression and Multilayer Perceptron and SVM. Out of all the techniques, Decision Tree is recommended for profiling studies because apart from obtaining high accurate results, it also yields ‘if–then’ classification rules. The classification rules provided here can be used to target potential customers to adopt mobile banking by offering them appropriate incentives.

Download Full-text

Big Data Analytics Processes in Industrial Internet of Things Systems: Sensing and Computing Technologies, Machine Learning Techniques, and Autonomous Decision-Making Algorithms

Journal of Self-Governance and Management Economics ◽

10.22381/jsme7420194 ◽

2019 ◽

Vol 7 (4) ◽

pp. 28 ◽

Cited By ~ 4

Keyword(s):

Machine Learning ◽

Decision Making ◽

Big Data ◽

Data Analytics ◽

Big Data Analytics ◽

Machine Learning Techniques ◽

Industrial Internet Of Things ◽

Autonomous Decision ◽

Learning Techniques ◽

Industrial Internet

Download Full-text

Application of Ensemble Models in Credit Scoring Models

Business Perspectives and Research ◽

10.1177/2278533718765531 ◽

2018 ◽

Vol 6 (2) ◽

pp. 129-141

Author(s):

Anjali Chopra ◽

Priyanka Bhilare

Keyword(s):

Decision Tree ◽

Empirical Analysis ◽

Credit Scoring ◽

Bank Loan ◽

Machine Learning Techniques ◽

Risk Scores ◽

Gradient Boosting ◽

Loan Default ◽

Linear Discriminant ◽

Learning Techniques

Loan default is a serious problem in banking industries. Banking systems have strong processes in place for identification of customers with poor credit risk scores; however, most of the credit scoring models need to be constantly updated with newer variables and statistical techniques for improved accuracy. While totally eliminating default is almost impossible, loan risk teams, however, minimize the rate of default, thereby protecting banks from the adverse effects of loan default. Credit scoring models have used logistic regression and linear discriminant analysis for identification of potential defaulters. Newer and contemporary machine learning techniques have the ability to outperform classic old age techniques. This article aims to conduct empirical analysis on publically available bank loan dataset to study banking loan default using decision tree as the base learner and comparing it with ensemble tree learning techniques such as bagging, boosting, and random forests. The results of the empirical analysis suggest that the gradient boosting model outperforms the base decision tree learner, indicating that ensemble model works better than individual models. The study recommends that the risk team should adopt newer contemporary techniques to achieve better accuracy resulting in effective loan recovery strategies.

Download Full-text

Improved argumentative paragraphs detection in academic theses supported with unit segmentation

Journal of Intelligent & Fuzzy Systems ◽

10.3233/jifs-219237 ◽

2021 ◽

pp. 1-11

Author(s):

Jesús Miguel García-Gorrostieta ◽

Aurelio López-López ◽

Samuel González-López ◽

Adrián Pastor López-Monroy

Keyword(s):

Machine Learning ◽

Decision Tree ◽

Automatic Detection ◽

Machine Learning Techniques ◽

Svm Classifier ◽

Complex Task ◽

Decision Tree Classifier ◽

Learning Techniques ◽

Tree Classifier ◽

Academic Author

Academic theses writing is a complex task that requires the author to be skilled in argumentation. The goal of the academic author is to communicate clear ideas and to convince the reader of the presented claims. However, few students are good arguers, and this is a skill that takes time to master. In this paper, we present an exploration of lexical features used to model automatic detection of argumentative paragraphs using machine learning techniques. We present a novel proposal, which combines the information in the complete paragraph with the detection of argumentative segments in order to achieve improved results for the detection of argumentative paragraphs. We propose two approaches; a more descriptive one, which uses the decision tree classifier with indicators and lexical features; and another more efficient, which uses an SVM classifier with lexical features and a Document Occurrence Representation (DOR). Both approaches consider the detection of argumentative segments to ensure that a paragraph detected as argumentative has indeed segments with argumentation. We achieved encouraging results for both approaches.

Download Full-text

Analysis of Kinase Inhibitors and Druggability of Kinase-Targets Using Machine Learning Techniques

Pattern Discovery Using Sequence Data Mining ◽

10.4018/978-1-61350-056-9.ch009 ◽

2012 ◽

pp. 155-165

Author(s):

S. Prasanthi ◽

S.Durga Bhavani ◽

T. Sobha Rani ◽

Raju S. Bapi

Keyword(s):

Machine Learning ◽

Decision Tree ◽

Kinase Inhibitors ◽

Kinase Inhibitor ◽

Classification Problem ◽

Machine Learning Techniques ◽

Learning Approaches ◽

Decision Tree Classifier ◽

Data Set ◽

Learning Techniques

Vast majority of successful drugs or inhibitors achieve their activity by binding to, and modifying the activity of a protein leading to the concept of druggability. A target protein is druggable if it has the potential to bind the drug-like molecules. Hence kinase inhibitors need to be studied to understand the specificity of a kinase inhibitor in choosing a particular kinase target. In this paper we focus on human kinase drug target sequences since kinases are known to be potential drug targets. Also we do a preliminary analysis of kinase inhibitors in order to study the problem in the protein-ligand space in future. The identification of druggable kinases is treated as a classification problem in which druggable kinases are taken as positive data set and non-druggable kinases are chosen as negative data set. The classification problem is addressed using machine learning techniques like support vector machine (SVM) and decision tree (DT) and using sequence-specific features. One of the challenges of this classification problem is due to the unbalanced data with only 48 druggable kinases available against 509 non-drugggable kinases present at Uniprot. The accuracy of the decision tree classifier obtained is 57.65 which is not satisfactory. A two-tier architecture of decision trees is carefully designed such that recognition on the non-druggable dataset also gets improved. Thus the overall model is shown to achieve a final performance accuracy of 88.37. To the best of our knowledge, kinase druggability prediction using machine learning approaches has not been reported in literature.

Download Full-text

Providing Security and Managing Quality Through Machine Learning Techniques for an Image Processing Model in the Industrial Internet of Things

Smart IoT for Research and Industry - EAI/Springer Innovations in Communication and Computing ◽

10.1007/978-3-030-71485-7_10 ◽

2021 ◽

pp. 161-177

Author(s):

B. Vineetha ◽

R. B. Madhumala

Keyword(s):

Machine Learning ◽

Image Processing ◽

Internet Of Things ◽

Machine Learning Techniques ◽

Industrial Internet Of Things ◽

Learning Techniques ◽

Industrial Internet

Download Full-text

Optimization of Evolutionary Algorithm Using Machine Learning Techniques for Pattern Mining in Transactional Database

Handbook of Research on Applications and Implementations of Machine Learning Techniques - Advances in Computational Intelligence and Robotics ◽

10.4018/978-1-5225-9902-9.ch010 ◽

2020 ◽

pp. 173-200

Author(s):

Logeswaran K. ◽

Suresh P. ◽

Savitha S. ◽

Prasanna Kumar K. R.

Keyword(s):

Evolutionary Algorithm ◽

Pattern Mining ◽

Fitness Function ◽

Search Space ◽

Machine Learning Techniques ◽

Dynamic Selection ◽

Learning Techniques ◽

Optimal Function ◽

High Utility ◽

Mining Algorithms

In recent years, the data analysts are facing many challenges in high utility itemset (HUI) mining from given transactional database using existing traditional techniques. The challenges in utility mining algorithms are exponentially growing search space and the minimum utility threshold appropriate to the given database. To overcome these challenges, evolutionary algorithm-based techniques can be used to mine the HUI from transactional database. However, testing each of the supporting functions in the optimization problem is very inefficient and it increases the time complexity of the algorithm. To overcome this drawback, reinforcement learning-based approach is proposed for improving the efficiency of the algorithm, and the most appropriate fitness function for evaluation can be selected automatically during execution of an algorithm. Furthermore, during the optimization process when distinct functions are skillful, dynamic selection of current optimal function is done.

Download Full-text

Automobile Gearbox Fault Diagnosis Using Naive Bayes and Decision Tree Algorithm

Applied Mechanics and Materials ◽

10.4028/www.scientific.net/amm.813-814.943 ◽

2015 ◽

Vol 813-814 ◽

pp. 943-948 ◽

Cited By ~ 2

Author(s):

P.G. Sreenath ◽

Gopalakrishnan Praveen Kumare ◽

Sundar Pravin ◽

K.N. Vikram ◽

M. Saimurugan

Keyword(s):

Fault Diagnosis ◽

Decision Tree ◽

Vital Role ◽

Condition Based Maintenance ◽

Vibration Monitoring ◽

Machine Learning Techniques ◽

Decision Tree Algorithm ◽

Vibration Signals ◽

Learning Techniques ◽

Technique Comparison

Gearbox plays a vital role in various fields in the industries. Failure of any component in the gearbox will lead to machine downtime. Vibration monitoring is the technique used for condition based maintenance of gearbox. This paper discusses the use of machine learning techniques for automating the fault diagnosis of automobile gearbox. Our experimental study monitors the vibration signals of actual automobile gearbox with simulated fault conditions in the gear and bearing. Statistical features are extracted and classified for identifying the faults using decision tree and Naïve bayes technique. Comparison of the techniques for determining the classification accuracy is discussed.

Download Full-text