Minimum Query Set for Decision Tree Construction

Wojciech Wieczorek; Jan Kozak; Łukasz Strąk; Arkadiusz Nowakowski

doi:10.3390/e23121682

Minimum Query Set for Decision Tree Construction

Entropy ◽

10.3390/e23121682 ◽

2021 ◽

Vol 23 (12) ◽

pp. 1682

Author(s):

Wojciech Wieczorek ◽

Jan Kozak ◽

Łukasz Strąk ◽

Arkadiusz Nowakowski

Keyword(s):

Genetic Algorithm ◽

Decision Tree ◽

Programming Model ◽

Building Blocks ◽

Optimal Decision ◽

Second Stage ◽

Tree Construction ◽

Series Of Experiments ◽

Definition Of ◽

Classification Quality

A new two-stage method for the construction of a decision tree is developed. The first stage is based on the definition of a minimum query set, which is the smallest set of attribute-value pairs for which any two objects can be distinguished. To obtain this set, an appropriate linear programming model is proposed. The queries from this set are building blocks of the second stage in which we try to find an optimal decision tree using a genetic algorithm. In a series of experiments, we show that for some databases, our approach should be considered as an alternative method to classical ones (CART, C4.5) and other heuristic approaches in terms of classification quality.

Download Full-text

Optimal Decision Trees on Simplicial Complexes

The Electronic Journal of Combinatorics ◽

10.37236/1900 ◽

2005 ◽

Vol 12 (1) ◽

Cited By ~ 1

Author(s):

Jakob Jonsson

Keyword(s):

Decision Tree ◽

Decision Trees ◽

Simplicial Complex ◽

Elementary Theory ◽

Simplicial Complexes ◽

Optimal Decision ◽

Property A ◽

Recursive Definition ◽

Topological Combinatorics ◽

Definition Of

We consider topological aspects of decision trees on simplicial complexes, concentrating on how to use decision trees as a tool in topological combinatorics. By Robin Forman's discrete Morse theory, the number of evasive faces of a given dimension $i$ with respect to a decision tree on a simplicial complex is greater than or equal to the $i$th reduced Betti number (over any field) of the complex. Under certain favorable circumstances, a simplicial complex admits an "optimal" decision tree such that equality holds for each $i$; we may hence read off the homology directly from the tree. We provide a recursive definition of the class of semi-nonevasive simplicial complexes with this property. A certain generalization turns out to yield the class of semi-collapsible simplicial complexes that admit an optimal discrete Morse function in the analogous sense. In addition, we develop some elementary theory about semi-nonevasive and semi-collapsible complexes. Finally, we provide explicit optimal decision trees for several well-known simplicial complexes.

Download Full-text

A case study of optimal decision tree construction for RFKON database

2016 International Symposium on INnovations in Intelligent SysTems and Applications (INISTA) ◽

10.1109/inista.2016.7571857 ◽

2016 ◽

Cited By ~ 1

Author(s):

Sinem Bozkurt Keser ◽

Ugur Yayan

Keyword(s):

Decision Tree ◽

Optimal Decision ◽

Tree Construction

Download Full-text

HYPER HEURISTIC EVOLUTIONARY APPROACH FOR CONSTRUCTING DECISION TREE CLASSIFIERS

Journal of Information and Communication Technology ◽

10.32890/jict2021.20.2.5 ◽

2021 ◽

Vol 20 (Number 2) ◽

pp. 249-276

Author(s):

Sunil Kumar ◽

Saroj Ratnoo ◽

Jyoti Vashishtha

Keyword(s):

Decision Tree ◽

Heuristic Approach ◽

Decision Tree Model ◽

Evolutionary Approach ◽

Optimal Decision ◽

Decision Tree Classifier ◽

Tree Model ◽

Tree Construction ◽

Tree Classifier ◽

Optimal Values

Decision tree models have earned a special status in predictive modeling since these are considered comprehensible for human analysis and insight. Classification and Regression Tree (CART) algorithm is one of the renowned decision tree induction algorithms to address the classification as well as regression problems. Finding optimal values for the hyper parameters of a decision tree construction algorithm is a challenging issue. While making an effective decision tree classifier with high accuracy and comprehensibility, we need to address the question of setting optimal values for its hyper parameters like the maximum size of the tree, the minimum number of instances required in a node for inducing a split, node splitting criterion and the amount of pruning. The hyper parameter setting influences the performance of the decision tree model. As researchers, we know that no single setting of hyper parameters works equally well for different datasets. A particular setting that gives an optimal decision tree for one dataset may produce a sub-optimal decision tree model for another dataset. In this paper, we present a hyper heuristic approach for tuning the hyper parameters of Recursive and Partition Trees (rpart), which is a typical implementation of CART in statistical and data analytics package R. We employ an evolutionary algorithm as hyper heuristic for tuning the hyper parameters of the decision tree classifier. The approach is named as Hyper heuristic Evolutionary Approach with Recursive and Partition Trees (HEARpart). The proposed approach is validated on 30 datasets. It is statistically proved that HEARpart performs significantly better than WEKA’s J48 algorithm in terms of error rate, F-measure, and tree size. Further, the suggested hyper heuristic algorithm constructs significantly comprehensible models as compared to WEKA’s J48, CART and other similar decision tree construction strategies. The results show that the accuracy achieved by the hyper heuristic approach is slightly less as compared to the other comparative approaches.

Download Full-text

Insulating Composites Made from Sulfur, Canola Oil, and Wool

10.26434/chemrxiv.13464842.v1 ◽

2020 ◽

Author(s):

Israa Bu Najmah ◽

Nicholas Lundquist ◽

Melissa K. Stanfield ◽

Filip Stojcevski ◽

Jonathan A. Campbell ◽

...

Keyword(s):

Tensile Strength ◽

Canola Oil ◽

Building Blocks ◽

Atom Economy ◽

Flame Resistance ◽

Sustainable Building ◽

Second Stage ◽

Wool Fibers ◽

Insulating Properties ◽

Polysulfide Polymer

An insulating composite was made from the sustainable building blocks wool, sulfur, and canola oil. In the first stage of the synthesis, inverse vulcanization was used to make a polysulfide polymer from the canola oil triglyceride and sulfur. This polymerization benefits from complete atom economy. In the second stage, the powdered polymer is mixed with wool, coating the fibers through electrostatic attraction. The polymer and wool mixture is then compressed with mild heating to provoke S-S metathesis in the polymer, which locks the wool in the polymer matrix. The wool fibers impart tensile strength, insulating properties, and flame resistance to the composite. All building blocks are sustainable or derived from waste and the composite is a promising lead on next-generation insulation for energy conservation.

Download Full-text

Optimal Placement and Sizing of D-STATCOM in Radial and Meshed Distribution Networks Using a Discrete-Continuous Version of the Genetic Algorithm

Electronics ◽

10.3390/electronics10121452 ◽

2021 ◽

Vol 10 (12) ◽

pp. 1452

Author(s):

Cristian Mateo Castiblanco-Pérez ◽

David Esteban Toro-Rodríguez ◽

Oscar Danilo Montoya ◽

Diego Armando Giral-Ramírez

Keyword(s):

Genetic Algorithm ◽

Objective Function ◽

Programming Model ◽

Distribution Networks ◽

Optimization Method ◽

Optimal Placement ◽

Mixed Integer ◽

Power Balance ◽

Continuous Version ◽

Discrete Part

In this paper, we propose a new discrete-continuous codification of the Chu–Beasley genetic algorithm to address the optimal placement and sizing problem of the distribution static compensators (D-STATCOM) in electrical distribution grids. The discrete part of the codification determines the nodes where D-STATCOM will be installed. The continuous part of the codification regulates their sizes. The objective function considered in this study is the minimization of the annual operative costs regarding energy losses and installation investments in D-STATCOM. This objective function is subject to the classical power balance constraints and devices’ capabilities. The proposed discrete-continuous version of the genetic algorithm solves the mixed-integer non-linear programming model that the classical power balance generates. Numerical validations in the 33 test feeder with radial and meshed configurations show that the proposed approach effectively minimizes the annual operating costs of the grid. In addition, the GAMS software compares the results of the proposed optimization method, which allows demonstrating its efficiency and robustness.

Download Full-text

Business E-NeGotiAtion: A Method Using a Genetic Algorithm for Online Dispute Resolution in B2B Relationships

Journal of theoretical and applied electronic commerce research ◽

10.3390/jtaer16050067 ◽

2021 ◽

Vol 16 (5) ◽

pp. 1186-1216

Author(s):

Nikola Simkova ◽

Zdenek Smutny

Keyword(s):

Genetic Algorithm ◽

Dispute Resolution ◽

Single Case ◽

Computer Mediated Communication ◽

Optimal Solution ◽

Mediated Communication ◽

Online Dispute Resolution ◽

B2b Relationships ◽

Definition Of ◽

E Mail

An opportunity to resolve disputes as an out-of-court settlement through computer-mediated communication is usually easier, faster, and cheaper than filing an action in court. Artificial intelligence and law (AI & Law) research has gained importance in this area. The article presents a design of the E-NeGotiAtion method for assisted negotiation in business to business (B2B) relationships, which uses a genetic algorithm for selecting the most appropriate solution(s). The aim of the article is to present how the method is designed and contribute to knowledge on online dispute resolution (ODR) with a focus on B2B relationships. The evaluation of the method consisted of an embedded single-case study, where participants from two countries simulated the realities of negotiation between companies. For comparison, traditional negotiation via e-mail was also conducted. The evaluation confirms that the proposed E-NeGotiAtion method quickly achieves solution(s), approaching the optimal solution on which both sides can decide, and also very importantly, confirms that the method facilitates negotiation with the partner and creates a trusted result. The evaluation demonstrates that the proposed method is economically efficient for parties of the dispute compared to negotiation via e-mail. For a more complicated task with five or more products, the E-NeGotiAtion method is significantly more suitable than negotiation via e-mail for achieving a resolution that favors one side or the other as little as possible. In conclusion, it can be said that the proposed method fulfills the definition of the dual-task of ODR—it resolves disputes and builds confidence.

Download Full-text

Optimal Sizing and Placement of Capacitor Banks in Distribution Networks Using a Genetic Algorithm

Electricity ◽

10.3390/electricity2020012 ◽

2021 ◽

Vol 2 (2) ◽

pp. 187-204

Author(s):

Gian Giuseppe Soma

Keyword(s):

Genetic Algorithm ◽

Electricity Market ◽

Distribution Systems ◽

Reactive Power ◽

Electricity Consumption ◽

Distribution Networks ◽

Optimal Placement ◽

Capacitor Banks ◽

Control Pattern ◽

Definition Of

Nowadays, response to electricity consumption growth is mainly supported by efficiency; therefore, this is the new main goal in the development of electric distribution networks, which must fully comply with the system’s constraints. In recent decades, the issue of independent reactive power services, including the optimal placement of capacitors in the grid due to the restructuring of the electricity industry and the creation of a competitive electricity market, has received attention from related companies. In this context, a genetic algorithm is proposed for optimal planning of capacitor banks. A case study derived from a real network, considering the application of suitable daily profiles for loads and generators, to obtain a better representation of the electrical conditions, is discussed in the present paper. The results confirmed that some placement solutions can be obtained with a good compromise between costs and benefits; the adopted benefits are energy losses and power factor infringements, taking into account the network technical limits. The feasibility and effectiveness of the proposed algorithm for optimal placement and sizing of capacitor banks in distribution systems, with the definition of a suitable control pattern, have been proved.

Download Full-text

Personalised attraction recommendation for enhancing topic diversity and accuracy

Journal of Information Science ◽

10.1177/0165551521999801 ◽

2021 ◽

pp. 016555152199980

Author(s):

Yuanyuan Lin ◽

Chao Huang ◽

Wei Yao ◽

Yifei Shao

Keyword(s):

Real World ◽

Information Overload ◽

Two Stage ◽

Misclassification Cost ◽

Second Stage ◽

Low Visibility ◽

Recommendation Diversity ◽

Optimisation Model ◽

Definition Of ◽

Improved Methods

Attraction recommendation plays an important role in tourism, such as solving information overload problems and recommending proper attractions to users. Currently, most recommendation methods are dedicated to improving the accuracy of recommendations. However, recommendation methods only focusing on accuracy tend to recommend popular items that are often purchased by users, which results in a lack of diversity and low visibility of non-popular items. Hence, many studies have suggested the importance of recommendation diversity and proposed improved methods, but there is room for improvement. First, the definition of diversity for different items requires consideration for domain characteristics. Second, the existing algorithms for improving diversity sacrifice the accuracy of recommendations. Therefore, the article utilises the topic ‘features of attractions’ to define the calculation method of recommendation diversity. We developed a two-stage optimisation model to enhance recommendation diversity while maintaining the accuracy of recommendations. In the first stage, an optimisation model considering topic diversity is proposed to increase recommendation diversity and generate candidate attractions. In the second stage, we propose a minimisation misclassification cost optimisation model to balance recommendation diversity and accuracy. To assess the performance of the proposed method, experiments are conducted with real-world travel data. The results indicate that the proposed two-stage optimisation model can significantly improve the diversity and accuracy of recommendations.

Download Full-text

A Two-Stage Mono- and Multi-Objective Method for the Optimization of General UPS Parallel Manipulators

Mathematics ◽

10.3390/math9050543 ◽

2021 ◽

Vol 9 (5) ◽

pp. 543

Author(s):

Alejandra Ríos ◽

Eusebio E. Hernández ◽

S. Ivvan Valdez

Keyword(s):

Genetic Algorithm ◽

Performance Metrics ◽

Tracking Error ◽

Parallel Manipulators ◽

Optimization Method ◽

Statistical Hypothesis ◽

Two Stage ◽

Multi Objective ◽

Second Stage ◽

Conflicting Objectives

This paper introduces a two-stage method based on bio-inspired algorithms for the design optimization of a class of general Stewart platforms. The first stage performs a mono-objective optimization in order to reach, with sufficient dexterity, a regular target workspace while minimizing the elements’ lengths. For this optimization problem, we compare three bio-inspired algorithms: the Genetic Algorithm (GA), the Particle Swarm Optimization (PSO), and the Boltzman Univariate Marginal Distribution Algorithm (BUMDA). The second stage looks for the most suitable gains of a Proportional Integral Derivative (PID) control via the minimization of two conflicting objectives: one based on energy consumption and the tracking error of a target trajectory. To this effect, we compare two multi-objective algorithms: the Multiobjective Evolutionary Algorithm based on Decomposition (MOEA/D) and Non-dominated Sorting Genetic Algorithm-III (NSGA-III). The main contributions lie in the optimization model, the proposal of a two-stage optimization method, and the findings of the performance of different bio-inspired algorithms for each stage. Furthermore, we show optimized designs delivered by the proposed method and provide directions for the best-performing algorithms through performance metrics and statistical hypothesis tests.

Download Full-text

A Building-Block-Based Genetic Algorithm for Solving the Robots Allocation Problem in a Robotic Mobile Fulfilment System

Mathematical Problems in Engineering ◽

10.1155/2019/6153848 ◽

2019 ◽

Vol 2019 ◽

pp. 1-15

Author(s):

Jingtian Zhang ◽

Fuxing Yang ◽

Xun Weng

Keyword(s):

Genetic Algorithm ◽

Dynamic Scheduling ◽

Building Blocks ◽

Allocation Problem ◽

Accurate Analysis ◽

Allocation Problems ◽

Scheduling Method ◽

Project Scheduling Problem ◽

Block Based ◽

Better Than

Robotic mobile fulfilment system (RMFS) is an efficient and flexible order picking system where robots ship the movable shelves with items to the picking stations. This innovative parts-to-picker system, known as Kiva system, is especially suited for e-commerce fulfilment centres and has been widely used in practice. However, there are lots of resource allocation problems in RMFS. The robots allocation problem of deciding which robot will be allocated to a delivery task has a significant impact on the productivity of the whole system. We model this problem as a resource-constrained project scheduling problem with transfer times (RCPSPTT) based on the accurate analysis of driving and delivering behaviour of robots. A dedicated serial schedule generation scheme and a genetic algorithm using building-blocks-based crossover (BBX) operator are proposed to solve this problem. The designed algorithm can be combined into a dynamic scheduling structure or used as the basis of calculation for other allocation problems. Experiment instances are generated based on the characteristics of RMFS, and the computation results show that the proposed algorithm outperforms the traditional rule-based scheduling method. The BBX operator is rapid and efficient which performs better than several classic and competitive crossover operators.

Download Full-text