Joint feature selection and hierarchical classifier design

Author(s):  
Cecille Freeman ◽  
Dana Kulic ◽  
Otman Basir
Author(s):  
PAULO V. W. RADTKE ◽  
TONY WONG ◽  
ROBERT SABOURIN

The optimization of many engineering systems is challenged by the solution over-fit to the data set used to evaluate potential solutions during the evolutionary process. The solution over-fit phenomenon is hard to detect and is especially prevalent in problems involving example-based training, such as pattern feature selection and pattern classifier design. For these applications, uncontrolled over-fit can lead to biased features being extracted and degraded classifier generalization abilities. This paper details the performance of a solution over-fit control strategy used in the multiobjective evolutionary optimization of a multileveled classification system. This control, embedded within a solution validation procedure, minimizes the over-fit effects without modifying the dominance relation used in the processing of candidate solutions. Extensive experimental analysis using multiobjective genetic and memetic algorithms demonstrates both the need and the efficiency of the proposed over-fit control for pattern classification systems optimization.


Entropy ◽  
2020 ◽  
Vol 22 (4) ◽  
pp. 492
Author(s):  
Gustavo Estrela ◽  
Marco Dimas Gubitoso ◽  
Carlos Eduardo Ferreira ◽  
Junior Barrera ◽  
Marcelo S. Reis

In Machine Learning, feature selection is an important step in classifier design. It consists of finding a subset of features that is optimum for a given cost function. One possibility to solve feature selection is to organize all possible feature subsets into a Boolean lattice and to exploit the fact that the costs of chains in that lattice describe U-shaped curves. Minimization of such cost function is known as the U-curve problem. Recently, a study proposed U-Curve Search (UCS), an optimal algorithm for that problem, which was successfully used for feature selection. However, despite of the algorithm optimality, the UCS required time in computational assays was exponential on the number of features. Here, we report that such scalability issue arises due to the fact that the U-curve problem is NP-hard. In the sequence, we introduce the Parallel U-Curve Search (PUCS), a new algorithm for the U-curve problem. In PUCS, we present a novel way to partition the search space into smaller Boolean lattices, thus rendering the algorithm highly parallelizable. We also provide computational assays with both synthetic data and Machine Learning datasets, where the PUCS performance was assessed against UCS and other golden standard algorithms in feature selection.


2004 ◽  
Vol 26 (9) ◽  
pp. 1105-1111 ◽  
Author(s):  
B. Krishnapuram ◽  
A.J. Harternink ◽  
L. Carin ◽  
M.A.T. Figueiredo

Sign in / Sign up

Export Citation Format

Share Document