Mutual information based input feature selection for classification problems

The objectives of this Perspective paper are to review some recent advances in sparse feature selection for regression and classification, as well as compressed sensing, and to discuss how these might be used to develop tools to advance personalized cancer therapy. As an illustration of the possibilities, a new algorithm for sparse regression is presented and is applied to predict the time to tumour recurrence in ovarian cancer. A new algorithm for sparse feature selection in classification problems is presented, and its validation in endometrial cancer is briefly discussed. Some open problems are also presented.

Download Full-text

Supervised mutual-information based feature selection for motor unit action potential classification

Medical & Biological Engineering & Computing ◽

10.1007/bf02510975 ◽

1997 ◽

Vol 35 (6) ◽

pp. 661-670 ◽

Cited By ~ 6

Author(s):

N. Sheikholeslami ◽

D. Stashuk

Keyword(s):

Feature Selection ◽

Mutual Information ◽

Action Potential ◽

Motor Unit ◽

Motor Unit Action Potential ◽

Motor Unit Action ◽

Selection For ◽

Unit Action

Download Full-text

Feature selection for facial expression recognition based on mutual information

Exhibition ◽

10.1109/ieeegcc.2009.5734265 ◽

2009 ◽

Cited By ~ 3

Author(s):

Seyed Mehdi Lajevardi ◽

Zahir M. Hussain

Keyword(s):

Feature Selection ◽

Mutual Information ◽

Facial Expression ◽

Facial Expression Recognition ◽

Expression Recognition ◽

Selection For

Download Full-text

Mutual information and sensitivity analysis for feature selection in customer targeting: A comparative study

Journal of Information Science ◽

10.1177/0165551518770967 ◽

2018 ◽

Vol 45 (1) ◽

pp. 53-67 ◽

Cited By ~ 8

Author(s):

Néstor Barraza ◽

Sérgio Moro ◽

Marcelo Ferreyra ◽

Adolfo de la Peña

Keyword(s):

Sensitivity Analysis ◽

Feature Selection ◽

Mutual Information ◽

Good Prediction ◽

Classification Problems ◽

Positive Ratio ◽

Advantages And Disadvantages ◽

False Positive Ratio ◽

Relevant Task ◽

The Cost

Feature selection is a highly relevant task in any data-driven knowledge discovery project. The present research focuses on analysing the advantages and disadvantages of using mutual information (MI) and data-based sensitivity analysis (DSA) for feature selection in classification problems, by applying both to a bank telemarketing case. A logistic regression model is built on the tuned set of features identified by each of the two techniques as the most influencing set of features on the success of a telemarketing contact, in a total of 13 features for MI and 9 for DSA. The latter performs better for lower values of false positives while the former is slightly better for a higher false-positive ratio. Thus, MI becomes a better choice if the intention is reducing slightly the cost of contacts without risking losing a high number of successes. However, DSA achieved good prediction results with less features.

Download Full-text