Ensemble Fuzzy Feature Selection Based on Relevancy, Redundancy, and Dependency Criteria

Omar A. M. Salem; Feng Liu; Yi-Ping Phoebe Chen; Xi Chen

doi:10.3390/e22070757

Ensemble Fuzzy Feature Selection Based on Relevancy, Redundancy, and Dependency Criteria

Entropy ◽

10.3390/e22070757 ◽

2020 ◽

Vol 22 (7) ◽

pp. 757 ◽

Cited By ~ 1

Author(s):

Omar A. M. Salem ◽

Feng Liu ◽

Yi-Ping Phoebe Chen ◽

Xi Chen

Keyword(s):

Feature Selection ◽

Classification Performance ◽

Classification Systems ◽

Selection Methods ◽

Discriminative Ability ◽

Main Challenge ◽

Benchmark Datasets ◽

Fuzzy Feature Selection ◽

Feature Significance ◽

Experimental Comparisons

The main challenge of classification systems is the processing of undesirable data. Filter-based feature selection is an effective solution to improve the performance of classification systems by selecting the significant features and discarding the undesirable ones. The success of this solution depends on the extracted information from data characteristics. For this reason, many research theories have been introduced to extract different feature relations. Unfortunately, traditional feature selection methods estimate the feature significance based on either individually or dependency discriminative ability. This paper introduces a new ensemble feature selection, called fuzzy feature selection based on relevancy, redundancy, and dependency (FFS-RRD). The proposed method considers both individually and dependency discriminative ability to extract all possible feature relations. To evaluate the proposed method, experimental comparisons are conducted with eight state-of-the-art and conventional feature selection methods. Based on 13 benchmark datasets, the experimental results over four well-known classifiers show the outperformance of our proposed method in terms of classification performance and stability.

Download Full-text

BETTER ALTERNATIVES FOR STEPWISE DISCRIMINANT ANALYSIS

Acta Universitatis Lodziensis Folia oeconomica ◽

10.18778/0208-6018.311.02 ◽

2015 ◽

Vol 1 (311) ◽

Author(s):

Katarzyna Stąpor

Keyword(s):

Feature Selection ◽

Discriminant Analysis ◽

Tabu Search ◽

Stepwise Discriminant Analysis ◽

Selection Methods ◽

Discrimination Power ◽

Statistical Software ◽

Software Packages ◽

Benchmark Datasets

Discriminant Analysis can best be defined as a technique which allows the classification of an individual into several dictinctive populations on the basis of a set of measurements. Stepwise discriminant analysis (SDA) is concerned with selecting the most important variables whilst retaining the highest discrimination power possible. The process of selecting a smaller number of variables is often necessary for a variety number of reasons. In the existing statistical software packages SDA is based on the classic feature selection methods. Many problems with such stepwise procedures have been identified. In this work the new method based on the metaheuristic strategy tabu search will be presented together with the experimental results conducted on the selected benchmark datasets. The results are promising.

Download Full-text

Deep Discriminative Representation Learning with Attention Map for Scene Classification

Remote Sensing ◽

10.3390/rs12091366 ◽

2020 ◽

Vol 12 (9) ◽

pp. 1366 ◽

Cited By ~ 5

Author(s):

Jun Li ◽

Daoyu Lin ◽

Yang Wang ◽

Guangluan Xu ◽

Yunyan Zhang ◽

...

Keyword(s):

Remote Sensing ◽

Feature Fusion ◽

Representation Learning ◽

Classification Performance ◽

Great Success ◽

Scene Classification ◽

Remote Sensing Images ◽

Discriminative Ability ◽

Feature Representations ◽

Benchmark Datasets

In recent years, convolutional neural networks (CNNs) have shown great success in the scene classification of computer vision images. Although these CNNs can achieve excellent classification accuracy, the discriminative ability of feature representations extracted from CNNs is still limited in distinguishing more complex remote sensing images. Therefore, we propose a unified feature fusion framework based on attention mechanism in this paper, which is called Deep Discriminative Representation Learning with Attention Map (DDRL-AM). Firstly, by applying Gradient-weighted Class Activation Mapping (Grad-CAM) algorithm, attention maps associated with the predicted results are generated in order to make CNNs focus on the most salient parts of the image. Secondly, a spatial feature transformer (SFT) is designed to extract discriminative features from attention maps. Then an innovative two-channel CNN architecture is proposed by the fusion of features extracted from attention maps and the RGB (red green blue) stream. A new objective function that considers both center and cross-entropy loss are optimized to decrease the influence of inter-class dispersion and within-class variance. In order to show its effectiveness in classifying remote sensing images, the proposed DDRL-AM method is evaluated on four public benchmark datasets. The experimental results demonstrate the competitive scene classification performance of the DDRL-AM approach. Moreover, the visualization of features extracted by the proposed DDRL-AM method can prove that the discriminative ability of features has been increased.

Download Full-text

Fuzzy Mutual Information Feature Selection Based on Representative Samples

International Journal of Software Innovation ◽

10.4018/ijsi.2018010105 ◽

2018 ◽

Vol 6 (1) ◽

pp. 58-72

Author(s):

Omar A. M. Salem ◽

Liwei Wang

Keyword(s):

Feature Selection ◽

Mutual Information ◽

Classification Performance ◽

Feature Subset ◽

Classification Models ◽

Negative Effect ◽

Benchmark Datasets ◽

Real World Datasets ◽

And Storage ◽

Representative Samples

Building classification models from real-world datasets became a difficult task, especially in datasets with high dimensional features. Unfortunately, these datasets may include irrelevant or redundant features which have a negative effect on the classification performance. Selecting the significant features and eliminating undesirable features can improve the classification models. Fuzzy mutual information is widely used feature selection to find the best feature subset before classification process. However, it requires more computation and storage space. To overcome these limitations, this paper proposes an improved fuzzy mutual information feature selection based on representative samples. Based on benchmark datasets, the experiments show that the proposed method achieved better results in the terms of classification accuracy, selected feature subset size, storage, and stability.

Download Full-text

Feature Selection Based on Binary Tree Growth Algorithm for the Classification of Myoelectric Signals

Machines ◽

10.3390/machines6040065 ◽

2018 ◽

Vol 6 (4) ◽

pp. 65 ◽

Cited By ~ 4

Author(s):

Jingwei Too ◽

Abdul Abdullah ◽

Norhashimah Mohd Saad ◽

Nursabillilah Mohd Ali

Keyword(s):

Feature Selection ◽

Tree Growth ◽

Binary Tree ◽

Feature Vector ◽

Classification Performance ◽

Feature Reduction ◽

Feature Subset ◽

Selection Methods ◽

Time Frequency ◽

Mutation Operators

Electromyography (EMG) has been widely used in rehabilitation and myoelectric prosthetic applications. However, a recent increment in the number of EMG features has led to a high dimensional feature vector. This in turn will degrade the classification performance and increase the complexity of the recognition system. In this paper, we have proposed two new feature selection methods based on a tree growth algorithm (TGA) for EMG signals classification. In the first approach, two transfer functions are implemented to convert the continuous TGA into a binary version. For the second approach, the swap, crossover, and mutation operators are introduced in a modified binary tree growth algorithm for enhancing the exploitation and exploration behaviors. In this study, short time Fourier transform (STFT) is employed to transform the EMG signals into time-frequency representation. The features are then extracted from the STFT coefficient and form a feature vector. Afterward, the proposed feature selection methods are applied to evaluate the best feature subset from a large available feature set. The experimental results show the superiority of MBTGA not only in terms of feature reduction, but also the classification performance.

Download Full-text

A feature selection model based on genetic rank aggregation for text sentiment classification

Journal of Information Science ◽

10.1177/0165551515613226 ◽

2016 ◽

Vol 43 (1) ◽

pp. 25-38 ◽

Cited By ~ 42

Author(s):

Aytuğ Onan ◽

Serdar Korukoğlu

Keyword(s):

Feature Selection ◽

Text Mining ◽

Language Processing ◽

Rank Aggregation ◽

Sentiment Classification ◽

Feature Subset ◽

Individual Feature ◽

Selection Methods ◽

Training Time ◽

Main Challenge

Sentiment analysis is an important research direction of natural language processing, text mining and web mining which aims to extract subjective information in source materials. The main challenge encountered in machine learning method-based sentiment classification is the abundant amount of data available. This amount makes it difficult to train the learning algorithms in a feasible time and degrades the classification accuracy of the built model. Hence, feature selection becomes an essential task in developing robust and efficient classification models whilst reducing the training time. In text mining applications, individual filter-based feature selection methods have been widely utilized owing to their simplicity and relatively high performance. This paper presents an ensemble approach for feature selection, which aggregates the several individual feature lists obtained by the different feature selection methods so that a more robust and efficient feature subset can be obtained. In order to aggregate the individual feature lists, a genetic algorithm has been utilized. Experimental evaluations indicated that the proposed aggregation model is an efficient method and it outperforms individual filter-based feature selection methods on sentiment classification.

Download Full-text

Chinese Sentiment Classifier Machine Learning Based on Optimized Information Gain Feature Selection

Advanced Materials Research ◽

10.4028/www.scientific.net/amr.988.511 ◽

2014 ◽

Vol 988 ◽

pp. 511-516 ◽

Cited By ~ 3

Author(s):

Jin Tao Shi ◽

Hui Liang Liu ◽

Yuan Xu ◽

Jun Feng Yan ◽

Jian Feng Xu

Keyword(s):

Machine Learning ◽

Feature Selection ◽

Word Frequency ◽

Chinese Text ◽

Information Gain ◽

Classification Performance ◽

Selection Methods ◽

Text Feature ◽

Important Solution ◽

Feature Word

Machine learning is important solution in the research of Chinese text sentiment categorization , the text feature selection is critical to the classification performance. However, the classical feature selection methods have better effect on the global categories, but it misses many representative feature words of each category. This paper presents an improved information gain method that integrates word frequency and degree of feature word sentiment into traditional information gain methods. Experiments show that classifier improved by this method has better classification .

Download Full-text

A Multi Criteria Decision Modelling Approach for Gait Analysis of Parkinson’s Disease Using Wearable Sensors to Compare the Classification Performance Based on the Different Feature Selection Methods

Lecture Notes in Electrical Engineering - Frontier Computing ◽

10.1007/978-981-13-3648-5_61 ◽

2019 ◽

pp. 528-534

Author(s):

Satyabrata Aich ◽

Kamalakanta Muduli ◽

Hee-Cheol Kim

Keyword(s):

Parkinson’S Disease ◽

Parkinson's Disease ◽

Feature Selection ◽

Gait Analysis ◽

Wearable Sensors ◽

Classification Performance ◽

Selection Methods ◽

Decision Modelling ◽

Modelling Approach

Download Full-text

Identifying Optimal Wavelengths as Disease Signatures Using Hyperspectral Sensor and Machine Learning

Remote Sensing ◽

10.3390/rs13142833 ◽

2021 ◽

Vol 13 (14) ◽

pp. 2833

Author(s):

Xing Wei ◽

Marcela A. Johnson ◽

David B. Langston ◽

Hillary L. Mehl ◽

Song Li

Keyword(s):

Machine Learning ◽

Feature Selection ◽

Optical Sensors ◽

Minimum Distance ◽

Classification Performance ◽

Stem Rot ◽

Recursive Feature Elimination ◽

Agricultural Crop ◽

Selection Methods ◽

20 Nm

Hyperspectral sensors combined with machine learning are increasingly utilized in agricultural crop systems for diverse applications, including plant disease detection. This study was designed to identify the most important wavelengths to discriminate between healthy and diseased peanut (Arachis hypogaea L.) plants infected with Athelia rolfsii, the causal agent of peanut stem rot, using in-situ spectroscopy and machine learning. In greenhouse experiments, daily measurements were conducted to inspect disease symptoms visually and to collect spectral reflectance of peanut leaves on lateral stems of plants mock-inoculated and inoculated with A. rolfsii. Spectrum files were categorized into five classes based on foliar wilting symptoms. Five feature selection methods were compared to select the top 10 ranked wavelengths with and without a custom minimum distance of 20 nm. Recursive feature elimination methods outperformed the chi-square and SelectFromModel methods. Adding the minimum distance of 20 nm into the top selected wavelengths improved classification performance. Wavelengths of 501–505, 690–694, 763 and 884 nm were repeatedly selected by two or more feature selection methods. These selected wavelengths can be applied in designing optical sensors for automated stem rot detection in peanut fields. The machine-learning-based methodology can be adapted to identify spectral signatures of disease in other plant-pathogen systems.

Download Full-text

Multi-Label Feature Selection Based on High-Order Label Correlation Assumption

Entropy ◽

10.3390/e22070797 ◽

2020 ◽

Vol 22 (7) ◽

pp. 797

Author(s):

Ping Zhang ◽

Wanfu Gao ◽

Juncheng Hu ◽

Yonghao Li

Keyword(s):

Feature Selection ◽

State Of The Art ◽

Classification Performance ◽

High Order ◽

Selection Methods ◽

Label Data ◽

Cumulative Summation ◽

Label Correlations ◽

Classification Information ◽

Correlation Term

Multi-label data often involve features with high dimensionality and complicated label correlations, resulting in a great challenge for multi-label learning. Feature selection plays an important role in multi-label learning to address multi-label data. Exploring label correlations is crucial for multi-label feature selection. Previous information-theoretical-based methods employ the strategy of cumulative summation approximation to evaluate candidate features, which merely considers low-order label correlations. In fact, there exist high-order label correlations in label set, labels naturally cluster into several groups, similar labels intend to cluster into the same group, different labels belong to different groups. However, the strategy of cumulative summation approximation tends to select the features related to the groups containing more labels while ignoring the classification information of groups containing less labels. Therefore, many features related to similar labels are selected, which leads to poor classification performance. To this end, Max-Correlation term considering high-order label correlations is proposed. Additionally, we combine the Max-Correlation term with feature redundancy term to ensure that selected features are relevant to different label groups. Finally, a new method named Multi-label Feature Selection considering Max-Correlation (MCMFS) is proposed. Experimental results demonstrate the classification superiority of MCMFS in comparison to eight state-of-the-art multi-label feature selection methods.

Download Full-text

Integrating Feature and Instance Selection Techniques in Opinion Mining

International Journal of Data Warehousing and Mining ◽

10.4018/ijdwm.2020070109 ◽

2020 ◽

Vol 16 (3) ◽

pp. 168-182

Author(s):

Zi-Hung You ◽

Ya-Han Hu ◽

Chih-Fong Tsai ◽

Yen-Ming Kuo

Keyword(s):

Feature Selection ◽

Opinion Mining ◽

Classification Performance ◽

Problem Instance ◽

Instance Selection ◽

Selection Methods ◽

Inverse Document Frequency ◽

Term Frequency ◽

Document Frequency ◽

Text Features

Opinion mining focuses on extracting polarity information from texts. For textual term representation, different feature selection methods, e.g. term frequency (TF) or term frequency–inverse document frequency (TF–IDF), can yield diverse numbers of text features. In text classification, however, a selected training set may contain noisy documents (or outliers), which can degrade the classification performance. To solve this problem, instance selection can be adopted to filter out unrepresentative training documents. Therefore, this article investigates the opinion mining performance associated with feature and instance selection steps simultaneously. Two combination processes based on performing feature selection and instance selection in different orders, were compared. Specifically, two feature selection methods, namely TF and TF–IDF, and two instance selection methods, namely DROP3 and IB3, were employed for comparison. The experimental results by using three Twitter datasets to develop sentiment classifiers showed that TF–IDF followed by DROP3 performs the best.

Download Full-text