Multi-Label Feature Selection Based on High-Order Label Correlation Assumption

Ping Zhang; Wanfu Gao; Juncheng Hu; Yonghao Li

doi:10.3390/e22070797

Multi-Label Feature Selection Based on High-Order Label Correlation Assumption

Entropy ◽

10.3390/e22070797 ◽

2020 ◽

Vol 22 (7) ◽

pp. 797

Author(s):

Ping Zhang ◽

Wanfu Gao ◽

Juncheng Hu ◽

Yonghao Li

Keyword(s):

Feature Selection ◽

State Of The Art ◽

Classification Performance ◽

High Order ◽

Selection Methods ◽

Label Data ◽

Cumulative Summation ◽

Label Correlations ◽

Classification Information ◽

Correlation Term

Multi-label data often involve features with high dimensionality and complicated label correlations, resulting in a great challenge for multi-label learning. Feature selection plays an important role in multi-label learning to address multi-label data. Exploring label correlations is crucial for multi-label feature selection. Previous information-theoretical-based methods employ the strategy of cumulative summation approximation to evaluate candidate features, which merely considers low-order label correlations. In fact, there exist high-order label correlations in label set, labels naturally cluster into several groups, similar labels intend to cluster into the same group, different labels belong to different groups. However, the strategy of cumulative summation approximation tends to select the features related to the groups containing more labels while ignoring the classification information of groups containing less labels. Therefore, many features related to similar labels are selected, which leads to poor classification performance. To this end, Max-Correlation term considering high-order label correlations is proposed. Additionally, we combine the Max-Correlation term with feature redundancy term to ensure that selected features are relevant to different label groups. Finally, a new method named Multi-label Feature Selection considering Max-Correlation (MCMFS) is proposed. Experimental results demonstrate the classification superiority of MCMFS in comparison to eight state-of-the-art multi-label feature selection methods.

Download Full-text

Partial Classifier Chains with Feature Selection by Exploiting Label Correlation in Multi-Label Classification

Entropy ◽

10.3390/e22101143 ◽

2020 ◽

Vol 22 (10) ◽

pp. 1143

Author(s):

Zhenwu Wang ◽

Tielin Wang ◽

Benting Wan ◽

Mengjie Han

Keyword(s):

Feature Selection ◽

State Of The Art ◽

Predictive Performance ◽

Chain Structure ◽

Classification Performance ◽

Learning Problem ◽

Feature Spaces ◽

Label Correlations ◽

Classifier Chains ◽

Label Correlation

Multi-label classification (MLC) is a supervised learning problem where an object is naturally associated with multiple concepts because it can be described from various dimensions. How to exploit the resulting label correlations is the key issue in MLC problems. The classifier chain (CC) is a well-known MLC approach that can learn complex coupling relationships between labels. CC suffers from two obvious drawbacks: (1) label ordering is decided at random although it usually has a strong effect on predictive performance; (2) all the labels are inserted into the chain, although some of them may carry irrelevant information that discriminates against the others. In this work, we propose a partial classifier chain method with feature selection (PCC-FS) that exploits the label correlation between label and feature spaces and thus solves the two previously mentioned problems simultaneously. In the PCC-FS algorithm, feature selection is performed by learning the covariance between feature set and label set, thus eliminating the irrelevant features that can diminish classification performance. Couplings in the label set are extracted, and the coupled labels of each label are inserted simultaneously into the chain structure to execute the training and prediction activities. The experimental results from five metrics demonstrate that, in comparison to eight state-of-the-art MLC algorithms, the proposed method is a significant improvement on existing multi-label classification.

Download Full-text

Feature Selection Based on Binary Tree Growth Algorithm for the Classification of Myoelectric Signals

Machines ◽

10.3390/machines6040065 ◽

2018 ◽

Vol 6 (4) ◽

pp. 65 ◽

Cited By ~ 4

Author(s):

Jingwei Too ◽

Abdul Abdullah ◽

Norhashimah Mohd Saad ◽

Nursabillilah Mohd Ali

Keyword(s):

Feature Selection ◽

Tree Growth ◽

Binary Tree ◽

Feature Vector ◽

Classification Performance ◽

Feature Reduction ◽

Feature Subset ◽

Selection Methods ◽

Time Frequency ◽

Mutation Operators

Electromyography (EMG) has been widely used in rehabilitation and myoelectric prosthetic applications. However, a recent increment in the number of EMG features has led to a high dimensional feature vector. This in turn will degrade the classification performance and increase the complexity of the recognition system. In this paper, we have proposed two new feature selection methods based on a tree growth algorithm (TGA) for EMG signals classification. In the first approach, two transfer functions are implemented to convert the continuous TGA into a binary version. For the second approach, the swap, crossover, and mutation operators are introduced in a modified binary tree growth algorithm for enhancing the exploitation and exploration behaviors. In this study, short time Fourier transform (STFT) is employed to transform the EMG signals into time-frequency representation. The features are then extracted from the STFT coefficient and form a feature vector. Afterward, the proposed feature selection methods are applied to evaluate the best feature subset from a large available feature set. The experimental results show the superiority of MBTGA not only in terms of feature reduction, but also the classification performance.

Download Full-text

Benefiting feature selection by the discovery of false irrelevant attributes

International Journal of Wavelets Multiresolution and Information Processing ◽

10.1142/s021969131550023x ◽

2015 ◽

Vol 13 (04) ◽

pp. 1550023 ◽

Cited By ~ 1

Author(s):

Lidia S. Chao ◽

Derek F. Wong ◽

Philip C. L. Chen ◽

Wing W. Y. Ng ◽

Daniel S. Yeung

Keyword(s):

Feature Selection ◽

Execution Time ◽

Classification Accuracy ◽

State Of The Art ◽

Classification Task ◽

Selection Methods ◽

Empirical Results ◽

Selection Accuracy ◽

Irrelevant Attributes ◽

Selection Methodology

The ordinary feature selection methods select only the explicit relevant attributes by filtering the irrelevant ones. They trade the selection accuracy for the execution time and complexity. In which, the hidden supportive information possessed by the irrelevant attributes may be lost, so that they may miss some good combinations. We believe that attributes are useless regarding the classification task by themselves, sometimes may provide potentially useful supportive information to other attributes and thus benefit the classification task. Such a strategy can minimize the information lost, therefore is able to maximize the classification accuracy. Especially for the dataset contains hidden interactions among attributes. This paper proposes a feature selection methodology from a new angle that selects not only the relevant features, but also targeting at the potentially useful false irrelevant attributes by measuring their supportive importance to other attributes. The empirical results validate the hypothesis by demonstrating that the proposed approach outperforms most of the state-of-the-art filter based feature selection methods.

Download Full-text

Chinese Sentiment Classifier Machine Learning Based on Optimized Information Gain Feature Selection

Advanced Materials Research ◽

10.4028/www.scientific.net/amr.988.511 ◽

2014 ◽

Vol 988 ◽

pp. 511-516 ◽

Cited By ~ 3

Author(s):

Jin Tao Shi ◽

Hui Liang Liu ◽

Yuan Xu ◽

Jun Feng Yan ◽

Jian Feng Xu

Keyword(s):

Machine Learning ◽

Feature Selection ◽

Word Frequency ◽

Chinese Text ◽

Information Gain ◽

Classification Performance ◽

Selection Methods ◽

Text Feature ◽

Important Solution ◽

Feature Word

Machine learning is important solution in the research of Chinese text sentiment categorization , the text feature selection is critical to the classification performance. However, the classical feature selection methods have better effect on the global categories, but it misses many representative feature words of each category. This paper presents an improved information gain method that integrates word frequency and degree of feature word sentiment into traditional information gain methods. Experiments show that classifier improved by this method has better classification .

Download Full-text

A Systematic Study of Feature Selection Methods for Learning to Rank Algorithms

International Journal of Information Retrieval Research ◽

10.4018/ijirr.2018070104 ◽

2018 ◽

Vol 8 (3) ◽

pp. 46-67 ◽

Cited By ~ 1

Author(s):

Mehrnoush Barani Shirzad ◽

Mohammad Reza Keyvanpour

Keyword(s):

Feature Selection ◽

State Of The Art ◽

Learning To Rank ◽

Future Research ◽

Selection Methods ◽

Ranking Models ◽

Ranking Systems ◽

Efficiency And Effectiveness ◽

Selection For

This article describes how feature selection for learning to rank algorithms has become an interesting issue. While noisy and irrelevant features influence performance, and result in an overfitting problem in ranking systems, reducing the number of features by illuminating irrelevant and noisy features is a solution. Several studies have applied feature selection for learning to rank, which promote efficiency and effectiveness of ranking models. As the number of features and consequently the number of irrelevant and noisy features is increasing, systematic a review of Feature selection for learning to rank methods is required. In this article, a framework to examine research on feature selection for learning to rank (FSLR) is proposed. Under this framework, the authors review the most state-of-the-art methods and suggest several criteria to analyze them. FSLR offers a structured classification of current algorithms for future research to: a) properly select strategies from existing algorithms using certain criteria or b) to find ways to develop existing methodologies.

Download Full-text

A Multi Criteria Decision Modelling Approach for Gait Analysis of Parkinson’s Disease Using Wearable Sensors to Compare the Classification Performance Based on the Different Feature Selection Methods

Lecture Notes in Electrical Engineering - Frontier Computing ◽

10.1007/978-981-13-3648-5_61 ◽

2019 ◽

pp. 528-534

Author(s):

Satyabrata Aich ◽

Kamalakanta Muduli ◽

Hee-Cheol Kim

Keyword(s):

Parkinson’S Disease ◽

Parkinson's Disease ◽

Feature Selection ◽

Gait Analysis ◽

Wearable Sensors ◽

Classification Performance ◽

Selection Methods ◽

Decision Modelling ◽

Modelling Approach

Download Full-text

Unsupervised authorship attribution using feature selection and weighted cosine similarity

Journal of Intelligent & Fuzzy Systems ◽

10.3233/jifs-219226 ◽

2021 ◽

pp. 1-11

Author(s):

Carolina Martín-del-Campo-Rodríguez ◽

Grigori Sidorov ◽

Ildar Batyrshin

Keyword(s):

Machine Learning ◽

Feature Selection ◽

Computational Model ◽

Clustering Algorithm ◽

State Of The Art ◽

Extraction Methods ◽

Cosine Similarity ◽

Authorship Attribution ◽

Selection Methods ◽

Cosine Similarity Measure

This paper presents a computational model for the unsupervised authorship attribution task based on a traditional machine learning scheme. An improvement over the state of the art is achieved by comparing different feature selection methods on the PAN17 author clustering dataset. To achieve this improvement, specific pre-processing and features extraction methods were proposed, such as a method to separate tokens by type to assign them to only one category. Similarly, special characters are used as part of the punctuation marks to improve the result obtained when applying typed character n-grams. The Weighted cosine similarity measure is applied to improve the B 3 F-score by reducing the vector values where attributes are exclusive. This measure is used to define distances between documents, which later are occupied by the clustering algorithm to perform authorship attribution.

Download Full-text

Identifying Optimal Wavelengths as Disease Signatures Using Hyperspectral Sensor and Machine Learning

Remote Sensing ◽

10.3390/rs13142833 ◽

2021 ◽

Vol 13 (14) ◽

pp. 2833

Author(s):

Xing Wei ◽

Marcela A. Johnson ◽

David B. Langston ◽

Hillary L. Mehl ◽

Song Li

Keyword(s):

Machine Learning ◽

Feature Selection ◽

Optical Sensors ◽

Minimum Distance ◽

Classification Performance ◽

Stem Rot ◽

Recursive Feature Elimination ◽

Agricultural Crop ◽

Selection Methods ◽

20 Nm

Hyperspectral sensors combined with machine learning are increasingly utilized in agricultural crop systems for diverse applications, including plant disease detection. This study was designed to identify the most important wavelengths to discriminate between healthy and diseased peanut (Arachis hypogaea L.) plants infected with Athelia rolfsii, the causal agent of peanut stem rot, using in-situ spectroscopy and machine learning. In greenhouse experiments, daily measurements were conducted to inspect disease symptoms visually and to collect spectral reflectance of peanut leaves on lateral stems of plants mock-inoculated and inoculated with A. rolfsii. Spectrum files were categorized into five classes based on foliar wilting symptoms. Five feature selection methods were compared to select the top 10 ranked wavelengths with and without a custom minimum distance of 20 nm. Recursive feature elimination methods outperformed the chi-square and SelectFromModel methods. Adding the minimum distance of 20 nm into the top selected wavelengths improved classification performance. Wavelengths of 501–505, 690–694, 763 and 884 nm were repeatedly selected by two or more feature selection methods. These selected wavelengths can be applied in designing optical sensors for automated stem rot detection in peanut fields. The machine-learning-based methodology can be adapted to identify spectral signatures of disease in other plant-pathogen systems.

Download Full-text

Integrating Feature and Instance Selection Techniques in Opinion Mining

International Journal of Data Warehousing and Mining ◽

10.4018/ijdwm.2020070109 ◽

2020 ◽

Vol 16 (3) ◽

pp. 168-182

Author(s):

Zi-Hung You ◽

Ya-Han Hu ◽

Chih-Fong Tsai ◽

Yen-Ming Kuo

Keyword(s):

Feature Selection ◽

Opinion Mining ◽

Classification Performance ◽

Problem Instance ◽

Instance Selection ◽

Selection Methods ◽

Inverse Document Frequency ◽

Term Frequency ◽

Document Frequency ◽

Text Features

Opinion mining focuses on extracting polarity information from texts. For textual term representation, different feature selection methods, e.g. term frequency (TF) or term frequency–inverse document frequency (TF–IDF), can yield diverse numbers of text features. In text classification, however, a selected training set may contain noisy documents (or outliers), which can degrade the classification performance. To solve this problem, instance selection can be adopted to filter out unrepresentative training documents. Therefore, this article investigates the opinion mining performance associated with feature and instance selection steps simultaneously. Two combination processes based on performing feature selection and instance selection in different orders, were compared. Specifically, two feature selection methods, namely TF and TF–IDF, and two instance selection methods, namely DROP3 and IB3, were employed for comparison. The experimental results by using three Twitter datasets to develop sentiment classifiers showed that TF–IDF followed by DROP3 performs the best.

Download Full-text

Multi-Label Feature Selection Method Based on Dynamic Weight

10.21203/rs.3.rs-604646/v1 ◽

2021 ◽

Author(s):

Ping Zhang ◽

Jiyao Sheng ◽

Wanfu Gao ◽

Juncheng Hu ◽

Yonghao Li

Keyword(s):

Feature Selection ◽

Dynamic Change ◽

Feature Selection Method ◽

Selection Method ◽

Data Sets ◽

Selection Methods ◽

Real World Data ◽

Amount Of Information ◽

The Difference ◽

Classification Information

Abstract Multi-label feature selection attracts considerable attention from multi-label learning. Information-theory based multi-label feature selection methods intend to select the most informative features and reduce the uncertain amount of information of labels. Previous methods regard the uncertain amount of information of labels as constant. In fact, as the classification information of the label set is captured by features, the remaining uncertainty of each label is changing dynamically. In this paper, we categorize labels into two groups: one contains the labels with few remaining uncertainty, which means that most of classification information with respect to the labels has been obtained by the already-selected features; another group contains the labels with extensive remaining uncertainty, which means that the classification information of these labels is neglected by already-selected features. Feature selection aims to select the new features with highly relevant to the labels in the second group. Existing methods do not distinguish the difference between two label groups and ignore the dynamic change amount of information of labels. To this end, a Relevancy Ratio is designed to clarify the dynamic change amount of information of each label under the condition of the already-selected features. Afterwards, a Weighted Feature Relevancy is defined to evaluate the candidate features. Finally, a new multi-label Feature Selection method based on Weighted Feature Relevancy (WFRFS) is proposed. The experiments obtain encouraging results of WFRFS in comparison to six multi-label feature selection methods on thirteen real-world data sets.

Download Full-text