An Efficient Feature Subset Selection Algorithm for Classification of Multidimensional Dataset

The Scientific World JOURNAL ◽

10.1155/2015/821798 ◽

2015 ◽

Vol 2015 ◽

pp. 1-9 ◽

Cited By ~ 5

Author(s):

Senthilkumar Devaraj ◽

S. Paulraj

Keyword(s):

Feature Selection ◽

Subset Selection ◽

Feature Subset Selection ◽

Complex Nature ◽

Feature Subset ◽

Selection Algorithm ◽

Feature Selection Algorithm ◽

Multidimensional Datasets ◽

Study Results

Multidimensional medical data classification has recently received increased attention by researchers working on machine learning and data mining. In multidimensional dataset (MDD) each instance is associated with multiple class values. Due to its complex nature, feature selection and classifier built from the MDD are typically more expensive or time-consuming. Therefore, we need a robust feature selection technique for selecting the optimum single subset of the features of the MDD for further analysis or to design a classifier. In this paper, an efficient feature selection algorithm is proposed for the classification of MDD. The proposed multidimensional feature subset selection (MFSS) algorithm yields a unique feature subset for further analysis or to build a classifier and there is a computational advantage on MDD compared with the existing feature selection algorithms. The proposed work is applied to benchmark multidimensional datasets. The number of features was reduced to 3% minimum and 30% maximum by using the proposed MFSS. In conclusion, the study results show that MFSS is an efficient feature selection algorithm without affecting the classification accuracy even for the reduced number of features. Also the proposed MFSS algorithm is suitable for both problem transformation and algorithm adaptation and it has great potentials in those applications generating multidimensional datasets.

Download Full-text

SVM and KNN Based SGO Feature Selection Algorithm for Breast Cancer Diagnosis

International Journal of Recent Technology and Engineering - 2 ◽

10.35940/ijrte.d4428.038620 ◽

2020 ◽

Vol 8 (2S7) ◽

pp. 2237-2240

Keyword(s):

Breast Cancer ◽

Machine Learning ◽

Feature Selection ◽

Learning Algorithms ◽

Subset Selection ◽

Machine Learning Algorithms ◽

Feature Subset Selection ◽

Feature Subset ◽

Selection Algorithm ◽

Feature Selection Algorithm

In diagnosis and prediction systems, algorithms working on datasets with a high number of dimensions tend to take more time than those with fewer dimensions. Feature subset selection algorithms enhance the efficiency of Machine Learning algorithms in prediction problems by selecting a subset of the total features and thus pruning redundancy and noise. In this article, such a feature subset selection method is proposed and implemented to diagnose breast cancer using Support Vector Machine (SVM) and K-Nearest Neighbor (KNN) algorithms. This feature selection algorithm is based on Social Group Optimization (SGO) an evolutionary algorithm. Higher accuracy in diagnosing breast cancer is achieved using our proposed model when compared to other feature selection-based Machine Learning algorithms

Download Full-text

Ensemble Based Classification of Sentiments Using Forest Optimization Algorithm

Data ◽

10.3390/data4020076 ◽

2019 ◽

Vol 4 (2) ◽

pp. 76 ◽

Cited By ~ 2

Author(s):

Mehreen Naz ◽

Kashif Zafar ◽

Ayesha Khan

Keyword(s):

Feature Selection ◽

Optimization Algorithm ◽

Subset Selection ◽

Feature Subset Selection ◽

Selection Problem ◽

Support Vector ◽

Feature Subset ◽

Hybrid Technique ◽

Computational Performance

Feature subset selection is a process to choose a set of relevant features from a high dimensionality dataset to improve the performance of classifiers. The meaningful words extracted from data forms a set of features for sentiment analysis. Many evolutionary algorithms, like the Genetic Algorithm (GA) and Particle Swarm Optimization (PSO), have been applied to feature subset selection problem and computational performance can still be improved. This research presents a solution to feature subset selection problem for classification of sentiments using ensemble-based classifiers. It consists of a hybrid technique of minimum redundancy and maximum relevance (mRMR) and Forest Optimization Algorithm (FOA)-based feature selection. Ensemble-based classification is implemented to optimize the results of individual classifiers. The Forest Optimization Algorithm as a feature selection technique has been applied to various classification datasets from the UCI machine learning repository. The classifiers used for ensemble methods for UCI repository datasets are the k-Nearest Neighbor (k-NN) and Naïve Bayes (NB). For the classification of sentiments, 15–20% improvement has been recorded. The dataset used for classification of sentiments is Blitzer’s dataset consisting of reviews of electronic products. The results are further improved by ensemble of k-NN, NB, and Support Vector Machine (SVM) with an accuracy of 95% for the classification of sentiment tasks.

Download Full-text

A Briefest Feature Subset Selection Algorithm Based on Preference Attribute

Advanced Materials Research ◽

10.4028/www.scientific.net/amr.774-776.1816 ◽

2013 ◽

Vol 774-776 ◽

pp. 1816-1822

Author(s):

Kai Yang ◽

Yong Long Jin ◽

Zhi Jun He

Keyword(s):

Feature Selection ◽

Concept Lattice ◽

Subset Selection ◽

Feature Subset Selection ◽

Formal Concept ◽

Feature Subset ◽

Selection Algorithm ◽

Subjective Experiences ◽

The Given ◽

Concept Pairs

Concept lattice is the core data structure of formal concept analysis and represents the order relationship between the concepts iconically. Feature selection has been the focus of research in machine learning.And feature selection has been shown very effective in removing irrelevant and redundant features,also increasing efficiency in learning process and obtaining more intelligible learned results.This paper proposes a new briefest feature subset selection algorithm based on preference attribute on the basis of study of concept lattice theory. User can put forward a preference attribute according to their subjective experiences, all the briefest feature subsets containing the given attribute can be discovered by the algorithm. It firstly find some special concept pairs and calculate their waned-value hypergraph, then obtain the minimal transversal of the hypergraph as a result. A practical example proves the method is cogent and effective.

Download Full-text

Reseach on Feature Selection Algorithm Based on the Margin of Support Vector Machine

Applied Mechanics and Materials ◽

10.4028/www.scientific.net/amm.333-335.1430 ◽

2013 ◽

Vol 333-335 ◽

pp. 1430-1434

Author(s):

Lin Fang Hu ◽

Lei Qiao ◽

Min De Huang

Keyword(s):

Feature Selection ◽

Weather Forecast ◽

Support Vector ◽

Feature Subset ◽

Selection Algorithm ◽

Feature Selection Algorithm ◽

Optimal Hyperplane ◽

Optimal Feature Subset ◽

Optimal Feature

A feature selection algorithm based on the optimal hyperplane of SVM is raised. Using the algorithm, the contribution to the classification of each feature in the candidate feature set is test, and then the feature subset with best classification ability will be selected. The algorithm is used in the recognition process of storm monomers in weather forecast, and experimental data show that the classification ability of the features can be effectively evaluated; the optimal feature subset is selected to enhance the working performance of the classifier.

Download Full-text

Efficient Feature Subset Selection Algorithm for High Dimensional Data

International Journal of Electrical and Computer Engineering (IJECE) ◽

10.11591/ijece.v6i4.pp1880-1888 ◽

2016 ◽

Vol 6 (4) ◽

pp. 1880 ◽

Cited By ~ 3

Author(s):

Smita Chormunge ◽

Sudarson Jena

Keyword(s):

Feature Selection ◽

Information Gain ◽

High Dimensional Data ◽

Feature Subset Selection ◽

High Dimensional ◽

Feature Subset ◽

Selection Algorithm ◽

Feature Selection Algorithm ◽

Computational Performance ◽

Optimal Feature Subset

<p>Feature selection approach solves the dimensionality problem by removing irrelevant and redundant features. Existing Feature selection algorithms take more time to obtain feature subset for high dimensional data. This paper proposes a feature selection algorithm based on Information gain measures for high dimensional data termed as IFSA (Information gain based Feature Selection Algorithm) to produce optimal feature subset in efficient time and improve the computational performance of learning algorithms. IFSA algorithm works in two folds: First apply filter on dataset. Second produce the small feature subset by using information gain measure. Extensive experiments are carried out to compare proposed algorithm and other methods with respect to two different classifiers (Naive bayes and IBK) on microarray and text data sets. The results demonstrate that IFSA not only produces the most select feature subset in efficient time but also improves the classifier performance.</p>

Download Full-text

Efficient Feature Subset Selection Algorithm for High Dimensional Data

International Journal of Electrical and Computer Engineering (IJECE) ◽

10.11591/ijece.v6i4.9800 ◽

2016 ◽

Vol 6 (4) ◽

pp. 1880 ◽

Cited By ~ 1

Author(s):

Smita Chormunge ◽

Sudarson Jena

Keyword(s):

Feature Selection ◽

Information Gain ◽

High Dimensional Data ◽

Feature Subset Selection ◽

High Dimensional ◽

Feature Subset ◽

Selection Algorithm ◽

Feature Selection Algorithm ◽

Computational Performance ◽

Optimal Feature Subset

Download Full-text

A plug-in feature extraction and feature subset selection algorithm for classification of medicinal brain image data

2014 International Conference on Communication and Signal Processing ◽

10.1109/iccsp.2014.6950108 ◽

2014 ◽

Cited By ~ 4

Author(s):

A Veeramuthu ◽

S Meenakshi ◽

A Kameshwaran

Keyword(s):

Feature Extraction ◽

Subset Selection ◽

Image Data ◽

Feature Subset Selection ◽

Feature Subset ◽

Selection Algorithm ◽

Brain Image

Download Full-text

A novel feature selection algorithm based on damping oscillation theory

PLoS ONE ◽

10.1371/journal.pone.0255307 ◽

2021 ◽

Vol 16 (8) ◽

pp. e0255307

Author(s):

Fujun Wang ◽

Xing Wang

Keyword(s):

Feature Selection ◽

Optimization Algorithm ◽

Euclidean Distance ◽

Oscillation Theory ◽

Feature Subset Selection ◽

Support Vector ◽

Data Sets ◽

Feature Subset ◽

Selection Algorithm ◽

Filter Model

Feature selection is an important task in big data analysis and information retrieval processing. It reduces the number of features by removing noise, extraneous data. In this paper, one feature subset selection algorithm based on damping oscillation theory and support vector machine classifier is proposed. This algorithm is called the Maximum Kendall coefficient Maximum Euclidean Distance Improved Gray Wolf Optimization algorithm (MKMDIGWO). In MKMDIGWO, first, a filter model based on Kendall coefficient and Euclidean distance is proposed, which is used to measure the correlation and redundancy of the candidate feature subset. Second, the wrapper model is an improved grey wolf optimization algorithm, in which its position update formula has been improved in order to achieve optimal results. Third, the filter model and the wrapper model are dynamically adjusted by the damping oscillation theory to achieve the effect of finding an optimal feature subset. Therefore, MKMDIGWO achieves both the efficiency of the filter model and the high precision of the wrapper model. Experimental results on five UCI public data sets and two microarray data sets have demonstrated the higher classification accuracy of the MKMDIGWO algorithm than that of other four state-of-the-art algorithms. The maximum ACC value of the MKMDIGWO algorithm is at least 0.5% higher than other algorithms on 10 data sets.

Download Full-text