Feature Selection and Classification for High-Dimensional Incomplete Multimodal Data

Mathematical Problems in Engineering ◽

10.1155/2018/1583969 ◽

2018 ◽

Vol 2018 ◽

pp. 1-9 ◽

Cited By ~ 2

Author(s):

Wan-Yu Deng ◽

Dan Liu ◽

Ying-Ying Dong

Keyword(s):

Feature Selection ◽

Data Fusion ◽

Classification Accuracy ◽

Missing Values ◽

High Dimensional Data ◽

Complete Data ◽

Experimental Results ◽

High Dimensional ◽

Multimodal Data ◽

Fusion Methods

Due to missing values, incomplete dataset is ubiquitous in multimodal scene. Complete data is a prerequisite of the most existing multimodality data fusion methods. For incomplete multimodal high-dimensional data, we propose a feature selection and classification method. Our method mainly focuses on extracting the most relevant features from the high-dimensional features and then improving the classification accuracy. The experimental results show that our method produces considerably better performance on incomplete multimodal data such as ADNI dataset and Office dataset, compared to the case of complete data.

Download Full-text

A comparative study of various feature selection techniques in high-dimensional data set to improve classification accuracy

2015 International Conference on Computer Communication and Informatics (ICCCI) ◽

10.1109/iccci.2015.7218098 ◽

2015 ◽

Cited By ~ 3

Author(s):

Kandarp P. Shroff ◽

Hardik H. Maheta

Keyword(s):

Feature Selection ◽

Comparative Study ◽

Classification Accuracy ◽

High Dimensional Data ◽

High Dimensional ◽

Data Set ◽

Feature Selection Techniques

Download Full-text

DEPSOSVM: variant of differential evolution based on PSO for image and text data classification

International Journal of Intelligent Computing and Cybernetics ◽

10.1108/ijicc-01-2020-0004 ◽

2020 ◽

Vol 13 (2) ◽

pp. 223-238

Author(s):

Abhishek Dixit ◽

Ashish Mani ◽

Rohit Bansal

Keyword(s):

Feature Selection ◽

Differential Evolution ◽

Classification Accuracy ◽

High Dimensional Data ◽

High Dimensional ◽

Svm Classifier ◽

Text Data ◽

Data Set ◽

Content Type ◽

Mutation Strategy

PurposeFeature selection is an important step for data pre-processing specially in the case of high dimensional data set. Performance of the data model is reduced if the model is trained with high dimensional data set, and it results in poor classification accuracy. Therefore, before training the model an important step to apply is the feature selection on the dataset to improve the performance and classification accuracy.Design/methodology/approachA novel optimization approach that hybridizes binary particle swarm optimization (BPSO) and differential evolution (DE) for fine tuning of SVM classifier is presented. The name of the implemented classifier is given as DEPSOSVM.FindingsThis approach is evaluated using 20 UCI benchmark text data classification data set. Further, the performance of the proposed technique is also evaluated on UCI benchmark image data set of cancer images. From the results, it can be observed that the proposed DEPSOSVM techniques have significant improvement in performance over other algorithms in the literature for feature selection. The proposed technique shows better classification accuracy as well.Originality/valueThe proposed approach is different from the previous work, as in all the previous work DE/(rand/1) mutation strategy is used whereas in this study DE/(rand/2) is used and the mutation strategy with BPSO is updated. Another difference is on the crossover approach in our case as we have used a novel approach of comparing best particle with sigmoid function. The core contribution of this paper is to hybridize DE with BPSO combined with SVM classifier (DEPSOSVM) to handle the feature selection problems.

Download Full-text

BagMeLiF: stable boosting-based hybrid-ensemble feature selection algorithm for high-dimensional data

2020 International Conference on Control, Robotics and Intelligent System ◽

10.1145/3437802.3437835 ◽

2020 ◽

Author(s):

Nikita Pilnenskiy ◽

Ivan Smetannikov

Keyword(s):

Feature Selection ◽

High Dimensional Data ◽

High Dimensional ◽

Selection Algorithm ◽

Feature Selection Algorithm

Download Full-text

On fuzzy feature selection in designing fuzzy classifiers for high-dimensional data

Evolving Systems ◽

10.1007/s12530-015-9142-4 ◽

2015 ◽

Vol 7 (4) ◽

pp. 255-265 ◽

Cited By ~ 6

Author(s):

Eghbal G. Mansoori ◽

Khadijeh S. Shafiee

Keyword(s):

Feature Selection ◽

High Dimensional Data ◽

High Dimensional ◽

Fuzzy Classifiers ◽

Fuzzy Feature Selection

Download Full-text

A hybrid feature selection algorithm combining ReliefF and Particle swarm optimization for high-dimensional medical data

Journal of Intelligent & Fuzzy Systems ◽

10.3233/jifs-202948 ◽

2021 ◽

pp. 1-15

Author(s):

Zhaozhao Xu ◽

Derong Shen ◽

Yue Kou ◽

Tiezheng Nie

Keyword(s):

Feature Selection ◽

Particle Swarm Optimization ◽

Random Forest ◽

Classification Accuracy ◽

Particle Swarm ◽

Medical Data ◽

High Dimensional ◽

Selection Algorithm ◽

Feature Selection Algorithm ◽

Swarm Optimization

Due to high-dimensional feature and strong correlation of features, the classification accuracy of medical data is not as good enough as expected. feature selection is a common algorithm to solve this problem, and selects effective features by reducing the dimensionality of high-dimensional data. However, traditional feature selection algorithms have the blindness of threshold setting and the search algorithms are liable to fall into a local optimal solution. Based on it, this paper proposes a hybrid feature selection algorithm combining ReliefF and Particle swarm optimization. The algorithm is mainly divided into three parts: Firstly, the ReliefF is used to calculate the feature weight, and the features are ranked by the weight. Then ranking feature is grouped according to the density equalization, where the density of features in each group is the same. Finally, the Particle Swarm Optimization algorithm is used to search the ranking feature groups, and the feature selection is performed according to a new fitness function. Experimental results show that the random forest has the highest classification accuracy on the features selected. More importantly, it has the least number of features. In addition, experimental results on 2 medical datasets show that the average accuracy of random forest reaches 90.20%, which proves that the hybrid algorithm has a certain application value.

Download Full-text

Clustering-guided particle swarm feature selection algorithm for high-dimensional imbalanced data with missing values

IEEE Transactions on Evolutionary Computation ◽

10.1109/tevc.2021.3106975 ◽

2021 ◽

pp. 1-1

Author(s):

Yong Zhang ◽

Yan-hu Wang ◽

Dun-wei Gong ◽

Xiao-yan Sun

Keyword(s):

Feature Selection ◽

Missing Values ◽

Particle Swarm ◽

Imbalanced Data ◽

High Dimensional ◽

Selection Algorithm ◽

Feature Selection Algorithm

Download Full-text

A Hybrid Feature Selection Method Based on Symmetrical Uncertainty and Support Vector Machine for High-Dimensional Data Classification

Intelligent Information and Database Systems - Lecture Notes in Computer Science ◽

10.1007/978-3-319-54472-4_67 ◽

2017 ◽

pp. 721-727 ◽

Cited By ~ 2

Author(s):

Yongjun Piao ◽

Keun Ho Ryu

Keyword(s):

Support Vector Machine ◽

Feature Selection ◽

High Dimensional Data ◽

Feature Selection Method ◽

Data Classification ◽

Selection Method ◽

High Dimensional ◽

Support Vector ◽

Symmetrical Uncertainty

Download Full-text

Statistics analysis on SPOT 5 classification accuracy of different data fusion methods

10.1117/12.838450 ◽

2009 ◽

Author(s):

Guifang Liu ◽

Heli Lu

Keyword(s):

Data Fusion ◽

Classification Accuracy ◽

Statistics Analysis ◽

Fusion Methods ◽

Spot 5

Download Full-text

High dimensional data classification and feature selection using support vector machines

European Journal of Operational Research ◽

10.1016/j.ejor.2017.08.040 ◽

2018 ◽

Vol 265 (3) ◽

pp. 993-1004 ◽

Cited By ~ 63

Author(s):

Bissan Ghaddar ◽

Joe Naoum-Sawaya

Keyword(s):

Feature Selection ◽

Support Vector Machines ◽

High Dimensional Data ◽

Data Classification ◽

High Dimensional ◽

Support Vector ◽

Vector Machines

Download Full-text

Opportunities and Challenges of Feature Selection Methods for High Dimensional Data: A Review

Ingénierie des systèmes d information ◽

10.18280/isi.260107 ◽

2021 ◽

Vol 26 (1) ◽

pp. 67-77

Author(s):

Siva Sankari Subbiah ◽

Jayakumar Chinnappan

Keyword(s):

Feature Selection ◽

Big Data ◽

Large Scale ◽

High Dimensional Data ◽

Research Work ◽

Basic Feature ◽

High Dimensional ◽

Selection Methods ◽

Fast Development ◽

Improved Accuracy

Now a day, all the organizations collecting huge volume of data without knowing its usefulness. The fast development of Internet helps the organizations to capture data in many different formats through Internet of Things (IoT), social media and from other disparate sources. The dimension of the dataset increases day by day at an extraordinary rate resulting in large scale dataset with high dimensionality. The present paper reviews the opportunities and challenges of feature selection for processing the high dimensional data with reduced complexity and improved accuracy. In the modern big data world the feature selection has a significance in reducing the dimensionality and overfitting of the learning process. Many feature selection methods have been proposed by researchers for obtaining more relevant features especially from the big datasets that helps to provide accurate learning results without degradation in performance. This paper discusses the importance of feature selection, basic feature selection approaches, centralized and distributed big data processing using Hadoop and Spark, challenges of feature selection and provides the summary of the related research work done by various researchers. As a result, the big data analysis with the feature selection improves the accuracy of the learning.

Download Full-text