scholarly journals A Review of Feature Selection and Feature Extraction Methods Applied on Microarray Data

2015 ◽  
Vol 2015 ◽  
pp. 1-13 ◽  
Author(s):  
Zena M. Hira ◽  
Duncan F. Gillies

We summarise various ways of performing dimensionality reduction on high-dimensional microarray data. Many different feature selection and feature extraction methods exist and they are being widely used. All these methods aim to remove redundant and irrelevant features so that classification of new instances will be more accurate. A popular source of data is microarrays, a biological platform for gathering gene expressions. Analysing microarrays can be difficult due to the size of the data they provide. In addition the complicated relations among the different genes make analysis more difficult and removing excess features can improve the quality of the results. We present some of the most popular methods for selecting significant features and provide a comparison between them. Their advantages and disadvantages are outlined in order to provide a clearer idea of when to use each one of them for saving computational time and resources.

Sensors ◽  
2019 ◽  
Vol 19 (4) ◽  
pp. 916 ◽  
Author(s):  
Wen Cao ◽  
Chunmei Liu ◽  
Pengfei Jia

Aroma plays a significant role in the quality of citrus fruits and processed products. The detection and analysis of citrus volatiles can be measured by an electronic nose (E-nose); in this paper, an E-nose is employed to classify the juice which is stored for different days. Feature extraction and classification are two important requirements for an E-nose. During the training process, a classifier can optimize its own parameters to achieve a better classification accuracy but cannot decide its input data which is treated by feature extraction methods, so the classification result is not always ideal. Label consistent KSVD (L-KSVD) is a novel technique which can extract the feature and classify the data at the same time, and such an operation can improve the classification accuracy. We propose an enhanced L-KSVD called E-LCKSVD for E-nose in this paper. During E-LCKSVD, we introduce a kernel function to the traditional L-KSVD and present a new initialization technique of its dictionary; finally, the weighted coefficients of different parts of its object function is studied, and enhanced quantum-behaved particle swarm optimization (EQPSO) is employed to optimize these coefficients. During the experimental section, we firstly find the classification accuracy of KSVD, and L-KSVD is improved with the help of the kernel function; this can prove that their ability of dealing nonlinear data is improved. Then, we compare the results of different dictionary initialization techniques and prove our proposed method is better. Finally, we find the optimal value of the weighted coefficients of the object function of E-LCKSVD that can make E-nose reach a better performance.


2020 ◽  
pp. 707-725
Author(s):  
Sujata Dash

Efficient classification and feature extraction techniques pave an effective way for diagnosing cancers from microarray datasets. It has been observed that the conventional classification techniques have major limitations in discriminating the genes accurately. However, such kind of problems can be addressed by an ensemble technique to a great extent. In this paper, a hybrid RotBagg ensemble framework has been proposed to address the problem specified above. This technique is an integration of Rotation Forest and Bagging ensemble which in turn preserves the basic characteristics of ensemble architecture i.e., diversity and accuracy. Three different feature selection techniques are employed to select subsets of genes to improve the effectiveness and generalization of the RotBagg ensemble. The efficiency is validated through five microarray datasets and also compared with the results of base learners. The experimental results show that the correlation based FRFR with PCA-based RotBagg ensemble form a highly efficient classification model.


Author(s):  
Sujata Dash

Efficient classification and feature extraction techniques pave an effective way for diagnosing cancers from microarray datasets. It has been observed that the conventional classification techniques have major limitations in discriminating the genes accurately. However, such kind of problems can be addressed by an ensemble technique to a great extent. In this paper, a hybrid RotBagg ensemble framework has been proposed to address the problem specified above. This technique is an integration of Rotation Forest and Bagging ensemble which in turn preserves the basic characteristics of ensemble architecture i.e., diversity and accuracy. Three different feature selection techniques are employed to select subsets of genes to improve the effectiveness and generalization of the RotBagg ensemble. The efficiency is validated through five microarray datasets and also compared with the results of base learners. The experimental results show that the correlation based FRFR with PCA-based RotBagg ensemble form a highly efficient classification model.


2019 ◽  
Vol 8 (3) ◽  
pp. 35-37
Author(s):  
R. Ravikumar ◽  
M. Babu Reddy

In machine learning as the dimensionality of the data rises, the amount of data required to provide a reliable analysis grows exponentially. To perform dimensionality reduction on high-dimensional micro array data, many different feature selection and feature extraction methods exist and they are being widely used. All these methods aim to remove redundant and irrelevant features so that classification of new instances will be more accurate. Analyzing microarrays can be difficult due to the size of the data they provide. In addition the complicated relations among the different genes make analysis more difficult and removing excess features can improve the quality of the results. Feature selection has been an active and fruitful field of research area in pattern recognition, machine learning, statistics and data mining communities. The main objective of this paper is feature selection is to choose a subset of input variables by eliminating features.


CCIT Journal ◽  
2021 ◽  
Vol 14 (1) ◽  
pp. 21-27
Author(s):  
Rogayah Rogayah ◽  
Waliya Rahmawanti ◽  
Nur Azizah

The development of cellular devices makes accessing information in the form of text or images more easier. In line with the growing field of computer vision, various processes in image/image processing continue to increase. Image processing can be done by increasing image quality (image enhancement) and image recovery (image restoration). Feature extraction is divided into three types, namely feature form extraction, texture feature extraction, and color feature extraction. The application of color-based feature extraction methods has been widely used by researchers in the process of classification of various objects. This paper aims to review the technology that can be applied to image processing in a CBIR system with the object of breast milk so that it can measure the quality of breast milk based on its color.


Author(s):  
VLADIMIR NIKULIN ◽  
TIAN-HSIANG HUANG ◽  
GEOFFREY J. MCLACHLAN

The method presented in this paper is novel as a natural combination of two mutually dependent steps. Feature selection is a key element (first step) in our classification system, which was employed during the 2010 International RSCTC data mining (bioinformatics) Challenge. The second step may be implemented using any suitable classifier such as linear regression, support vector machine or neural networks. We conducted leave-one-out (LOO) experiments with several feature selection techniques and classifiers. Based on the LOO evaluations, we decided to use feature selection with the separation type Wilcoxon-based criterion for all final submissions. The method presented in this paper was tested successfully during the RSCTC data mining Challenge, where we achieved the top score in the Basic track.


MethodsX ◽  
2021 ◽  
Vol 8 ◽  
pp. 101166
Author(s):  
Timothy J. Fawcett ◽  
Chad S. Cooper ◽  
Ryan J. Longenecker ◽  
Joseph P. Walton

Sign in / Sign up

Export Citation Format

Share Document