Gene selection via BPSO and Backward generation for cancer classification

Ahmed Bir-Jmel; Sidi Mohamed Douiri; Souad Elbernoussi

doi:10.1051/ro/2018059

Gene selection via BPSO and Backward generation for cancer classification

RAIRO - Operations Research ◽

10.1051/ro/2018059 ◽

2019 ◽

Vol 53 (1) ◽

pp. 269-288 ◽

Cited By ~ 2

Author(s):

Ahmed Bir-Jmel ◽

Sidi Mohamed Douiri ◽

Souad Elbernoussi

Keyword(s):

Gene Selection ◽

Cancer Classification ◽

Selection Problem ◽

Small Subset ◽

Large Set ◽

High Dimensions ◽

Hybrid Approaches ◽

Filter Methods ◽

Microarray Datasets

Gene expression data (DNA microarray) enable researchers to simultaneously measure the levels of expression of several thousand genes. These levels of expression are very important in the classification of different types of tumors. In this work, we are interested in gene selection, which is an essential step in the data pre-processing for cancer classification. This selection makes it possible to represent a small subset of genes from a large set, and to eliminate the redundant, irrelevant or noisy genes. The combinatorial nature of the selection problem requires the development of specific techniques such as filters and Wrappers, or hybrids combining several optimization processes. In this context, we propose two hybrid approaches (RBPSO-1NN and FBPSO-SVM) for the gene selection problem, based on the combination of the filter methods (the Fisher criterion and the ReliefF algorithm), the BPSO metaheuristic algorithms and the Backward algorithm using the classifiers (SVM and 1NN) for the evaluation of the relevance of the candidate subsets. In order to verify the performance of our methods, we have tested them on eight well-known microarray datasets of high dimensions varying from 2308 to 11225 genes. The experiments carried out on the different datasets show that our methods prove to be very competitive with the existing works.

Download Full-text

Gene Selection in Cancer Classification Using Sparse Logistic Regression with L1/2 Regularization

Applied Sciences ◽

10.3390/app8091569 ◽

2018 ◽

Vol 8 (9) ◽

pp. 1569 ◽

Cited By ~ 3

Author(s):

Shengbing Wu ◽

Hongkun Jiang ◽

Haiwei Shen ◽

Ziyi Yang

Keyword(s):

Logistic Regression ◽

Gene Selection ◽

Classification Performance ◽

Cancer Classification ◽

Sparse Logistic Regression ◽

The Subject ◽

Selection For ◽

Microarray Datasets ◽

Sparse Methods

In recent years, gene selection for cancer classification based on the expression of a small number of gene biomarkers has been the subject of much research in genetics and molecular biology. The successful identification of gene biomarkers will help in the classification of different types of cancer and improve the prediction accuracy. Recently, regularized logistic regression using the L 1 regularization has been successfully applied in high-dimensional cancer classification to tackle both the estimation of gene coefficients and the simultaneous performance of gene selection. However, the L 1 has a biased gene selection and dose not have the oracle property. To address these problems, we investigate L 1 / 2 regularized logistic regression for gene selection in cancer classification. Experimental results on three DNA microarray datasets demonstrate that our proposed method outperforms other commonly used sparse methods ( L 1 and L E N ) in terms of classification performance.

Download Full-text

A novel gene selection algorithm for cancer classification using microarray datasets

BMC Medical Genomics ◽

10.1186/s12920-018-0447-6 ◽

2019 ◽

Vol 12 (1) ◽

Cited By ~ 10

Author(s):

Russul Alanni ◽

Jingyu Hou ◽

Hasseeb Azzawi ◽

Yong Xiang

Keyword(s):

Gene Selection ◽

Cancer Classification ◽

Selection Algorithm ◽

Novel Gene ◽

Microarray Datasets ◽

Gene Selection Algorithm

Download Full-text

HYBRIDIZATION OF GENETIC AND QUANTUM ALGORITHM FOR GENE SELECTION AND CLASSIFICATION OF MICROARRAY DATA

International Journal of Foundations of Computer Science ◽

10.1142/s0129054112400217 ◽

2012 ◽

Vol 23 (02) ◽

pp. 431-444 ◽

Cited By ~ 6

Author(s):

ALLANI ABDERRAHIM ◽

EL-GHAZALI TALBI ◽

MELLOULI KHALED

Keyword(s):

Microarray Data ◽

Gene Selection ◽

Quantum Algorithm ◽

High Accuracy ◽

High Dimensional ◽

Support Vector ◽

Small Subset ◽

Microarray Experiments ◽

Vector Machines

In this work, we hybridize the Genetic Quantum Algorithm with the Support Vector Machines classifier for gene selection and classification of high dimensional Microarray Data. We named our algorithm GQA SVM. Its purpose is to identify a small subset of genes that could be used to separate two classes of samples with high accuracy. A comparison of the approach with different methods of literature, in particular GA SVM and PSO SVM [2], was realized on six different datasets issued of microarray experiments dealing with cancer (leukemia, breast, colon, ovarian, prostate, and lung) and available on Web. The experiments clearified the very good performances of the method. The first contribution shows that the algorithm GQA SVM is able to find genes of interest and improve the classification on a meaningful way. The second important contribution consists in the actual discovery of new and challenging results on datasets used.

Download Full-text

Hybrid Correlation based Gene Selection for Accurate Cancer Classification of Gene Expression Data

International Journal of Computer Applications ◽

10.5120/6170-8591 ◽

2012 ◽

Vol 43 (14) ◽

pp. 13-18 ◽

Cited By ~ 3

Author(s):

Vibhav PrakashSingh ◽

Singh Gaurav Arvind ◽

Arindam G Mahapatra

Keyword(s):

Gene Expression ◽

Gene Expression Data ◽

Gene Selection ◽

Cancer Classification ◽

Expression Data ◽

Selection For

Download Full-text

Application of Nature Inspired Soft Computing Techniques for Gene Selection: A Novel Frame Work for Classification of Cancer.

10.21203/rs.3.rs-1121838/v1 ◽

2022 ◽

Author(s):

Rabia Musheer Aziz

Keyword(s):

Gene Selection ◽

Optimization Technique ◽

Cuckoo Search ◽

Cancer Classification ◽

Prediction Errors ◽

Bee Colony ◽

Proposed Model ◽

Comparison Results ◽

Soft Computing Techniques

Abstract A modified Artificial Bee Colony (ABC) metaheuristics optimization technique is applied for cancer classification, that reduces the classifier's prediction errors and allows for faster convergence by selecting informative genes. Cuckoo search (CS) algorithm was used in the onlooker bee phase (exploitation phase)of ABC to boost performance by maintaining the balance between exploration and exploitation of ABC. Tuned the modified ABC algorithm by using Naïve Bayes (NB) classifiers to improve the further accuracy of the model. Independent Component Analysis (ICA) is used for dimensionality reduction. In the first step, the reduced dataset is optimized by using Modified ABC and after that, in the second step, the optimized dataset is used to train the NB classifier. Extensive experiments were performed for comprehensive comparative analysis of the proposed algorithm with well-known metaheuristic algorithms, namely Genetic Algorithm (GA) when used with the same framework for the classification of six high-dimensional cancer datasets. The comparison results showed that the proposed model with the CS algorithm achieves the highest performance as maximum classification accuracy with less count of selected genes. This shows the effectiveness of the proposed algorithm which is validated using ANOVA for cancer classification.

Download Full-text

A phase diagram for gene selection and disease classification

10.1101/002360 ◽

2014 ◽

Author(s):

Hong-Dong Li ◽

Qing-Song Xu ◽

Yi-Zeng Liang

Keyword(s):

Phase Diagram ◽

Gene Selection ◽

Population Analysis ◽

Predictive Ability ◽

Disease Diagnosis ◽

Disease Classification ◽

Small Subset ◽

Analysis Framework ◽

Source Codes ◽

Microarray Datasets

Identifying a small subset of discriminate genes is important for predicting clinical outcomes and facilitating disease diagnosis. Based on the model population analysis framework, we present a method, called PHADIA, which is able to output a phase diagram displaying the predictive ability of each variable, which provides an intuitive way for selecting informative variables. Using two publicly available microarray datasets, its demonstrated that our method can selects a few informative genes and achieves significantly better or comparable classification accuracy compared to the reported results in the literature. The source codes are freely available at: www.libpls.net.

Download Full-text

Gene Selection via a New Hybrid Ant Colony Optimization Algorithm for Cancer Classification in High-Dimensional Data

Computational and Mathematical Methods in Medicine ◽

10.1155/2019/7828590 ◽

2019 ◽

Vol 2019 ◽

pp. 1-20 ◽

Cited By ~ 2

Author(s):

Ahmed Bir-Jmel ◽

Sidi Mohamed Douiri ◽

Souad Elbernoussi

Keyword(s):

Data Analysis ◽

Microarray Data ◽

Gene Selection ◽

Hybrid Approach ◽

The Body ◽

Cancer Classification ◽

Microarray Data Analysis ◽

High Dimensional ◽

Small Subset ◽

Classification Problems

The recent advance in the microarray data analysis makes it easy to simultaneously measure the expression levels of several thousand genes. These levels can be used to distinguish cancerous tissues from normal ones. In this work, we are interested in gene expression data dimension reduction for cancer classification, which is a common task in most microarray data analysis studies. This reduction has an essential role in enhancing the accuracy of the classification task and helping biologists accurately predict cancer in the body; this is carried out by selecting a small subset of relevant genes and eliminating the redundant or noisy genes. In this context, we propose a hybrid approach (MWIS-ACO-LS) for the gene selection problem, based on the combination of a new graph-based approach for gene selection (MWIS), in which we seek to minimize the redundancy between genes by considering the correlation between the latter and maximize gene-ranking (Fisher) scores, and a modified ACO coupled with a local search (LS) algorithm using the classifier 1NN for measuring the quality of the candidate subsets. In order to evaluate the proposed method, we tested MWIS-ACO-LS on ten well-replicated microarray datasets of high dimensions varying from 2308 to 12600 genes. The experimental results based on ten high-dimensional microarray classification problems demonstrated the effectiveness of our proposed method.

Download Full-text

Deep gene selection method to select genes from microarray datasets for cancer classification

BMC Bioinformatics ◽

10.1186/s12859-019-3161-2 ◽

2019 ◽

Vol 20 (1) ◽

Cited By ~ 1

Author(s):

Russul Alanni ◽

Jingyu Hou ◽

Hasseeb Azzawi ◽

Yong Xiang

Keyword(s):

Gene Selection ◽

Computational Cost ◽

Selection Method ◽

Cancer Classification ◽

Gene Selection Method ◽

Microarray Expression Data ◽

Efficient Gene ◽

Number Of Genes ◽

Efficiency And Effectiveness ◽

Microarray Datasets

Abstract Background Microarray datasets consist of complex and high-dimensional samples and genes, and generally the number of samples is much smaller than the number of genes. Due to this data imbalance, gene selection is a demanding task for microarray expression data analysis. Results The gene set selected by DGS has shown its superior performances in cancer classification. DGS has a high capability of reducing the number of genes in the original microarray datasets. The experimental comparisons with other representative and state-of-the-art gene selection methods also showed that DGS achieved the best performance in terms of the number of selected genes, classification accuracy, and computational cost. Conclusions We provide an efficient gene selection algorithm can select relevant genes which are significantly sensitive to the samples’ classes. With the few discriminative genes and less cost time by the proposed algorithm achieved much high prediction accuracy on several public microarray data, which in turn verifies the efficiency and effectiveness of the proposed gene selection method.

Download Full-text

Swarm Intelligence Algorithms in Gene Selection Profile Based on Classification of Microarray Data: A Review

Journal of Applied Science and Technology Trends ◽

10.38094/jastt20161 ◽

2021 ◽

Vol 2 (01) ◽

pp. 01-09

Author(s):

Alan Jahwar ◽

Nawzat Ahmed

Keyword(s):

Swarm Intelligence ◽

Microarray Data ◽

Gene Selection ◽

Classification Algorithms ◽

Data Sets ◽

Paper Briefly ◽

Large Gene ◽

Microarray Datasets ◽

Selection Of

Microarray data plays a major role in diagnosing and treating cancer. In several microarray data sets, many gene fragments are not associated with the target diseases. A solution to the gene selection problem might become important when analyzing large gene datasets. The key task is to better represent genes through optimum accuracy in classifying the samples. Different gene classification algorithms have been provided in past studies; after all, they suffered due to the selection of several genes mostly in high-dimensional microarray data. This paper aims to review classification and feature selection with different microarray datasets focused on swarm intelligence algorithms. We explain microarray data and its types in this paper briefly. Moreover, our paper presents an introduction to most common swarm intelligence algorithms. A review on swarm intelligence algorithms in gene selection profile based on classification of Microarray Data is presented in this paper.

Download Full-text

Incremental Search for Informative Gene Selection in Cancer Classification

Annals of Emerging Technologies in Computing ◽

10.33166/aetic.2021.02.002 ◽

2021 ◽

Vol 5 (2) ◽

pp. 15-21

Author(s):

Fathima Fajila ◽

Yuhanis Yusof

Keyword(s):

Data Analysis ◽

Microarray Data ◽

Gene Selection ◽

Subset Selection ◽

Cancer Classification ◽

Microarray Data Analysis ◽

Informative Gene ◽

Incremental Search ◽

Selection Approach ◽

Microarray Datasets

Although numerous methods of using microarray data analysis for classification have been reported, there is space in the field of cancer classification for new inventions in terms of informative gene selection. This study introduces a new incremental search-based gene selection approach for cancer classification. The strength of wrappers in determining relevant genes in a gene pool can be increased as they evaluate each possible gene’s subset. Nevertheless, the searching algorithms play a major role in gene’s subset selection. Hence, there is the possibility of finding more informative genes with incremental application. Thus, we introduce an approach which utilizes two searching algorithms in gene’s subset selection. The approach was efficient enough to classify five out of six microarray datasets with 100% accuracy using only a few biomarkers while the rest classified with only one misclassification.

Download Full-text