Determining Cutoff Point of Ensemble Trees Based on Sample Size in Predicting Clinical Dose with DNA Microarray Data

Computational and Mathematical Methods in Medicine ◽

10.1155/2016/6794916 ◽

2016 ◽

Vol 2016 ◽

pp. 1-9 ◽

Cited By ~ 1

Author(s):

Selen Yılmaz Isıkhan ◽

Erdem Karabulut ◽

Celal Reha Alpar

Keyword(s):

Gene Expression ◽

Sample Size ◽

Feature Selection Method ◽

Microarray Gene Expression Data ◽

Cutoff Point ◽

Gradient Boosting ◽

Support Vector ◽

Microarray Gene Expression ◽

Dose Prediction ◽

Clinical Dose

Background/Aim. Evaluating the success of dose prediction based on genetic or clinical data has substantially advanced recently. The aim of this study is to predict various clinical dose values from DNA gene expression datasets using data mining techniques. Materials and Methods. Eleven real gene expression datasets containing dose values were included. First, important genes for dose prediction were selected using iterative sure independence screening. Then, the performances of regression trees (RTs), support vector regression (SVR), RT bagging, SVR bagging, and RT boosting were examined. Results. The results demonstrated that a regression-based feature selection method substantially reduced the number of irrelevant genes from raw datasets. Overall, the best prediction performance in nine of 11 datasets was achieved using SVR; the second most accurate performance was provided using a gradient-boosting machine (GBM). Conclusion. Analysis of various dose values based on microarray gene expression data identified common genes found in our study and the referenced studies. According to our findings, SVR and GBM can be good predictors of dose-gene datasets. Another result of the study was to identify the sample size of n=25 as a cutoff point for RT bagging to outperform a single RT.

Download Full-text

A Coupling Support Vector Machines with the Feature Learning of Deep Convolutional Neural Networks for Classifying Microarray Gene Expression Data

Modern Approaches for Intelligent Information and Database Systems - Studies in Computational Intelligence ◽

10.1007/978-3-319-76081-0_20 ◽

2018 ◽

pp. 233-243 ◽

Cited By ~ 4

Author(s):

Phuoc-Hai Huynh ◽

Van-Hoa Nguyen ◽

Thanh-Nghi Do

Keyword(s):

Gene Expression ◽

Neural Networks ◽

Support Vector Machines ◽

Feature Learning ◽

Microarray Gene Expression Data ◽

Support Vector ◽

Deep Convolutional Neural Networks ◽

Microarray Gene Expression ◽

Vector Machines ◽

Microarray Gene

Download Full-text

A feature selection method using fixed-point algorithm for DNA microarray gene expression data

International Journal of Knowledge-based and Intelligent Engineering Systems ◽

10.3233/kes-140285 ◽

2014 ◽

Vol 18 (1) ◽

pp. 55-59 ◽

Cited By ~ 4

Author(s):

Alok Sharma ◽

Kuldip K. Paliwal ◽

Seiya Imoto ◽

Satoru Miyano ◽

Vandana Sharma ◽

...

Keyword(s):

Gene Expression ◽

Fixed Point ◽

Feature Selection ◽

Feature Selection Method ◽

Microarray Gene Expression Data ◽

Selection Method ◽

Expression Data ◽

Microarray Gene Expression ◽

Fixed Point Algorithm ◽

Microarray Gene

Download Full-text

Web-Based Application for Accurately Classifying Cancer Type from Microarray Gene Expression Data Using a Support Vector Machine (SVM) Learning Algorithm

Bioinformatics and Biomedical Engineering - Lecture Notes in Computer Science ◽

10.1007/978-3-030-17935-9_14 ◽

2019 ◽

pp. 149-154 ◽

Cited By ~ 1

Author(s):

Shrikant Pawar

Keyword(s):

Gene Expression ◽

Support Vector Machine ◽

Gene Expression Data ◽

Learning Algorithm ◽

Microarray Gene Expression Data ◽

Support Vector ◽

Cancer Type ◽

Microarray Gene Expression ◽

Web Based ◽

Microarray Gene

Download Full-text

Knowledge-based analysis of microarray gene expression data by using support vector machines

Proceedings of the National Academy of Sciences ◽

10.1073/pnas.97.1.262 ◽

2000 ◽

Vol 97 (1) ◽

pp. 262-267 ◽

Cited By ~ 1324

Author(s):

M. P. S. Brown ◽

W. N. Grundy ◽

D. Lin ◽

N. Cristianini ◽

C. W. Sugnet ◽

...

Keyword(s):

Gene Expression ◽

Support Vector Machines ◽

Gene Expression Data ◽

Microarray Gene Expression Data ◽

Support Vector ◽

Expression Data ◽

Microarray Gene Expression ◽

Knowledge Based ◽

Vector Machines ◽

Microarray Gene

Download Full-text

Comparison of machine-learning methodologies for accurate diagnosis of sepsis using microarray gene expression data

PLoS ONE ◽

10.1371/journal.pone.0251800 ◽

2021 ◽

Vol 16 (5) ◽

pp. e0251800

Author(s):

Dominik Schaack ◽

Markus A. Weigand ◽

Florian Uhle

Keyword(s):

Gene Expression ◽

Machine Learning ◽

Intensive Care ◽

Gene Expression Data ◽

Meta Analysis ◽

Microarray Gene Expression Data ◽

Support Vector ◽

Expression Data ◽

Microarray Gene Expression ◽

Microarray Gene

We investigate the feasibility of molecular-level sample classification of sepsis using microarray gene expression data merged by in silico meta-analysis. Publicly available data series were extracted from NCBI Gene Expression Omnibus and EMBL-EBI ArrayExpress to create a comprehensive meta-analysis microarray expression set (meta-expression set). Measurements had to be obtained via microarray-technique from whole blood samples of adult or pediatric patients with sepsis diagnosed based on international consensus definition immediately after admission to the intensive care unit. We aggregate trauma patients, systemic inflammatory response syndrome (SIRS) patients, and healthy controls in a non-septic entity. Differential expression (DE) analysis is compared with machine-learning-based solutions like decision tree (DT), random forest (RF), support vector machine (SVM), and deep-learning neural networks (DNNs). We evaluated classifier training and discrimination performance in 100 independent iterations. To test diagnostic resilience, we gradually degraded expression data in multiple levels. Clustering of expression values based on DE genes results in partial identification of sepsis samples. In contrast, RF, SVM, and DNN provide excellent diagnostic performance measured in terms of accuracy and area under the curve (>0.96 and >0.99, respectively). We prove DNNs as the most resilient methodology, virtually unaffected by targeted removal of DE genes. By surpassing most other published solutions, the presented approach substantially augments current diagnostic capability in intensive care medicine.

Download Full-text

Classification of Microarray Gene Expression Data Using a New Binary Support Vector System

2005 International Conference on Neural Networks and Brain ◽

10.1109/icnnb.2005.1614659 ◽

2006 ◽

Cited By ~ 2

Author(s):

Tung-Shou Chen ◽

Rong-Chang ◽

Tzu-Hsin Tsai ◽

Shuan-Yow Li ◽

Xun Liang

Keyword(s):

Gene Expression ◽

Gene Expression Data ◽

Microarray Gene Expression Data ◽

Vector System ◽

Support Vector ◽

Expression Data ◽

Microarray Gene Expression ◽

Microarray Gene

Download Full-text

MINIMUM REDUNDANCY FEATURE SELECTION FROM MICROARRAY GENE EXPRESSION DATA

Journal of Bioinformatics and Computational Biology ◽

10.1142/s0219720005001004 ◽

2005 ◽

Vol 03 (02) ◽

pp. 185-205 ◽

Cited By ~ 937

Author(s):

CHRIS DING ◽

HANCHUAN PENG

Keyword(s):

Gene Expression ◽

Feature Selection ◽

Gene Expression Data ◽

Microarray Gene Expression Data ◽

Support Vector ◽

Small Subset ◽

Expression Data ◽

Microarray Gene Expression ◽

Linear Discriminant ◽

Selection Framework

How to selecting a small subset out of the thousands of genes in microarray data is important for accurate classification of phenotypes. Widely used methods typically rank genes according to their differential expressions among phenotypes and pick the top-ranked genes. We observe that feature sets so obtained have certain redundancy and study methods to minimize it. We propose a minimum redundancy — maximum relevance (MRMR) feature selection framework. Genes selected via MRMR provide a more balanced coverage of the space and capture broader characteristics of phenotypes. They lead to significantly improved class predictions in extensive experiments on 6 gene expression data sets: NCI, Lymphoma, Lung, Child Leukemia, Leukemia, and Colon. Improvements are observed consistently among 4 classification methods: Naïve Bayes, Linear discriminant analysis, Logistic regression, and Support vector machines. Supplimentary: The top 60 MRMR genes for each of the datasets are listed in . More information related to MRMR methods can be found at .

Download Full-text

Impact of Feature Selection on Support Vector Machine Using Microarray Gene Expression Data

2009 Second International Conference on Machine Vision ◽

10.1109/icmv.2009.46 ◽

2009 ◽

Cited By ~ 1

Author(s):

Choudhury Muhammad Mufassil Wahid ◽

A.B.M. Shawkat Ali ◽

Kevin Tickle

Keyword(s):

Gene Expression ◽

Support Vector Machine ◽

Feature Selection ◽

Gene Expression Data ◽

Microarray Gene Expression Data ◽

Support Vector ◽

Expression Data ◽

Microarray Gene Expression ◽

Microarray Gene

Download Full-text

Transductive support vector machines for classification of microarray gene expression data

Proceedings of the International Joint Conference on Neural Networks, 2003. ◽

10.1109/ijcnn.2003.1224039 ◽

2004 ◽

Cited By ~ 1

Author(s):

R. Semolini ◽

F.J. Von Zuben

Keyword(s):

Gene Expression ◽

Support Vector Machines ◽

Gene Expression Data ◽

Microarray Gene Expression Data ◽

Support Vector ◽

Expression Data ◽

Microarray Gene Expression ◽

Vector Machines ◽

Microarray Gene

Download Full-text

Robust proportional overlapping analysis for feature selection in binary classification within functional genomic experiments

PeerJ Computer Science ◽

10.7717/peerj-cs.562 ◽

2021 ◽

Vol 7 ◽

pp. e562

Author(s):

Muhammad Hamraz ◽

Naz Gul ◽

Mushtaq Raza ◽

Dost Muhammad Khan ◽

Umair Khalil ◽

...

Keyword(s):

Gene Expression ◽

Feature Selection ◽

Gene Selection ◽

Binary Classification ◽

Feature Selection Method ◽

Brier Score ◽

Classification Error ◽

Support Vector ◽

Microarray Gene Expression ◽

Absolute Deviation

In this paper, a novel feature selection method called Robust Proportional Overlapping Score (RPOS), for microarray gene expression datasets has been proposed, by utilizing the robust measure of dispersion, i.e., Median Absolute Deviation (MAD). This method robustly identifies the most discriminative genes by considering the overlapping scores of the gene expression values for binary class problems. Genes with a high degree of overlap between classes are discarded and the ones that discriminate between the classes are selected. The results of the proposed method are compared with five state-of-the-art gene selection methods based on classification error, Brier score, and sensitivity, by considering eleven gene expression datasets. Classification of observations for different sets of selected genes by the proposed method is carried out by three different classifiers, i.e., random forest, k-nearest neighbors (k-NN), and support vector machine (SVM). Box-plots and stability scores of the results are also shown in this paper. The results reveal that in most of the cases the proposed method outperforms the other methods.

Download Full-text