scholarly journals Applying Cost-Sensitive Extreme Learning Machine and Dissimilarity Integration to Gene Expression Data Classification

2016 ◽  
Vol 2016 ◽  
pp. 1-9 ◽  
Author(s):  
Yanqiu Liu ◽  
Huijuan Lu ◽  
Ke Yan ◽  
Haixia Xia ◽  
Chunlin An

Embedding cost-sensitive factors into the classifiers increases the classification stability and reduces the classification costs for classifying high-scale, redundant, and imbalanced datasets, such as the gene expression data. In this study, we extend our previous work, that is, Dissimilar ELM (D-ELM), by introducing misclassification costs into the classifier. We name the proposed algorithm as the cost-sensitive D-ELM (CS-D-ELM). Furthermore, we embed rejection cost into the CS-D-ELM to increase the classification stability of the proposed algorithm. Experimental results show that the rejection cost embedded CS-D-ELM algorithm effectively reduces the average and overall cost of the classification process, while the classification accuracy still remains competitive. The proposed method can be extended to classification problems of other redundant and imbalanced data.

2019 ◽  
Vol 20 (S25) ◽  
Author(s):  
Huijuan Lu ◽  
Yige Xu ◽  
Minchao Ye ◽  
Ke Yan ◽  
Zhigang Gao ◽  
...  

Abstract Background Cost-sensitive algorithm is an effective strategy to solve imbalanced classification problem. However, the misclassification costs are usually determined empirically based on user expertise, which leads to unstable performance of cost-sensitive classification. Therefore, an efficient and accurate method is needed to calculate the optimal cost weights. Results In this paper, two approaches are proposed to search for the optimal cost weights, targeting at the highest weighted classification accuracy (WCA). One is the optimal cost weights grid searching and the other is the function fitting. Comparisons are made between these between the two algorithms above. In experiments, we classify imbalanced gene expression data using extreme learning machine to test the cost weights obtained by the two approaches. Conclusions Comprehensive experimental results show that the function fitting method is generally more efficient, which can well find the optimal cost weights with acceptable WCA.


2012 ◽  
Vol 51 (02) ◽  
pp. 162-167 ◽  
Author(s):  
Z. Wang

SummaryBackground: Multi-class molecular cancer classification has great potential clinical implications. Such applications require statistical methods to accurately classify cancer types with a small subset of genes from thousands of genes in the data.Objectives: This paper presents a new functional gradient descent boosting algorithm that directly extends the HingeBoost algorithm from the binary case to the multi-class case without reducing the original problem to multiple binary problems.Methods: Minimizing a multi-class hinge loss with boosting technique, the proposed Hinge-Boost has good theoretical properties by implementing the Bayes decision rule and providing a unifying framework with either equal or unequal misclassification costs. Furthermore, we propose Twin HingeBoost which has better feature selection behavior than Hinge-Boost by reducing the number of ineffective covariates. Simulated data, benchmark data and two cancer gene expression data sets are utilized to evaluate the performance of the proposed approach.Results: Simulations and the benchmark data showed that the multi-class HingeBoost generated accurate predictions when compared with the alternative methods, especially with high-dimensional covariates. The multi-class Hinge-Boost also produced more accurate prediction or comparable prediction in two cancer classification problems using gene expression data.Conclusions: This work has shown that the HingeBoost provides a powerful tool for multi-classification problems. In many applications, the classification accuracy and feature selection behavior can be further improved when using Twin HingeBoost.


2019 ◽  
Vol 20 (S9) ◽  
Author(s):  
Damiano Verda ◽  
Stefano Parodi ◽  
Enrico Ferrari ◽  
Marco Muselli

Abstract Background Logic Learning Machine (LLM) is an innovative method of supervised analysis capable of constructing models based on simple and intelligible rules. In this investigation the performance of LLM in classifying patients with cancer was evaluated using a set of eight publicly available gene expression databases for cancer diagnosis. LLM accuracy was assessed by summary ROC curve (sROC) analysis and estimated by the area under an sROC curve (sAUC). Its performance was compared in cross validation with that of standard supervised methods, namely: decision tree, artificial neural network, support vector machine (SVM) and k-nearest neighbor classifier. Results LLM showed an excellent accuracy (sAUC = 0.99, 95%CI: 0.98–1.0) and outperformed any other method except SVM. Conclusions LLM is a new powerful tool for the analysis of gene expression data for cancer diagnosis. Simple rules generated by LLM could contribute to a better understanding of cancer biology, potentially addressing therapeutic approaches.


2012 ◽  
Vol 75 (1) ◽  
pp. 33-42 ◽  
Author(s):  
Ana C. Lorena ◽  
Ivan G. Costa ◽  
Newton Spolaôr ◽  
Marcilio C.P. de Souto

2020 ◽  
Author(s):  
Aixiang Jiang ◽  
Laura K. Hilton ◽  
Jeffrey Tang ◽  
Christopher K. Rushton ◽  
Bruno M. Grande ◽  
...  

AbstractBinary classification using gene expression data is commonly used to stratify cancers into molecular subgroups that may have distinct prognoses and therapeutic options. A limitation of many such methods is the requirement for comparable training and testing data sets. Here, we describe and demonstrate a self-training implementation of probability ratio-based classification prediction score (PRPS-ST) that facilitates the porting of existing classification models to other gene expression data sets. We demonstrate its robustness through application to two binary classification problems in diffuse large B-cell lymphoma using a diverse variety of gene expression data types and normalization methods.


Sign in / Sign up

Export Citation Format

Share Document