Analyzing the Performance of Hierarchical Binary Classifiers for Multi-class Classification Problem Using Biological Data

Author(s):  
Salma Begum ◽  
Ramazan S. Aygun
Author(s):  
Kanae Takahashi ◽  
Kouji Yamamoto ◽  
Aya Kuchiba ◽  
Tatsuki Koyama

AbstractA binary classification problem is common in medical field, and we often use sensitivity, specificity, accuracy, negative and positive predictive values as measures of performance of a binary predictor. In computer science, a classifier is usually evaluated with precision (positive predictive value) and recall (sensitivity). As a single summary measure of a classifier’s performance, F1 score, defined as the harmonic mean of precision and recall, is widely used in the context of information retrieval and information extraction evaluation since it possesses favorable characteristics, especially when the prevalence is low. Some statistical methods for inference have been developed for the F1 score in binary classification problems; however, they have not been extended to the problem of multi-class classification. There are three types of F1 scores, and statistical properties of these F1 scores have hardly ever been discussed. We propose methods based on the large sample multivariate central limit theorem for estimating F1 scores with confidence intervals.


2018 ◽  
Vol 7 (2) ◽  
pp. 817
Author(s):  
Senthilselvan Natarajan ◽  
Rajarajan S ◽  
Subramaniyaswamy V

Biological data suffers from the problem of high dimensionality which makes the process of multi-class classification difficult and also these data have elements that are incomplete and redundant. Breast Cancer is currently one of the most pre-dominant causes of death in women around the globe. The current methods for classifying a tumour as malignant or benign involve physical procedures. This often leads to mental stress. Research has now sought to implement soft computing techniques in order to classify tumours based on the data available. In this paper, a novel classifier model is implemented using Artificial Neural Networks. Optimization is done in this neural network by using a meta-heuristic algorithm called the Whale Swarm Algorithm in order to make the classifier model accurate. Experimental results show that new technique outperforms other existing models.


2019 ◽  
Author(s):  
Seda Bilaloglu ◽  
Joyce Wu ◽  
Eduardo Fierro ◽  
Raul Delgado Sanchez ◽  
Paolo Santiago Ocampo ◽  
...  

AbstractVisual analysis of solid tissue mounted on glass slides is currently the primary method used by pathologists for determining the stage, type and subtypes of cancer. Although whole slide images are usually large (10s to 100s thousands pixels wide), an exhaustive though time-consuming assessment is necessary to reduce the risk of misdiagnosis. In an effort to address the many diagnostic challenges faced by trained experts, recent research has been focused on developing automatic prediction systems for this multi-class classification problem. Typically, complex convolutional neural network (CNN) architectures, such as Google’s Inception, are used to tackle this problem. Here, we introduce a greatly simplified CNN architecture, PathCNN, which allows for more efficient use of computational resources and better classification performance. Using this improved architecture, we trained simultaneously on whole-slide images from multiple tumor sites and corresponding non-neoplastic tissue. Dimensionality reduction analysis of the weights of the last layer of the network capture groups of images that faithfully represent the different types of cancer, highlighting at the same time differences in staining and capturing outliers, artifacts and misclassification errors. Our code is available online at: https://github.com/sedab/PathCNN.


2017 ◽  
Author(s):  
Marie Lachaize ◽  
Sylvie Le Hégarat-Mascle ◽  
Emanuel Aldea ◽  
Aude Maitrot ◽  
Roger Reynaud

Data Mining ◽  
2013 ◽  
pp. 970-990
Author(s):  
Weiqi Wang ◽  
Yanbo J. Wang ◽  
Qin Xin ◽  
René Bañares-Alcántara ◽  
Frans Coenen ◽  
...  

Discovering how Mesenchymal Stem Cells (MSCs) can be differentiated is an important topic in stem cell therapy and tissue engineering. In a general context, such differentiation analysis can be modeled as a classification problem in data mining. Specifically, this is concerned with the single-label multi-class classification task. Previous studies on this topic suggests the Associative Classification (AC) rather than other alternative (Classification) techniques, and presented classification results based on the CMAR (Classification based on Multiple Association Rules) associative classifier. Other AC algorithms include: CBA (Classification Based on Associations), PRM (Predictive Rule Mining), CPAR (Classification based on Predictive Association Rules) and TFPC (Total From Partial Classification). The main aim of this chapter is to compare the performance of different associative classifiers, in terms of classification accuracy, efficiency, number of rules to be generated, quality of such rules, and the maximum number of attributes in rule-antecedents, with respect to MSC differentiation analysis.


2019 ◽  
Vol 9 (19) ◽  
pp. 4036 ◽  
Author(s):  
You ◽  
Wu ◽  
Lee ◽  
Liu

Multi-class classification is a very important technique in engineering applications, e.g., mechanical systems, mechanics and design innovations, applied materials in nanotechnologies, etc. A large amount of research is done for single-label classification where objects are associated with a single category. However, in many application domains, an object can belong to two or more categories, and multi-label classification is needed. Traditionally, statistical methods were used; recently, machine learning techniques, in particular neural networks, have been proposed to solve the multi-class classification problem. In this paper, we develop radial basis function (RBF)-based neural network schemes for single-label and multi-label classification, respectively. The number of hidden nodes and the parameters involved with the basis functions are determined automatically by applying an iterative self-constructing clustering algorithm to the given training dataset, and biases and weights are derived optimally by least squares. Dimensionality reduction techniques are adopted and integrated to help reduce the overfitting problem associated with the RBF networks. Experimental results from benchmark datasets are presented to show the effectiveness of the proposed schemes.


2019 ◽  
Vol 9 (15) ◽  
pp. 3007
Author(s):  
Dengyong Zhang ◽  
Shanshan Wang ◽  
Jin Wang ◽  
Arun Kumar Sangaiah ◽  
Feng Li ◽  
...  

There are many image resizing techniques, which include scaling, scale-and-stretch, seam carving, and so on. They have their own advantages and are suitable for different application scenarios. Therefore, a universal detection of tampering by image resizing is more practical. By preliminary experiments, we found that no matter which image resizing technique is adopted, it will destroy local texture and spatial correlations among adjacent pixels to some extent. Due to the excellent performance of local Tchebichef moments (LTM) in texture classification, we are motivated to present a detection method of tampering by image resizing using LTM in this paper. The tampered images are obtained by removing the pixels from original images using image resizing (scaling, scale-and-stretch and seam carving). Firstly, the residual is obtained by image pre-processing. Then, the histogram features of LTM are extracted from the residual. Finally, an error-correcting output code strategy is adopted by ensemble learning, which turns a multi-class classification problem into binary classification sub-problems. Experimental results show that the proposed approach can obtain an acceptable detection accuracies for the three content-aware image re-targeting techniques.


Sensors ◽  
2019 ◽  
Vol 19 (23) ◽  
pp. 5153 ◽  
Author(s):  
Hamid Khodakarami ◽  
Lucia Ricciardi ◽  
Maria Contarino ◽  
Rajesh Pahwa ◽  
Kelly Lyons ◽  
...  

The response to levodopa (LR) is important for managing Parkinson’s Disease and is measured with clinical scales prior to (OFF) and after (ON) levodopa. The aim of this study was to ascertain whether an ambulatory wearable device could predict the LR from the response to the first morning dose. The ON and OFF scores were sorted into six categories of severity so that separating Parkinson’s Kinetigraph (PKG) features corresponding to the ON and OFF scores became a multi-class classification problem according to whether they fell below or above the threshold for each class. Candidate features were extracted from the PKG data and matched to the class labels. Several linear and non-linear candidate statistical models were examined and compared to classify the six categories of severity. The resulting model predicted a clinically significant LR with an area under the receiver operator curve of 0.92. This study shows that ambulatory data could be used to identify a clinically significant response to levodopa. This study has also identified practical steps that would enhance the reliability of this test in future studies.


Sign in / Sign up

Export Citation Format

Share Document