Multivariate binary classification of imbalanced datasets-A case study based on high-dimensional multiplex autoimmune assay data

Laura Schlieker; Anna Telaar; Angelika Lueking; Peter Schulz-Knappe; Carmen Theek; Katja Ickstadt

doi:10.1002/bimj.201600207

Sensing structure in learning-based binary classification of high-dimensional data

2011 49th Annual Allerton Conference on Communication, Control, and Computing (Allerton) ◽

10.1109/allerton.2011.6120348 ◽

2011 ◽

Author(s):

B. Orten ◽

P. Ishwar ◽

W. C. Karl ◽

V. Saligrama

Keyword(s):

Binary Classification ◽

High Dimensional Data ◽

High Dimensional

Download Full-text

Image Processing Using Artificial Intelligence: Case Study on Classification of High-Dimensional Remotely Sensed Images

10.1201/9781003140351-6 ◽

2021 ◽

pp. 39-49

Author(s):

Dibyajyoti Chutia ◽

Avinash Chouhan ◽

Nilay Nishant ◽

P. Subhash Singh ◽

D. K. Bhattacharyya ◽

...

Keyword(s):

Artificial Intelligence ◽

Image Processing ◽

Remotely Sensed ◽

High Dimensional ◽

Remotely Sensed Images

Download Full-text

Binary classification of imbalanced datasets using conformal prediction

Journal of Molecular Graphics and Modelling ◽

10.1016/j.jmgm.2017.01.008 ◽

2017 ◽

Vol 72 ◽

pp. 256-265 ◽

Cited By ~ 23

Author(s):

Ulf Norinder ◽

Scott Boyer

Keyword(s):

Binary Classification ◽

Imbalanced Datasets ◽

Conformal Prediction

Download Full-text

1D embedding multi-category classification methods

International Journal of Wavelets Multiresolution and Information Processing ◽

10.1142/s0219691316400063 ◽

2016 ◽

Vol 14 (02) ◽

pp. 1640006 ◽

Cited By ~ 3

Author(s):

Luoqing Li ◽

Chuanwu Yang ◽

Qiwei Xie

Keyword(s):

Binary Classification ◽

High Dimensional Data ◽

Classification Performance ◽

Classification Method ◽

High Dimensional ◽

Classification Methods ◽

One Dimensional ◽

Interpolation Technique ◽

Facial Images

In this paper, we propose a novel semi-supervised multi-category classification method based on one-dimensional (1D) multi-embedding. Based on the multiple 1D embedding based interpolation technique, we embed the high-dimensional data into several different 1D manifolds and perform binary classification firstly. Then we construct the multi-category classifiers by means of one-versus-rest and one-versus-one strategies separately. A weight strategy is employed in our algorithm for improving the classification performance. The proposed method shows promising results in the classification of handwritten digits and facial images.

Download Full-text

Unsupervised Feature Selection Based on Ultrametricity and Sparse Training Data: A Case Study for the Classification of High-Dimensional Hyperspectral Data

Remote Sensing ◽

10.3390/rs10101564 ◽

2018 ◽

Vol 10 (10) ◽

pp. 1564 ◽

Cited By ~ 3

Author(s):

Patrick Bradley ◽

Sina Keller ◽

Martin Weinmann

Keyword(s):

Feature Selection ◽

Dimensionality Reduction ◽

Hyperspectral Data ◽

Training Data ◽

High Dimensional ◽

Unsupervised Feature Selection ◽

Feature Selection Techniques ◽

The Given

In this paper, we investigate the potential of unsupervised feature selection techniques for classification tasks, where only sparse training data are available. This is motivated by the fact that unsupervised feature selection techniques combine the advantages of standard dimensionality reduction techniques (which only rely on the given feature vectors and not on the corresponding labels) and supervised feature selection techniques (which retain a subset of the original set of features). Thus, feature selection becomes independent of the given classification task and, consequently, a subset of generally versatile features is retained. We present different techniques relying on the topology of the given sparse training data. Thereby, the topology is described with an ultrametricity index. For the latter, we take into account the Murtagh Ultrametricity Index (MUI) which is defined on the basis of triangles within the given data and the Topological Ultrametricity Index (TUI) which is defined on the basis of a specific graph structure. In a case study addressing the classification of high-dimensional hyperspectral data based on sparse training data, we demonstrate the performance of the proposed unsupervised feature selection techniques in comparison to standard dimensionality reduction and supervised feature selection techniques on four commonly used benchmark datasets. The achieved classification results reveal that involving supervised feature selection techniques leads to similar classification results as involving unsupervised feature selection techniques, while the latter perform feature selection independently from the given classification task and thus deliver generally versatile features.

Download Full-text

Computational Information Geometry for Binary Classification of High-Dimensional Random Tensors

Entropy ◽

10.3390/e20030203 ◽

2018 ◽

Vol 20 (3) ◽

pp. 203 ◽

Cited By ~ 1

Author(s):

Gia-Thuy Pham ◽

Rémy Boyer ◽

Frank Nielsen

Keyword(s):

Binary Classification ◽

Information Geometry ◽

High Dimensional

Download Full-text

Binary classification of imbalanced datasets: The case of CoIL challenge 2000

Expert Systems with Applications ◽

10.1016/j.eswa.2019.03.024 ◽

2019 ◽

Vol 128 ◽

pp. 169-186 ◽

Cited By ~ 3

Author(s):

Mohammad Rasoul Khalilpour Darzi ◽

Seyed Taghi Akhavan Niaki ◽

Majid Khedmati

Keyword(s):

Binary Classification ◽

Imbalanced Datasets

Download Full-text

Comparative study of quality estimation of binary classification

Informatics ◽

10.37661/1816-0301-2020-17-1-87-101 ◽

2020 ◽

Vol 17 (1) ◽

pp. 87-101

Author(s):

V. V. Starovoitov ◽

Yu. I. Golub

Keyword(s):

Matthews Correlation Coefficient ◽

Binary Classification ◽

Confusion Matrix ◽

Diagnostic Odds Ratio ◽

Quality Estimation ◽

Imbalanced Datasets ◽

Error Matrix ◽

Classification Errors ◽

Sensitivity Specificity

The paper describes results of analytical and experimental analysis of seventeen functions used for evaluation of binary classification results of arbitrary data. The results are presented by 2×2 error matrices. The behavior and properties of the main functions calculated by the elements of such matrices are studied. Classification options with balanced and imbalanced datasets are analyzed. It is shown that there are linear dependencies between some functions, many functions are invariant to the transposition of the error matrix, which allows us to calculate the estimation without specifying the order in which their elements were written to the matrices.It has been proven that all classical measures such as Sensitivity, Specificity, Precision, Accuracy, F1, F2, GM, the Jacquard index are sensitive to the imbalance of classified data and distort estimation of smaller class objects classification errors. Sensitivity to imbalance is found in the Matthews correlation coefficient and Kohen’s kappa. It has been experimentally shown that functions such as the confusion entropy, the discriminatory power, and the diagnostic odds ratio should not be used for analysis of binary classification of imbalanced datasets. The last two functions are invariant to the imbalance of classified data, but poorly evaluate results with approximately equal common percentage of classification errors in two classes.We proved that the area under the ROC curve (AUC) and the Yuden index calculated from the binary classification confusion matrix are linearly dependent and are the best estimation functions of both balanced and imbalanced datasets.

Download Full-text

Method of determination of the text direction on the image with the use of convolutional neural network

Informatization and communication ◽

10.34219/2078-8320-2020-11-2-96-99 ◽

2020 ◽

pp. 96-99

Author(s):

P.L. Nikolaev

Keyword(s):

Neural Network ◽

Convolutional Neural Network ◽

Deep Neural Network ◽

Binary Classification ◽

Synthetic Data ◽

Real Data ◽

Method Of Determination ◽

Classification Of Images

This article deals with method of binary classification of images with small text on them Classification is based on the fact that the text can have 2 directions – it can be positioned horizontally and read from left to right or it can be turned 180 degrees so the image must be rotated to read the sign. This type of text can be found on the covers of a variety of books, so in case of recognizing the covers, it is necessary first to determine the direction of the text before we will directly recognize it. The article suggests the development of a deep neural network for determination of the text position in the context of book covers recognizing. The results of training and testing of a convolutional neural network on synthetic data as well as the examples of the network functioning on the real data are presented.

Download Full-text

A Study on Multi Class Classification from Breast Cancer Images using Ensemble Network and Transfer Learning

Recent Patents on Engineering ◽

10.2174/1872212114999201109205421 ◽

2020 ◽

Vol 14 ◽

Author(s):

Lahari Tipirneni ◽

Rizwan Patan

Keyword(s):

Breast Cancer ◽

Neural Network ◽

Convolutional Neural Network ◽

Binary Classification ◽

Disease Diagnosis ◽

Feature Descriptors ◽

Histopathological Images ◽

Viable Approach ◽

Multi Class Classification

Abstract:: Millions of deaths all over the world are caused by breast cancer every year. It has become the most common type of cancer in women. Early detection will help in better prognosis and increases the chance of survival. Automating the classification using Computer-Aided Diagnosis (CAD) systems can make the diagnosis less prone to errors. Multi class classification and Binary classification of breast cancer is a challenging problem. Convolutional neural network architectures extract specific feature descriptors from images, which cannot represent different types of breast cancer. This leads to false positives in classification, which is undesirable in disease diagnosis. The current paper presents an ensemble Convolutional neural network for multi class classification and Binary classification of breast cancer. The feature descriptors from each network are combined to produce the final classification. In this paper, histopathological images are taken from publicly available BreakHis dataset and classified between 8 classes. The proposed ensemble model can perform better when compared to the methods proposed in the literature. The results showed that the proposed model could be a viable approach for breast cancer classification.

Download Full-text