A Framework for Supervised Classification Performance Analysis with Information-Theoretic Methods

Francisco J. Valverde-Albacete; Carmen Pelaez-Moreno

doi:10.1109/tkde.2019.2915643

A Framework for Supervised Classification Performance Analysis with Information-Theoretic Methods

IEEE Transactions on Knowledge and Data Engineering ◽

10.1109/tkde.2019.2915643 ◽

2020 ◽

Vol 32 (11) ◽

pp. 2075-2087

Author(s):

Francisco J. Valverde-Albacete ◽

Carmen Pelaez-Moreno

Keyword(s):

Performance Analysis ◽

Supervised Classification ◽

Classification Performance ◽

Information Theoretic ◽

Information Theoretic Methods

Download Full-text

Information-Theoretic Representation Learning for Positive-Unlabeled Classification

Neural Computation ◽

10.1162/neco_a_01337 ◽

2021 ◽

Vol 33 (1) ◽

pp. 244-268

Author(s):

Tomoya Sakai ◽

Gang Niu ◽

Masashi Sugiyama

Keyword(s):

Supervised Classification ◽

Principal Component ◽

Representation Learning ◽

Classification Performance ◽

Accurate Estimate ◽

Information Theoretic ◽

Information Maximization ◽

Weakly Supervised ◽

Weakly Supervised Classification ◽

Maximization Principle

Recent advances in weakly supervised classification allow us to train a classifier from only positive and unlabeled (PU) data. However, existing PU classification methods typically require an accurate estimate of the class-prior probability, a critical bottleneck particularly for high-dimensional data. This problem has been commonly addressed by applying principal component analysis in advance, but such unsupervised dimension reduction can collapse the underlying class structure. In this letter, we propose a novel representation learning method from PU data based on the information-maximization principle. Our method does not require class-prior estimation and thus can be used as a preprocessing method for PU classification. Through experiments, we demonstrate that our method, combined with deep neural networks, highly improves the accuracy of PU class-prior estimation, leading to state-of-the-art PU classification performance.

Download Full-text

Semi-Supervised Classification and its Application to Filtering IDS False Positives

Applied Mechanics and Materials ◽

10.4028/www.scientific.net/amm.427-429.2309 ◽

2013 ◽

Vol 427-429 ◽

pp. 2309-2312

Author(s):

Hai Bin Mei ◽

Ming Hua Zhang

Keyword(s):

Supervised Learning ◽

Supervised Classification ◽

Classification Performance ◽

False Positives ◽

Training Data ◽

Classification Model ◽

Classification Technique

Alert classifiers built with the supervised classification technique require large amounts of labeled training alerts. Preparing for such training data is very difficult and expensive. Thus accuracy and feasibility of current classifiers are greatly restricted. This paper employs semi-supervised learning to build alert classification model to reduce the number of needed labeled training alerts. Alert context properties are also introduced to improve the classification performance. Experiments have demonstrated the accuracy and feasibility of our approach.

Download Full-text

Information-Theoretic Methods Applied to Dispatch of Emergency Services Data

Augmented Cognition. Human Cognition and Behavior - Lecture Notes in Computer Science ◽

10.1007/978-3-030-50439-7_24 ◽

2020 ◽

pp. 353-370

Author(s):

Monte Hancock ◽

Katy Hancock ◽

Marie Tree ◽

Mitchell Kirshner ◽

Benjamin Bowles

Keyword(s):

Emergency Services ◽

Information Theoretic ◽

Information Theoretic Methods

Download Full-text

Information-theoretic methods for studying population codes

Neural Networks ◽

10.1016/j.neunet.2010.05.008 ◽

2010 ◽

Vol 23 (6) ◽

pp. 713-727 ◽

Cited By ~ 36

Author(s):

Robin A.A. Ince ◽

Riccardo Senatore ◽

Ehsan Arabzadeh ◽

Fernando Montani ◽

Mathew E. Diamond ◽

...

Keyword(s):

Information Theoretic ◽

Population Codes ◽

Information Theoretic Methods

Download Full-text

An Evaluation of Information-Theoretic Methods for Detecting Structural Microbial Biosignatures

Astrobiology ◽

10.1089/ast.2008.0301 ◽

2010 ◽

Vol 10 (4) ◽

pp. 363-379 ◽

Cited By ~ 8

Author(s):

Kiri L. Wagstaff ◽

Frank A. Corsetti

Keyword(s):

Information Theoretic ◽

Information Theoretic Methods

Download Full-text

Language engineering and information theoretic methods in protein sequence similarity studies

Computational Intelligence in Medical Informatics - Studies in Computational Intelligence ◽

10.1007/978-3-540-75767-2_8 ◽

2008 ◽

pp. 151-183 ◽

Cited By ~ 5

Author(s):

A. Bogan-Marta ◽

A. Hategan ◽

I. Pitas

Keyword(s):

Protein Sequence ◽

Sequence Similarity ◽

Language Engineering ◽

Information Theoretic ◽

Protein Sequence Similarity ◽

Information Theoretic Methods

Download Full-text

Generalizing Benfordʼs Law

Benford's Law ◽

10.23943/princeton/9780691147611.003.0017 ◽

2015 ◽

Author(s):

Joanne Lee ◽

Wendy K. Tam Cho ◽

George Judge

Keyword(s):

Scientific Misconduct ◽

Adaptive Methods ◽

Data Sets ◽

Federal Grants ◽

Information Theoretic ◽

Data Adaptive ◽

The University ◽

University Of Vermont ◽

Case Data ◽

Information Theoretic Methods

This chapter examines and searches for evidence of fraud in two clinical data sets from a highly publicized case of scientific misconduct. In this case, data were falsified by Eric Poehlman, a faculty member at the University of Vermont, who pleaded guilty to fabricating more than a decade of data, some connected to federal grants from the National Institutes of Health. Poehlman had authored influential studies on many topics; including obesity, menopause, lipids, and aging. The chapter's classical Benford analysis along with a presentation of a more general class of Benford-like distributions highlights interesting insights into this and similar cases. In addition, this chapter demonstrates how information-theoretic methods and other data-adaptive methods are promising tools for generating benchmark distributions of first significant digits (FSDs) and examining data sets for departures from expectations.

Download Full-text