class skew Latest Research Papers

A class skew-insensitive ACO-based decision tree algorithm for imbalanced data sets

Indonesian Journal of Electrical Engineering and Computer Science ◽

10.11591/ijeecs.v21.i1.pp412-419 ◽

2021 ◽

Vol 21 (1) ◽

pp. 412

Author(s):

Muhamad Hasbullah Bin Mohd Razali ◽

Rizauddin Bin Saian ◽

Yap Bee Wah ◽

Ku Ruhana Ku-Mahamud

Keyword(s):

Decision Tree ◽

Statistical Significance ◽

Imbalanced Data ◽

Predictive Ability ◽

Significance Test ◽

Data Sets ◽

Decision Tree Algorithm ◽

Tree Algorithm ◽

Imbalanced Data Sets ◽

Class Skew

<span>Ant-tree-miner (ATM) has an advantage over the conventional decision tree algorithm in terms of feature selection. However, real world applications commonly involved imbalanced class problem where the classes have different importance. This condition impeded the entropy-based heuristic of existing ATM algorithm to develop effective decision boundaries due to its biasness towards the dominant class. Consequently, the induced decision trees are dominated by the majority class which lack in predictive ability on the rare class. This study proposed an enhanced algorithm called hellinger-ant-tree-miner (HATM) which is inspired by ant colony optimization (ACO) metaheuristic for imbalanced learning using decision tree classification algorithm. The proposed algorithm was compared to the existing algorithm, ATM in nine (9) publicly available imbalanced data sets. Simulation study reveals the superiority of HATM when the sample size increases with skewed class (Imbalanced Ratio < 50%). Experimental results demonstrate the performance of the existing algorithm measured by BACC has been improved due to the class skew-insensitiveness of hellinger distance. The statistical significance test shows that HATM has higher mean BACC score than ATM.</span>

Download Full-text

On Using Meta-Features to Learn Under Class Skew in Biomedical Domains

2020 IEEE 33rd International Symposium on Computer-Based Medical Systems (CBMS) ◽

10.1109/cbms49503.2020.00054 ◽

2020 ◽

Author(s):

Rosa Sicilia ◽

Ermanno Cordelli ◽

Paolo Soda

Keyword(s):

Class Skew

Download Full-text

Subpopulation Discovery in Epidemiological Data with Subspace Clustering

Foundations of Computing and Decision Sciences ◽

10.2478/fcds-2014-0015 ◽

2014 ◽

Vol 39 (4) ◽

pp. 271-300 ◽

Cited By ~ 3

Author(s):

Uli Niemann ◽

Myra Spiliopoulou ◽

Henry Völzke ◽

Jens-Peter Kühn

Keyword(s):

Risk Factors ◽

Personalized Medicine ◽

Quality Assessment ◽

Cluster Structure ◽

Subspace Clustering ◽

Ground Truth ◽

Epidemiological Data ◽

Clustering Methods ◽

Class Skew ◽

Epidemiological Cohort

Abstract A prerequisite of personalized medicine is the identification of groups of people who share specific risk factors towards an outcome. We investigate the potential of subspace clustering for finding such groups in epidemiological data. We propose a workflow that encompasses clusterability assessment before cluster discovery and quality assessment after learning the clusters. Epidemiological usually do not have a ground truth for the verification of clusters found in subspaces. Hence, we introduce quality assessment through juxtaposition of the learned models to “models-of-randomness”, i.e. models that do not reflect a true cluster structure. On the basis of this workflow, we select subspace clustering methods, compare and discuss their performance. We use a dataset with hepatic steatosis as outcome, but our findings apply on arbitrary epidemiological cohort data that have tenths of variables and exhibit class skew.

Download Full-text

Model Assessment with ROC Curves

Encyclopedia of Data Warehousing and Mining, Second Edition ◽

10.4018/978-1-60566-010-3.ch204 ◽

2011 ◽

pp. 1316-1323 ◽

Cited By ~ 5

Author(s):

Lutz Hamel

Keyword(s):

Performance Metrics ◽

Confusion Matrix ◽

Model Performance ◽

Roc Curves ◽

Parametric Models ◽

Classification Model ◽

Model Assessment ◽

Classification Models ◽

The Difference ◽

Class Skew

Classification models and in particular binary classification models are ubiquitous in many branches of science and business. Consider, for example, classification models in bioinformatics that classify catalytic protein structures as being in an active or inactive conformation. As an example from the field of medical informatics we might consider a classification model that, given the parameters of a tumor, will classify it as malignant or benign. Finally, a classification model in a bank might be used to tell the difference between a legal and a fraudulent transaction. Central to constructing, deploying, and using classification models is the question of model performance assessment (Hastie, Tibshirani, & Friedman, 2001). Traditionally this is accomplished by using metrics derived from the confusion matrix or contingency table. However, it has been recognized that (a) a scalar is a poor summary for the performance of a model in particular when deploying non-parametric models such as artificial neural networks or decision trees (Provost, Fawcett, & Kohavi, 1998) and (b) some performance metrics derived from the confusion matrix are sensitive to data anomalies such as class skew (Fawcett & Flach, 2005). Recently it has been observed that Receiver Operating Characteristic (ROC) curves visually convey the same information as the confusion matrix in a much more intuitive and robust fashion (Swets, Dawes, & Monahan, 2000). Here we take a look at model performance metrics derived from the confusion matrix. We highlight their shortcomings and illustrate how ROC curves can be deployed for model assessment in order to provide a much deeper and perhaps more intuitive analysis of the models. We also briefly address the problem of model selection.

Download Full-text

Four-class skew-symmetric association schemes

Journal of Combinatorial Theory Series A ◽

10.1016/j.jcta.2010.12.002 ◽

2011 ◽

Vol 118 (4) ◽

pp. 1381-1391 ◽

Cited By ~ 2

Author(s):

Jianmin Ma ◽

Kaishun Wang

Keyword(s):

Association Schemes ◽

Class Skew

Download Full-text

Mitotic HEp-2 Cells Recognition under Class Skew

Image Analysis and Processing – ICIAP 2011 - Lecture Notes in Computer Science ◽

10.1007/978-3-642-24088-1_37 ◽

2011 ◽

pp. 353-362 ◽

Cited By ~ 7

Author(s):

Gennaro Percannella ◽

Paolo Soda ◽

Mario Vento

Keyword(s):

Class Skew

Download Full-text

Dealing with Class Skew in Context Recognition

26th IEEE International Conference on Distributed Computing Systems Workshops (ICDCSW'06) ◽

10.1109/icdcsw.2006.36 ◽

2006 ◽

Cited By ~ 8

Author(s):

M. Stager ◽

P. Lukowicz ◽

G. Troster

Keyword(s):

Context Recognition ◽

Class Skew

Download Full-text

class skew
Recently Published Documents

TOTAL DOCUMENTS

H-INDEX

A class skew-insensitive ACO-based decision tree algorithm for imbalanced data sets

On Using Meta-Features to Learn Under Class Skew in Biomedical Domains

Subpopulation Discovery in Epidemiological Data with Subspace Clustering

Model Assessment with ROC Curves

Four-class skew-symmetric association schemes

Mitotic HEp-2 Cells Recognition under Class Skew

Dealing with Class Skew in Context Recognition

Export Citation Format

class skewRecently Published Documents

TOTAL DOCUMENTS

H-INDEX

A class skew-insensitive ACO-based decision tree algorithm for imbalanced data sets

On Using Meta-Features to Learn Under Class Skew in Biomedical Domains

Subpopulation Discovery in Epidemiological Data with Subspace Clustering

Model Assessment with ROC Curves

Four-class skew-symmetric association schemes

Mitotic HEp-2 Cells Recognition under Class Skew

Dealing with Class Skew in Context Recognition

class skew
Recently Published Documents