Improved PSO_AdaBoost Ensemble Algorithm for Imbalanced Data

Kewen Li; Guangyue Zhou; Jiannan Zhai; Fulai Li; Mingwen Shao

doi:10.3390/s19061476

Improved PSO_AdaBoost Ensemble Algorithm for Imbalanced Data

Sensors ◽

10.3390/s19061476 ◽

2019 ◽

Vol 19 (6) ◽

pp. 1476 ◽

Cited By ~ 5

Author(s):

Kewen Li ◽

Guangyue Zhou ◽

Jiannan Zhai ◽

Fulai Li ◽

Mingwen Shao

Keyword(s):

Area Under Curve ◽

Imbalanced Data ◽

Local Optimum ◽

Adaptive Boosting ◽

Adaboost Algorithm ◽

Learning Framework ◽

Comprehensive Performance ◽

Good Classification ◽

Misclassification Probability ◽

Ensemble Algorithm

The Adaptive Boosting (AdaBoost) algorithm is a widely used ensemble learning framework, and it can get good classification results on general datasets. However, it is challenging to apply the AdaBoost algorithm directly to imbalanced data since it is designed mainly for processing misclassified samples rather than samples of minority classes. To better process imbalanced data, this paper introduces the indicator Area Under Curve (AUC) which can reflect the comprehensive performance of the model, and proposes an improved AdaBoost algorithm based on AUC (AdaBoost-A) which improves the error calculation performance of the AdaBoost algorithm by comprehensively considering the effects of misclassification probability and AUC. To prevent redundant or useless weak classifiers the traditional AdaBoost algorithm generated from consuming too much system resources, this paper proposes an ensemble algorithm, PSOPD-AdaBoost-A, which can re-initialize parameters to avoid falling into local optimum, and optimize the coefficients of AdaBoost weak classifiers. Experiment results show that the proposed algorithm is effective for processing imbalanced data, especially the data with relatively high imbalances.

Download Full-text

A Measure Optimized Cost-Sensitive Learning Framework for Imbalanced Data Classification

Artificial Intelligence ◽

10.4018/978-1-5225-1759-7.ch026 ◽

2017 ◽

pp. 611-640

Author(s):

Peng Cao ◽

Osmar R. Zaiane ◽

Dazhe Zhao

Keyword(s):

Imbalanced Data ◽

Data Classification ◽

Cost Sensitive Learning ◽

Learning Framework ◽

Imbalanced Data Classification

Download Full-text

A Multi-label Multimodal Deep Learning Framework for Imbalanced Data Classification

2019 IEEE Conference on Multimedia Information Processing and Retrieval (MIPR) ◽

10.1109/mipr.2019.00043 ◽

2019 ◽

Cited By ~ 3

Author(s):

Samira Pouyanfar ◽

Tianyi Wang ◽

Shu-Ching Chen

Keyword(s):

Deep Learning ◽

Imbalanced Data ◽

Data Classification ◽

Learning Framework ◽

Imbalanced Data Classification

Download Full-text

Ensemble Method of Effective AdaBoost Algorithm for Decision Tree Classifiers

International Journal of Artificial Intelligence Tools ◽

10.1142/s0218213017500075 ◽

2016 ◽

Vol 26 (03) ◽

pp. 1750007 ◽

Cited By ~ 2

Author(s):

S. Dinakaran ◽

P. Ranjit Jeba Thangaiah

Keyword(s):

Random Forest ◽

Decision Tree ◽

Classification Accuracy ◽

Time Complexity ◽

Statistical Test ◽

Ensemble Method ◽

Adaptive Boosting ◽

Adaboost Algorithm ◽

Comparison Results ◽

Boosting Algorithms

This article introduces a novel ensemble method named eAdaBoost (Effective Adaptive Boosting) is a meta classifier which is developed by enhancing the existing AdaBoost algorithm and to handle the time complexity and also to produce the best classification accuracy. The eAdaBoost reduces the error rate when compared with the existing methods and generates the best accuracy by reweighing each feature for further process. The comparison results of an extensive experimental evaluation of the proposed method are explained using the UCI machine learning repository datasets. The accuracy of the classifiers and statistical test comparisons are made with various boosting algorithms. The proposed eAdaBoost has been also implemented with different decision tree classifiers like C4.5, Decision Stump, NB Tree and Random Forest. The algorithm has been computed with various dataset, with different weight thresholds and the performance is analyzed. The proposed method produces better results using random forest and NB tree as base classifier than the decision stump and C4.5 classifiers for few datasets. The eAdaBoost gives better classification accuracy, and prediction accuracy, and execution time is also less when compared with other classifiers.

Download Full-text

DeepGly: A Deep Learning Framework With Recurrent and Convolutional Neural Networks to Identify Protein Glycation Sites From Imbalanced Data

IEEE Access ◽

10.1109/access.2019.2944411 ◽

2019 ◽

Vol 7 ◽

pp. 142368-142378

Author(s):

Jingui Chen ◽

Runtao Yang ◽

Chengjin Zhang ◽

Lina Zhang ◽

Qian Zhang

Keyword(s):

Neural Networks ◽

Deep Learning ◽

Convolutional Neural Networks ◽

Imbalanced Data ◽

Protein Glycation ◽

Learning Framework

Download Full-text

A Learning Framework for Medical Image-Based Intelligent Diagnosis from Imbalanced Datasets

10.3233/shti210801 ◽

2021 ◽

Author(s):

Tetiana Biloborodova ◽

Inna Skarga-Bandurova ◽

Mark Koverha ◽

Illia Skarha-Bandurov ◽

Yelyzaveta Yevsieieva

Keyword(s):

Image Classification ◽

Predictive Models ◽

Medical Image ◽

Imbalanced Data ◽

Classification Performance ◽

Data Reuse ◽

Imbalanced Datasets ◽

Learning Framework ◽

Class Distribution ◽

Medical Image Classification

Medical image classification and diagnosis based on machine learning has made significant achievements and gradually penetrated the healthcare industry. However, medical data characteristics such as relatively small datasets for rare diseases or imbalance in class distribution for rare conditions significantly restrains their adoption and reuse. Imbalanced datasets lead to difficulties in learning and obtaining accurate predictive models. This paper follows the FAIR paradigm and proposes a technique for the alignment of class distribution, which enables improving image classification performance in imbalanced data and ensuring data reuse. The experiments on the acne disease dataset support that the proposed framework outperforms the baselines and enable to achieve up to 5% improvement in image classification.

Download Full-text

A Hybrid Imbalanced Data Learning Framework to Tackle Opinion Imbalance in Movie Reviews

Communication Software and Networks - Lecture Notes in Networks and Systems ◽

10.1007/978-981-15-5397-4_46 ◽

2020 ◽

pp. 453-462

Author(s):

Salina Adinarayana ◽

E. Ilavarasan

Keyword(s):

Imbalanced Data ◽

Learning Framework ◽

Imbalanced Data Learning

Download Full-text

Improving ADABoost Algorithm with Weighted SVM for Imbalanced Data Classification

10.1007/978-3-030-91387-8_9 ◽

2021 ◽

pp. 125-136

Author(s):

Vo Duc Quang ◽

Tran Dinh Khang ◽

Nguyen Minh Huy

Keyword(s):

Imbalanced Data ◽

Data Classification ◽

Adaboost Algorithm ◽

Imbalanced Data Classification

Download Full-text

Object Detection using Feature Mining in a Distributed Machine Learning Framework

10.51202/9783186855107 ◽

2017 ◽

Author(s):

Arne Ehlers

Keyword(s):

Machine Learning ◽

Object Detection ◽

Training Data ◽

Visual Object ◽

Ensemble Classifiers ◽

Adaptive Boosting ◽

Learning Framework ◽

Theory Of Evidence ◽

Feature Mining ◽

Distributed Machine Learning

This dissertation addresses the problem of visual object detection based on machine-learned classifiers. A distributed machine learning framework is developed to learn detectors for several object classes creating cascaded ensemble classifiers by the Adaptive Boosting algorithm. Methods are proposed that enhance several components of an object detection framework: At first, the thesis deals with augmenting the training data in order to improve the performance of object detectors learned from sparse training sets. Secondly, feature mining strategies are introduced to create feature sets that are customized to the object class to be detected. Furthermore, a novel class of fractal features is proposed that allows to represent a wide variety of shapes. Thirdly, a method is introduced that models and combines internal confidences and uncertainties of the cascaded detector using Dempster’s theory of evidence in order to increase the quality of the post-processing. ...

Download Full-text

A Selective Ensemble Learning Framework for ECG-Based Heartbeat Classification with Imbalanced Data

2018 IEEE International Conference on Bioinformatics and Biomedicine (BIBM) ◽

10.1109/bibm.2018.8621523 ◽

2018 ◽

Cited By ~ 1

Author(s):

Hongwei Ge ◽

Keyi Sun ◽

Liang Sun ◽

Mingde Zhao ◽

Chunguo Wu

Keyword(s):

Ensemble Learning ◽

Imbalanced Data ◽

Heartbeat Classification ◽

Learning Framework ◽

Selective Ensemble

Download Full-text

Dual Adversarial Co-Learning for Multi-Domain Text Classification

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i04.6115 ◽

2020 ◽

Vol 34 (04) ◽

pp. 6438-6445

Author(s):

Yuan Wu ◽

Yuhong Guo

Keyword(s):

Text Classification ◽

State Of The Art ◽

Digital Data ◽

Classification Model ◽

Classification Models ◽

Learning Framework ◽

Good Classification ◽

Classification Tasks ◽

Multiple Domains ◽

Learned Features

With the advent of deep learning, the performance of text classification models have been improved significantly. Nevertheless, the successful training of a good classification model requires a sufficient amount of labeled data, while it is always expensive and time consuming to annotate data. With the rapid growth of digital data, similar classification tasks can typically occur in multiple domains, while the availability of labeled data can largely vary across domains. Some domains may have abundant labeled data, while in some other domains there may only exist a limited amount (or none) of labeled data. Meanwhile text classification tasks are highly domain-dependent — a text classifier trained in one domain may not perform well in another domain. In order to address these issues, in this paper we propose a novel dual adversarial co-learning approach for multi-domain text classification (MDTC). The approach learns shared-private networks for feature extraction and deploys dual adversarial regularizations to align features across different domains and between labeled and unlabeled data simultaneously under a discrepancy based co-learning framework, aiming to improve the classifiers' generalization capacity with the learned features. We conduct experiments on multi-domain sentiment classification datasets. The results show the proposed approach achieves the state-of-the-art MDTC performance.

Download Full-text