A comprehensive survey of imbalanced learning methods for bankruptcy prediction

Evaluating Active Learning Methods for Bankruptcy Prediction

Brain Function Assessment in Learning - Lecture Notes in Computer Science ◽

10.1007/978-3-319-67615-9_5 ◽

2017 ◽

pp. 57-66 ◽

Cited By ~ 2

Author(s):

Georgios Kostopoulos ◽

Stamatis Karlos ◽

Sotiris Kotsiantis ◽

Vassilis Tampakas

Keyword(s):

Active Learning ◽

Bankruptcy Prediction ◽

Learning Methods

Download Full-text

Exploring Video Captioning Techniques: A Comprehensive Survey on Deep Learning Methods

SN Computer Science ◽

10.1007/s42979-021-00487-x ◽

2021 ◽

Vol 2 (2) ◽

Author(s):

Saiful Islam ◽

Aurpan Dash ◽

Ashek Seum ◽

Amir Hossain Raj ◽

Tonmoy Hossain ◽

...

Keyword(s):

Deep Learning ◽

Learning Methods ◽

Video Captioning ◽

Comprehensive Survey

Download Full-text

A Cost-Sensitive Ensemble Method for Class-Imbalanced Datasets

Abstract and Applied Analysis ◽

10.1155/2013/196256 ◽

2013 ◽

Vol 2013 ◽

pp. 1-6 ◽

Cited By ~ 9

Author(s):

Yong Zhang ◽

Dapeng Wang

Keyword(s):

Imbalanced Data ◽

Ensemble Method ◽

Support Vector ◽

Data Sets ◽

Imbalanced Learning ◽

Imbalanced Datasets ◽

Learning Methods ◽

Training Samples ◽

Imbalanced Data Classification ◽

Area Under Roc Curve

In imbalanced learning methods, resampling methods modify an imbalanced dataset to form a balanced dataset. Balanced data sets perform better than imbalanced datasets for many base classifiers. This paper proposes a cost-sensitive ensemble method based on cost-sensitive support vector machine (SVM), and query-by-committee (QBC) to solve imbalanced data classification. The proposed method first divides the majority-class dataset into several subdatasets according to the proportion of imbalanced samples and trains subclassifiers using AdaBoost method. Then, the proposed method generates candidate training samples by QBC active learning method and uses cost-sensitive SVM to learn the training samples. By using 5 class-imbalanced datasets, experimental results show that the proposed method has higher area under ROC curve (AUC), F-measure, and G-mean than many existing class-imbalanced learning methods.

Download Full-text

A Comprehensive Survey on Various Machine Learning Methods used for Intrusion Detection System

2020 IEEE 9th International Conference on Communication Systems and Network Technologies (CSNT) ◽

10.1109/csnt48778.2020.9115764 ◽

2020 ◽

Author(s):

Akshay Ramesh bhai Gupta ◽

Jitendra Agrawal

Keyword(s):

Machine Learning ◽

Intrusion Detection ◽

Intrusion Detection System ◽

Detection System ◽

Learning Methods ◽

Machine Learning Methods ◽

Comprehensive Survey

Download Full-text

A Survey on Deep Learning for Human Activity Recognition

ACM Computing Surveys ◽

10.1145/3472290 ◽

2022 ◽

Vol 54 (8) ◽

pp. 1-34

Author(s):

Fuqiang Gu ◽

Mu-Huan Chung ◽

Mark Chignell ◽

Shahrokh Valaee ◽

Baoding Zhou ◽

...

Keyword(s):

Deep Learning ◽

Activity Recognition ◽

Human Activity ◽

Smart Home ◽

State Of The Art ◽

Human Activity Recognition ◽

Learning Methods ◽

Machine Learning Methods ◽

Comprehensive Survey ◽

Conventional Machine

Human activity recognition is a key to a lot of applications such as healthcare and smart home. In this study, we provide a comprehensive survey on recent advances and challenges in human activity recognition (HAR) with deep learning. Although there are many surveys on HAR, they focused mainly on the taxonomy of HAR and reviewed the state-of-the-art HAR systems implemented with conventional machine learning methods. Recently, several works have also been done on reviewing studies that use deep models for HAR, whereas these works cover few deep models and their variants. There is still a need for a comprehensive and in-depth survey on HAR with recently developed deep learning methods.

Download Full-text

Tackling the Problem of Class Imbalance in Multi-class Sentiment Classification: An Experimental Study

Foundations of Computing and Decision Sciences ◽

10.2478/fcds-2019-0009 ◽

2019 ◽

Vol 44 (2) ◽

pp. 151-178 ◽

Cited By ~ 1

Author(s):

Mateusz Lango

Keyword(s):

Experimental Study ◽

Sentiment Analysis ◽

Learning Algorithms ◽

Class Imbalance ◽

Sentiment Classification ◽

Important Task ◽

Imbalanced Learning ◽

Learning Methods ◽

Feature Representations ◽

The Impact

Abstract Sentiment classification is an important task which gained extensive attention both in academia and in industry. Many issues related to this task such as handling of negation or of sarcastic utterances were analyzed and accordingly addressed in previous works. However, the issue of class imbalance which often compromises the prediction capabilities of learning algorithms was scarcely studied. In this work, we aim to bridge the gap between imbalanced learning and sentiment analysis. An experimental study including twelve imbalanced learning preprocessing methods, four feature representations, and a dozen of datasets, is carried out in order to analyze the usefulness of imbalanced learning methods for sentiment classification. Moreover, the data difficulty factors — commonly studied in imbalanced learning — are investigated on sentiment corpora to evaluate the impact of class imbalance.

Download Full-text

An Impartial Semi-Supervised Learning Strategy for Imbalanced Classification on VHR Images

Sensors ◽

10.3390/s20226699 ◽

2020 ◽

Vol 20 (22) ◽

pp. 6699

Author(s):

Fei Sun ◽

Fang Fang ◽

Run Wang ◽

Bo Wan ◽

Qinghua Guo ◽

...

Keyword(s):

Remote Sensing ◽

Supervised Learning ◽

Learning Strategy ◽

Gradient Boosting ◽

Support Vector ◽

Imbalanced Learning ◽

Learning Methods ◽

Minority Class ◽

Imbalanced Classification ◽

Extreme Gradient Boosting

Imbalanced learning is a common problem in remote sensing imagery-based land-use and land-cover classifications. Imbalanced learning can lead to a reduction in classification accuracy and even the omission of the minority class. In this paper, an impartial semi-supervised learning strategy based on extreme gradient boosting (ISS-XGB) is proposed to classify very high resolution (VHR) images with imbalanced data. ISS-XGB solves multi-class classification by using several semi-supervised classifiers. It first employs multi-group unlabeled data to eliminate the imbalance of training samples and then utilizes gradient boosting-based regression to simulate the target classes with positive and unlabeled samples. In this study, experiments were conducted on eight study areas with different imbalanced situations. The results showed that ISS-XGB provided a comparable but more stable performance than most commonly used classification approaches (i.e., random forest (RF), XGB, multilayer perceptron (MLP), and support vector machine (SVM)), positive and unlabeled learning (PU-Learning) methods (PU-BP and PU-SVM), and typical synthetic sample-based imbalanced learning methods. Especially under extremely imbalanced situations, ISS-XGB can provide high accuracy for the minority class without losing overall performance (the average overall accuracy achieves 85.92%). The proposed strategy has great potential in solving the imbalanced classification problems in remote sensing.

Download Full-text

Consensus Clustering-Based Undersampling Approach to Imbalanced Learning

Scientific Programming ◽

10.1155/2019/5901087 ◽

2019 ◽

Vol 2019 ◽

pp. 1-14 ◽

Cited By ~ 7

Author(s):

Aytuğ Onan

Keyword(s):

Large Scale ◽

Nearest Neighbor ◽

Clustering Algorithms ◽

Class Imbalance ◽

Bankruptcy Prediction ◽

Support Vector ◽

Small Scale ◽

Consensus Clustering ◽

Imbalanced Learning ◽

Minority Class

Class imbalance is an important problem, encountered in machine learning applications, where one class (named as, the minority class) has extremely small number of instances and the other class (referred as, the majority class) has immense quantity of instances. Imbalanced datasets can be of great importance in several real-world applications, including medical diagnosis, malware detection, anomaly identification, bankruptcy prediction, and spam filtering. In this paper, we present a consensus clustering based-undersampling approach to imbalanced learning. In this scheme, the number of instances in the majority class was undersampled by utilizing a consensus clustering-based scheme. In the empirical analysis, 44 small-scale and 2 large-scale imbalanced classification benchmarks have been utilized. In the consensus clustering schemes, five clustering algorithms (namely, k-means, k-modes, k-means++, self-organizing maps, and DIANA algorithm) and their combinations were taken into consideration. In the classification phase, five supervised learning methods (namely, naïve Bayes, logistic regression, support vector machines, random forests, and k-nearest neighbor algorithm) and three ensemble learner methods (namely, AdaBoost, bagging, and random subspace algorithm) were utilized. The empirical results indicate that the proposed heterogeneous consensus clustering-based undersampling scheme yields better predictive performance.

Download Full-text

A Comprehensive Survey on Identification and Analysis of Phishing Website based on Machine Learning Methods

2021 IEEE 11th IEEE Symposium on Computer Applications & Industrial Electronics (ISCAIE) ◽

10.1109/iscaie51753.2021.9431794 ◽

2021 ◽

Author(s):

Mohammed Hazim Alkawaz ◽

Stephanie Joanne Steven ◽

Asif Iqbal Hajamydeen ◽

Rusyaizila Ramli

Keyword(s):

Machine Learning ◽

Learning Methods ◽

Machine Learning Methods ◽

Comprehensive Survey

Download Full-text

A Survey on Representation Learning for User Modeling

Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2020/695 ◽

2020 ◽

Cited By ~ 1

Author(s):

Sheng Li ◽

Handong Zhao

Keyword(s):

Intelligent Systems ◽

User Modeling ◽

User Behavior ◽

Representation Learning ◽

Learning Methods ◽

The Past ◽

Research Problems ◽

Comprehensive Survey ◽

Movie Recommendation ◽

Latent Representations

Artificial intelligent systems are changing every aspect of our daily life. In the past decades, numerous approaches have been developed to characterize user behavior, in order to deliver personalized experience to users in scenarios like online shopping or movie recommendation. This paper presents a comprehensive survey of recent advances in user modeling from the perspective of representation learning. In particular, we formulate user modeling as a process of learning latent representations for users. We discuss both the static and sequential representation learning methods for the purpose of user modeling, and review representative approaches in each category, such as matrix factorization, deep collaborative filtering, and recurrent neural networks. Both shallow and deep learning methods are reviewed and discussed. Finally, we conclude this survey and discuss a number of open research problems that would inspire further research in this field.

Download Full-text