scholarly journals A comprehensive survey of imbalanced learning methods for bankruptcy prediction

2021 ◽  
Author(s):  
Tuong Le
Author(s):  
Georgios Kostopoulos ◽  
Stamatis Karlos ◽  
Sotiris Kotsiantis ◽  
Vassilis Tampakas

2021 ◽  
Vol 2 (2) ◽  
Author(s):  
Saiful Islam ◽  
Aurpan Dash ◽  
Ashek Seum ◽  
Amir Hossain Raj ◽  
Tonmoy Hossain ◽  
...  

2013 ◽  
Vol 2013 ◽  
pp. 1-6 ◽  
Author(s):  
Yong Zhang ◽  
Dapeng Wang

In imbalanced learning methods, resampling methods modify an imbalanced dataset to form a balanced dataset. Balanced data sets perform better than imbalanced datasets for many base classifiers. This paper proposes a cost-sensitive ensemble method based on cost-sensitive support vector machine (SVM), and query-by-committee (QBC) to solve imbalanced data classification. The proposed method first divides the majority-class dataset into several subdatasets according to the proportion of imbalanced samples and trains subclassifiers using AdaBoost method. Then, the proposed method generates candidate training samples by QBC active learning method and uses cost-sensitive SVM to learn the training samples. By using 5 class-imbalanced datasets, experimental results show that the proposed method has higher area under ROC curve (AUC), F-measure, and G-mean than many existing class-imbalanced learning methods.


2022 ◽  
Vol 54 (8) ◽  
pp. 1-34
Author(s):  
Fuqiang Gu ◽  
Mu-Huan Chung ◽  
Mark Chignell ◽  
Shahrokh Valaee ◽  
Baoding Zhou ◽  
...  

Human activity recognition is a key to a lot of applications such as healthcare and smart home. In this study, we provide a comprehensive survey on recent advances and challenges in human activity recognition (HAR) with deep learning. Although there are many surveys on HAR, they focused mainly on the taxonomy of HAR and reviewed the state-of-the-art HAR systems implemented with conventional machine learning methods. Recently, several works have also been done on reviewing studies that use deep models for HAR, whereas these works cover few deep models and their variants. There is still a need for a comprehensive and in-depth survey on HAR with recently developed deep learning methods.


2019 ◽  
Vol 44 (2) ◽  
pp. 151-178 ◽  
Author(s):  
Mateusz Lango

Abstract Sentiment classification is an important task which gained extensive attention both in academia and in industry. Many issues related to this task such as handling of negation or of sarcastic utterances were analyzed and accordingly addressed in previous works. However, the issue of class imbalance which often compromises the prediction capabilities of learning algorithms was scarcely studied. In this work, we aim to bridge the gap between imbalanced learning and sentiment analysis. An experimental study including twelve imbalanced learning preprocessing methods, four feature representations, and a dozen of datasets, is carried out in order to analyze the usefulness of imbalanced learning methods for sentiment classification. Moreover, the data difficulty factors — commonly studied in imbalanced learning — are investigated on sentiment corpora to evaluate the impact of class imbalance.


Sensors ◽  
2020 ◽  
Vol 20 (22) ◽  
pp. 6699
Author(s):  
Fei Sun ◽  
Fang Fang ◽  
Run Wang ◽  
Bo Wan ◽  
Qinghua Guo ◽  
...  

Imbalanced learning is a common problem in remote sensing imagery-based land-use and land-cover classifications. Imbalanced learning can lead to a reduction in classification accuracy and even the omission of the minority class. In this paper, an impartial semi-supervised learning strategy based on extreme gradient boosting (ISS-XGB) is proposed to classify very high resolution (VHR) images with imbalanced data. ISS-XGB solves multi-class classification by using several semi-supervised classifiers. It first employs multi-group unlabeled data to eliminate the imbalance of training samples and then utilizes gradient boosting-based regression to simulate the target classes with positive and unlabeled samples. In this study, experiments were conducted on eight study areas with different imbalanced situations. The results showed that ISS-XGB provided a comparable but more stable performance than most commonly used classification approaches (i.e., random forest (RF), XGB, multilayer perceptron (MLP), and support vector machine (SVM)), positive and unlabeled learning (PU-Learning) methods (PU-BP and PU-SVM), and typical synthetic sample-based imbalanced learning methods. Especially under extremely imbalanced situations, ISS-XGB can provide high accuracy for the minority class without losing overall performance (the average overall accuracy achieves 85.92%). The proposed strategy has great potential in solving the imbalanced classification problems in remote sensing.


2019 ◽  
Vol 2019 ◽  
pp. 1-14 ◽  
Author(s):  
Aytuğ Onan

Class imbalance is an important problem, encountered in machine learning applications, where one class (named as, the minority class) has extremely small number of instances and the other class (referred as, the majority class) has immense quantity of instances. Imbalanced datasets can be of great importance in several real-world applications, including medical diagnosis, malware detection, anomaly identification, bankruptcy prediction, and spam filtering. In this paper, we present a consensus clustering based-undersampling approach to imbalanced learning. In this scheme, the number of instances in the majority class was undersampled by utilizing a consensus clustering-based scheme. In the empirical analysis, 44 small-scale and 2 large-scale imbalanced classification benchmarks have been utilized. In the consensus clustering schemes, five clustering algorithms (namely, k-means, k-modes, k-means++, self-organizing maps, and DIANA algorithm) and their combinations were taken into consideration. In the classification phase, five supervised learning methods (namely, naïve Bayes, logistic regression, support vector machines, random forests, and k-nearest neighbor algorithm) and three ensemble learner methods (namely, AdaBoost, bagging, and random subspace algorithm) were utilized. The empirical results indicate that the proposed heterogeneous consensus clustering-based undersampling scheme yields better predictive performance.


Author(s):  
Sheng Li ◽  
Handong Zhao

Artificial intelligent systems are changing every aspect of our daily life. In the past decades, numerous approaches have been developed to characterize user behavior, in order to deliver personalized experience to users in scenarios like online shopping or movie recommendation. This paper presents a comprehensive survey of recent advances in user modeling from the perspective of representation learning. In particular, we formulate user modeling as a process of learning latent representations for users. We discuss both the static and sequential representation learning methods for the purpose of user modeling, and review representative approaches in each category, such as matrix factorization, deep collaborative filtering, and recurrent neural networks. Both shallow and deep learning methods are reviewed and discussed. Finally, we conclude this survey and discuss a number of open research problems that would inspire further research in this field.


Sign in / Sign up

Export Citation Format

Share Document