Particle swarm optimization-deep belief network-based rare class prediction model for highly class imbalance problem

2017 ◽  
Vol 29 (11) ◽  
pp. e4128 ◽  
Author(s):  
Jae Kwon Kim ◽  
Young Shin Han ◽  
Jong Sik Lee
Author(s):  
Sayan Surya Shaw ◽  
Shameem Ahmed ◽  
Samir Malakar ◽  
Laura Garcia-Hernandez ◽  
Ajith Abraham ◽  
...  

AbstractMany real-life datasets are imbalanced in nature, which implies that the number of samples present in one class (minority class) is exceptionally less compared to the number of samples found in the other class (majority class). Hence, if we directly fit these datasets to a standard classifier for training, then it often overlooks the minority class samples while estimating class separating hyperplane(s) and as a result of that it missclassifies the minority class samples. To solve this problem, over the years, many researchers have followed different approaches. However the selection of the true representative samples from the majority class is still considered as an open research problem. A better solution for this problem would be helpful in many applications like fraud detection, disease prediction and text classification. Also, the recent studies show that it needs not only analyzing disproportion between classes, but also other difficulties rooted in the nature of different data and thereby it needs more flexible, self-adaptable, computationally efficient and real-time method for selection of majority class samples without loosing much of important data from it. Keeping this fact in mind, we have proposed a hybrid model constituting Particle Swarm Optimization (PSO), a popular swarm intelligence-based meta-heuristic algorithm, and Ring Theory (RT)-based Evolutionary Algorithm (RTEA), a recently proposed physics-based meta-heuristic algorithm. We have named the algorithm as RT-based PSO or in short RTPSO. RTPSO can select the most representative samples from the majority class as it takes advantage of the efficient exploration and the exploitation phases of its parent algorithms for strengthening the search process. We have used AdaBoost classifier to observe the final classification results of our model. The effectiveness of our proposed method has been evaluated on 15 standard real-life datasets having low to extreme imbalance ratio. The performance of the RTPSO has been compared with PSO, RTEA and other standard undersampling methods. The obtained results demonstrate the superiority of RTPSO over state-of-the-art class imbalance problem-solvers considered here for comparison. The source code of this work is available in https://github.com/Sayansurya/RTPSO_Class_imbalance.


2012 ◽  
Vol 9 (1) ◽  
Author(s):  
Rok Blagus ◽  
Lara Lusa

The goal of multi-class supervised classification is to develop a rule that accurately predicts the class membership of new samples when the number of classes is larger than two. In this paper we consider high-dimensional class-imbalanced data: the number of variables greatly exceeds the number of samples and the number of samples in each class is not equal. We focus on Friedman's one-versus-one approach for three-class problems and show how its class probabilities depend on the class probabilities from the binary classification sub-problems. We further explore its performance using diagonal linear discriminant analysis (DLDA) as a base classifier and compare its performance with multi-class DLDA, using simulated and real data. Our results show that the class-imbalance has a significant effect on the classification results: the classification is biased towards the majority class as in the two-class problems and the problem is magnified when the number of variables is large. The amount of the bias depends also, jointly, on the magnitude of the differences between the classes and on the sample size: the bias diminishes when the difference between the classes is larger or the sample size is increased. Also variable selection plays an important role in the class-imbalance problem and the most effective strategy depends on the type of differences that exist between classes. DLDA seems to be among the least sensible classifiers to class-imbalance and its use is recommended also for multi-class problems. Whenever possible the experiments should be planned using balanced data in order to avoid the class-imbalance problem.


2020 ◽  
Vol 2020 ◽  
pp. 1-10
Author(s):  
Jianjian Yang ◽  
Boshen Chang ◽  
Xiaolin Wang ◽  
Qiang Zhang ◽  
Chao Wang ◽  
...  

Due to the problem of poor recognition of data with deep fault attribute in the case of traditional superficial network under semisupervised and weak labeling, a deep belief network (DBN) was proposed for deep fault detection. Due to the problems of deep belief network (DBN) network structure and training parameter selection, a stochastic adaptive particle swarm optimization (RSAPSO) algorithm was proposed in this study to optimize the DBN. A stochastic criterion was proposed in this method to make the particles jump out of the original position search with a certain probability and reduce the probability of falling into the local optimum. The RSAPSO-DBN method used sample data to train the DBN and used the final diagnostic error rate to construct the fitness value function of the particle swarm algorithm. By comparing the minimum fitness value of each particle to determine the advantages and disadvantages of the model, the corresponding minimum fitness value was selected. Using the number of network nodes, learning rate, and momentum parameters, the optimal DBN classifier was generated for fault diagnosis. Finally, the validity of the method was verified by bearing data from Case Western Reserve University in the United States and data collected in the laboratory. Comparing BP (BP neural network), support vector machine, and heterogeneous particle swarm optimization DBN methods, the proposed method demonstrated the highest recognition rates of 87.75% and 93.75%. This proves that the proposed method possesses universality in fault diagnosis and provides new ideas for data identification with different fault depth attributes.


Sign in / Sign up

Export Citation Format

Share Document