An Efficient Cost-Sensitive Feature Selection Using Chaos Genetic Algorithm for Class Imbalance Problem

Mathematical Problems in Engineering ◽

10.1155/2016/8752181 ◽

2016 ◽

Vol 2016 ◽

pp. 1-9 ◽

Cited By ~ 7

Author(s):

Jing Bian ◽

Xin-guang Peng ◽

Ying Wang ◽

Hai Zhang

Keyword(s):

Genetic Algorithm ◽

Feature Selection ◽

Network Security ◽

Classification Accuracy ◽

Large Scale ◽

Class Imbalance ◽

Evaluation Function ◽

Class Imbalance Problem ◽

Imbalance Problem ◽

Chaos Genetic Algorithm

In the era of big data, feature selection is an essential process in machine learning. Although the class imbalance problem has recently attracted a great deal of attention, little effort has been undertaken to develop feature selection techniques. In addition, most applications involving feature selection focus on classification accuracy but not cost, although costs are important. To cope with imbalance problems, we developed a cost-sensitive feature selection algorithm that adds the cost-based evaluation function of a filter feature selection using a chaos genetic algorithm, referred to as CSFSG. The evaluation function considers both feature-acquiring costs (test costs) and misclassification costs in the field of network security, thereby weakening the influence of many instances from the majority of classes in large-scale datasets. The CSFSG algorithm reduces the total cost of feature selection and trades off both factors. The behavior of the CSFSG algorithm is tested on a large-scale dataset of network security, using two kinds of classifiers: C4.5 andk-nearest neighbor (KNN). The results of the experimental research show that the approach is efficient and able to effectively improve classification accuracy and to decrease classification time. In addition, the results of our method are more promising than the results of other cost-sensitive feature selection algorithms.

Download Full-text

Cost-Sensitive Feature Selection for Class Imbalance Problem

Information Systems Architecture and Technology: Proceedings of 38th International Conference on Information Systems Architecture and Technology – ISAT 2017 - Advances in Intelligent Systems and Computing ◽

10.1007/978-3-319-67220-5_17 ◽

2017 ◽

pp. 182-194 ◽

Cited By ~ 3

Author(s):

Małgorzata Bach ◽

Aleksandra Werner

Keyword(s):

Feature Selection ◽

Class Imbalance ◽

Class Imbalance Problem ◽

Imbalance Problem ◽

Selection For

Download Full-text

Feature Selection Techniques to Counter Class Imbalance Problem for Aging Related Bug Prediction

Proceedings of the 11th Innovations in Software Engineering Conference on - ISEC '18 ◽

10.1145/3172871.3172872 ◽

2018 ◽

Cited By ~ 1

Author(s):

Lov Kumar ◽

Ashish Sureka

Keyword(s):

Feature Selection ◽

Class Imbalance ◽

Class Imbalance Problem ◽

Imbalance Problem ◽

Feature Selection Techniques

Download Full-text

A Cost-Sensitive Sparse Representation Based Classification for Class-Imbalance Problem

Scientific Programming ◽

10.1155/2016/8035089 ◽

2016 ◽

Vol 2016 ◽

pp. 1-9

Author(s):

Zhenbing Liu ◽

Chunyang Gao ◽

Huihua Yang ◽

Qijia He

Keyword(s):

Sparse Representation ◽

Classification Accuracy ◽

Class Imbalance ◽

Misclassification Rate ◽

Data Sets ◽

Class Imbalance Problem ◽

Misclassification Cost ◽

Practical Applications ◽

Imbalance Problem ◽

Positive Class

Sparse representation has been successfully used in pattern recognition and machine learning. However, most existing sparse representation based classification (SRC) methods are to achieve the highest classification accuracy, assuming the same losses for different misclassifications. This assumption, however, may not hold in many practical applications as different types of misclassification could lead to different losses. In real-world application, much data sets are imbalanced of the class distribution. To address these problems, we propose a cost-sensitive sparse representation based classification (CSSRC) for class-imbalance problem method by using probabilistic modeling. Unlike traditional SRC methods, we predict the class label of test samples by minimizing the misclassification losses, which are obtained via computing the posterior probabilities. Experimental results on the UCI databases validate the efficacy of the proposed approach on average misclassification cost, positive class misclassification rate, and negative class misclassification rate. In addition, we sampled test samples and training samples with different imbalance ratio and use F-measure, G-mean, classification accuracy, and running time to evaluate the performance of the proposed method. The experiments show that our proposed method performs competitively compared to SRC, CSSVM, and CS4VM.

Download Full-text

Feature Selection and the Class Imbalance Problem in Predicting Protein Function from Sequence

Applied Bioinformatics ◽

10.2165/00822942-200504030-00004 ◽

2005 ◽

Vol 4 (3) ◽

pp. 195-203 ◽

Cited By ~ 66

Author(s):

Ali Al-Shahib ◽

Rainer Breitling ◽

David Gilbert

Keyword(s):

Feature Selection ◽

Protein Function ◽

Class Imbalance ◽

Class Imbalance Problem ◽

Imbalance Problem

Download Full-text

Oversample Based Large Scale Support Vector Machine for Online Class Imbalance Problem

Big Data Analytics - Lecture Notes in Computer Science ◽

10.1007/978-3-030-04780-1_24 ◽

2018 ◽

pp. 348-362

Author(s):

D. Himaja ◽

T. Maruthi Padmaja ◽

P. Radha Krishna

Keyword(s):

Support Vector Machine ◽

Large Scale ◽

Class Imbalance ◽

Support Vector ◽

Class Imbalance Problem ◽

Online Class ◽

Imbalance Problem

Download Full-text

Combining Synthetic Minority Oversampling Technique and Subset Feature Selection Technique For Class Imbalance Problem

Proceedings of the International Conference on Advances in Information Communication Technology & Computing - AICTC '16 ◽

10.1145/2979779.2979804 ◽

2016 ◽

Author(s):

Pawan Lachheta ◽

Seema Bawa

Keyword(s):

Feature Selection ◽

Class Imbalance ◽

Class Imbalance Problem ◽

Feature Selection Technique ◽

Selection Technique ◽

Imbalance Problem

Download Full-text

Addressing class imbalance problem in medical diagnosis: A genetic algorithm approach

2017 International Conference on Information, Communication, Instrumentation and Control (ICICIC) ◽

10.1109/icomicon.2017.8279150 ◽

2017 ◽

Cited By ~ 4

Author(s):

Anju Jain ◽

Saroj Ratnoo ◽

Dinesh Kumar

Keyword(s):

Genetic Algorithm ◽

Medical Diagnosis ◽

Class Imbalance ◽

Class Imbalance Problem ◽

Imbalance Problem ◽

Genetic Algorithm Approach

Download Full-text

Combining feature selection and hybrid approach redefinition in handling class imbalance and overlapping for multi-class imbalanced

Indonesian Journal of Electrical Engineering and Computer Science ◽

10.11591/ijeecs.v21.i3.pp1513-1522 ◽

2021 ◽

Vol 21 (3) ◽

pp. 1513

Author(s):

Hartono Hartono ◽

Erianto Ongko ◽

Yeni Risyani

Keyword(s):

Feature Selection ◽

Learning Algorithm ◽

Hybrid Approach ◽

Feature Selection Method ◽

Class Imbalance ◽

Poor Performance ◽

Class Imbalance Problem ◽

Imbalance Problem ◽

Classifier Performance ◽

F Measure

<span>In the classification process that contains class imbalance problems. In addition to the uneven distribution of instances which causes poor performance, overlapping problems also cause performance degradation. This paper proposes a method that combining feature selection and hybrid approach redefinition (HAR) method in handling class imbalance and overlapping for multi-class imbalanced. HAR was a hybrid ensembles method in handling class imbalance problem. The main contribution of this work is to produce a new method that can overcome the problem of class imbalance and overlapping in the multi-class imbalance problem. This method must be able to give better results in terms of classifier performance and overlap degrees in multi-class problems. This is achieved by improving an ensemble learning algorithm and a preprocessing technique in HAR <span>using minimizing overlapping selection under SMOTE (MOSS). MOSS was known as a very popular feature selection method in handling overlapping. To validate the accuracy of the proposed method, this research use augmented R-Value, Mean AUC, Mean F-Measure, Mean G-Mean, and Mean Precision. The performance of the model is evaluated against the hybrid method (MBP+CGE) as a popular method in handling class imbalance and overlapping for multi-class imbalanced. It is found that the proposed method is superior when subjected to classifier performance as indicate with better Mean AUC, F-Measure, G-Mean, and precision.</span></span>

Download Full-text

Feature Selection Method from Multiclass Text with Class Imbalance Problem

Journal of the Korean Institute of Industrial Engineers ◽

10.7232/jkiie.2019.45.2.093 ◽

2019 ◽

Vol 45 (2) ◽

pp. 93-100

Author(s):

Minji Seo ◽

Gilseung Ahn ◽

Sun Hur

Keyword(s):

Feature Selection ◽

Feature Selection Method ◽

Class Imbalance ◽

Selection Method ◽

Class Imbalance Problem ◽

Imbalance Problem

Download Full-text

Improving Fashion Style Classification Accuracy using VAE in Class Imbalance Problem

The Journal of Korean Institute of Information Technology ◽

10.14801/jkiit.2021.19.2.1 ◽

2021 ◽

Vol 19 (2) ◽

pp. 1-10

Author(s):

Jonghyuk Park

Keyword(s):

Classification Accuracy ◽

Class Imbalance ◽

Class Imbalance Problem ◽

Imbalance Problem

Download Full-text