scholarly journals Class Imbalance Reduction (CIR): A Novel Approach to Software Defect Prediction in the Presence of Class Imbalance

Symmetry ◽  
2020 ◽  
Vol 12 (3) ◽  
pp. 407 ◽  
Author(s):  
Kiran Kumar Bejjanki ◽  
Jayadev Gyani ◽  
Narsimha Gugulothu

Software defect prediction (SDP) is the technique used to predict the occurrences of defects in the early stages of software development process. Early prediction of defects will reduce the overall cost of software and also increase its reliability. Most of the defect prediction methods proposed in the literature suffer from the class imbalance problem. In this paper, a novel class imbalance reduction (CIR) algorithm is proposed to create a symmetry between the defect and non-defect records in the imbalance datasets by considering distribution properties of the datasets and is compared with SMOTE (synthetic minority oversampling technique), a built-in package of many machine learning tools that is considered a benchmark in handling class imbalance problems, and with K-Means SMOTE. We conducted the experiment on forty open source software defect datasets from PRedict or Models in Software Engineering (PROMISE) repository using eight different classifiers and evaluated with six performance measures. The results show that the proposed CIR method shows improved performance over SMOTE and K-Means SMOTE.

IEEE Access ◽  
2018 ◽  
Vol 6 ◽  
pp. 24184-24195 ◽  
Author(s):  
Shamsul Huda ◽  
Kevin Liu ◽  
Mohamed Abdelrazek ◽  
Amani Ibrahim ◽  
Sultan Alyahya ◽  
...  

Author(s):  
R. Srivastava ◽  
Aman Kumar Jain

Objective:: Defects in delivered software products not only have financial implications but also blemish the reputation of the organisation and lead to wastage of time and human resource. This paper aims to detect defects in software modules. Methods:: Our approach sequentially combines SMOTE algorithm to deal with class imbalance problem, K - means clustering algorithm to obtain a set of key features based on inter-class and intra-class coefficient of correlation and ensemble modelling to predict defects in software modules. After cautious examination, an ensemble framework of XGBoost, Decision Tree and Random Forest is used for prediction of software defects owing to numerous merits of ensembling approach. Results:: We have used five open-source datasets from NASA Promise Repository for Software Engineering. The result obtained from our approach has been compared with that of individual algorithms used in ensemble. A confidence interval for the accuracy of our approach with respect to performance evaluation metrics namely Accuracy, Precision, Recall, F1 score and AUC score has also been constructed at a significance level of 0.01. Conclusion:: Results have been depicted pictographically.


2021 ◽  
Vol 11 (5) ◽  
pp. 2002
Author(s):  
Jonggu Kang ◽  
Sunjae Kwon ◽  
Duksan Ryu ◽  
Jongmoon Baik

Software is playing the most important role in recent vehicle innovations, and consequently the amount of software has rapidly grown in recent decades. The safety-critical nature of ships, one sort of vehicle, makes software quality assurance (SQA) a fundamental prerequisite. Just-in-time software defect prediction (JIT-SDP) aims to conduct software defect prediction (SDP) on commit-level code changes to achieve effective SQA resource allocation. The first case study of SDP in the maritime domain reported feasible prediction performance. However, we still consider that the prediction model has room for improvement since the parameters of the model are not optimized yet. Harmony search (HS) is a widely used music-inspired meta-heuristic optimization algorithm. In this article, we demonstrated that JIT-SDP can produce better performance of prediction by applying HS-based parameter optimization with balanced fitness value. Using two real-world datasets from the maritime software project, we obtained an optimized model that meets the performance criterion beyond the baseline of a previous case study throughout various defect to non-defect class imbalance ratio of datasets. Experiments with open source software also showed better recall for all datasets despite the fact that we considered balance as a performance index. HS-based parameter optimized JIT-SDP can be applied to the maritime domain software with a high class imbalance ratio. Finally, we expect that our research can be extended to improve the performance of JIT-SDP not only in maritime domain software but also in open source software.


Sign in / Sign up

Export Citation Format

Share Document