The Effect of Dimensionality Reduction on Software Vulnerability Prediction Models

Any vulnerability in the software creates a software security threat and helps hackers to gain unauthorized access to resources. Vulnerability prediction models help software engineers to effectively allocate their resources to find any vulnerable class in the software, before its delivery to customers. Vulnerable classes must be carefully reviewed by security experts and tested to identify potential threats that may arise in the future. In the present work, a novel technique based on Grey wolf algorithm and Random forest is proposed for software vulnerability prediction. Grey wolf technique is a metaheuristic technique and it is used to select the best subset of features. The proposed technique is compared with other machine learning techniques. Experiments were performed on three datasets available publicly. It was observed that our proposed technique (GW-RF) outperformed all other techniques for software vulnerability prediction.

Download Full-text

Software Vulnerability Prediction Models Based on Complex Network

DEStech Transactions on Computer Science and Engineering ◽

10.12783/dtcse/cimns2017/17397 ◽

2018 ◽

Author(s):

XIAO-LIN ZHAO ◽

QUAN-BAO CHEN ◽

JIA-TONG GAO ◽

XIAN-HUA ZHANG ◽

JIAN-YANG DING

Keyword(s):

Complex Network ◽

Prediction Models ◽

Software Vulnerability ◽

Vulnerability Prediction

Download Full-text

The impact of feature types, classifiers, and data balancing techniques on software vulnerability prediction models

Journal of Software Evolution and Process ◽

10.1002/smr.2164 ◽

2019 ◽

Vol 31 (9) ◽

Author(s):

Aydin Kaya ◽

Ali Seydi Keceli ◽

Cagatay Catal ◽

Bedir Tekinerdogan

Keyword(s):

Prediction Models ◽

Software Vulnerability ◽

Vulnerability Prediction ◽

The Impact

Download Full-text

The effect of Bellwether analysis on software vulnerability severity prediction models

Software Quality Journal ◽

10.1007/s11219-019-09490-1 ◽

2020 ◽

Vol 28 (4) ◽

pp. 1413-1446 ◽

Cited By ~ 3

Author(s):

Patrick Kwaku Kudjo ◽

Jinfu Chen ◽

Solomon Mensah ◽

Richard Amankwah ◽

Christopher Kudjo

Keyword(s):

Prediction Models ◽

Software Vulnerability ◽

Severity Prediction

Download Full-text

Machine Learning Dimensionality Reduction Showed Marginal Performance Benefit Over Stepwise Regression for Risk Stratification of Chest Pain Patients in the Emergency Department

10.21203/rs.3.rs-43703/v1 ◽

2020 ◽

Author(s):

Nan Liu ◽

Marcel Lucas Chee ◽

Zhi Xiong Koh ◽

Su Li Leow ◽

Andrew Fu Wah Ho ◽

...

Keyword(s):

Machine Learning ◽

Logistic Regression ◽

Chest Pain ◽

Risk Stratification ◽

Dimensionality Reduction ◽

Prediction Models ◽

Stepwise Logistic Regression ◽

Stepwise Variable Selection ◽

Reduction Methods ◽

Pain Patients

Abstract Background: Chest pain is among the most common presenting complaints in the emergency department (ED). Swift and accurate risk stratification of chest pain patients in the ED may improve patient outcomes and reduce unnecessary costs. Traditional logistic regression with stepwise variable selection has been used to build risk prediction models for ED chest pain patients. In this study, we aimed to investigate if machine learning dimensionality reduction methods can achieve superior performance than the stepwise approach in deriving risk stratification models. Methods: A retrospective analysis was conducted on the data of patients >20 years old who presented to the ED of Singapore General Hospital with chest pain between September 2010 and July 2015. Variables used included demographics, medical history, laboratory findings, heart rate variability (HRV), and HRnV parameters calculated from five to six-minute electrocardiograms (ECGs). The primary outcome was 30-day major adverse cardiac events (MACE), which included death, acute myocardial infarction, and revascularization. Candidate variables identified using univariable analysis were then used to generate the stepwise logistic regression model and eight machine learning dimensionality reduction prediction models. A separate set of models was derived by excluding troponin. Receiver operating characteristic (ROC) and calibration analysis was used to compare model performance.Results: 795 patients were included in the analysis, of which 247 (31%) met the primary outcome of 30-day MACE. Patients with MACE were older and more likely to be male. All eight dimensionality reduction methods marginally but non-significantly outperformed stepwise variable selection; The multidimensional scaling algorithm performed the best with an area under the curve (AUC) of 0.901. All HRnV-based models generated in this study outperformed several existing clinical scores in ROC analysis.Conclusions: HRnV-based models using stepwise logistic regression performed better than existing chest pain scores for predicting MACE, with only marginal improvements using machine learning dimensionality reduction. Moreover, traditional stepwise approach benefits from model transparency and interpretability; in comparison, machine learning dimensionality reduction models are black boxes, making them difficult to explain in clinical practice.

Download Full-text