The Effect of Dimensionality Reduction on Software Vulnerability Prediction Models

2017 ◽  
Vol 66 (1) ◽  
pp. 17-37 ◽  
Author(s):  
Jeffrey Stuckman ◽  
James Walden ◽  
Riccardo Scandariato
2022 ◽  
Vol 13 (1) ◽  
pp. 0-0

Any vulnerability in the software creates a software security threat and helps hackers to gain unauthorized access to resources. Vulnerability prediction models help software engineers to effectively allocate their resources to find any vulnerable class in the software, before its delivery to customers. Vulnerable classes must be carefully reviewed by security experts and tested to identify potential threats that may arise in the future. In the present work, a novel technique based on Grey wolf algorithm and Random forest is proposed for software vulnerability prediction. Grey wolf technique is a metaheuristic technique and it is used to select the best subset of features. The proposed technique is compared with other machine learning techniques. Experiments were performed on three datasets available publicly. It was observed that our proposed technique (GW-RF) outperformed all other techniques for software vulnerability prediction.


2020 ◽  
Vol 28 (4) ◽  
pp. 1413-1446 ◽  
Author(s):  
Patrick Kwaku Kudjo ◽  
Jinfu Chen ◽  
Solomon Mensah ◽  
Richard Amankwah ◽  
Christopher Kudjo

2020 ◽  
Author(s):  
Nan Liu ◽  
Marcel Lucas Chee ◽  
Zhi Xiong Koh ◽  
Su Li Leow ◽  
Andrew Fu Wah Ho ◽  
...  

Abstract Background: Chest pain is among the most common presenting complaints in the emergency department (ED). Swift and accurate risk stratification of chest pain patients in the ED may improve patient outcomes and reduce unnecessary costs. Traditional logistic regression with stepwise variable selection has been used to build risk prediction models for ED chest pain patients. In this study, we aimed to investigate if machine learning dimensionality reduction methods can achieve superior performance than the stepwise approach in deriving risk stratification models. Methods: A retrospective analysis was conducted on the data of patients >20 years old who presented to the ED of Singapore General Hospital with chest pain between September 2010 and July 2015. Variables used included demographics, medical history, laboratory findings, heart rate variability (HRV), and HRnV parameters calculated from five to six-minute electrocardiograms (ECGs). The primary outcome was 30-day major adverse cardiac events (MACE), which included death, acute myocardial infarction, and revascularization. Candidate variables identified using univariable analysis were then used to generate the stepwise logistic regression model and eight machine learning dimensionality reduction prediction models. A separate set of models was derived by excluding troponin. Receiver operating characteristic (ROC) and calibration analysis was used to compare model performance.Results: 795 patients were included in the analysis, of which 247 (31%) met the primary outcome of 30-day MACE. Patients with MACE were older and more likely to be male. All eight dimensionality reduction methods marginally but non-significantly outperformed stepwise variable selection; The multidimensional scaling algorithm performed the best with an area under the curve (AUC) of 0.901. All HRnV-based models generated in this study outperformed several existing clinical scores in ROC analysis.Conclusions: HRnV-based models using stepwise logistic regression performed better than existing chest pain scores for predicting MACE, with only marginal improvements using machine learning dimensionality reduction. Moreover, traditional stepwise approach benefits from model transparency and interpretability; in comparison, machine learning dimensionality reduction models are black boxes, making them difficult to explain in clinical practice.


Sign in / Sign up

Export Citation Format

Share Document