Buffer Overflow Vulnerability Prediction from x86 Executables Using Static Analysis and Machine Learning

Author(s):  
Bindu Madhavi Padmanabhuni ◽  
Hee Beng Kuan Tan
Symmetry ◽  
2020 ◽  
Vol 13 (1) ◽  
pp. 35
Author(s):  
Sungjoong Kim ◽  
Seongkyu Yeom ◽  
Haengrok Oh ◽  
Dongil Shin ◽  
Dongkyoo Shin

The development of information and communication technology (ICT) is making daily life more convenient by allowing access to information at anytime and anywhere and by improving the efficiency of organizations. Unfortunately, malicious code is also proliferating and becoming increasingly complex and sophisticated. In fact, even novices can now easily create it using hacking tools, which is causing it to increase and spread exponentially. It has become difficult for humans to respond to such a surge. As a result, many studies have pursued methods to automatically analyze and classify malicious code. There are currently two methods for analyzing it: a dynamic analysis method that executes the program directly and confirms the execution result, and a static analysis method that analyzes the program without executing it. This paper proposes a static analysis automation technique for malicious code that uses machine learning. This classification system was designed by combining a method for classifying malicious code using a portable executable (PE) structure and a method for classifying it using a PE structure. The system has 98.77% accuracy when classifying normal and malicious files. The proposed system can be used to classify various types of malware from PE files to shell code.


2021 ◽  
Vol 58 ◽  
pp. 102735
Author(s):  
Wadi’ Hijawi ◽  
Ja’far Alqatawna ◽  
Ala’ M. Al-Zoubi ◽  
Mohammad A. Hassonah ◽  
Hossam Faris

2021 ◽  
Vol 20 (1) ◽  
Author(s):  
Xiaoya Guo ◽  
Akiko Maehara ◽  
Mitsuaki Matsumura ◽  
Liang Wang ◽  
Jie Zheng ◽  
...  

Abstract Background Coronary plaque vulnerability prediction is difficult because plaque vulnerability is non-trivial to quantify, clinically available medical image modality is not enough to quantify thin cap thickness, prediction methods with high accuracies still need to be developed, and gold-standard data to validate vulnerability prediction are often not available. Patient follow-up intravascular ultrasound (IVUS), optical coherence tomography (OCT) and angiography data were acquired to construct 3D fluid–structure interaction (FSI) coronary models and four machine-learning methods were compared to identify optimal method to predict future plaque vulnerability. Methods Baseline and 10-month follow-up in vivo IVUS and OCT coronary plaque data were acquired from two arteries of one patient using IRB approved protocols with informed consent obtained. IVUS and OCT-based FSI models were constructed to obtain plaque wall stress/strain and wall shear stress. Forty-five slices were selected as machine learning sample database for vulnerability prediction study. Thirteen key morphological factors from IVUS and OCT images and biomechanical factors from FSI model were extracted from 45 slices at baseline for analysis. Lipid percentage index (LPI), cap thickness index (CTI) and morphological plaque vulnerability index (MPVI) were quantified to measure plaque vulnerability. Four machine learning methods (least square support vector machine, discriminant analysis, random forest and ensemble learning) were employed to predict the changes of three indices using all combinations of 13 factors. A standard fivefold cross-validation procedure was used to evaluate prediction results. Results For LPI change prediction using support vector machine, wall thickness was the optimal single-factor predictor with area under curve (AUC) 0.883 and the AUC of optimal combinational-factor predictor achieved 0.963. For CTI change prediction using discriminant analysis, minimum cap thickness was the optimal single-factor predictor with AUC 0.818 while optimal combinational-factor predictor achieved an AUC 0.836. Using random forest for predicting MPVI change, minimum cap thickness was the optimal single-factor predictor with AUC 0.785 and the AUC of optimal combinational-factor predictor achieved 0.847. Conclusion This feasibility study demonstrated that machine learning methods could be used to accurately predict plaque vulnerability change based on morphological and biomechanical factors from multi-modality image-based FSI models. Large-scale studies are needed to verify our findings.


2017 ◽  
Vol 2017 ◽  
pp. 1-13 ◽  
Author(s):  
Qingkun Meng ◽  
Chao Feng ◽  
Bin Zhang ◽  
Chaojing Tang

Buffer overflow vulnerability is a kind of consequence in which programmers’ intentions are not implemented correctly. In this paper, a static analysis method based on machine learning is proposed to assist in auditing buffer overflow vulnerabilities. First, an extended code property graph is constructed from the source code to extract seven kinds of static attributes, which are used to describe buffer properties. After embedding these attributes into a vector space, five frequently used machine learning algorithms are employed to classify the functions into suspicious vulnerable functions and secure ones. The five classifiers reached an average recall of 83.5%, average true negative rate of 85.9%, a best recall of 96.6%, and a best true negative rate of 91.4%. Due to the imbalance of the training samples, the average precision of the classifiers is 68.9% and the average F1 score is 75.2%. When the classifiers were applied to a new program, our method could reduce the false positive to 1/12 compared to Flawfinder.


2018 ◽  
Vol 7 (4.6) ◽  
pp. 410
Author(s):  
Hetal Suresh ◽  
Joseph Raymond V

Mobile phones has become very integral part in our day to day life. In the digitalized world most of our day to day activities rely on mobile phone like banking activities, wallet payments, credentials, social accounts etc. Our system works in such a way that if there is an advantage to a technology there also exists a disadvantage. Every users have all their private and sensitive data in their mobile phones and download random applications from different platforms like play store, App store etc. There is a huge possibility that the applications downloaded are malicious applications. The existing system provides a solution for detection of such applications with the help of antivirus which has pre-built signatures that can be used to obtain an already existing malware which can be modified and manipulated by the hacker if they tend to do so. In this project, our purpose is to identify the malicious applications using Machine learning. By combining both static analysis and dynamic analysis we can use a Hybrid approach for analysing and detecting malware threats in android applications using Recurrent Neural Network (RNN). The main aim of this project will be to ensure that the application installed is benign, if it is not, it should block such applications and notify the user. 


2022 ◽  
Vol 24 (3) ◽  
pp. 1-25
Author(s):  
Nishtha Paul ◽  
Arpita Jadhav Bhatt ◽  
Sakeena Rizvi ◽  
Shubhangi

Frequency of malware attacks because Android apps are increasing day by day. Current studies have revealed startling facts about data harvesting incidents, where user’s personal data is at stake. To preserve privacy of users, a permission induced risk interface MalApp to identify privacy violations rising from granting permissions during app installation is proposed. It comprises of multi-fold process that performs static analysis based on app’s category. First, concept of reverse engineering is applied to extract app permissions to construct a Boolean-valued permission matrix. Second, ranking of permissions is done to identify the risky permissions across category. Third, machine learning and ensembling techniques have been incorporated to test the efficacy of the proposed approach on a data set of 404 benign and 409 malicious apps. The empirical studies have identified that our proposed algorithm gives a best case malware detection rate of 98.33%. The highlight of interface is that any app can be classified as benign or malicious even before running it using static analysis.


Author(s):  
Syed Khurram Jah Rizvi ◽  
Warda Aslam ◽  
Muhammad Shahzad ◽  
Shahzad Saleem ◽  
Muhammad Moazam Fraz

AbstractEnterprises are striving to remain protected against malware-based cyber-attacks on their infrastructure, facilities, networks and systems. Static analysis is an effective approach to detect the malware, i.e., malicious Portable Executable (PE). It performs an in-depth analysis of PE files without executing, which is highly useful to minimize the risk of malicious PE contaminating the system. Yet, instant detection using static analysis has become very difficult due to the exponential rise in volume and variety of malware. The compelling need of early stage detection of malware-based attacks significantly motivates research inclination towards automated malware detection. The recent machine learning aided malware detection approaches using static analysis are mostly supervised. Supervised malware detection using static analysis requires manual labelling and human feedback; therefore, it is less effective in rapidly evolutionary and dynamic threat space. To this end, we propose a progressive deep unsupervised framework with feature attention block for static analysis-based malware detection (PROUD-MAL). The framework is based on cascading blocks of unsupervised clustering and features attention-based deep neural network. The proposed deep neural network embedded with feature attention block is trained on the pseudo labels. To evaluate the proposed unsupervised framework, we collected a real-time malware dataset by deploying low and high interaction honeypots on an enterprise organizational network. Moreover, endpoint security solution is also deployed on an enterprise organizational network to collect malware samples. After post processing and cleaning, the novel dataset consists of 15,457 PE samples comprising 8775 malicious and 6681 benign ones. The proposed PROUD-MAL framework achieved an accuracy of more than 98.09% with better quantitative performance in standard evaluation parameters on collected dataset and outperformed other conventional machine learning algorithms. The implementation and dataset are available at https://bit.ly/35Sne3a.


Sign in / Sign up

Export Citation Format

Share Document