To Incorporate Sequential Dynamic Features in Malware Detection Engines

Mobile malware are malicious programs that target mobile devices. They are an increasing problem, as seen in the rise of detected mobile malware samples per year. The number of active smartphone users is expected to grow, stressing the importance of research on the detection of mobile malware. Detection methods for mobile malware exist but are still limited. In this paper, we propose dynamic malware-detection methods that use device information such as the CPU usage, battery usage, and memory usage for the detection of 10 subtypes of Mobile Trojans on the Android Operating System (OS). We use a real-life sensor dataset containing device and malware data from 47 users for a year (2016) to create multiple mobile malware detection methods. We examine which features, i.e. aspects, of a device, are most important to monitor to detect (subtypes of) Mobile Trojans. The focus of this paper is on dynamic hardware features. Using these dynamic features we apply the following machine learning classifiers: Random Forest, K-Nearest Neighbour, and AdaBoost.

Download Full-text

cHybriDroid: A Machine Learning-Based Hybrid Technique for Securing the Edge Computing

Security and Communication Networks ◽

10.1155/2020/8861639 ◽

2020 ◽

Vol 2020 ◽

pp. 1-14

Author(s):

Afifa Maryam ◽

Usman Ahmed ◽

Muhammad Aleem ◽

Jerry Chun-Wei Lin ◽

Muhammad Arshad Islam ◽

...

Keyword(s):

Machine Learning ◽

Hybrid Approach ◽

Malware Detection ◽

Edge Computing ◽

Cloud Services ◽

Hybrid Technique ◽

Dynamic Features ◽

Hybrid Features ◽

Android Malware ◽

Detection Techniques

Smart phones are an integral component of the mobile edge computing (MEC) framework. Securing the data stored on mobile devices is very crucial for ensuring the smooth operations of cloud services. A growing number of malicious Android applications demand an in-depth investigation to dissect their malicious intent to design effective malware detection techniques. The contemporary state-of-the-art model suggests that hybrid features based on machine learning (ML) techniques could play a significant role in android malware detection. The selection of application’s features plays a very crucial role to capture the appropriate behavioural patterns of malware instances for a useful classification of mobile applications. In this study, we propose a novel hybrid approach to detect android malware, wherein static features in conjunction with dynamic features of smart phone applications are employed. We collect these hybrid features using permissions, intents, and run-time features (such as information leakage, cryptography’s exploitation, and network manipulations) to analyse the effectiveness of the employed techniques for malware detection. We conduct experiments using over 5,000 real-world applications. The outcomes of the study reveal that the proposed set of features has successfully detected malware threats with 97% F-measure results.

Download Full-text

Machine Learning for Malware Detection: Beyond Accuracy Rates

10.5753/sbseg_estendido.2019.14005 ◽

2019 ◽

Author(s):

Lucas Galante ◽

Marcus Botacin ◽

André Grégio ◽

Paulo De Geus

Keyword(s):

Machine Learning ◽

Research Work ◽

Malware Detection ◽

Detection Accuracy ◽

Electronic Systems ◽

Dynamic Features ◽

Daily Lives ◽

Feedback Information ◽

The Cost ◽

Accuracy Rates

Today's world is supported by connected, electronic systems, thus ensuring their secure operation is essential to our daily lives. A major threat to system's security is malware infections, which cause ﬁnancial and image losses to corporate and end-users, thus motivating the development of malware detectors. In this scenario, Machine Learning (ML) has been demonstrated to be a powerful technique to develop classiﬁers able to distinguish malware from goodware samples. However, many ML research work on malware detection focus only on the ﬁnal detection accuracy rate and overlook other important aspects of classiﬁer's implementation and evaluation, such as feature extraction and parameter selection. In this paper, we shed light to these aspects to highlight the challenges and drawbacks of ML-based malware classiﬁers development. We trained 25 distinct classiﬁcation models and applied them to 2,800 real x86, Linux ELF malware binaries. Our results shows that: (i) dynamic features outperforms static features when the same classiﬁers are considered; (ii) Discrete-bounded features present smaller accuracy variance over time in comparison to continuous features, at the cost of some time-localized accuracy loss; (iii) Datasets presenting distinct characteristics (e.g., temporal changes) impose generalization challenges to ML models; and (iv) Feature analysis can be used as feedback information for malware detection and infection prevention. We expect that our work could help other researchers when developing their ML-based malware classiﬁcation solutions.

Download Full-text

BrainShield: A Hybrid Machine Learning-Based Malware Detection Model for Android Devices

Electronics ◽

10.3390/electronics10232948 ◽

2021 ◽

Vol 10 (23) ◽

pp. 2948

Author(s):

Corentin Rodrigo ◽

Samuel Pierre ◽

Ronald Beaubrun ◽

Franjieh El Khoury

Keyword(s):

Neural Network ◽

Malware Detection ◽

Detection Methods ◽

Dynamic Features ◽

Android Malware ◽

Detection Model ◽

The Third ◽

Android Malware Detection ◽

Fully Connected ◽

Server Architecture

Android has become the leading operating system for mobile devices, and the most targeted one by malware. Therefore, many analysis methods have been proposed for detecting Android malware. However, few of them use proper datasets for evaluation. In this paper, we propose BrainShield, a hybrid malware detection model trained on the Omnidroid dataset to reduce attacks on Android devices. The latter is the most diversified dataset in terms of the number of different features, and contains the largest number of samples, 22,000 samples, for model evaluation in the Android malware detection field. BrainShield’s implementation is based on a client/server architecture and consists of three fully connected neural networks: (1) the first is used for static analysis and reaches an accuracy of 92.9% trained on 840 static features; (2) the second is a dynamic neural network that reaches an accuracy of 81.1% trained on 3722 dynamic features; and (3) the third neural network proposed is hybrid, reaching an accuracy of 91.1% trained on 7081 static and dynamic features. Simulation results show that BrainShield is able to improve the accuracy and the precision of well-known malware detection methods.

Download Full-text

Runtime Detection Framework for Android Malware

Mobile Information Systems ◽

10.1155/2018/8094314 ◽

2018 ◽

Vol 2018 ◽

pp. 1-15 ◽

Cited By ~ 1

Author(s):

TaeGuen Kim ◽

BooJoong Kang ◽

Eul Gyu Im

Keyword(s):

Dynamic Analysis ◽

Static Analysis ◽

Suffix Tree ◽

Malware Detection ◽

Application Programming Interface ◽

Detection Methods ◽

Detection Accuracy ◽

Dynamic Features ◽

Android Malware ◽

Android Malware Detection

As the number of Android malware has been increased rapidly over the years, various malware detection methods have been proposed so far. Existing methods can be classified into two categories: static analysis-based methods and dynamic analysis-based methods. Both approaches have some limitations: static analysis-based methods are relatively easy to be avoided through transformation techniques such as junk instruction insertions, code reordering, and so on. However, dynamic analysis-based methods also have some limitations that analysis overheads are relatively high and kernel modification might be required to extract dynamic features. In this paper, we propose a dynamic analysis framework for Android malware detection that overcomes the aforementioned shortcomings. The framework uses a suffix tree that contains API (Application Programming Interface) subtraces and their probabilistic confidence values that are generated using HMMs (Hidden Markov Model) to reduce the malware detection overhead, and we designed the framework with the client-server architecture since the suffix tree is infeasible to be deployed in mobile devices. In addition, an application rewriting technique is used to trace API invocations without any modifications in the Android kernel. In our experiments, we measured the detection accuracy and the computational overheads to evaluate its effectiveness and efficiency of the proposed framework.

Download Full-text

A Simhash-Based Integrative Features Extraction Algorithm for Malware Detection

Algorithms ◽

10.3390/a11080124 ◽

2018 ◽

Vol 11 (8) ◽

pp. 124 ◽

Cited By ~ 1

Author(s):

Yihong Li ◽

Fangzheng Liu ◽

Zhenyu Du ◽

Dubing Zhang

Keyword(s):

Feature Extraction ◽

Malware Detection ◽

Application Programming Interface ◽

Classification Performance ◽

Detection Performance ◽

Machine Learning Algorithms ◽

Dynamic Features ◽

Dynamic Information ◽

Static Information ◽

Extraction Algorithm

In the malware detection process, obfuscated malicious codes cannot be efficiently and accurately detected solely in the dynamic or static feature space. Aiming at this problem, an integrative feature extraction algorithm based on simhash was proposed, which combines the static information e.g., API (Application Programming Interface) calls and dynamic information (such as file, registry and network behaviors) of malicious samples to form integrative features. The experiment extracts the integrative features of some static information and dynamic information, and then compares the classification, time and obfuscated-detection performance of the static, dynamic and integrated features, respectively, by using several common machine learning algorithms. The results show that the integrative features have better time performance than the static features, and better classification performance than the dynamic features, and almost the same obfuscated-detection performance as the dynamic features. This algorithm can provide some support for feature extraction of malware detection.

Download Full-text

Malware Detection Based on Static and Dynamic Features Analysis

Machine Learning for Cyber Security - Lecture Notes in Computer Science ◽

10.1007/978-3-030-62223-7_10 ◽

2020 ◽

pp. 111-124

Author(s):

Budong Xu ◽

Yongqin Li ◽

Xiaomei Yu

Keyword(s):

Malware Detection ◽

Dynamic Features

Download Full-text

DeepMal: maliciousness-Preserving adversarial instruction learning against static malware detection

Cybersecurity ◽

10.1186/s42400-021-00079-5 ◽

2021 ◽

Vol 4 (1) ◽

Author(s):

Chun Yang ◽

Jinghui Xu ◽

Shuangshuang Liang ◽

Yanna Wu ◽

Yu Wen ◽

...

Keyword(s):

Computer Vision ◽

Language Processing ◽

Real World ◽

Malware Detection ◽

Small Scale ◽

Dynamic Features ◽

Adversarial Learning ◽

Adversarial Attack ◽

The Mean ◽

Real World Datasets

AbstractOutside the explosive successful applications of deep learning (DL) in natural language processing, computer vision, and information retrieval, there have been numerous Deep Neural Networks (DNNs) based alternatives for common security-related scenarios with malware detection among more popular. Recently, adversarial learning has gained much focus. However, unlike computer vision applications, malware adversarial attack is expected to guarantee malwares’ original maliciousness semantics. This paper proposes a novel adversarial instruction learning technique, DeepMal, based on an adversarial instruction learning approach for static malware detection. So far as we know, DeepMal is the first practical and systematical adversarial learning method, which could directly produce adversarial samples and effectively bypass static malware detectors powered by DL and machine learning (ML) models while preserving attack functionality in the real world. Moreover, our method conducts small-scale attacks, which could evade typical malware variants analysis (e.g., duplication check). We evaluate DeepMal on two real-world datasets, six typical DL models, and three typical ML models. Experimental results demonstrate that, on both datasets, DeepMal can attack typical malware detectors with the mean F1-score and F1-score decreasing maximal 93.94% and 82.86% respectively. Besides, three typical types of malware samples (Trojan horses, Backdoors, Ransomware) prove to preserve original attack functionality, and the mean duplication check ratio of malware adversarial samples is below 2.0%. Besides, DeepMal can evade dynamic detectors and be easily enhanced by learning more dynamic features with specific constraints.

Download Full-text