Automatic Benchmark Generation Framework for Malware Detection

Security and Communication Networks ◽

10.1155/2018/4947695 ◽

2018 ◽

Vol 2018 ◽

pp. 1-8 ◽

Cited By ~ 2

Author(s):

Guanghui Liang ◽

Jianmin Pang ◽

Zheng Shan ◽

Runqing Yang ◽

Yihang Chen

Keyword(s):

Initial Data ◽

Malware Detection ◽

Detection Methods ◽

Small Data ◽

Security Threats ◽

Improved Genetic Algorithm ◽

Data Set ◽

Detection Model ◽

Manual Selection ◽

Selection Of

To address emerging security threats, various malware detection methods have been proposed every year. Therefore, a small but representative set of malware samples are usually needed for detection model, especially for machine-learning-based malware detection models. However, current manual selection of representative samples from large unknown file collection is labor intensive and not scalable. In this paper, we firstly propose a framework that can automatically generate a small data set for malware detection. With this framework, we extract behavior features from a large initial data set and then use a hierarchical clustering technique to identify different types of malware. An improved genetic algorithm based on roulette wheel sampling is implemented to generate final test data set. The final data set is only one-eighteenth the volume of the initial data set, and evaluations show that the data set selected by the proposed framework is much smaller than the original one but does not lose nearly any semantics.

Download Full-text

Analytic Study of Features for the Detection of Covert Timing Channels in NetworkTraffic

Journal of Cyber Security and Mobility ◽

10.13052/2245-1439.632 ◽

2017 ◽

Author(s):

Félix Iglesias Vázquez ◽

Robert Annessi ◽

Tanja Zseby

Keyword(s):

Experimental Studies ◽

Machine Learning Algorithms ◽

Detection Methods ◽

Security Threats ◽

Building Detection ◽

Problem Space ◽

Timing Channels ◽

Covert Timing Channels ◽

Expert Community ◽

Selection Of

Covert timing channels are security threats that have concerned the expert community from the beginnings of secure computer networks. In this paper we explore the nature of covert timing channels by studying the behavior of a selection of features used for their detection. Insights are obtained from experimental studies based on ten covert timing channels techniques published in the literature, which include popular and novel approaches. The study digs into the shapes of flows containing covert timing channels from a statistical perspective as well as using supervised and unsupervised machine learning algorithms. Our experiments reveal which features are recommended for building detection methods and draw meaningful representations to understand the problem space. Covert timing channels show high histogramdistance based outlierness, but insufficient to clearly discriminate them from normal traffic. On the other hand, traffic features do show dependencies that allow separating subspaces and facilitate the identification of covert timing channels. The conducted study shows the detection difficulties due to the high shape variability of normal traffic and suggests the implementation of semi-supervised techniques to develop accurate and reliable detectors.

Download Full-text

Learning-Based Detection for Malicious Android Application Using Code Vectorization

Security and Communication Networks ◽

10.1155/2021/9964224 ◽

2021 ◽

Vol 2021 ◽

pp. 1-11

Author(s):

Lin Liu ◽

Wang Ren ◽

Feng Xie ◽

Shengwei Yi ◽

Junkai Yi ◽

...

Keyword(s):

Hybrid Model ◽

Malware Detection ◽

Virus Detection ◽

Training Model ◽

Malicious Code ◽

Detection Methods ◽

Android Application ◽

Fusion Model ◽

Data Set ◽

Android Applications

The malicious APK (Android Application Package) makers use some techniques such as code obfuscation and code encryption to avoid existing detection methods, which poses new challenges for accurate virus detection and makes it more and more difficult to detect the malicious code. A report indicates that a new malicious app for Android is created every 10 seconds. To combat this serious malware activity, a scalable malware detection approach is needed, which can effectively and efficiently identify the malware apps. Common static detection methods often rely on Hash matching and analysis of viruses, which cannot quickly detect new malicious Android applications and their variants. In this paper, a malicious Android application detection method is proposed, which is implemented by the deep network fusion model. The hybrid model only needs to use the sample training model to achieve high accuracy in the identification of the malicious applications, which is more suitable for the detection of the new malicious Android applications than the existing methods. This method extracts the static features in the core code of the Android application by decompiling APK files, then performs code vectorization processing, and uses the deep learning network for classification and discrimination. Our experiments with a data set containing 10,170 apps show that the decisions from the hybrid model can increase the malware detection rate significantly on a real device, which verifies the superiority of this method in the detection of malicious codes.

Download Full-text

Mlifdect: Android Malware Detection Based on Parallel Machine Learning and Information Fusion

Security and Communication Networks ◽

10.1155/2017/6451260 ◽

2017 ◽

Vol 2017 ◽

pp. 1-14 ◽

Cited By ~ 8

Author(s):

Xin Wang ◽

Dafang Zhang ◽

Xin Su ◽

Wenjia Li

Keyword(s):

Machine Learning ◽

Information Fusion ◽

Malware Detection ◽

Parallel Machine ◽

Detection Methods ◽

Detection Accuracy ◽

Android Malware ◽

Detection Model ◽

Android Apps ◽

Android Malware Detection

In recent years, Android malware has continued to grow at an alarming rate. More recent malicious apps’ employing highly sophisticated detection avoidance techniques makes the traditional machine learning based malware detection methods far less effective. More specifically, they cannot cope with various types of Android malware and have limitation in detection by utilizing a single classification algorithm. To address this limitation, we propose a novel approach in this paper that leverages parallel machine learning and information fusion techniques for better Android malware detection, which is named Mlifdect. To implement this approach, we first extract eight types of features from static analysis on Android apps and build two kinds of feature sets after feature selection. Then, a parallel machine learning detection model is developed for speeding up the process of classification. Finally, we investigate the probability analysis based and Dempster-Shafer theory based information fusion approaches which can effectively obtain the detection results. To validate our method, other state-of-the-art detection works are selected for comparison with real-world Android apps. The experimental results demonstrate that Mlifdect is capable of achieving higher detection accuracy as well as a remarkable run-time efficiency compared to the existing malware detection solutions.

Download Full-text

An Ensemble-Based Malware Detection Model Using Minimum Feature Set

MENDEL ◽

10.13164/mendel.2019.2.001 ◽

2019 ◽

Vol 25 (2) ◽

pp. 1-10 ◽

Cited By ~ 2

Author(s):

Ivan Zelinka ◽

Eslam Amer

Keyword(s):

Machine Learning ◽

False Positive Rate ◽

Malware Detection ◽

Machine Learning Techniques ◽

Detection Methods ◽

Detection Model ◽

Learning Techniques ◽

Proposed Model ◽

Positive Rate ◽

Minimum Number

Current commercial antivirus detection engines still rely on signature-based methods. However, with the huge increase in the number of new malware, current detection methods become not suitable. In this paper, we introduce a malware detection model based on ensemble learning. The model is trained using the minimum number of signification features that are extracted from the file header. Evaluations show that the ensemble models slightly outperform individual classification models. Experimental evaluations show that our model can predict unseen malware with an accuracy rate of 0.998 and with a false positive rate of 0.002. The paper also includes a comparison between the performance of the proposed model and with different machine learning techniques. We are emphasizing the use of machine learning based approaches to replace conventional signature-based methods.

Download Full-text

BrainShield: A Hybrid Machine Learning-Based Malware Detection Model for Android Devices

Electronics ◽

10.3390/electronics10232948 ◽

2021 ◽

Vol 10 (23) ◽

pp. 2948

Author(s):

Corentin Rodrigo ◽

Samuel Pierre ◽

Ronald Beaubrun ◽

Franjieh El Khoury

Keyword(s):

Neural Network ◽

Malware Detection ◽

Detection Methods ◽

Dynamic Features ◽

Android Malware ◽

Detection Model ◽

The Third ◽

Android Malware Detection ◽

Fully Connected ◽

Server Architecture

Android has become the leading operating system for mobile devices, and the most targeted one by malware. Therefore, many analysis methods have been proposed for detecting Android malware. However, few of them use proper datasets for evaluation. In this paper, we propose BrainShield, a hybrid malware detection model trained on the Omnidroid dataset to reduce attacks on Android devices. The latter is the most diversified dataset in terms of the number of different features, and contains the largest number of samples, 22,000 samples, for model evaluation in the Android malware detection field. BrainShield’s implementation is based on a client/server architecture and consists of three fully connected neural networks: (1) the first is used for static analysis and reaches an accuracy of 92.9% trained on 840 static features; (2) the second is a dynamic neural network that reaches an accuracy of 81.1% trained on 3722 dynamic features; and (3) the third neural network proposed is hybrid, reaching an accuracy of 91.1% trained on 7081 static and dynamic features. Simulation results show that BrainShield is able to improve the accuracy and the precision of well-known malware detection methods.

Download Full-text

Packed malware variants detection using deep belief networks

MATEC Web of Conferences ◽

10.1051/matecconf/202030902002 ◽

2020 ◽

Vol 309 ◽

pp. 02002 ◽

Cited By ~ 1

Author(s):

Zhigang Zhang ◽

Chaowen Chang ◽

Peisheng Han ◽

Hongtao Zhang

Keyword(s):

Information Gain ◽

Malware Detection ◽

Normal System ◽

Detection Accuracy ◽

System Call ◽

Security Threats ◽

Malware Analysis ◽

Deep Belief Networks ◽

Detection Model ◽

Detection Technology

Malware is one of the most serious network security threats. To detect unknown variants of malware, many researches have proposed various methods of malware detection based on machine learning in recent years. However, modern malware is often protected by software packers, obfuscation, and other technologies, which bring challenges to malware analysis and detection. In this paper, we propose a system call based malware detection technology. By comparing malware and benign software in a sandbox environment, a sensitive system call context is extracted based on information gain, which reduces obfuscation caused by a normal system call. By using the deep belief network, we train a malware detection model with sensitive system call context to improve the detection accuracy.

Download Full-text

Multitype Damage Detection of Container Using CNN Based on Transfer Learning

Mathematical Problems in Engineering ◽

10.1155/2021/5395494 ◽

2021 ◽

Vol 2021 ◽

pp. 1-12

Author(s):

Zixin Wang ◽

Jing Gao ◽

Qingcheng Zeng ◽

Yuhui Sun

Keyword(s):

Service Life ◽

Damage Detection ◽

Transfer Learning ◽

Vital Role ◽

Detection Methods ◽

Data Set ◽

Detection Model ◽

Natural Factors ◽

Container Management

Due to the repeated bearing of mechanical operations and natural factors, the container will suffer various types of damage during use. Adopting effective container damage detection methods plays a vital role in prolonging the service life and using function. This paper proposes a multitype damage detection model for containers based on transfer learning and MobileNetV2. In addition, a data set containing nine typical types of container damage is established. To ensure the validity and practicability of the model, we conducted tests and verifications in the actual port environment. The results show that the model can identify multiple types of container damage. Compared with the existing models, the damage detection model proposed in this paper can ensure the identification effect of various types of container damage, which is more suitable for the actual container detection situation. This method can provide a new idea of damage detection for container management in ports.

Download Full-text

Application Research on the Medical imaging testing of Novel Coronavirus Pneumonia COVID-19 based on Transfer Learning

10.21203/rs.3.rs-771648/v1 ◽

2021 ◽

Author(s):

shouqiang Liu ◽

Mingyue Jiang ◽

Liming Chen ◽

Yang Wang

Keyword(s):

Medical Imaging ◽

Transfer Learning ◽

Image Data ◽

Training Model ◽

Small Sample ◽

Detection Methods ◽

Nucleic Acid Detection ◽

Small Data ◽

Data Set ◽

Novel Coronavirus

Abstract Novel coronavirus pneumonia (COVID-19) is a highly infectious and fatal pneumonia-type disease that poses a great threat to the public safety of society. A fast and efficient method for screening COVID19-positive patients is essential. At present, the main detection methods are nucleic acid detection of manual diagnosis and medical imaging (CT image/X-ray image), both of which take a long time to obtain the diagnosis result. This paper discusses the common processing methods for the problem of insufficient medical image data. Then, transfer learning and convolutional neural network were used to construct the screening and diagnosis model of COVID-19, and different migration models were analyzed and compared to select a better pre-training model, which was trained and analyzed under small data sets. Finally, it analyzes and discusses how to train a highly reliable model to quickly help doctors provide advice in the critical moment of epidemic prevention and control when only a small sample data set is available.

Download Full-text

ON THE EXPERT’S RIGHT TO BE PRESENT AT LEGAL PROCEEDINGS

Theory and Practice of Forensic Science and Criminalistics ◽

10.32353/khrife.2015.19 ◽

2016 ◽

Vol 15 ◽

pp. 163-171

Author(s):

M. G. Shcherbakovskiy

Keyword(s):

Initial Data ◽

Legal Proceedings ◽

Reliable Assessment ◽

Data Acquiring ◽

Selection Of

The article discusses the reasonsfor an expert to participate in legal proceedings. The gnoseological reason for that consists of the bad quality of materials subject to examination that renders the examination either completely impossible or compromises objective, reasoned and reliable assessment of the findings. The procedural reason consists ofa proscription for an expert to collect evidence himself or herself. The author investigates into the ways of how an expert can participate in legal proceedings. If the defense invites an expert to participate in the proceedings, then it is recommended that his or her involvement should be in the presence of attesting witnesses and recorded in the protocol. In the course of the legal proceedings an expert has the following tasks: adding initial data, acquiring new initial data, understanding the situation of the incident, acquiring new objects to be studied, including samples for examination. An expert’s participation in legal proceedings differs from the participation of a specialist or an examination on the scene of the incident. The author describes the tasks that an expert solves in the course of legal proceedings, the peculiarities ofan investigation experiment practices, the selection of samples for an examination, inspection, interrogation.

Download Full-text

Building Damage Detection from Post-Event Aerial Imagery Using Single Shot Multibox Detector

Applied Sciences ◽

10.3390/app9061128 ◽

2019 ◽

Vol 9 (6) ◽

pp. 1128 ◽

Cited By ~ 12

Author(s):

Yundong Li ◽

Wei Hu ◽

Han Dong ◽

Xueyan Zhang

Keyword(s):

Machine Learning ◽

Data Augmentation ◽

Hurricane Sandy ◽

Training Data ◽

Aerial Images ◽

Detection Methods ◽

Single Shot ◽

Data Set ◽

Augmentation Strategies ◽

Post Disaster

Using aerial cameras, satellite remote sensing or unmanned aerial vehicles (UAV) equipped with cameras can facilitate search and rescue tasks after disasters. The traditional manual interpretation of huge aerial images is inefficient and could be replaced by machine learning-based methods combined with image processing techniques. Given the development of machine learning, researchers find that convolutional neural networks can effectively extract features from images. Some target detection methods based on deep learning, such as the single-shot multibox detector (SSD) algorithm, can achieve better results than traditional methods. However, the impressive performance of machine learning-based methods results from the numerous labeled samples. Given the complexity of post-disaster scenarios, obtaining many samples in the aftermath of disasters is difficult. To address this issue, a damaged building assessment method using SSD with pretraining and data augmentation is proposed in the current study and highlights the following aspects. (1) Objects can be detected and classified into undamaged buildings, damaged buildings, and ruins. (2) A convolution auto-encoder (CAE) that consists of VGG16 is constructed and trained using unlabeled post-disaster images. As a transfer learning strategy, the weights of the SSD model are initialized using the weights of the CAE counterpart. (3) Data augmentation strategies, such as image mirroring, rotation, Gaussian blur, and Gaussian noise processing, are utilized to augment the training data set. As a case study, aerial images of Hurricane Sandy in 2012 were maximized to validate the proposed method’s effectiveness. Experiments show that the pretraining strategy can improve of 10% in terms of overall accuracy compared with the SSD trained from scratch. These experiments also demonstrate that using data augmentation strategies can improve mAP and mF1 by 72% and 20%, respectively. Finally, the experiment is further verified by another dataset of Hurricane Irma, and it is concluded that the paper method is feasible.

Download Full-text