scholarly journals Machine Learning and Deep Learning Methods for Intrusion Detection Systems: A Survey

2019 ◽  
Vol 9 (20) ◽  
pp. 4396 ◽  
Author(s):  
Hongyu Liu ◽  
Bo Lang

Networks play important roles in modern life, and cyber security has become a vital research area. An intrusion detection system (IDS) which is an important cyber security technique, monitors the state of software and hardware running in the network. Despite decades of development, existing IDSs still face challenges in improving the detection accuracy, reducing the false alarm rate and detecting unknown attacks. To solve the above problems, many researchers have focused on developing IDSs that capitalize on machine learning methods. Machine learning methods can automatically discover the essential differences between normal data and abnormal data with high accuracy. In addition, machine learning methods have strong generalizability, so they are also able to detect unknown attacks. Deep learning is a branch of machine learning, whose performance is remarkable and has become a research hotspot. This survey proposes a taxonomy of IDS that takes data objects as the main dimension to classify and summarize machine learning-based and deep learning-based IDS literature. We believe that this type of taxonomy framework is fit for cyber security researchers. The survey first clarifies the concept and taxonomy of IDSs. Then, the machine learning algorithms frequently used in IDSs, metrics, and benchmark datasets are introduced. Next, combined with the representative literature, we take the proposed taxonomic system as a baseline and explain how to solve key IDS issues with machine learning and deep learning techniques. Finally, challenges and future developments are discussed by reviewing recent representative studies.

Sensors ◽  
2021 ◽  
Vol 21 (4) ◽  
pp. 1113
Author(s):  
Ming Zhong ◽  
Yajin Zhou ◽  
Gang Chen

IoT plays an important role in daily life; commands and data transfer rapidly between the servers and objects to provide services. However, cyber threats have become a critical factor, especially for IoT servers. There should be a vigorous way to protect the network infrastructures from various attacks. IDS (Intrusion Detection System) is the invisible guardian for IoT servers. Many machine learning methods have been applied in IDS. However, there is a need to improve the IDS system for both accuracy and performance. Deep learning is a promising technique that has been used in many areas, including pattern recognition, natural language processing, etc. The deep learning reveals more potential than traditional machine learning methods. In this paper, sequential model is the key point, and new methods are proposed by the features of the model. The model can collect features from the network layer via tcpdump packets and application layer via system routines. Text-CNN and GRU methods are chosen because the can treat sequential data as a language model. The advantage compared with the traditional methods is that they can extract more features from the data and the experiments show that the deep learning methods have higher F1-score. We conclude that the sequential model-based intrusion detection system using deep learning method can contribute to the security of the IoT servers.


2021 ◽  
Vol 2021 ◽  
pp. 1-13
Author(s):  
Yue Wang ◽  
Yiming Jiang ◽  
Julong Lan

When traditional machine learning methods are applied to network intrusion detection, they need to rely on expert knowledge to extract feature vectors in advance, which incurs lack of flexibility and versatility. Recently, deep learning methods have shown superior performance compared with traditional machine learning methods. Deep learning methods can learn the raw data directly, but they are faced with expensive computing cost. To solve this problem, a preprocessing method based on multipacket input unit and compression is proposed, which takes m data packets as the input unit to maximize the retention of information and greatly compresses the raw traffic to shorten the data learning and training time. In our proposed method, the CNN network structure is optimized and the weights of some convolution layers are assigned directly by using the Gabor filter. Experimental results on the benchmark data set show that compared with the existing models, the proposed method improves the detection accuracy by 2.49% and reduces the training time by 62.1%. In addition, the experiments show that the proposed compression method has obvious advantages in detection accuracy and computational efficiency compared with the existing compression methods.


2020 ◽  
Author(s):  
Thomas R. Lane ◽  
Daniel H. Foil ◽  
Eni Minerali ◽  
Fabio Urbina ◽  
Kimberley M. Zorn ◽  
...  

<p>Machine learning methods are attracting considerable attention from the pharmaceutical industry for use in drug discovery and applications beyond. In recent studies we have applied multiple machine learning algorithms, modeling metrics and in some cases compared molecular descriptors to build models for individual targets or properties on a relatively small scale. Several research groups have used large numbers of datasets from public databases such as ChEMBL in order to evaluate machine learning methods of interest to them. The largest of these types of studies used on the order of 1400 datasets. We have now extracted well over 5000 datasets from CHEMBL for use with the ECFP6 fingerprint and comparison of our proprietary software Assay Central<sup>TM</sup> with random forest, k-Nearest Neighbors, support vector classification, naïve Bayesian, AdaBoosted decision trees, and deep neural networks (3 levels). Model performance <a>was</a> assessed using an array of five-fold cross-validation metrics including area-under-the-curve, F1 score, Cohen’s kappa and Matthews correlation coefficient. <a>Based on ranked normalized scores for the metrics or datasets all methods appeared comparable while the distance from the top indicated Assay Central<sup>TM</sup> and support vector classification were comparable. </a>Unlike prior studies which have placed considerable emphasis on deep neural networks (deep learning), no advantage was seen in this case where minimal tuning was performed of any of the methods. If anything, Assay Central<sup>TM</sup> may have been at a slight advantage as the activity cutoff for each of the over 5000 datasets representing over 570,000 unique compounds was based on Assay Central<sup>TM</sup>performance, but support vector classification seems to be a strong competitor. We also apply Assay Central<sup>TM</sup> to prospective predictions for PXR and hERG to further validate these models. This work currently appears to be the largest comparison of machine learning algorithms to date. Future studies will likely evaluate additional databases, descriptors and algorithms, as well as further refining methods for evaluating and comparing models. </p><p><b> </b></p>


Author(s):  
Ashish Prajapati ◽  
Shital Gupta

This survey paper describes the literature survey for cyber analytics in support of intrusion detection of machine learnings (ML) and data mining (DM) methods. Short ML/DM method tutorial details will be given. Documents representing each method were categorized, read and summarized based on the number of citations and significance of an evolving method. Since data is so important.


Sign in / Sign up

Export Citation Format

Share Document