Effects of Feature Selection and Normalization on Network Intrusion Detection

10.36227/techrxiv.12480425 ◽

2020 ◽

Author(s):

Mubarak Albarka Umar ◽

Chen Zhanfang

Keyword(s):

Feature Selection ◽

Intrusion Detection ◽

Intrusion Detection Systems ◽

Computational Time ◽

Network Intrusion Detection ◽

Detection Systems ◽

Defense Systems ◽

Network Intrusion ◽

Depth Analysis ◽

Negative Impacts

<div><br></div><div><p> The rapid rise of cyberattacks and the gradual failing of traditional defense systems and approaches led to the use of Machine Learning (ML) techniques aiming to build more efficient and reliable Intrusion Detection Systems (IDSs). However, the advent of larger IDS datasets brought about negative impacts on the performance and computational time of ML-based IDSs. To overcome such issues, many researchers utilized data preprocessing techniques such as feature selection and normalization. While most of these researchers reported the success of these preprocessing techniques on a shallow level, very few studies are performed on their effects on a wider scale. Furthermore, the performance of an IDS model is subject to not only the preprocessing techniques used but also the dataset and the ML algorithm used, which most of the existing studies on preprocessing techniques give little emphasis on. Thus, this study provides an in-depth analysis of the effects of feature selection and normalization on various IDS models built using four separate IDS datasets and five different ML algorithms. Wrapper-based decision tree and min-max are used in feature selection and normalization respectively. The models are evaluated and compared using popular evaluation metrics in IDS. The study found normalization to be more important than feature selection in improving performance and computational time of models on both datasets, while feature selection on UNSW-NB15 failed to reduce models computational time, and in the case of models built using NSL-KDD, it decreases their performance. The study also reveals that, compared to the UNSW-NB15 dataset, the NSL-KDD dataset is less complex and unsuitable for building reliable modern-day IDS models. Furthermore, the best performance on both datasets is achieved by Random Forest with accuracy of 99.75% and 98.51% on NSL-KDD and UNSW-NB15 respectively. </p></div>

Download Full-text

Effects of Feature Selection and Normalization on Network Intrusion Detection

10.36227/techrxiv.12480425.v1 ◽

2020 ◽

Author(s):

Mubarak Albarka Umar ◽

Chen Zhanfang

Keyword(s):

Feature Selection ◽

Intrusion Detection ◽

Intrusion Detection Systems ◽

Computational Time ◽

Network Intrusion Detection ◽

Detection Systems ◽

Defense Systems ◽

Network Intrusion ◽

Depth Analysis ◽

Negative Impacts

<div><br></div><div><p> The rapid rise of cyberattacks and the gradual failing of traditional defense systems and approaches led to the use of Machine Learning (ML) techniques aiming to build more efficient and reliable Intrusion Detection Systems (IDSs). However, the advent of larger IDS datasets brought about negative impacts on the performance and computational time of ML-based IDSs. To overcome such issues, many researchers utilized data preprocessing techniques such as feature selection and normalization. While most of these researchers reported the success of these preprocessing techniques on a shallow level, very few studies are performed on their effects on a wider scale. Furthermore, the performance of an IDS model is subject to not only the preprocessing techniques used but also the dataset and the ML algorithm used, which most of the existing studies on preprocessing techniques give little emphasis on. Thus, this study provides an in-depth analysis of the effects of feature selection and normalization on various IDS models built using four separate IDS datasets and five different ML algorithms. Wrapper-based decision tree and min-max are used in feature selection and normalization respectively. The models are evaluated and compared using popular evaluation metrics in IDS. The study found normalization to be more important than feature selection in improving performance and computational time of models on both datasets, while feature selection on UNSW-NB15 failed to reduce models computational time, and in the case of models built using NSL-KDD, it decreases their performance. The study also reveals that, compared to the UNSW-NB15 dataset, the NSL-KDD dataset is less complex and unsuitable for building reliable modern-day IDS models. Furthermore, the best performance on both datasets is achieved by Random Forest with accuracy of 99.75% and 98.51% on NSL-KDD and UNSW-NB15 respectively. </p></div>

Download Full-text

Feature Selection Method based on Chaotic Salp Swarm Algorithm and Extreme Learning Machine for Network Intrusion Detection Systems

Webology ◽

10.14704/web/v18si04/web18154 ◽

2021 ◽

Vol 18 (Special Issue 04) ◽

pp. 626-640

Author(s):

Rana Nazhan Hadi ◽

Dr. Rasha Orban Mahmoud ◽

Dr. Adly S. Tag Eldien

Keyword(s):

Feature Selection ◽

Intrusion Detection ◽

Extreme Learning Machine ◽

Classification Accuracy ◽

Intrusion Detection Systems ◽

Network Intrusion Detection ◽

Detection Systems ◽

Network Intrusion ◽

Network Intrusion Detection Systems ◽

Learning Machine

Network Intrusion Detection Systems (IDSs) have been widely used to monitor and manage network connections and prevent unauthorized connections. Machine learning models have been utilized to classify the connections into normal connections or attack connections based on the users' behavior. One of the most common issues facing the IDSs is the detection system's low classification accuracy and high dimensionality in the feature selection process. However, the feature selection methods are usually used to decrease the datasets' redundancy and enhance the classification performance. In this paper, a Chaotic Salp Swarm Algorithm (CSSA) was integrated with the Extreme Learning Machine (ELM) classifier to select the most relevant subset of features and decrease the dimensionality of a dataset. Each Salp in the population was represented in a binary form, where 1 represented a selected feature, while 0 represented a removed feature. The proposed feature selection algorithm was evaluated based on NSL-KDD dataset, which consists of 41 features. The results were compared with others and have shown that the proposed algorithm succeeded in achieving classification accuracy up to 97.814% and minimized the number of selected features.

Download Full-text

Machine Learning techniques for Behavioral Feature Selection in Network Intrusion Detection Systems

10.1049/icp.2021.1448 ◽

2021 ◽

Author(s):

Vicente Martinez ◽

Rodrigo Salas ◽

Oliver Tessini ◽

Romina Torres

Keyword(s):

Machine Learning ◽

Feature Selection ◽

Intrusion Detection ◽

Intrusion Detection Systems ◽

Machine Learning Techniques ◽

Network Intrusion Detection ◽

Detection Systems ◽

Network Intrusion ◽

Learning Techniques ◽

Network Intrusion Detection Systems

Download Full-text

An Approach to Reduce Uncertainty Problem in Network Intrusion Detection Systems

2020 IEEE 15th International Conference on Industrial and Information Systems (ICIIS) ◽

10.1109/iciis51140.2020.9342634 ◽

2020 ◽

Author(s):

Gargi Kadam ◽

Sahil Parekh ◽

Priyanka Agnihotri ◽

Dayanand Ambawade ◽

Prasenjit Bhavathankar

Keyword(s):

Intrusion Detection ◽

Intrusion Detection Systems ◽

Network Intrusion Detection ◽

Detection Systems ◽

Network Intrusion ◽

Network Intrusion Detection Systems

Download Full-text

Predicting the resource consumption of network intrusion detection systems

Proceedings of the 2008 ACM SIGMETRICS international conference on Measurement and modeling of computer systems - SIGMETRICS '08 ◽

10.1145/1375457.1375509 ◽

2008 ◽

Cited By ~ 7

Author(s):

Holger Dreger ◽

Anja Feldmann ◽

Vern Paxson ◽

Robin Sommer

Keyword(s):

Intrusion Detection ◽

Resource Consumption ◽

Intrusion Detection Systems ◽

Network Intrusion Detection ◽

Detection Systems ◽

Network Intrusion ◽

Network Intrusion Detection Systems

Download Full-text

A Flexible Pattern-Matching Algorithm for Network Intrusion Detection Systems Using Multi-Core Processors

Algorithms ◽

10.3390/a10020058 ◽

2017 ◽

Vol 10 (2) ◽

pp. 58 ◽

Cited By ~ 1

Author(s):

◽

Keyword(s):

Intrusion Detection ◽

Pattern Matching ◽

Intrusion Detection Systems ◽

Network Intrusion Detection ◽

Matching Algorithm ◽

Detection Systems ◽

Network Intrusion ◽

Network Intrusion Detection Systems ◽

Pattern Matching Algorithm

Download Full-text

Enhancing Robustness Against Adversarial Examples in Network Intrusion Detection Systems

2020 IEEE Conference on Network Function Virtualization and Software Defined Networks (NFV-SDN) ◽

10.1109/nfv-sdn50289.2020.9289869 ◽

2020 ◽

Author(s):

Mohammad J. Hashemi ◽

Eric Keller

Keyword(s):

Intrusion Detection ◽

Intrusion Detection Systems ◽

Network Intrusion Detection ◽

Detection Systems ◽

Network Intrusion ◽

Network Intrusion Detection Systems ◽

Adversarial Examples

Download Full-text

On Handling Class Imbalance in Continual Learning based Network Intrusion Detection Systems

10.1145/3486001.3486231 ◽

2021 ◽

Author(s):

Suresh Kumar Amalapuram ◽

Thushara Tippi Reddy ◽

Sumohana S. Channappayya ◽

Bheemarjuna Reddy Tamma

Keyword(s):

Intrusion Detection ◽

Class Imbalance ◽

Intrusion Detection Systems ◽

Network Intrusion Detection ◽

Detection Systems ◽

Network Intrusion ◽

Network Intrusion Detection Systems ◽

Continual Learning

Download Full-text

Fast and Scalable Pattern Matching for Network Intrusion Detection Systems

IEEE Journal on Selected Areas in Communications ◽

10.1109/jsac.2006.877131 ◽

2006 ◽

Vol 24 (10) ◽

pp. 1781-1792 ◽

Cited By ~ 99

Author(s):

S. Dharmapurikar ◽

J.W. Lockwood

Keyword(s):

Intrusion Detection ◽

Pattern Matching ◽

Intrusion Detection Systems ◽

Network Intrusion Detection ◽

Detection Systems ◽

Network Intrusion ◽

Network Intrusion Detection Systems

Download Full-text