scholarly journals Design and Development of an Efficient Network Intrusion Detection System Using Machine Learning Techniques

2021 ◽  
Vol 2021 ◽  
pp. 1-35
Author(s):  
Thomas Rincy N ◽  
Roopam Gupta

Today’s internets are made up of nearly half a million different networks. In any network connection, identifying the attacks by their types is a difficult task as different attacks may have various connections, and their number may vary from a few to hundreds of network connections. To solve this problem, a novel hybrid network IDS called NID-Shield is proposed in the manuscript that classifies the dataset according to different attack types. Furthermore, the attack names found in attack types are classified individually helping considerably in predicting the vulnerability of individual attacks in various networks. The hybrid NID-Shield NIDS applies the efficient feature subset selection technique called CAPPER and distinct machine learning methods. The UNSW-NB15 and NSL-KDD datasets are utilized for the evaluation of metrics. Machine learning algorithms are applied for training the reduced accurate and highly merit feature subsets obtained from CAPPER and then assessed by the cross-validation method for the reduced attributes. Various performance metrics show that the hybrid NID-Shield NIDS applied with the CAPPER approach achieves a good accuracy rate and low FPR on the UNSW-NB15 and NSL-KDD datasets and shows good performance results when analyzed with various approaches found in existing literature studies.

A Network Intrusion Detection System (NIDS) is a framework to identify network interruptions as well as abuse by checking network traffic movement and classifying it as either typical or strange. Numerous Intrusion Detection Systems have been implemented using simulated datasets like KDD’99 intrusion dataset but none of them uses a real time dataset. The proposed work performs and assesses tests to overview distinctive machine learning models reliant on KDD’99 intrusion dataset and an ongoing created dataset. The machine learning models achieved to compute required performance metrics so as to assess the chosen classifiers. The emphasis was on the accuracy metric so as to improve the recognition pace of the interruption identification framework. The actualized calculations showed that the decision tree classifier accomplished the most noteworthy estimation of accuracy while the logistic regression classifier has accomplished the least estimation of exactness for both of the datasets utilized.


Author(s):  
Manuel Gonçalves da Silva Neto ◽  
Danielo G. Gomes

With the increasing popularization of computer network-based technologies, security has become a daily concern, and intrusion detection systems (IDS) play an essential role in the supervision of computer networks. An employed approach to combat network intrusions is the development of intrusion detection systems via machine learning techniques. The intrusion detection performance of these systems depends highly on the quality of the IDS dataset used in their design and the decision making for the most suitable machine learning algorithm becomes a difficult task. The proposed paper focuses on evaluate and accurate the model of intrusion detection system of different machine learning algorithms on two resampling techniques using the new CICIDS2017 dataset where Decision Trees, MLPs, and Random Forests on Stratified 10-Fold gives high stability in results with Precision, Recall, and F1-Scores of 98% and 99% with low execution times.


2021 ◽  
Vol 35 (1) ◽  
pp. 11-21
Author(s):  
Himani Tyagi ◽  
Rajendra Kumar

IoT is characterized by communication between things (devices) that constantly share data, analyze, and make decisions while connected to the internet. This interconnected architecture is attracting cyber criminals to expose the IoT system to failure. Therefore, it becomes imperative to develop a system that can accurately and automatically detect anomalies and attacks occurring in IoT networks. Therefore, in this paper, an Intrsuion Detection System (IDS) based on extracted novel feature set synthesizing BoT-IoT dataset is developed that can swiftly, accurately and automatically differentiate benign and malicious traffic. Instead of using available feature reduction techniques like PCA that can change the core meaning of variables, a unique feature set consisting of only seven lightweight features is developed that is also IoT specific and attack traffic independent. Also, the results shown in the study demonstrates the effectiveness of fabricated seven features in detecting four wide variety of attacks namely DDoS, DoS, Reconnaissance, and Information Theft. Furthermore, this study also proves the applicability and efficiency of supervised machine learning algorithms (KNN, LR, SVM, MLP, DT, RF) in IoT security. The performance of the proposed system is validated using performance Metrics like accuracy, precision, recall, F-Score and ROC. Though the accuracy of Decision Tree (99.9%) and Randon Forest (99.9%) Classifiers are same but other metrics like training and testing time shows Random Forest comparatively better.


Sales forecasting is an important when it comes to companies who are engaged in retailing, logistics, manufacturing, marketing and wholesaling. It allows companies to allocate resources efficiently, to estimate revenue of the sales and to plan strategies which are better for company’s future. In this paper, predicting product sales from a particular store is done in a way that produces better performance compared to any machine learning algorithms. The dataset used for this project is Big Mart Sales data of the 2013.Nowadays shopping malls and Supermarkets keep track of the sales data of the each and every individual item for predicting the future demand of the customer. It contains large amount of customer data and the item attributes. Further, the frequent patterns are detected by mining the data from the data warehouse. Then the data can be used for predicting the sales of the future with the help of several machine learning techniques (algorithms) for the companies like Big Mart. In this project, we propose a model using the Xgboost algorithm for predicting sales of companies like Big Mart and founded that it produces better performance compared to other existing models. An analysis of this model with other models in terms of their performance metrics is made in this project. Big Mart is an online marketplace where people can buy or sell or advertise your merchandise at low cost. The goal of the paper is to make Big Mart the shopping paradise for the buyers and a marketing solutions for the sellers as well. The ultimate aim is the complete satisfaction of the customers. The project “SUPERMARKET SALES PREDICTION” builds a predictive model and finds out the sales of each of the product at a particular store. The Big Mart use this model to under the properties of the products which plays a major role in increasing the sales. This can also be done on the basis hypothesis that should be done before looking at the data


Entropy ◽  
2021 ◽  
Vol 23 (10) ◽  
pp. 1258
Author(s):  
Taher Al-Shehari ◽  
Rakan A. Alsowail

Insider threats are malicious acts that can be carried out by an authorized employee within an organization. Insider threats represent a major cybersecurity challenge for private and public organizations, as an insider attack can cause extensive damage to organization assets much more than external attacks. Most existing approaches in the field of insider threat focused on detecting general insider attack scenarios. However, insider attacks can be carried out in different ways, and the most dangerous one is a data leakage attack that can be executed by a malicious insider before his/her leaving an organization. This paper proposes a machine learning-based model for detecting such serious insider threat incidents. The proposed model addresses the possible bias of detection results that can occur due to an inappropriate encoding process by employing the feature scaling and one-hot encoding techniques. Furthermore, the imbalance issue of the utilized dataset is also addressed utilizing the synthetic minority oversampling technique (SMOTE). Well known machine learning algorithms are employed to detect the most accurate classifier that can detect data leakage events executed by malicious insiders during the sensitive period before they leave an organization. We provide a proof of concept for our model by applying it on CMU-CERT Insider Threat Dataset and comparing its performance with the ground truth. The experimental results show that our model detects insider data leakage events with an AUC-ROC value of 0.99, outperforming the existing approaches that are validated on the same dataset. The proposed model provides effective methods to address possible bias and class imbalance issues for the aim of devising an effective insider data leakage detection system.


Optimization algorithms are widely used for the identification of intrusion. This is attributable to the increasing number of audit data features and the decreasing performance of human-based smart Intrusion Detection Systems (IDS) regarding classification accuracy and training time. In this paper, an improved method for intrusion detection for binary classification was presented and discussed in detail. The proposed method combined the New Teaching-Learning-Based Optimization Algorithm (NTLBO), Support Vector Machine (SVM), Extreme Learning Machine (ELM), and Logistic Regression (LR) (feature selection and weighting) NTLBO algorithm with supervised machine learning techniques for Feature Subset Selection (FSS). The process of selecting the least number of features without any effect on the result accuracy in FSS was considered a multi-objective optimization problem. The NTLBO was proposed in this paper as an FSS mechanism; its algorithm-specific, parameter-less concept (which requires no parameter tuning during an optimization) was explored. The experiments were performed on the prominent intrusion machine-learning datasets (KDDCUP’99 and CICIDS 2017), where significant enhancements were observed with the suggested NTLBO algorithm as compared to the classical Teaching-Learning-Based Optimization algorithm (TLBO), NTLBO presented better results than TLBO and many existing works. The results showed that NTLBO reached 100% accuracy for KDDCUP’99 dataset and 97% for CICIDS dataset


Author(s):  
Ramakanta Mohanty ◽  
Vadlamani Ravi

The past 10 years have seen the prediction of software defects proposed by many researchers using various metrics based on measurable aspects of source code entities (e.g. methods, classes, files or modules) and the social structure of software project in an effort to predict the software defects. However, these metrics could not predict very high accuracies in terms of sensitivity, specificity and accuracy. In this chapter, we propose the use of machine learning techniques to predict software defects. The effectiveness of all these techniques is demonstrated on ten datasets taken from literature. Based on an experiment, it is observed that PNN outperformed all other techniques in terms of accuracy and sensitivity in all the software defects datasets followed by CART and Group Method of data handling. We also performed feature selection by t-statistics based approach for selecting feature subsets across different folds for a given technique and followed by the feature subset selection. By taking the most important variables, we invoked the classifiers again and observed that PNN outperformed other classifiers in terms of sensitivity and accuracy. Moreover, the set of ‘if- then rules yielded by J48 and CART can be used as an expert system for prediction of software defects.


2013 ◽  
Vol 7 (4) ◽  
pp. 37-52
Author(s):  
Srinivasa K G

Increase in the number of network based transactions for both personal and professional use has made network security gain a significant and indispensable status. The possible attacks that an Intrusion Detection System (IDS) has to tackle can be of an existing type or of an entirely new type. The challenge for researchers is to develop an intelligent IDS which can detect new attacks as efficiently as they detect known ones. Intrusion Detection Systems are rendered intelligent by employing machine learning techniques. In this paper we present a statistical machine learning approach to the IDS using the Support Vector Machine (SVM). Unike conventional SVMs this paper describes a milti model approach which makes use of an extra layer over the existing SVM. The network traffic is modeled into connections based on protocols at various network layers. These connection statistics are given as input to SVM which in turn plots each input vector. The new attacks are identified by plotting them with respect to the trained system. The experimental results demonstrate the lower execution time of the proposed system with high detection rate and low false positive number. The 1999 DARPA IDS dataset is used as the evaluation dataset for both training and testing. The proposed system, SVM NIDS is bench marked with SNORT (Roesch, M. 1999), an open source IDS.


2020 ◽  
Vol 10 (2) ◽  
pp. 1-26
Author(s):  
Naghmeh Moradpoor Sheykhkanloo ◽  
Adam Hall

An insider threat can take on many forms and fall under different categories. This includes malicious insider, careless/unaware/uneducated/naïve employee, and the third-party contractor. Machine learning techniques have been studied in published literature as a promising solution for such threats. However, they can be biased and/or inaccurate when the associated dataset is hugely imbalanced. Therefore, this article addresses the insider threat detection on an extremely imbalanced dataset which includes employing a popular balancing technique known as spread subsample. The results show that although balancing the dataset using this technique did not improve performance metrics, it did improve the time taken to build the model and the time taken to test the model. Additionally, the authors realised that running the chosen classifiers with parameters other than the default ones has an impact on both balanced and imbalanced scenarios, but the impact is significantly stronger when using the imbalanced dataset.


Sign in / Sign up

Export Citation Format

Share Document