Binary Classification of Network-Generated Flow Data Using a Machine Learning Algorithm

2021 ◽  
Vol 15 (1) ◽  
pp. 26-43
Author(s):  
Sikha Bagui ◽  
Keenal M. Shah ◽  
Yizhi Hu ◽  
Subhash Bagui

This study proposes a model for building intrusion detection systems. The dataset used, CICIDS 2017, contains 14 different attacks with 85 features for each attack. This high dimensionality of the data is a major challenge when building efficient intrusion detection systems, especially in today's big data environment, since a lot of the features are redundant. The main goal in this paper was to reduce the number of features and present a detailed discussion of the important features. For feature selection, information gain was used in an iterative way, and for classification, a machine learning algorithm, the J48 decision tree algorithm, was used. The important features for the classification of each attack were identified, and the features that were important for classifying multiple attacks were also identified and discussed.

2014 ◽  
Vol 687-691 ◽  
pp. 2693-2697
Author(s):  
Li Ding ◽  
Li Mao ◽  
Xiao Feng Wang

One single machine learning algorithm presents shortcomings when the data environment changes in the process of application. This article puts forward a heteromorphic ensemble learning model made up of bayes, support vector machine (SVM) and decision tree which classifies P2P traffic by voting principle. The experiment shows that the model can significantly improve the classification accuracy, and has a good stability.


Author(s):  
N. Pratik Neelakantan ◽  
C. Nagesh

Intrusion Detection Systems are important for protecting network and its resources from illegal penetration. For 802.11network, the features used for training and testing the intrusion detection systems consist of basic information related to the TCP/IP header, with no considerable attention to the features associated with lower level protocol frames. The resulting detectors were efficient and accurate in detecting network attacks at the network and transport layers, but unfortunately, not capable of detecting 802.11-specific attacks such as de authentication attacks or MAC layer DoS attack. IDS systems can also identify and alert to the presence of unauthorized MAC addresses on the networks. The IDS is based a novel hybrid model that efficiently selects the optimal set of features in order to detect 802.11-specific intrusions. This model for feature selection uses the information gain ratio measure as a means to compute the relevance of each feature and the k-means classifier to select the optimal set of MAC layer features that can improve the accuracy of intrusion detection systems while reducing the learning time of their learning algorithm.


Author(s):  
Manuel Gonçalves da Silva Neto ◽  
Danielo G. Gomes

With the increasing popularization of computer network-based technologies, security has become a daily concern, and intrusion detection systems (IDS) play an essential role in the supervision of computer networks. An employed approach to combat network intrusions is the development of intrusion detection systems via machine learning techniques. The intrusion detection performance of these systems depends highly on the quality of the IDS dataset used in their design and the decision making for the most suitable machine learning algorithm becomes a difficult task. The proposed paper focuses on evaluate and accurate the model of intrusion detection system of different machine learning algorithms on two resampling techniques using the new CICIDS2017 dataset where Decision Trees, MLPs, and Random Forests on Stratified 10-Fold gives high stability in results with Precision, Recall, and F1-Scores of 98% and 99% with low execution times.


2018 ◽  
Vol 5 (1) ◽  
Author(s):  
Suad Mohammed Othman ◽  
Fadl Mutaher Ba-Alwi ◽  
Nabeel T. Alsohybe ◽  
Amal Y. Al-Hashida

2021 ◽  
Vol 11 (3) ◽  
pp. 92
Author(s):  
Mehdi Berriri ◽  
Sofiane Djema ◽  
Gaëtan Rey ◽  
Christel Dartigues-Pallez

Today, many students are moving towards higher education courses that do not suit them and end up failing. The purpose of this study is to help provide counselors with better knowledge so that they can offer future students courses corresponding to their profile. The second objective is to allow the teaching staff to propose training courses adapted to students by anticipating their possible difficulties. This is possible thanks to a machine learning algorithm called Random Forest, allowing for the classification of the students depending on their results. We had to process data, generate models using our algorithm, and cross the results obtained to have a better final prediction. We tested our method on different use cases, from two classes to five classes. These sets of classes represent the different intervals with an average ranging from 0 to 20. Thus, an accuracy of 75% was achieved with a set of five classes and up to 85% for sets of two and three classes.


2021 ◽  
Vol 1 (2) ◽  
pp. 252-273
Author(s):  
Pavlos Papadopoulos ◽  
Oliver Thornewill von Essen ◽  
Nikolaos Pitropakis ◽  
Christos Chrysoulas ◽  
Alexios Mylonas ◽  
...  

As the internet continues to be populated with new devices and emerging technologies, the attack surface grows exponentially. Technology is shifting towards a profit-driven Internet of Things market where security is an afterthought. Traditional defending approaches are no longer sufficient to detect both known and unknown attacks to high accuracy. Machine learning intrusion detection systems have proven their success in identifying unknown attacks with high precision. Nevertheless, machine learning models are also vulnerable to attacks. Adversarial examples can be used to evaluate the robustness of a designed model before it is deployed. Further, using adversarial examples is critical to creating a robust model designed for an adversarial environment. Our work evaluates both traditional machine learning and deep learning models’ robustness using the Bot-IoT dataset. Our methodology included two main approaches. First, label poisoning, used to cause incorrect classification by the model. Second, the fast gradient sign method, used to evade detection measures. The experiments demonstrated that an attacker could manipulate or circumvent detection with significant probability.


Sign in / Sign up

Export Citation Format

Share Document