Internet Traffic Classification by Aggregating Correlated Decision Tree Classifier

Author(s):  
Lekshmi M Nair ◽  
G P Sajeev
2019 ◽  
Vol 9 (11) ◽  
pp. 2375 ◽  
Author(s):  
Riaz Ullah Khan ◽  
Xiaosong Zhang ◽  
Rajesh Kumar ◽  
Abubakar Sharif ◽  
Noorbakhsh Amiri Golilarz ◽  
...  

In recent years, the botnets have been the most common threats to network security since it exploits multiple malicious codes like a worm, Trojans, Rootkit, etc. The botnets have been used to carry phishing links, to perform attacks and provide malicious services on the internet. It is challenging to identify Peer-to-peer (P2P) botnets as compared to Internet Relay Chat (IRC), Hypertext Transfer Protocol (HTTP) and other types of botnets because P2P traffic has typical features of the centralization and distribution. To resolve the issues of P2P botnet identification, we propose an effective multi-layer traffic classification method by applying machine learning classifiers on features of network traffic. Our work presents a framework based on decision trees which effectively detects P2P botnets. A decision tree algorithm is applied for feature selection to extract the most relevant features and ignore the irrelevant features. At the first layer, we filter non-P2P packets to reduce the amount of network traffic through well-known ports, Domain Name System (DNS). query, and flow counting. The second layer further characterized the captured network traffic into non-P2P and P2P. At the third layer of our model, we reduced the features which may marginally affect the classification. At the final layer, we successfully detected P2P botnets using decision tree Classifier by extracting network communication features. Furthermore, our experimental evaluations show the significance of the proposed method in P2P botnets detection and demonstrate an average accuracy of 98.7%.


2016 ◽  
Vol 2016 ◽  
pp. 1-21 ◽  
Author(s):  
Taimur Bakhshi ◽  
Bogdan Ghita

Traffic classification utilizing flow measurement enables operators to perform essential network management. Flow accounting methods such as NetFlow are, however, considered inadequate for classification requiring additional packet-level information, host behaviour analysis, and specialized hardware limiting their practical adoption. This paper aims to overcome these challenges by proposing two-phased machine learning classification mechanism with NetFlow as input. The individual flow classes are derived per application throughk-means and are further used to train a C5.0 decision tree classifier. As part of validation, the initial unsupervised phase used flow records of fifteen popular Internet applications that were collected and independently subjected tok-means clustering to determine unique flow classes generated per application. The derived flow classes were afterwards used to train and test a supervised C5.0 based decision tree. The resulting classifier reported an average accuracy of 92.37% on approximately 3.4 million test cases increasing to 96.67% with adaptive boosting. The classifier specificity factor which accounted for differentiating content specific from supplementary flows ranged between 98.37% and 99.57%. Furthermore, the computational performance and accuracy of the proposed methodology in comparison with similar machine learning techniques lead us to recommend its extension to other applications in achieving highly granular real-time traffic classification.


Sign in / Sign up

Export Citation Format

Share Document