scholarly journals Reducing Network Traffic Data Sets

Author(s):  
A. Botta ◽  
A. Dainotti ◽  
A. Pescape ◽  
G. Ventre
Author(s):  
Sahisnu Mazumder ◽  
Tuhin Sharma ◽  
Rahul Mitra ◽  
Nandita Sengupta ◽  
Jaya Sil

Entropy ◽  
2021 ◽  
Vol 23 (5) ◽  
pp. 507
Author(s):  
Piotr Białczak ◽  
Wojciech Mazurczyk

Malicious software utilizes HTTP protocol for communication purposes, creating network traffic that is hard to identify as it blends into the traffic generated by benign applications. To this aim, fingerprinting tools have been developed to help track and identify such traffic by providing a short representation of malicious HTTP requests. However, currently existing tools do not analyze all information included in the HTTP message or analyze it insufficiently. To address these issues, we propose Hfinger, a novel malware HTTP request fingerprinting tool. It extracts information from the parts of the request such as URI, protocol information, headers, and payload, providing a concise request representation that preserves the extracted information in a form interpretable by a human analyst. For the developed solution, we have performed an extensive experimental evaluation using real-world data sets and we also compared Hfinger with the most related and popular existing tools such as FATT, Mercury, and p0f. The conducted effectiveness analysis reveals that on average only 1.85% of requests fingerprinted by Hfinger collide between malware families, what is 8–34 times lower than existing tools. Moreover, unlike these tools, in default mode, Hfinger does not introduce collisions between malware and benign applications and achieves it by increasing the number of fingerprints by at most 3 times. As a result, Hfinger can effectively track and hunt malware by providing more unique fingerprints than other standard tools.


2018 ◽  
Vol 14 (4) ◽  
pp. 423-437 ◽  
Author(s):  
David Prantl ◽  
Martin Prantl

PurposeThe purpose of this paper is to examine and verify the competitive intelligence tools Alexa and SimilarWeb, which are broadly used for website traffic data estimation. Tested tools belong to the state of the art in this area.Design/methodology/approachThe authors use quantitative approach. Research was conducted on a sample of Czech websites for which there are accurate traffic data values, against which the other data sets (less accurate) provided by Alexa and SimilarWeb will be compared.FindingsThe results show that neither tool can accurately determine the ranking of websites on the internet. However, it is possible to approximately determine the significance of a particular website. These results are useful for another research studies which use data from Alexa or SimilarWeb. Moreover, the results show that it is still not possible to accurately estimate website traffic of any website in the world.Research limitations/implicationsThe limitation of the research lies in the fact that it was conducted solely in the Czech market.Originality/valueSignificant amount of research studies use data sets provided by Alexa and SimilarWeb. However, none of these research studies focus on the quality of the website traffic data acquired by Alexa or SimilarWeb, nor do any of them refer to other studies that would deal with this issue. Furthermore, authors describe approaches to measuring website traffic and based on the analysis, the possible usability of these methods is discussed.


2018 ◽  
Vol 77 (9) ◽  
pp. 11459-11487 ◽  
Author(s):  
Zichan Ruan ◽  
Yuantian Miao ◽  
Lei Pan ◽  
Yang Xiang ◽  
Jun Zhang

Author(s):  
Seung Bae Jeon ◽  
Muhammad Sarfraz Khan ◽  
Jung Hwan Lee ◽  
Myeong Hun Jeong

2020 ◽  
Author(s):  
Marta Catillo ◽  
Antonio Pecchia ◽  
Massimiliano Rak ◽  
Umberto Villano
Keyword(s):  

Sign in / Sign up

Export Citation Format

Share Document