web robot
Recently Published Documents


TOTAL DOCUMENTS

44
(FIVE YEARS 1)

H-INDEX

10
(FIVE YEARS 0)

2021 ◽  
Vol 2010 (1) ◽  
pp. 012161
Author(s):  
Xue Chen ◽  
Yang Song ◽  
Wei Xiong ◽  
Yutao Lu ◽  
Xingen Wang

2020 ◽  
Vol 50 (11) ◽  
pp. 4017-4028
Author(s):  
Athanasios Lagopoulos ◽  
Grigorios Tsoumakas

Author(s):  
A. A. Menshchikov ◽  
Yu. A. Gatchin

Today modern researches suggest that robotic traffic on web resources prevails over user traffic in terms of volume and intensity. Web robots threaten data privacy, copyright, as well as affect performance, security, and affect statistics. There is a need to develop efficient detection and protection methods against web robots. Existing techniques involve the use of syntactic and analytical processing of web server logs to detect web robots. This article proposes to analyze the graph of visits of web robots, taking into account the time, as well as the connectivity of topics of the visited pages. In the article we provide an algorithm for data selection and cleansing, extracting semantic features of pages on a web resource, as well as the proposed detection parameters. We describe in detail the process of forming the ground truth and the principles of existing sessions labelling to the legit and robotic types. It is proposed to use the capabilities of a web server to identify sessions uniquely. The clustering procedure and the selection of a suitable classification model are discussed. For each of the studied models, the selection of hyper parameters and cross-validation of the results are made. The analysis of performance and detection accuracy, as well as comparison with the results of existing approaches is provided. Empirical results of the proposed method on web-resources show that this method leads to better web robot detection accuracy and precision comparing with the existing approaches.


Author(s):  
Dilip Singh Sisodia

Web robots are autonomous software agents used for crawling websites in a mechanized way for non-malicious and malicious reasons. With the popularity of Web 2.0 services, web robots are also proliferating and growing in sophistication. The web servers are flooded with access requests from web robots. The web access requests are recorded in the form of web server logs, which contains significant knowledge about web access patterns of visitors. The presence of web robot access requests in log repositories distorts the actual access patterns of human visitors. The human visitors' actual web access patterns are potentially useful for enhancement of services for more satisfaction or optimization of server resources. In this chapter, the correlative access patterns of human visitors and web robots are discussed using the web server access logs of a portal.


2017 ◽  
Vol 87 ◽  
pp. 129-140 ◽  
Author(s):  
Mahdieh Zabihimayvan ◽  
Reza Sadeghi ◽  
H. Nathan Rude ◽  
Derek Doran

Sign in / Sign up

Export Citation Format

Share Document