Visual similarity comparison for Web page retrieval

Author(s):  
Y. Takama ◽  
N. Mitsuhashi
Author(s):  
Toru Furusawa ◽  
Yasuyuki Watai ◽  
Toshihiko Yamasaki ◽  
Kiyoharu Aizawa

2019 ◽  
Vol 16 (3) ◽  
pp. 815-830
Author(s):  
Xingchen Li ◽  
Weizhe Zhang ◽  
Desheng Wang ◽  
Bin Zhang ◽  
Hui He

Phishing often deceives users due to the relative similarity to the true pages on a layout and leads to considerable losses for the society. Consequently, detecting phishing sites has been an urgent activity. By researching phishing web pages using web page screenshots, we discover that this kind of web pages use numerous web page screenshots to achieve the close similarity to the true page and avoid the text and structure similarity detection. This study introduces a new similarity matching algorithm based on visual blocks. First, the RenderLayer tree of the web page is obtained to extract the visual block. Second, an algorithm that will settle the jumbled visual blocks, including the deletion of the small visual blocks and the emergence of the overlapping visual blocks, is designed. Finally, the similarity between the two web pages is assessed. The proposed algorithm sets different thresholds to achieve the optimal missing and false alarm rates.


2012 ◽  
Vol 204-208 ◽  
pp. 4928-4931
Author(s):  
Yang Xin Yu

A Web information retrieval algorithm based on Web page segment is designed, the key idea of which is to segment each Web page into different topic areas or segments according to its HTML tags and contents since Web pages are semi-structure. First, the algorithm builds a HTML tag tree, and then it combines nodes in the tree under the rule of content similarity and visual similarity. During the process of retrieval and ranking, the algorithm makes full use of the segmentation information to sequence the relevant pages. The experimental results show that this method is able to improve the precision in search significantly and it is also a good reference for the design of the future search engines.


2006 ◽  
Vol 42 (4) ◽  
pp. 310-318
Author(s):  
Yasufumi TAKAMA ◽  
Keisuke NAKAHARA ◽  
Noriaki MITSUHASHI ◽  
Toru YAMAGUCHI

2005 ◽  
Author(s):  
Aaron W. Bangor ◽  
James T. Miller
Keyword(s):  

Sign in / Sign up

Export Citation Format

Share Document