Web Spam Detection Using Link-Based Ant Colony Optimization

Author(s):  
Apichat Taweesiriwate ◽  
Bundit Manaskasemsak ◽  
Arnon Rungsawang
2015 ◽  
Vol 11 (2) ◽  
pp. 142-161 ◽  
Author(s):  
Bundit Manaskasemsak ◽  
Arnon Rungsawang

Purpose – This paper aims to present a machine learning approach for solving the problem of Web spam detection. Based on an adoption of the ant colony optimization (ACO), three algorithms are proposed to construct rule-based classifiers to distinguish between non-spam and spam hosts. Moreover, the paper also proposes an adaptive learning technique to enhance the spam detection performance. Design/methodology/approach – The Trust-ACO algorithm is designed to let an ant start from a non-spam seed, and afterwards, decide to walk through paths in the host graph. Trails (i.e. trust paths) discovered by ants are then interpreted and compiled to non-spam classification rules. Similarly, the Distrust-ACO algorithm is designed to generate spam classification ones. The last Combine-ACO algorithm aims to accumulate rules given from the former algorithms. Moreover, an adaptive learning technique is introduced to let ants walk with longer (or shorter) steps by rewarding them when they find desirable paths or penalizing them otherwise. Findings – Experiments are conducted on two publicly available WEBSPAM-UK2006 and WEBSPAM-UK2007 datasets. The results show that the proposed algorithms outperform well-known rule-based classification baselines. Especially, the proposed adaptive learning technique helps improving the AUC scores up to 0.899 and 0.784 on the former and the latter datasets, respectively. Originality/value – To the best of our knowledge, this is the first comprehensive study that adopts the ACO learning approach to solve the problem of Web spam detection. In addition, we have improved the traditional ACO by using the adaptive learning technique.


Author(s):  
Rathika Natarajan ◽  
Abolfazl Mehbodniya ◽  
Murugesan Ganapathy ◽  
Rahul Neware ◽  
Swimpy Pahuja ◽  
...  

Electronic mails (emails) have been widely adapted by organizations and individuals as efficient communication means. Despite the pervasiveness of alternate means like social networks, mobile SMS, electronic messages, etc. email users are continuously growing. The higher user growth attracts more spammers who send unsolicited emails to anonymous users. These spam emails may contain malware, misleading information, phishing links, etc. that can imperil the privacy of benign users. The paper proposes a self-adaptive hybrid algorithm of big bang–big crunch (BB–BC) with ant colony optimization (ACO) for email spam detection. The BB–BC algorithm is based on the physics-inspired evolution theory of the universe, and the collective interaction behavior of ants is the inspiration for the ACO algorithm. Here, the ant miner plus (AMP) variant of the ACO algorithm is adapted, a data mining variant efficient for the classification. The proposed hybrid algorithm (HB3C-AMP) adapts the attributes of B3C (BB–BC) for local exploitation and AMP for global exploration. It evaluates the center of mass along with the consideration of pheromone value evaluated by the best ants to detect email spam efficiently. The experiments for the proposed HB3C-AMP algorithm are conducted with the Ling Spam and CSDMC2010 datasets. Different experiments are conducted to determine the significance of the pre-processing modules, iterations, and population size on the proposed algorithm. The results are also evaluated for the AM (ant miner), AM2 (ant miner2), AM3 (ant miner3), and AMP algorithms. The performance comparison demonstrates that the proposed HB3C-AMP algorithm is superior to the other techniques.


2012 ◽  
Author(s):  
Earth B. Ugat ◽  
Jennifer Joyce M. Montemayor ◽  
Mark Anthony N. Manlimos ◽  
Dante D. Dinawanao

2012 ◽  
Vol 3 (3) ◽  
pp. 122-125
Author(s):  
THAHASSIN C THAHASSIN C ◽  
◽  
A. GEETHA A. GEETHA ◽  
RASEEK C RASEEK C

Author(s):  
Achmad Fanany Onnilita Gaffar ◽  
Agusma Wajiansyah ◽  
Supriadi Supriadi

The shortest path problem is one of the optimization problems where the optimization value is a distance. In general, solving the problem of the shortest route search can be done using two methods, namely conventional methods and heuristic methods. The Ant Colony Optimization (ACO) is the one of the optimization algorithm based on heuristic method. ACO is adopted from the behavior of ant colonies which naturally able to find the shortest route on the way from the nest to the food sources. In this study, ACO is used to determine the shortest route from Bumi Senyiur Hotel (origin point) to East Kalimantan Governor's Office (destination point). The selection of the origin and destination points is based on a large number of possible major roads connecting the two points. The data source used is the base map of Samarinda City which is cropped on certain coordinates by using Google Earth app which covers the origin and destination points selected. The data pre-processing is performed on the base map image of the acquisition results to obtain its numerical data. ACO is implemented on the data to obtain the shortest path from the origin and destination point that has been determined. From the study results obtained that the number of ants that have been used has an effect on the increase of possible solutions to optimal. The number of tours effect on the number of pheromones that are left on each edge passed ant. With the global pheromone update on each tour then there is a possibility that the path that has passed the ant will run out of pheromone at the end of the tour. This causes the possibility of inconsistent results when using the number of ants smaller than the number of tours.


2020 ◽  
Vol 26 (11) ◽  
pp. 2427-2447
Author(s):  
S.N. Yashin ◽  
E.V. Koshelev ◽  
S.A. Borisov

Subject. This article discusses the issues related to the creation of a technology of modeling and optimization of economic, financial, information, and logistics cluster-cluster cooperation within a federal district. Objectives. The article aims to propose a model for determining the optimal center of industrial agglomeration for innovation and industry clusters located in a federal district. Methods. For the study, we used the ant colony optimization algorithm. Results. The article proposes an original model of cluster-cluster cooperation, showing the best version of industrial agglomeration, the cities of Samara, Ulyanovsk, and Dimitrovgrad, for the Volga Federal District as a case study. Conclusions. If the industrial agglomeration center is located in these three cities, the cutting of the overall transportation costs and natural population decline in the Volga Federal District will make it possible to qualitatively improve the foresight of evolution of the large innovation system of the district under study.


Sign in / Sign up

Export Citation Format

Share Document