scholarly journals Evading obscure communication from spam emails

2021 ◽  
Vol 19 (2) ◽  
pp. 1926-1943
Author(s):  
Khan Farhan Rafat ◽  
◽  
Qin Xin ◽  
Abdul Rehman Javed ◽  
Zunera Jalil ◽  
...  

<abstract><p>Spam is any form of annoying and unsought digital communication sent in bulk and may contain offensive content feasting viruses and cyber-attacks. The voluminous increase in spam has necessitated developing more reliable and vigorous artificial intelligence-based anti-spam filters. Besides text, an email sometimes contains multimedia content such as audio, video, and images. However, text-centric email spam filtering employing text classification techniques remains today's preferred choice. In this paper, we show that text pre-processing techniques nullify the detection of malicious contents in an obscure communication framework. We use <italic>Spamassassin</italic> corpus with and without text pre-processing and examined it using machine learning (ML) and deep learning (DL) algorithms to classify these as ham or spam emails. The proposed DL-based approach consistently outperforms ML models. In the first stage, using pre-processing techniques, the long-short-term memory (LSTM) model achieves the highest results of 93.46% precision, 96.81% recall, and 95% F1-score. In the second stage, without using pre-processing techniques, LSTM achieves the best results of 95.26% precision, 97.18% recall, and 96% F1-score. Results show the supremacy of DL algorithms over the standard ones in filtering spam. However, the effects are unsatisfactory for detecting encrypted communication for both forms of ML algorithms.</p></abstract>

2017 ◽  
Vol 2017 ◽  
pp. 1-9 ◽  
Author(s):  
Xin Liu ◽  
Pingjun Zou ◽  
Weishan Zhang ◽  
Jiehan Zhou ◽  
Changying Dai ◽  
...  

Email spam consumes a lot of network resources and threatens many systems because of its unwanted or malicious content. Most existing spam filters only target complete-spam but ignore semispam. This paper proposes a novel and comprehensive CPSFS scheme: Credible Personalized Spam Filtering Scheme, which classifies spam into two categories: complete-spam and semispam, and targets filtering both kinds of spam. Complete-spam is always spam for all users; semispam is an email identified as spam by some users and as regular email by other users. Most existing spam filters target complete-spam but ignore semispam. In CPSFS, Bayesian filtering is deployed at email servers to identify complete-spam, while semispam is identified at client side by crowdsourcing. An email user client can distinguish junk from legitimate emails according to spam reports from credible contacts with the similar interests. Social trust and interest similarity between users and their contacts are calculated so that spam reports are more accurately targeted to similar users. The experimental results show that the proposed CPSFS can improve the accuracy rate of distinguishing spam from legitimate emails compared with that of Bayesian filter alone.


Spam emails, also known as non-self, are unsolicited commercial emails or fraudulent emails sent to a particular individual or company, or to a group of individuals. Machine learning algorithms in the area of spam filtering is commonly used. There has been a lot of effort to render spam filtering more efficient in classifying e-mails as either ham (valid messages) or spam (unwanted messages) through the ML classifiers. We may recognize the distinguishing features of the material of documents. Much important work has been carried out in the area of spam filtering which cannot be adapted to various conditions and problems which are limited to certain domains. Our analysis contrasts the positives methods as well as some shortcomings of current ML methods and open spam filters study challenges. We suggest some of the new ongoing approaches towards deep leaning as potential tactics that can tackle the challenge of spam emails efficiently.


Author(s):  
Palla Srikanth ◽  
Dantu W. Ram ◽  
Cangussu João

Email spam is the most effective form of online advertising. Unlike telephone marketing, email spamming does not require huge human or financial resources investment. Most existing spam filtering techniques concentrate on the emails’ content. However, most spammers obfuscate their emails’ content to circumvent content-based spam filters. An integrated solution for restricting spam emails is needed as content analyses alone might not provide a solution for filtering unsolicited emails. Here we present a new method for isolating unsolicited emails. Though spammers obfuscate their emails’ content, they do not have access to all the fields in the email header. Our classification method is based on the path an email traverses instead of content. Overall, our classifier produced fewer false positives when compared to current filters such as SpamAssassin. We achieved a precision of 98.65% which compares well with the precisions achieved by SPF, DNSRBL blacklists.


Author(s):  
RajKishore Sahni

The upsurge in the volume of unwanted emails called spam has created an intense need for the development of more dependable and robust antispam filters. Machine learning methods of recent are being used to successfully detect and filter spam emails. We present a systematic review of some of the popular machine learning based email spam filtering approaches. Our review covers survey of the important concepts, attempts, efficiency, and the research trend in spam filtering. The preliminary discussion in the study background examines the applications of machine learning techniques to the email spam filtering process of the leading internet service providers (ISPs) like Gmail, Yahoo and Outlook emails spam filters. Discussion on general email spam filtering process, and the various efforts by different researchers in combating spam through the use machine learning techniques was done. Our review compares the strengths and drawbacks of existing machine learning approaches and the open research problems in spam filtering. We recommended deep learning and deep adversarial learning as the future techniques that can effectively handle the menace of spam emails


Author(s):  
Iqbal H. Sarker

Deep learning (DL), which is originated from an artificial neural network (ANN), is one of the major technologies of today's smart cybersecurity systems or policies to function in an intelligent manner. Popular deep learning techniques, such as Multi-layer Perceptron (MLP), Convolutional Neural Network (CNN or ConvNet), Recurrent Neural Network (RNN) or Long Short-Term Memory (LSTM), Self-organizing Map (SOM), Auto-Encoder (AE), Restricted Boltzmann Machine (RBM), Deep Belief Networks (DBN), Generative Adversarial Network (GAN), Deep Transfer Learning (DTL or Deep TL), Deep Reinforcement Learning (DRL or Deep RL), or their ensembles and hybrid approaches can be used to intelligently tackle the diverse cybersecurity issues. In this paper, we aim to present a comprehensive overview from the perspective of these neural networks and deep learning techniques according to today's diverse needs. We also discuss the applicability of these techniques in various cybersecurity tasks such as intrusion detection, identification of malware or botnets, phishing, predicting cyber-attacks, e.g. denial of service (DoS), fraud detection or cyber-anomalies, etc. Finally, we highlight several research issues and future directions within the scope of our study in the field. Overall, the ultimate goal of this paper is to serve as a reference point and guidelines for the academia and professionals in the cyber industries, especially from the deep learning point of view.


Author(s):  
Wasan Shaker Awad ◽  
Wafa M. Rafiq

Email is the most popular choice of communication due to its low-cost and easy accessibility, which makes email spam a major issue. Emails can be incorrectly marked by a spam filter and legitimate emails can get lost in the spam folder or the spam emails can deluge the users' inboxes. Therefore, various methods based on statistics and machine learning have been developed to classify emails accurately. In this chapter, the existing spam filtering methods were studied comprehensively, and a spam email classifier based on the genetic algorithm was proposed. The proposed algorithm was successful in achieving high accuracy by reducing the rate of false positives, but at the same time, it also maintained an acceptable rate of false negatives. The proposed algorithm was tested on 2000 emails from the two popular spam datasets, Enron and LingSpam, and the accuracy was found to be nearly 90%. The results showed that the genetic algorithm is an effective method for spam classification and with further enhancements that will provide a more robust spam filter.


Sign in / Sign up

Export Citation Format

Share Document