Evading obscure communication from spam emails

Khan Farhan Rafat;  ; Qin Xin; Abdul Rehman Javed; Zunera Jalil; Rana Zeeshan Ahmad;  ;

doi:10.3934/mbe.2022091

Evading obscure communication from spam emails

Mathematical Biosciences and Engineering ◽

10.3934/mbe.2022091 ◽

2021 ◽

Vol 19 (2) ◽

pp. 1926-1943

Author(s):

Khan Farhan Rafat ◽

◽

Qin Xin ◽

Abdul Rehman Javed ◽

Zunera Jalil ◽

...

Keyword(s):

Short Term Memory ◽

Cyber Attacks ◽

Spam Filtering ◽

Communication Framework ◽

Second Stage ◽

Spam Filters ◽

Encrypted Communication ◽

Processing Techniques ◽

Audio Video ◽

Email Spam

<abstract><p>Spam is any form of annoying and unsought digital communication sent in bulk and may contain offensive content feasting viruses and cyber-attacks. The voluminous increase in spam has necessitated developing more reliable and vigorous artificial intelligence-based anti-spam filters. Besides text, an email sometimes contains multimedia content such as audio, video, and images. However, text-centric email spam filtering employing text classification techniques remains today's preferred choice. In this paper, we show that text pre-processing techniques nullify the detection of malicious contents in an obscure communication framework. We use <italic>Spamassassin</italic> corpus with and without text pre-processing and examined it using machine learning (ML) and deep learning (DL) algorithms to classify these as ham or spam emails. The proposed DL-based approach consistently outperforms ML models. In the first stage, using pre-processing techniques, the long-short-term memory (LSTM) model achieves the highest results of 93.46% precision, 96.81% recall, and 95% F1-score. In the second stage, without using pre-processing techniques, LSTM achieves the best results of 95.26% precision, 97.18% recall, and 96% F1-score. Results show the supremacy of DL algorithms over the standard ones in filtering spam. However, the effects are unsatisfactory for detecting encrypted communication for both forms of ML algorithms.</p></abstract>

Download Full-text

CPSFS: A Credible Personalized Spam Filtering Scheme by Crowdsourcing

Wireless Communications and Mobile Computing ◽

10.1155/2017/1457870 ◽

2017 ◽

Vol 2017 ◽

pp. 1-9 ◽

Cited By ~ 2

Author(s):

Xin Liu ◽

Pingjun Zou ◽

Weishan Zhang ◽

Jiehan Zhou ◽

Changying Dai ◽

...

Keyword(s):

Social Trust ◽

Experimental Results ◽

Bayesian Filtering ◽

Spam Filtering ◽

Network Resources ◽

Accuracy Rate ◽

Spam Filters ◽

Interest Similarity ◽

Client Side ◽

Email Spam

Email spam consumes a lot of network resources and threatens many systems because of its unwanted or malicious content. Most existing spam filters only target complete-spam but ignore semispam. This paper proposes a novel and comprehensive CPSFS scheme: Credible Personalized Spam Filtering Scheme, which classifies spam into two categories: complete-spam and semispam, and targets filtering both kinds of spam. Complete-spam is always spam for all users; semispam is an email identified as spam by some users and as regular email by other users. Most existing spam filters target complete-spam but ignore semispam. In CPSFS, Bayesian filtering is deployed at email servers to identify complete-spam, while semispam is identified at client side by crowdsourcing. An email user client can distinguish junk from legitimate emails according to spam reports from credible contacts with the similar interests. Social trust and interest similarity between users and their contacts are calculated so that spam reports are more accurately targeted to similar users. The experimental results show that the proposed CPSFS can improve the accuracy rate of distinguishing spam from legitimate emails compared with that of Bayesian filter alone.

Download Full-text

A Machine Learning Based Email Spam Classification Framework Model: Related Challenges and Issues

International Journal of Innovative Technology and Exploring Engineering - Special Issue ◽

10.35940/ijitee.d1561.029420 ◽

2020 ◽

Vol 9 (4) ◽

pp. 3137-3144

Keyword(s):

Machine Learning ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Spam Filtering ◽

Important Work ◽

Classification Framework ◽

Framework Model ◽

Spam Filters ◽

Distinguishing Features ◽

Email Spam

Spam emails, also known as non-self, are unsolicited commercial emails or fraudulent emails sent to a particular individual or company, or to a group of individuals. Machine learning algorithms in the area of spam filtering is commonly used. There has been a lot of effort to render spam filtering more efficient in classifying e-mails as either ham (valid messages) or spam (unwanted messages) through the ML classifiers. We may recognize the distinguishing features of the material of documents. Much important work has been carried out in the area of spam filtering which cannot be adapted to various conditions and problems which are limited to certain domains. Our analysis contrasts the positives methods as well as some shortcomings of current ML methods and open spam filters study challenges. We suggest some of the new ongoing approaches towards deep leaning as potential tactics that can tackle the challenge of spam emails efficiently.

Download Full-text

Spam Classification Based on E-Mail Path Analysis

Pervasive Information Security and Privacy Developments ◽

10.4018/978-1-61692-000-5.ch021 ◽

2011 ◽

pp. 332-355

Author(s):

Palla Srikanth ◽

Dantu W. Ram ◽

Cangussu João

Keyword(s):

Online Advertising ◽

False Positives ◽

New Method ◽

Financial Resources ◽

Spam Filtering ◽

Effective Form ◽

Content Analyses ◽

Spam Filters ◽

E Mail ◽

Email Spam

Email spam is the most effective form of online advertising. Unlike telephone marketing, email spamming does not require huge human or financial resources investment. Most existing spam filtering techniques concentrate on the emails’ content. However, most spammers obfuscate their emails’ content to circumvent content-based spam filters. An integrated solution for restricting spam emails is needed as content analyses alone might not provide a solution for filtering unsolicited emails. Here we present a new method for isolating unsolicited emails. Though spammers obfuscate their emails’ content, they do not have access to all the fields in the email header. Our classification method is based on the path an email traverses instead of content. Overall, our classifier produced fewer false positives when compared to current filters such as SpamAssassin. We achieved a precision of 98.65% which compares well with the precisions achieved by SPF, DNSRBL blacklists.

Download Full-text

Supervised Machine Learning Classifier for Email Spam Filtering

Innovations in Computer Science and Engineering - Lecture Notes in Networks and Systems ◽

10.1007/978-981-13-7082-3_41 ◽

2019 ◽

pp. 357-363 ◽

Cited By ~ 1

Author(s):

Deepika Mallampati ◽

K. Chandra Shekar ◽

K. Ravikanth

Keyword(s):

Machine Learning ◽

Supervised Machine Learning ◽

Spam Filtering ◽

Learning Classifier ◽

Email Spam

Download Full-text

Email Spam Detection using Bidirectional Long Short Term Memory with Convolutional Neural Network

2020 IEEE Region 10 Symposium (TENSYMP) ◽

10.1109/tensymp50017.2020.9230769 ◽

2020 ◽

Author(s):

Sefat E Rahman ◽

Shofi Ullah

Keyword(s):

Neural Network ◽

Convolutional Neural Network ◽

Short Term Memory ◽

Spam Detection ◽

Short Term ◽

Term Memory ◽

Long Short Term Memory ◽

Email Spam

Download Full-text

Three-way Email Spam Filtering with Game-theoretic Rough Sets

2019 International Conference on Computing, Networking and Communications (ICNC) ◽

10.1109/iccnc.2019.8685642 ◽

2019 ◽

Cited By ~ 2

Author(s):

Yan Zhang ◽

PengFei Liu ◽

JingTao Yao

Keyword(s):

Rough Sets ◽

Spam Filtering ◽

Game Theoretic ◽

Email Spam

Download Full-text

Comparison of Deep and Traditional Learning Methods for Email Spam Filtering

International Journal of Advanced Computer Science and Applications ◽

10.14569/ijacsa.2021.0120164 ◽

2021 ◽

Vol 12 (1) ◽

Author(s):

Abdullah Sheneamer

Keyword(s):

Traditional Learning ◽

Spam Filtering ◽

Learning Methods ◽

Email Spam

Download Full-text

Analysis of Naıve Bayes Algorithm for Email Spam Filtering

International Journal for Modern Trends in Science and Technology - RTT2020 ◽

10.46501/ijmtst0701002 ◽

2021 ◽

Vol 7 (01) ◽

pp. 5-9

Author(s):

RajKishore Sahni

Keyword(s):

Machine Learning ◽

Service Providers ◽

Machine Learning Techniques ◽

Research Trend ◽

Learning Approaches ◽

Spam Filtering ◽

Internet Service ◽

Learning Techniques ◽

Bayes Algorithm ◽

Email Spam

The upsurge in the volume of unwanted emails called spam has created an intense need for the development of more dependable and robust antispam filters. Machine learning methods of recent are being used to successfully detect and filter spam emails. We present a systematic review of some of the popular machine learning based email spam filtering approaches. Our review covers survey of the important concepts, attempts, efficiency, and the research trend in spam filtering. The preliminary discussion in the study background examines the applications of machine learning techniques to the email spam filtering process of the leading internet service providers (ISPs) like Gmail, Yahoo and Outlook emails spam filters. Discussion on general email spam filtering process, and the various efforts by different researchers in combating spam through the use machine learning techniques was done. Our review compares the strengths and drawbacks of existing machine learning approaches and the open research problems in spam filtering. We recommended deep learning and deep adversarial learning as the future techniques that can effectively handle the menace of spam emails

Download Full-text

Deep Cybersecurity: A Comprehensive Overview from Neural Network and Deep Learning Perspective

10.20944/preprints202102.0340.v1 ◽

2021 ◽

Author(s):

Iqbal H. Sarker

Keyword(s):

Neural Network ◽

Deep Learning ◽

Short Term Memory ◽

Denial Of Service ◽

Cyber Attacks ◽

Self Organizing Map ◽

Comprehensive Overview ◽

Generative Adversarial Network ◽

Research Issues ◽

Learning Techniques

Deep learning (DL), which is originated from an artificial neural network (ANN), is one of the major technologies of today's smart cybersecurity systems or policies to function in an intelligent manner. Popular deep learning techniques, such as Multi-layer Perceptron (MLP), Convolutional Neural Network (CNN or ConvNet), Recurrent Neural Network (RNN) or Long Short-Term Memory (LSTM), Self-organizing Map (SOM), Auto-Encoder (AE), Restricted Boltzmann Machine (RBM), Deep Belief Networks (DBN), Generative Adversarial Network (GAN), Deep Transfer Learning (DTL or Deep TL), Deep Reinforcement Learning (DRL or Deep RL), or their ensembles and hybrid approaches can be used to intelligently tackle the diverse cybersecurity issues. In this paper, we aim to present a comprehensive overview from the perspective of these neural networks and deep learning techniques according to today's diverse needs. We also discuss the applicability of these techniques in various cybersecurity tasks such as intrusion detection, identification of malware or botnets, phishing, predicting cyber-attacks, e.g. denial of service (DoS), fraud detection or cyber-anomalies, etc. Finally, we highlight several research issues and future directions within the scope of our study in the field. Overall, the ultimate goal of this paper is to serve as a reference point and guidelines for the academia and professionals in the cyber industries, especially from the deep learning point of view.

Download Full-text

Improving Spam Email Filtering Systems Using Data Mining Techniques

Implementing Computational Intelligence Techniques for Security Systems Design - Advances in Computational Intelligence and Robotics ◽

10.4018/978-1-7998-2418-3.ch003 ◽

2020 ◽

pp. 43-72

Author(s):

Wasan Shaker Awad ◽

Wafa M. Rafiq

Keyword(s):

Machine Learning ◽

Data Mining ◽

Genetic Algorithm ◽

Low Cost ◽

High Accuracy ◽

False Positives ◽

Spam Filtering ◽

Spam Filter ◽

Using Data ◽

Email Spam

Email is the most popular choice of communication due to its low-cost and easy accessibility, which makes email spam a major issue. Emails can be incorrectly marked by a spam filter and legitimate emails can get lost in the spam folder or the spam emails can deluge the users' inboxes. Therefore, various methods based on statistics and machine learning have been developed to classify emails accurately. In this chapter, the existing spam filtering methods were studied comprehensively, and a spam email classifier based on the genetic algorithm was proposed. The proposed algorithm was successful in achieving high accuracy by reducing the rate of false positives, but at the same time, it also maintained an acceptable rate of false negatives. The proposed algorithm was tested on 2000 emails from the two popular spam datasets, Enron and LingSpam, and the accuracy was found to be nearly 90%. The results showed that the genetic algorithm is an effective method for spam classification and with further enhancements that will provide a more robust spam filter.

Download Full-text