E-Mail Worm Detection Using Data Mining

Mohammad M. Masud; Latifur Khan; Bhavani Thuraisingham

doi:10.4018/jisp.2007100103

E-Mail Worm Detection Using Data Mining

Data Warehousing and Mining ◽

10.4018/978-1-59904-951-9.ch121 ◽

2008 ◽

pp. 2036-2050

Author(s):

Mohammad M. Masud ◽

Latifur Khan ◽

Bhavani Thuraisingham

Keyword(s):

Data Mining ◽

Feature Selection ◽

Dimension Reduction ◽

Principal Component ◽

Two Phase ◽

Feature Selection Technique ◽

Worm Detection ◽

Phase Selection ◽

Using Data ◽

E Mail

This work applies data mining techniques to detect e-mail worms. E-mail messages contain a number of different features such as the total number of words in message body/subject, presence/absence of binary attachments, type of attachments, and so on. The goal is to obtain an efficient classification model based on these features. The solution consists of several steps. First, the number of features is reduced using two different approaches: feature-selection and dimension-reduction. This step is necessary to reduce noise and redundancy from the data. The feature-selection technique is called Two-phase Selection (TPS), which is a novel combination of decision tree and greedy selection algorithm. The dimension-reduction is performed by Principal Component Analysis. Second, the reduced data is used to train a classifier. Different classification techniques have been used, such as Support Vector Machine (SVM), Naïve Bayes, and their combination. Finally, the trained classifiers are tested on a dataset containing both known and unknown types of worms. These results have been compared with published results. It is found that the proposed TPS selection along with SVM classification achieves the best accuracy in detecting both known and unknown types of worms.

Download Full-text

Email Worm Detection Using Data Mining

Techniques and Applications for Advanced Information Privacy and Security ◽

10.4018/978-1-60566-210-7.ch002 ◽

2011 ◽

pp. 20-34

Author(s):

Mohammad M. Masud ◽

Latifur Khan ◽

Bhavani Thuraisingham

Keyword(s):

Data Mining ◽

Feature Selection ◽

Principal Component ◽

Classification Model ◽

Support Vector ◽

Two Phase ◽

Feature Selection Technique ◽

Worm Detection ◽

Phase Selection ◽

Using Data

This chapter applies data mining techniques to detect email worms. Email messages contain a number of different features such as the total number of words in message body/subject, presence/absence of binary attachments, type of attachments, and so on. The goal is to obtain an efficient classification model based on these features. The solution consists of several steps. First, the number of features is reduced using two different approaches: feature-selection and dimension-reduction. This step is necessary to reduce noise and redundancy from the data. The feature-selection technique is called Two-phase Selection (TPS), which is a novel combination of decision tree and greedy selection algorithm. The dimensionreduction is performed by Principal Component Analysis. Second, the reduced data is used to train a classifier. Different classification techniques have been used, such as Support Vector Machine (SVM), Naïve Bayes and their combination. Finally, the trained classifiers are tested on a dataset containing both known and unknown types of worms. These results have been compared with published results. It is found that the proposed TPS selection along with SVM classification achieves the best accuracy in detecting both known and unknown types of worms.

Download Full-text

A Study on Detection of Small Size Malicious Code using Data Mining Method

Jouranl of Information and Security ◽

10.33778/kcsa.2019.19.1.011 ◽

2019 ◽

Vol 19 (1) ◽

pp. 11-17

Author(s):

Taek-Hyun Lee ◽

◽

Ho Kook Kwang

Keyword(s):

Data Mining ◽

Malicious Code ◽

Mining Method ◽

Data Mining Method ◽

Using Data

Download Full-text

A Study on the Analysis of Employment Decision Factor of the Visually Impaired using Data Mining Technique

Disability & Employment ◽

10.15707/disem.2013.23.1.011 ◽

2013 ◽

Vol 23 (1) ◽

pp. 273-302 ◽

Cited By ~ 6

Author(s):

임은정 ◽

신현욱 ◽

김성진

Keyword(s):

Data Mining ◽

Visually Impaired ◽

Data Mining Technique ◽

Mining Technique ◽

Employment Decision ◽

Using Data

Download Full-text

Analysis of Crop Yield Prediction of Kharif & Rabi Jowar Crops Using Data Mining Techniques

International Journal of Advanced Research in Computer Science and Software Engineering ◽

10.23956/ijarcsse.v7i11.468 ◽

2017 ◽

Vol 7 (11) ◽

pp. 79

Author(s):

Sujata Mulik

Keyword(s):

Data Mining ◽

Crop Yield ◽

Crop Production ◽

Climatic Factors ◽

Crop Productivity ◽

Yield Prediction ◽

Data Mining Techniques ◽

Agriculture Sector ◽

Using Data ◽

Rabi Crops

Agriculture sector in India is facing rigorous problem to maximize crop productivity. More than 60 percent of the crop still depends on climatic factors like rainfall, temperature, humidity. This paper discusses the use of various Data Mining applications in agriculture sector. Data Mining is used to solve various problems in agriculture sector. It can be used it to solve yield prediction. The problem of yield prediction is a major problem that remains to be solved based on available data. Data mining techniques are the better choices for this purpose. Different Data Mining techniques are used and evaluated in agriculture for estimating the future year's crop production. In this paper we have focused on predicting crop yield productivity of kharif & Rabi Crops.

Download Full-text

Perancangan Aplikasi Prediksi Kelulusan Tepat Waktu Bagi Mahasiswa Baru Dengan Teknik Data Mining (Studi Kasus: Data Akademik Mahasiswa STMIK Dipanegara Makassar)

Creative Information Technology Journal ◽

10.24076/citec.2014v1i4.27 ◽

2015 ◽

Vol 1 (4) ◽

pp. 270

Author(s):

Muhammad Syukri Mustafa ◽

I. Wayan Simpen

Keyword(s):

Data Mining ◽

Nearest Neighbor ◽

Test Results ◽

K Nearest Neighbor ◽

Accuracy Rate ◽

Sample Data ◽

New Students ◽

K Nearest Neighbor Algorithm ◽

Using Data ◽

Existing Data

Penelitian ini dimaksudkan untuk melakukan prediksi terhadap kemungkian mahasiswa baru dapat menyelesaikan studi tepat waktu dengan menggunakan analisis data mining untuk menggali tumpukan histori data dengan menggunakan algoritma K-Nearest Neighbor (KNN). Aplikasi yang dihasilkan pada penelitian ini akan menggunakan berbagai atribut yang klasifikasikan dalam suatu data mining antara lain nilai ujian nasional (UN), asal sekolah/ daerah, jenis kelamin, pekerjaan dan penghasilan orang tua, jumlah bersaudara, dan lain-lain sehingga dengan menerapkan analysis KNN dapat dilakukan suatu prediksi berdasarkan kedekatan histori data yang ada dengan data yang baru, apakah mahasiswa tersebut berpeluang untuk menyelesaikan studi tepat waktu atau tidak. Dari hasil pengujian dengan menerapkan algoritma KNN dan menggunakan data sampel alumni tahun wisuda 2004 s.d. 2010 untuk kasus lama dan data alumni tahun wisuda 2011 untuk kasus baru diperoleh tingkat akurasi sebesar 83,36%.This research is intended to predict the possibility of new students time to complete studies using data mining analysis to explore the history stack data using K-Nearest Neighbor algorithm (KNN). Applications generated in this study will use a variety of attributes in a data mining classified among other Ujian Nasional scores (UN), the origin of the school / area, gender, occupation and income of parents, number of siblings, and others that by applying the analysis KNN can do a prediction based on historical proximity of existing data with new data, whether the student is likely to complete the study on time or not. From the test results by applying the KNN algorithm and uses sample data alumnus graduation year 2004 s.d 2010 for the case of a long and alumni data graduation year 2011 for new cases obtained accuracy rate of 83.36%.

Download Full-text

Predicting Student Performance using Data Mining

International Journal of Computer Sciences and Engineering ◽

10.26438/ijcse/v6i10.172177 ◽

2018 ◽

Vol 6 (10) ◽

pp. 172-177

Author(s):

Mabel Christina

Keyword(s):

Data Mining ◽

Student Performance ◽

Predicting Student Performance ◽

Using Data

Download Full-text

Forecasting Automobile Retail Sales Using Data Mining The Case of Ranchi, Jharkhand, India

International Journal of Computer Sciences and Engineering ◽

10.26438/ijcse/v6i9.572574 ◽

2018 ◽

Vol 6 (9) ◽

pp. 572-574

Author(s):

Gyaneshwar Mahto ◽

Umesh Prasad ◽

Rajiv Kumar Dwivedi

Keyword(s):

Data Mining ◽

Retail Sales ◽

Using Data

Download Full-text

CROP YIELD PREDICTION AND SOIL DATA ANALYSIS USING DATA MINING TECHNIQUES IN KRISHNAGIRI DISTRICT

International Journal of Computer Sciences and Engineering ◽

10.26438/ijcse/v6si8.4955 ◽

2018 ◽

Vol 06 (08) ◽

pp. 49-55

Author(s):

K. Samundeeswari ◽

K. Srinivasan

Keyword(s):

Data Mining ◽

Data Analysis ◽

Crop Yield ◽

Yield Prediction ◽

Data Mining Techniques ◽

Using Data

Download Full-text

Diabetes Prediction using Data Mining

International Journal of Computer Sciences and Engineering ◽

10.26438/ijcse/v7i3.749753 ◽

2019 ◽

Vol 7 (3) ◽

pp. 749-753

Author(s):

Suhasini Vijaykumar ◽

Manjiri Moghe

Keyword(s):

Data Mining ◽

Diabetes Prediction ◽

Using Data

Download Full-text