scholarly journals Appraisal of the Classification Technique in Data Mining of Student Performance using J48 Decision Tree, K-Nearest Neighbor and Multilayer Perceptron Algorithms

2018 ◽  
Vol 179 (33) ◽  
pp. 39-46 ◽  
Author(s):  
Faiza Umar ◽  
Najim Ussiph
Author(s):  
Tikaridha Hardiani

The students of Universitas ‘Aisyiyah Yogyakarta have been increasing including the number of students in the Faculty of Health Sciences. In 2016 the total number of UNISA students was 1851. The increasing number of students every year leads to great numbers of data stored in the university database. The data provide useful information for the university to predict student graduation or student study period whether they graduate on time with a study period of 4 years or late with a study period of more than 4 years. This can be processed by using a data mining technique that is the classification technique. Data needed in the classification technique are data of students who have graduated as training data and data of students who are still studying in the university as testing data. The training data were 501 records with 10 goals and the testing data were 428 records. Data mining process method used was the Cross-Industry Standard Prosses for Data Mining (CRISPDM). The algorithms used in this study were Naive Bayes, K-Nearest Neighbor (KNN) and Decision Tree. The three algorithms were compared to see the accuracy by using Rapidminer software. Based on the accuracy, it was found that the K-NN algorithm was the best in predicting student graduation with an accuracy of 91.82%. The K-NN algorithm showed that 100% of the students of Nursing study program of Universitas Aisyiyah Yogyakarta are predicted to graduate on time.


2016 ◽  
Vol 7 (4) ◽  
Author(s):  
Mochammad Yusa ◽  
Ema Utami ◽  
Emha T. Luthfi

Abstract. Readmission is associated with quality measures on patients in hospitals. Different attributes related to diabetic patients such as medication, ethnicity, race, lifestyle, age, and others result in the calculation of quality care that tends to be complicated. Classification techniques of data mining can solve this problem. In this paper, the evaluation on three different classifiers, i.e. Decision Tree, k-Nearest Neighbor (k-NN), dan Naive Bayes with various settingparameter, is developed by using 10-Fold Cross Validation technique. The targets of parameter performance evaluated is based on term of Accuracy, Mean Absolute Error (MAE), dan Kappa Statistic. The selected dataset consists of 47 attributes and 49.735 records. The result shows that k-NN classifier with k=100 has a better performance in term of accuracy and Kappa Statistic, but Naive Bayes outperforms in term of MAE among other classifiers. Keywords: k-NN, naive bayes, diabetes, readmissionAbstrak. Proses Readmisi dikaitkan dengan perhitungan kualitas penanganan pasien di rumah sakit. Perbedaan atribut-atribut yang berhubungan dengan pasien diabetes proses medikasi, etnis, ras, gaya hidup, umur, dan lain-lain, mengakibatkan perhitungan kualitas cenderung rumit. Teknik klasifikasi data mining dapat menjadi solusi dalam perhitungan kualitas ini. Teknik klasifikasi merupakan salah satu teknik data mining yang perkembangannya cukup signifikan. Di dalam penelitian ini, model algoritma klasifikasi Decision Tree, k-Nearest Neighbor (k-NN), dan Naive Bayes dengan berbagai parameter setting akan dievaluasi performanya berdasarkan nilai performa Accuracy, Mean AbsoluteError (MAE), dan Kappa Statistik dengan metode 10-Fold Cross Validation. Dataset yang dievaluasi memiliki 47 atribut dengan 49.735 records. Hasil penelitian menunjukan bahwa performa accuracy, MAE, dan Kappa Statistik terbaik didapatkan dari Model Algoritma Naive Bayes.Kata Kunci: k-NN, naive bayes, diabetes, readmisi


2018 ◽  
Vol 4 (2) ◽  
pp. 83
Author(s):  
Tutus Praningki ◽  
Indra Budi

Tersedianya data histori rekam medis pasien kanker serviks pada institusi pelayanan kesehatan, tidak disertai dengan proses ekstraksi menjadi sebuah pengetahuan atau informasi. Penggunaan teknik data mining sangat berpotensi untuk diimplementasikan kedalam sistem yang dapat melakukan prediksi penyakit kanker serviks. Pada penelitian ini berfokus pada dataset diagnosa medis pasien yang akan melakukan tes Pap Smear. Algoritma yang digunakan untuk melakukan klasifikasi penyakit kanker serviks adalah Classification And Regression Trees (CART), Naive Bayes, dan k-Nearest Neighbor (k-NN). Pengujian yang dilakukan terhadap algoritma CART Decision Tree, Naive Bayes, dan k-NN, menggunakan formula Confusion Matrix, dengan menggunakan teknik pemecahan dataset Holdout. Hasil pengujian terhadap algoritma yang digunakan, menunjukkan algoritma Naive Bayes memiliki akurasi terbaik sebesar 94,44%, sedangkan tingkat akurasi yang dihasilkan algoritma CART dan k-NN adalah 88,89%, 85,04%. Performa yang didapatkan oleh masing-masing algoritma yang digunakan, memungkinkan penggunaan sistem prediksi penyakit kanker serviks untuk mendukung keputusan klinis pada pasien baru. 


2020 ◽  
Vol 6 (3) ◽  
pp. 337
Author(s):  
Seno Hartono ◽  
Anggi Perwitasari ◽  
Herry Sujaini

Klasifikasi merupakan metode data mining yang berfungsi untuk mengatur dan mengkategorikan data pada kelas yang berbeda-beda. Penelitian ini bertujuan untuk membandingkan dan menentukan algoritma nonparametrik terbaik dalam pengklasifikasian citra wajah. Dalam proses pengklasifikasian, penelitian ini menggunakan algoritma klasifikasi nonparametrik yaitu k-Nearest Neighbor (kNN), Support Vector Machine (SVM), Decision Tree, dan AdaBoost Untuk mengklasifikasikan citra wajah penduduk Indonesia yang berasal dari suku Batak, Dayak, Jawa, Melayu, dan Tionghoa. Penelitian ini menggunakan Orange Data Mining Tool sebagai alat bantu untuk melakukan proses data mining. Dari hasil pengklasifikasian dengan menerapkan algoritma k-Nearest Neigbor, Support Vector Machine, Decision Tree, dan AdaBoost, SVM memberikan nilai akurasi yang lebih baik dibanding algoritma lainnya. Rata-rata nilai precision keempat algoritma tersebut berturut-turut adalah Support Vector Machine 37.5%, diikuti oleh algoritma k-Nearest Neighbor 31.55%, AdaBoost 30.25%, dan untuk Decision Tree 29.75%.


2020 ◽  
Vol 5 (2) ◽  
pp. 265-270 ◽  
Author(s):  
Agus Budiyantara ◽  
Irwansyah Irwansyah ◽  
Egi Prengki ◽  
Pandi Ahmad Pratama ◽  
Ninuk Wiliani

Private Universities (PTS) compete so tight in providing performance in producing quality graduates. In addition, the number of universities in Indonesia which counts a lot both PTN and PTS makes the higher competition between universities as well. So the university strives to improve quality and provide the best education for service recipients, namely students, where one of the problems if there are some students who are late graduating or not on time so that it becomes an obstacle to the progress of the college. Prediction of students graduating on time is needed by university management in determining preventive policies related to early prevention of Drop Out (DO) cases. This prediction aims to determine the academic factors that influence the period of study and build the best prediction model with Data Mining techniques. There are 11 attributes used for Data Mining Classification, namely NPM, Gender, Age, Department, Class, Occupation, Semester 1 Achievement Index, Semester 2 Achievement Index, Semester 3 Achievement Index, Semester 4 Achievement Index and Information as result attributes. From the results of evaluations and validations that have been carried out using the RapidMiner tools the accuracy of the Decision Tree (C4.5) method is 98.04% in the 3rd test. The accuracy of the Naïve Bayes Method is 96.00% in the 4th test. And the accuracy of the K-Nearest Neighbor Method (K-NN) of 90.00% in the second test.


2020 ◽  
Vol 17 (9) ◽  
pp. 4548-4552
Author(s):  
Vikas Rattan ◽  
Ruchi Mittal ◽  
Varun Malik

Tremendous growth of educational institutions forced educational institutes to adopt data mining techniques to bring out important and yet unknown facts from educational data to have a competitive edge over their counterparts. In this paper, student performance dataset comprises of 131 records is taken from UCI repository and data mining tool Orange is used to study the comparative analyses of accuracy for classifying the performance of student in graduation using four classifiers namely random forest, k nearest neighbor (KNN), decision tree and naïve bayes. The result shows that decision tree accuracy is highest among all other classifier


Author(s):  
M. Khairul Anam ◽  
Bunga Nanti Pikir ◽  
Muhammad Bambang Firdaus

Pemerintah Pekanbaru saat ini sudah menerapkan teknologi dalam sistem pemerintahan, penerapannya saat ini masih mendapat keluhan dari masyarakat seperti layanan publik command center yang hanya sebagian masyarakat mengetahuinya dan penerapan cctv yang ada di Alat Pemberi Isyarat Lalu Lintas (APILL) yang belum berfungsi dengan baik. Penerapan teknologi lainnya oleh Pemerintah Pekanbaru dapat kita lihat dari keberadaan portal-portal web situs resmi Pemerintah. Sedangkan untuk melihat beragam komentar netizen dari twitter. Twitter menjadi tempat untuk mendapatkan data yang diungkapkan masyarakat melalui tweets yang diposting ke timeline. Analisa sentimen dilakukan untuk melihat pendapat atau kecenderungan opini netizen terhadap pemerintah Pekanbaru yang mengandung sentimen positif, negatif, dan netral. Data yang digunakan adalah tweet dengan jumlah dataset sebanyak 150 tweets. Data tersebut kemudian di analisa agar menjadi informasi. Analisa dilakukan menggunakan metode data mining yaitu Naïve Bayes Classifier, K-Nearest Neighbor (KNN), dan Decision tree. Penggunaan ketiga pendekatan ini berupaya untuk mengkategorikan hasil komentar netizen terkait penggunaan teknologi yang telah melalui proses analisis sentimen dan membandingkan keakuratan ketiga cara tersebut. Hasil akurasi yang didapatkan cukup beragam yaitu dari metode Naïve Bayes akurasi 100%, metode KKN akurasi 98,25%, dan metode decision tree akurasi 62,28%.


2017 ◽  
Vol 2 (2) ◽  
pp. 103
Author(s):  
Devi Yunita

Perbandingan Algoritma K-Nearest Neighbor Dan Decision Tree  untuk  Risiko Kredit Kepemilikan Mobil Kredit adalah sarana agar orang atau perusahaan dapat meminjam modal atau uang dan membayarnya dalam tempo yang sudah ditentukan. Agar kredit yang diberikan sesuai tujuan atau sasaran; yaitu aman; maka perlu diakukan analisis kredit. Analisis kredit adalah kajian yang dilakukan untuk mengetahui kelayakan dari suatu permasalahan kredit. Dalam penelitian analisa kredit ini menggunakan perbandingan Algoitma K-nearest neighbor (K-NN) yang merupakan penelitian menggunakan metode dengan mencari kedekatan antara kriteria kasus baru dengan kriteria kasus lama berdasarkan kriteria kasus yang paling mendekati; dan menggunakan Metode Decision tree yang merupakan metode yang ada pada teknik klasifikasidalam data mining. Hasil penelitian dengan menggunakan aplikasi Rapid Miner menunjukan bahwa Algoritma K-Nearest Neighbor (K-NN) memiliki nilai akurasi yang lebih baik


Educational data mining is a field of science that extracts knowledge from educational data. One of its implementations is to predict student performance, it helps teachers to identify students that need more support. This can potentially increase learning effectiveness and elevate overall student’s grades. There are various algorithms and optimization solutions to predict student’s performance. In this paper, we use real data from one of Indonesia’s public junior high schools to compare naive bayes, decision tree, and k-nearest neighbor algorithms and implement feature selection and parameter optimization to identify which combination of algorithm and optimization can achieve the highest accuracy in predicting student grades, i.e. 7-grade classification.The results show that k-NN achieves the highest accuracy with 77.36%, where both feature selection and parameter optimization are applied


2020 ◽  
Vol 17 (1) ◽  
pp. 22-30
Author(s):  
Tiska Pattiasina ◽  
Didi Rosiyadi

Data Mining is a series of processes to explore added value in the form of unknown information manually from the database. In the world of data mining education can be used to obtain information about student performance. In this study the researchers took research samples from class XI (eleven) students at SMAN 3 Ambon by classifying student performance based on thirteen attributes, namely: age, sex, school organization, extracurricular activities, pocket money, duration of study at home, duration of social media, online game duration, attendance, illness, permits, semester 1 and semester 2 grades. Using the KDD (Knowledge Discovery Database) method and classification algorithm that will be used, namely, decision tree, Naïve Bayes and K-Nearest Neighbor. And then do the test using k-fold cross validation.


Sign in / Sign up

Export Citation Format

Share Document