PENERAPAN ANALISIS SENTIMEN PADA PENGGUNA TWITTER MENGGUNAKAN METODE K-NEAREST NEIGHBOR

Akhmad Deviyanto; Muhammad Didik Rohmad Wahyudi

doi:10.14421/jiska.2018.31-01

PENERAPAN ANALISIS SENTIMEN PADA PENGGUNA TWITTER MENGGUNAKAN METODE K-NEAREST NEIGHBOR

JISKA (Jurnal Informatika Sunan Kalijaga) ◽

10.14421/jiska.2018.31-01 ◽

2018 ◽

Vol 3 (1) ◽

pp. 1 ◽

Cited By ~ 2

Author(s):

Akhmad Deviyanto ◽

Muhammad Didik Rohmad Wahyudi

Keyword(s):

Sentiment Analysis ◽

Nearest Neighbor ◽

Cosine Similarity ◽

K Nearest Neighbor ◽

Term Weighting ◽

K Nearest Neighbor Algorithm ◽

Cosine Similarity Measure ◽

Analysis System ◽

Test Result ◽

Python Package

AbstractThis research is made to implement the KNN (K-Nearest Neighbor) algorithm for sentiment analysis Twitter about Jakarta Governor Election 2017. The object is 2000 data tweets in Indonesia collected from Twitter during Januari 2017 using Python package called Twitterscraper. The methode used in sentiment analysis system is KNN with TF-IDF term weighting and Cosine similarity measure. As the test result, the highest accuracy is 67,2% when k=5, the highest precision is 56,94% with k=5, and the highest recall 78,24% with k=15.Keywords : K – Nearest Neighbor, Twitterscraper, TF-IDF, Cosine Similarity Penelitian ini dibuat untuk mengimplementasikan algoritma KNN (K - Nearest Neighbor) dalam analisis sentimen pengguna Twitter tentang topik Pilkada DKI 2017. Data tweet yang digunakan adalah sebanyak 2000 data tweet berbahasa Indonesia yang dikumpulkan selama bulan Januari 2017 menggunakan package Python bernama Twitterscraper. Menggunakan algoritma KNN dengan pembobotan kata TF-IDF dan fungsi Cosine Similarity, akan dilakukan pengklasifikasian nilai sentimen ke dalam dua kelas : positif dan negatif. Dari hasil pengujian diketahui bahwa nilai akurasi terbesar adalah 67,2% ketika k=5, presisi tertinggi 56,94% ketika k=5, dan recall 78,24% dengan k=15.Kata Kunci : K – Nearest Neighbor, Twitterscraper, TF-IDF, Cosine Similarity

Download Full-text

Sentiment Analysis about Large-Scale Social Restrictions in Social Media Twitter Using Algoritm K-Nearest Neighbor

Jurnal Online Informatika ◽

10.15575/join.v6i1.670 ◽

2021 ◽

Vol 6 (1) ◽

pp. 96

Author(s):

Ikhsan Romli ◽

Shanti Prameswari R ◽

Antika Zahrotul Kamalia

Keyword(s):

Social Media ◽

Sentiment Analysis ◽

Large Scale ◽

Nearest Neighbor ◽

Cosine Similarity ◽

Manhattan Distance ◽

K Nearest Neighbor ◽

Distance Calculation ◽

K Nearest Neighbor Algorithm ◽

Similarity Distance

Sentiment analysis is a data processing to recognize topics that people talk about and their sentiments toward the topics, one of which in this study is about large-scale social restrictions (PSBB). This study aims to classify negative and positive sentiments by applying the K-Nearest Neighbor algorithm to see the accuracy value of 3 types of distance calculation which are cosine similarity, euclidean, and manhattan distance for Indonesian language tweets about large-scale social restrictions (PSBB) from social media twitter. With the results obtained, the K-Nearest Neighbor accuracy by the Cosine Similarity distance 82% at k = 3, K-Nearest Neighbor by the Euclidean Distance with an accuracy of 81% at k = 11 and K-Nearest Neighbor by Manhattan Distance with an accuracy 80% at k = 5, 7, 9, 11, and 13. So, in this study the K-Nearest Neighbor algorithm with the Cosine Similarity Distance calculation gets the highest point.

Download Full-text

Song Recommendations Based on Artists with Cosine Similarity Algorithms and K-Nearest Neighbor

JELIKU (Jurnal Elektronik Ilmu Komputer Udayana) ◽

10.24843/jlk.2020.v08.i04.p01 ◽

2020 ◽

Vol 8 (4) ◽

pp. 367

Author(s):

Muhammad Arief Budiman ◽

Gst. Ayu Vida Mastrika Giri

Keyword(s):

Collaborative Filtering ◽

Mobile Phones ◽

Recommendation System ◽

Nearest Neighbor ◽

Cosine Similarity ◽

Music Recommendation ◽

K Nearest Neighbor ◽

Filtering Method ◽

K Nearest Neighbor Algorithm ◽

Music Recommendation System

The development of the music industry is currently growing rapidly, millions of music works continue to be issued by various music artists. As for the technologies also follows these developments, examples are mobile phones applications that have music subscription services, namely Spotify, Joox, GrooveShark, and others. Application-based services are increasingly in demand by users for streaming music, free or paid. In this paper, a music recommendation system is proposed, which the system itself can recommend songs based on the similarity of the artist that the user likes or has heard. This research uses Collaborative Filtering method with Cosine Similarity and K-Nearest Neighbor algorithm. From this research, a system that can recommend songs based on artists who are related to one another is generated.

Download Full-text

Sentiment Analysis System for Myanmar News using K Nearest Neighbor and Naïve Bayes

Proceedings of 2020 the 10th International Workshop on Computer Science and Engineering ◽

10.18178/wcse.2020.02.001 ◽

2020 ◽

Keyword(s):

Sentiment Analysis ◽

Nearest Neighbor ◽

Naive Bayes ◽

Naïve Bayes ◽

K Nearest Neighbor ◽

Analysis System

Download Full-text

ASPECT BASED SENTIMENT ANALYSIS DATA KUESIONER DI RUMAH SAKIT MUHAMMADIYAH LAMONGAN MENGGUNAKAN ALGORITMA K-NN.

JOUTICA ◽

10.30736/jti.v6i2.677 ◽

2021 ◽

Vol 6 (2) ◽

pp. 506

Author(s):

Mustain Mustain Mustain

Keyword(s):

Vector Space ◽

Sentiment Analysis ◽

Nearest Neighbor ◽

Vector Space Model ◽

Analysis Data ◽

Cosine Similarity ◽

K Nearest Neighbor ◽

Space Model

Kesulitan untuk mengorganisir data kuesioner yang bersifat konvensional melatarbelakangi penelitian ini. Oleh karena itu dibuat sistem yang memudahkan pengelompokan data kuesioner secara otomatis yang lengkap dengan sentimen yang terkandung didalamnya. Dataset yang digunakan dalam penelitian ini adalah data kuesioner rumah sakit Muhammadiyah lamongan. Penelitian ini hanya menangani kuesioner yang berbentuk teks. Data dengan fisik kertas direkap kemudian diinput ke database lengkap dengan kategori unit kerja dan sentiment. Selanjutnya dataset tersebut di dilakukan pre-prosesing yang meliputi penanganan negasi case folding, tokenizing, filtering dan stemming. Sebagai data uji komentar dari kuesioner akan dilakukan pre-prosesing selanjutnya dihitung tingkat kemiripan document dengan menggunakan metode K- Nearest Neighbor dan Vector Space Model. Jumlah data yang ditangani mempengaruhi performa system terutama dari akurasi dan kecepatan pada saat proses klasifikasi. Hasil dari sistem yang dibuat berupa ranking dokumen yang paling mirip dengan dataset berdasarkan urutan nilai cosine similarity. Ujicoba klasifikasi berdasarkan kelas kategori menghasilkan nilai akurasi 91 %. Ujicoba berdasarkan Kelas Sentimen sebesar 94 %.dari kombinasi keduanya system berhasil mendapat akurasi sebesar 86 %

Download Full-text

Twitter text mining for sentiment analysis on government’s response to forest fires with vader lexicon polarity detection and k-nearest neighbor algorithm

Journal of Physics Conference Series ◽

10.1088/1742-6596/1567/3/032024 ◽

2020 ◽

Vol 1567 ◽

pp. 032024

Author(s):

T Mustaqim ◽

K Umam ◽

M A Muslim

Keyword(s):

Text Mining ◽

Sentiment Analysis ◽

Forest Fires ◽

Nearest Neighbor ◽

K Nearest Neighbor ◽

Nearest Neighbor Algorithm ◽

K Nearest Neighbor Algorithm

Download Full-text

Implementasi Algoritma K-Nearest Neighbor untuk Melakukan Klasifikasi Produk dari beberapa E-marketplace

Jurnal Teknik Informatika dan Sistem Informasi ◽

10.28932/jutisi.v5i1.1581 ◽

2019 ◽

Vol 5 (1) ◽

Author(s):

Danny Sebastian

Keyword(s):

Nearest Neighbor ◽

Cosine Similarity ◽

K Nearest Neighbor ◽

Nearest Neighbor Algorithm ◽

Product Data ◽

K Nearest Neighbor Algorithm ◽

Similarity Distance ◽

Brand Product

E-marketplace has gained popularity with the Indonesian society resulting in the increment of products offered. Consequently, customers require more effort to search for products. In this study, we classified products from several e-marketplaces. The classification was carried out using TF-IDF method for the weighting, cosine similarity to calculate product similarity distance, and k-nearest neighbor algorithm. Based on the first testing result using 150 product data, the k-nearest neighbor method with k=5 successfully classified 146 data with 4 data classified into the wrong class. This k=5 value gives the best result for this case, with an accuracy of 97.33%. The second testing result using 150 mixed brand product data, the k-nearest neighbor method successfully classified 145 data with 5 data classified into the wrong class. The accuracy of the second testing is 96.67%.

Download Full-text

Ensemble of Classifiers and Term Weighting Schemes for Sentiment Analysis in Turkish

10.52460/src.2021.004 ◽

2021 ◽

Vol 1 (1) ◽

pp. 1-12

Author(s):

Aytuğ Onan ◽

Keyword(s):

Sentiment Analysis ◽

Language Processing ◽

Nearest Neighbor ◽

Text Messages ◽

Support Vector ◽

K Nearest Neighbor ◽

Term Weighting ◽

Text Documents ◽

Weighting Schemes ◽

Short Text

With the advancement of information and communication technology, social networking and microblogging sites have become a vital source of information. Individuals can express their opinions, grievances, feelings, and attitudes about a variety of topics. Through microblogging platforms, they can express their opinions on current events and products. Sentiment analysis is a significant area of research in natural language processing because it aims to define the orientation of the sentiment contained in source materials. Twitter is one of the most popular microblogging sites on the internet, with millions of users daily publishing over one hundred million text messages (referred to as tweets). Choosing an appropriate term representation scheme for short text messages is critical. Term weighting schemes are critical representation schemes for text documents in the vector space model. We present a comprehensive analysis of Turkish sentiment analysis using nine supervised and unsupervised term weighting schemes in this paper. The predictive efficiency of term weighting schemes is investigated using four supervised learning algorithms (Naive Bayes, support vector machines, the k-nearest neighbor algorithm, and logistic regression) and three ensemble learning methods (AdaBoost, Bagging, and Random Subspace). The empirical evidence suggests that supervised term weighting models can outperform unsupervised term weighting models.

Download Full-text

N-Gram and K-Nearest Neighbor Algorithm for Sentiment Analysis on Capital Relocation

10.1109/citsm52892.2021.9587919 ◽

2021 ◽

Author(s):

Muhammad Ilham Ramadhon ◽

Arini Arini ◽

Fitri Mintarsih ◽

Iik Muhamad Malik Matin

Keyword(s):

Sentiment Analysis ◽

Nearest Neighbor ◽

K Nearest Neighbor ◽

Nearest Neighbor Algorithm ◽

K Nearest Neighbor Algorithm ◽

N Gram

Download Full-text

KLASIFIKASI SENTIMENT ANALYSIS PADA KOMENTAR PESERTA DIKLAT MENGGUNAKAN METODE K-NEAREST NEIGHBOR

Kilat ◽

10.33322/kilat.v8i1.421 ◽

2019 ◽

Vol 8 (1) ◽

Author(s):

Riki Ruli A. Siregar ◽

Zuhdiyyah Ulfah Siregar ◽

Rakhmat Arianto

Keyword(s):

Sentiment Analysis ◽

Test Data ◽

Nearest Neighbor ◽

Cosine Similarity ◽

Training Data ◽

K Nearest Neighbor ◽

Term Frequency ◽

Document Frequency ◽

Negative Comments

The process of analyzing and classifying comment data done by reading and sorting one by one negative comments and classifying them one by one using Ms. Excel not effective if the data to be processed in large quantities. Therefore, this study aims to apply sentiment analysis on comment data using K-Nearest Neighbor (KNN) method. The comment data used is the comments of the participants of the training on Udiklat Jakarta filled by each participant who followed the training. Furthermore, the comment data is processed by pre-processing, weighting the word using Term Frequency-Invers Document Frequency, calculating the similarity level between the training data and test data with cosine similarity. The process of applying sentiment analysis is done to determine whether the comment is positive or negative. Furthermore, these comments will be classified into four categories, namely: instructors, materials, facilities and infrastructure. The results of this study resulted in a system that can classify comment data automatically with an accuracy of 94.23%

Download Full-text

Klasifikasi Berita Politik Menggunakan Algoritma K-nearst Neighbor

BERKALA SAINSTEK ◽

10.19184/bst.v6i2.9256 ◽

2018 ◽

Vol 6 (2) ◽

pp. 106

Author(s):

Difari Afreyna Fauziah ◽

Achmad Maududie ◽

Ifrina Nuritha

Keyword(s):

Nearest Neighbor ◽

Confusion Matrix ◽

Cosine Similarity ◽

K Nearest Neighbor ◽

Term Weighting ◽

F Measure

Klasifikasi konten berita politik menggunakan algoritma K-Nearest Neighbor merupakan suatu proses untuk mengklasifikasikan berita politik ke dalam tiga subkategori yang lebih spesifik yaitu pilkada, UU ORMAS dan reshuffle kabinet. Algoritma yang digunakan dalam penelitian ini adalah algoritma K-Nearest Neighbor. Algoritma K-Nearest Neighbor merupakan suatu pendekatan klasifikasi yang mencari semua data training yang paling relatif mirip atau memiliki jarak yang paling dekat dengan data testing. Algoritma ini dipilih karena K-Nearest Neighbor merupakan algoritma yang sederhana dengan mencari kategori mayoritas sebanyak nilai K yang telah ditentukan sebelumnya. nilai K yang digunakan pada penelitian ini adalah K=3, K=5, K=7 dan K=9. Mekanisme dari sistem klasifikasi konten berita ini dimulai dengan tahap preprocessing. Berita politik yang dimasukkan kedalam sistem akan melewati empat tahap preprocessing yaitu case folding, tokenizing, stopword dan stemming. Tahap selanjutnya yaitu tahap pembobotan term. Pembobotan atau term weighting merupakan proses mendapatkan nilai term yang berhasil diekstrak dari proses sebelumnya yaitu proses preprocessing. Algoritma yang digunakan untuk tahap pembobotan pada penelitian ini adalah algoritma TFIDF. Setelah didapatkan nilai dari bobot term, kemudian dicari nilai jarak antar dokumen menggunakan algoritma cosine similarity. Langkah berikutnya adalah melakukan pengurutan data dalam data training berdasarkan hasil perhitungan nilai jarak. Selanjutnya, dari hasil pengurutan tersebut diambil sejumlah K data yang memiliki nilai kedekatan. Tujuan dari penelitian ini adalah sistem mampu mengimplementasikan algoritma KNN pada dokumen yang memiliki similarity yang tinggi. Pada penelitian ini dilakukan 3 pengujian dengan tiga variasi dataset yang berbeda dengan empat nilai K. Hasil akurasi yang terbaik didapatkan ketika sistem menggunakan nilai K=9 yang menunjukkan nilai precision sebesar 100%, recall sebesar 100% dan nilai f-measure sebesar 100%. Kata Kunci: klasifikasi, algoritma K-Nearest Neighbor, TFIDF, cosine similarity, confusion matrix.

Download Full-text