Classification of YouTube Data based on Opinion Mining

Index of suicide risk in Mexico using Twitter

Journal of Social Researches ◽

10.35429/jsr.2019.15.5.1.13 ◽

2019 ◽

pp. 1-13

Author(s):

Luz Judith Rodríguez-Esparza ◽

Diana Barraza-Barraza ◽

Jesús Salazar-Ibarra ◽

Rafael Gerardo Vargas-Pasaye

Keyword(s):

Social Networks ◽

Analytic Hierarchy Process ◽

Suicide Risk ◽

Opinion Mining ◽

Real Data ◽

Analytic Hierarchy ◽

Specialized Care ◽

The Analytic Hierarchy Process ◽

Hierarchy Process

Objectives: To identify early suicide risk signs on depressive subjects, so that specialized care can be provided. Various studies have focused on studying expressions on social networks, where users pour their emotions, to determine if they show signs of depression or not. However, they have neglected the quantification of the risk of committing suicide. Therefore, this article proposes a new index for identifying suicide risk in Mexico. Methodology: The proposal index is constructed through opinion mining using Twitter and the Analytic Hierarchy Process. Contribution: Using R statistical package, a study is presented considering real data, making a classification of people according to the obtained index and using information from psychologists. The proposed methodology represents an innovative prevention alternative for suicide.

Download Full-text

Polarity Classification of Arabic Sentiments

International Journal of Information Technology and Web Engineering ◽

10.4018/ijitwe.2016070103 ◽

2016 ◽

Vol 11 (3) ◽

pp. 32-49 ◽

Cited By ~ 5

Author(s):

Mohammed N. Al-Kabi ◽

Heider A. Wahsheh ◽

Izzat M. Alsmadi

Keyword(s):

Social Media ◽

Sentiment Analysis ◽

Operating Characteristic ◽

Opinion Mining ◽

Online Social Network ◽

The Social ◽

Polarity Classification ◽

Arabic Sentiment Analysis ◽

Modern Standard

Sentiment Analysis/Opinion Mining is associated with social media and usually aims to automatically identify the polarities of different points of views of the users of the social media about different aspects of life. The polarity of a sentiment reflects the point view of its author about a certain issue. This study aims to present a new method to identify the polarity of Arabic reviews and comments whether they are written in Modern Standard Arabic (MSA), or one of the Arabic Dialects, and/or include Emoticons. The proposed method is called Detection of Arabic Sentiment Analysis Polarity (DASAP). A modest dataset of Arabic comments, posts, and reviews is collected from Online social network websites (i.e. Facebook, Blogs, YouTube, and Twitter). This dataset is used to evaluate the effectiveness of the proposed method (DASAP). Receiver Operating Characteristic (ROC) prediction quality measurements are used to evaluate the effectiveness of DASAP based on the collected dataset.

Download Full-text

Classification of Fake Product Ratings Using a Timeline Based Approach

International Journal of Business Administration and Management Research ◽

10.24178/ijbamr.2017.3.2.12 ◽

2017 ◽

Vol 3 (2) ◽

pp. 12 ◽

Cited By ~ 1

Author(s):

Neha Thomas ◽

Susan Elias

Keyword(s):

Language Processing ◽

Opinion Mining ◽

Optimal Point ◽

Linear Classifiers ◽

Wide Range ◽

Text Content ◽

Classification Tool ◽

Fake Reviews ◽

Product Ratings

Abstract— Detection of fake review and reviewers is currently a challenging problem in cyber space. It is challenging primarily due to the dynamic nature of the methodology used to fake the review. There are several aspects to be considered when analyzing reviews to classify them effective into genuine and fake. Sentiment analysis, opinion mining and intend mining are fields of research that try to accomplish the goal through Natural Language Processing of the text content of the review. In this paper, an approach that uses the review ratings evaluated along a timeline is presented. An Amazon dataset comprising of ratings indicated for a wide range of products was used for the analysis presented here. The analysis of the ratings was carried out for an electronic product over a period of six years. The computed average rating helps to identify linear classifiers that define solution boundaries within the dataspace. This enables a product specific classification of review ratings and suitable recommendations can also be generated automatically. The paper explains a methodology to evaluate the average product ratings over time and presents the research outcomes using a novel classification tool. The proposed approach helps to determine the optimal point to distinguish between fake and genuine ratings for each product. Index Terms: Fake reviews, Fake Ratings, Product Ratings, Online Shopping, Amazon Dataset.

Download Full-text

Classification of Opinion Mining Techniques

International Journal of Computer Applications ◽

10.5120/8948-3122 ◽

2012 ◽

Vol 56 (13) ◽

pp. 1-6 ◽

Cited By ~ 14

Author(s):

Nidhi Mishra ◽

C. K. Jha

Keyword(s):

Opinion Mining

Download Full-text

A gradient boosted decision tree-based sentiment classification of twitter data

International Journal of Wavelets Multiresolution and Information Processing ◽

10.1142/s0219691320500277 ◽

2020 ◽

Vol 18 (04) ◽

pp. 2050027

Author(s):

S. Neelakandan ◽

D. Paulraj

Keyword(s):

Decision Tree ◽

Opinion Mining ◽

Research Topic ◽

Sentiment Classification ◽

Decision Tree Classifier ◽

Twitter Data ◽

Tree Classifier ◽

Boosted Decision Tree ◽

Text Sentiment Analysis

People communicate their views, arguments and emotions about their everyday life on social media (SM) platforms (e.g. Twitter and Facebook). Twitter stands as an international micro-blogging service that features a brief message called tweets. Freestyle writing, incorrect grammar, typographical errors and abbreviations are some noises that occur in the text. Sentiment analysis (SA) centered on a tweet posted by the user, and also opinion mining (OM) of the customers review is another famous research topic. The texts are gathered from users’ tweets by means of OM and automatic-SA centered on ternary classifications, namely positive, neutral and negative. It is very challenging for the researchers to ascertain sentiments as a result of its limited size, misspells, unstructured nature, abbreviations and slangs for Twitter data. This paper, with the aid of the Gradient Boosted Decision Tree classifier (GBDT), proposes an efficient SA and Sentiment Classification (SC) of Twitter data. Initially, the twitter data undergoes pre-processing. Next, the pre-processed data is processed using HDFS MapReduce. Now, the features are extracted from the processed data, and then efficient features are selected using the Improved Elephant Herd Optimization (I-EHO) technique. Now, score values are calculated for each of those chosen features and given to the classifier. At last, the GBDT classifier classifies the data as negative, positive, or neutral. Experiential results are analyzed and contrasted with the other conventional techniques to show the highest performance of the proposed method.

Download Full-text

Opinion Mining and Classification of Music Lyrics Using Supervised Learning Algorithms

2018 First International Conference on Secure Cyber Computing and Communication (ICSCCC) ◽

10.1109/icsccc.2018.8703292 ◽

2018 ◽

Author(s):

Mahesh Ahuja ◽

A. L. Sangal

Keyword(s):

Supervised Learning ◽

Opinion Mining ◽

Learning Algorithms ◽

Supervised Learning Algorithms ◽

Music Lyrics

Download Full-text

Opinion Mining on Culinary Food Customer Satisfaction Using Naïve Bayes Based-on Hybrid Feature Selection

Indonesian Journal of Electrical Engineering and Computer Science ◽

10.11591/ijeecs.v15.i1.pp468-475 ◽

2019 ◽

Vol 15 (1) ◽

pp. 468 ◽

Cited By ~ 3

Author(s):

Oman Somantri ◽

Dyah Apriliani

Keyword(s):

Feature Selection ◽

Opinion Mining ◽

Naive Bayes ◽

Information Gain ◽

Naïve Bayes ◽

Classification Model ◽

Consumer Ratings ◽

Bayes Algorithm ◽

Restaurant Owners

Conducting an assessment of consumer sentiments taken from social media in assessing a culinary food gives useful information for everyone who wants to get this information especially for migrants and tourists, in th other hand that information is very valuable for food stall and restaurant owners as information in improvinf food quality. Overcoming this problem, a sentiment analysis classification model using naïve bayes algorithm (NB) was applied to get this information. This problem occurs is the level of accuracy of classification of consumer ratings of culinary food is still not optimal because the weight of values in the data preprocessing process are not optimal. In this paper proposed a hybrid feature selection models to overcome the problems in the process of selecting the feature attributes that have not been optimal by using a combination of information gain (IG) and genetic algorithm (GA) algorithms. The result of this research showed that after the experiment and compared to using others algorithms produce the best of the level occuracy is 93%.

Download Full-text

Analisis Sentimen Sistem E-Tilang Menggunakan Algoritma Naive Bayes Dengan Optimalisasi Information Gain

Journal of Informatic and Information Security ◽

10.31599/jiforty.v1i1.137 ◽

2020 ◽

Vol 1 (1) ◽

pp. 19-26

Author(s):

Rakhmi Khalida ◽

Siti Setiawati

Keyword(s):

Sentiment Analysis ◽

Opinion Mining ◽

Naive Bayes ◽

Information Gain ◽

Naïve Bayes ◽

Traffic Violations ◽

The Government ◽

Bayes Algorithm ◽

User Friendly

Abstract The Government of Indonesia took steps to change the system to improve public services in traffic violations by implementing the e-ticketing system. This system is a solution for disciplining motorized motorists from committing traffic violations. The existence of e-ticketing is also a solution to prevent the delinquency of law enforcers from illegal levies, peace terms in place, to accountability of fines. In this study, sentiment analysis of the e-ticketing system or opinion mining to classify the variety of public comments that give a positive, negative or neutral impression. Twitter social media is one of the objects to express opinions because it is user friendly, updated topics, and openly accesses tweets. Opinions on Twitter are collected, then the preprocessing stage is performed, then the selection of information gain features helps reduce noise caused by irrelevant labels, the next step is the classification of sentiments with the Naïve Bayes algorithm and finally polarity sentiments. This research resulted in an accuracy of 41.82%, a precision of 50.51% and a recall of 45.45%. Keywords: Sentiment analysis, E-ticketing, Information Gain, Naive Bayes Abstrak Pemerintah Indonesia melakukan langkah perubahan untuk memperbaiki sistem pelayanan publik dalam pelanggaran berlalu-lintas yaitu dengan menerapkan sistem e-Tilang. Sistem ini menjadi solusi mendisiplinkan para pengendara kendaraan bermotor dari banyaknya melakukan pelanggaran berlalu-lintas. Keberadaan e-Tilang juga menjadi solusi mencegah kenakalan penegak hukum dari pungutan liar, istilah damai ditempat, hingga akuntabilitas uang denda. Dalam penelitian ini melakukan analisis sentimen tentang sistem e-Tilang atau opinion mining untuk mengelompokan ragam komentar masyarakat yang memberikan kesan positif, negatif atau netral. Media sosial Twitter menjadi salah satu objek untuk menyampaikan opini karena user friendly, topik ter-update, dan terbuka mengakses tweet. Opini pada twitter dikumpulkan, lalu dilakukan tahapan preprocessing, selanjutnya dengan seleksi fitur information gain membantu mengurangi noise yang disebabkan oleh label-label yang tidak relevan, tahap selanjutnya adalah klasifikasi sentimen dengan algoritma Naïve Bayes dan terakhir sentimen polarity. Penelitian ini menghasilkan accuracy 41,82%, presisi 50,51% dan recall 45,45%. Kata kunci: Analisis sentimen, E-Tilang, Information Gain, Naive Bayes

Download Full-text

Evolution of hybrid distance based kNN classification

IAES International Journal of Artificial Intelligence (IJ-AI) ◽

10.11591/ijai.v10.i2.pp510-518 ◽

2021 ◽

Vol 10 (2) ◽

pp. 510

Author(s):

N. Suresh Kumar ◽

Pothina Praveena

Keyword(s):

Opinion Mining ◽

Linear Models ◽

Data Sets ◽

Review Analysis ◽

Average Accuracy ◽

Non Linear ◽

Distance Weighted ◽

Hybrid Distance ◽

Linear Algorithms

The evolution of classification of opinion mining and user review analysis span from decades reaching into ubiquitous computing in efforts such as movie review analysis. The performance of linear and non-linear models are discussed to classify the positive and negative reviews of movie data sets. The effectiveness of linear and non-linear algorithms are tested and compared in-terms of average accuracy. The performance of various algorithms is tested by implementing them on internet movie data base (IMDB). The hybrid kNN model optimizes the performance classification interns of accuracy. The accuracy of polarity prediction rate is improved with random-distance-weighted-kNN-ABC when compared with kNN algorithm applied alone.

Download Full-text

Opinion Mining with SentiWordNet

Knowledge Discovery Practices and Emerging Applications of Data Mining - Advances in Data Mining and Database Management ◽

10.4018/978-1-60960-067-9.ch013 ◽

2010 ◽

pp. 266-286 ◽

Cited By ~ 1

Author(s):

Bruno Ohana ◽

Brendan Tierney

Keyword(s):

Opinion Mining ◽

Sentiment Classification ◽

The Novel ◽

Data Set ◽

Learning Classifier ◽

Novel Approach ◽

Supervised Methods ◽

Classification Tasks ◽

Film Reviews

Opinion Mining is an emerging field of research concerned with applying computational methods to the treatment of subjectivity in text, with a number of applications in fields such as recommendation systems, contextual advertising and business intelligence. In this chapter the authors survey the area of opinion mining and discuss the SentiWordNet lexicon of sentiment information for terms derived from WordNet. Furthermore, the results of their research in applying this lexicon to sentiment classification of film reviews along with a novel approach that leverages opinion lexicons to build a data set of features used as input to a supervised learning classifier are also presented. The results obtained are in line with other experiments based on manually built opinion lexicons with further improvements obtained by using the novel approach, and are indicative that lexicons built using semi supervised methods such as SentiWordNet can be an important resource in sentiment classification tasks. Considerations on future improvements are also presented based on a detailed analysis of classification results.

Download Full-text