scholarly journals Sentiment Analysis of Persian Movie Reviews Using Deep Learning

Entropy ◽  
2021 ◽  
Vol 23 (5) ◽  
pp. 596
Author(s):  
Kia Dashtipour ◽  
Mandar Gogate ◽  
Ahsan Adeel ◽  
Hadi Larijani ◽  
Amir Hussain

Sentiment analysis aims to automatically classify the subject’s sentiment (e.g., positive, negative, or neutral) towards a particular aspect such as a topic, product, movie, news, etc. Deep learning has recently emerged as a powerful machine learning technique to tackle the growing demand for accurate sentiment analysis. However, the majority of research efforts are devoted to English-language only, while information of great importance is also available in other languages. This paper presents a novel, context-aware, deep-learning-driven, Persian sentiment analysis approach. Specifically, the proposed deep-learning-driven automated feature-engineering approach classifies Persian movie reviews as having positive or negative sentiments. Two deep learning algorithms, convolutional neural networks (CNN) and long-short-term memory (LSTM), are applied and compared with our previously proposed manual-feature-engineering-driven, SVM-based approach. Simulation results demonstrate that LSTM obtained a better performance as compared to multilayer perceptron (MLP), autoencoder, support vector machine (SVM), logistic regression and CNN algorithms.

2021 ◽  
pp. 016555152110065
Author(s):  
Rahma Alahmary ◽  
Hmood Al-Dossari

Sentiment analysis (SA) aims to extract users’ opinions automatically from their posts and comments. Almost all prior works have used machine learning algorithms. Recently, SA research has shown promising performance in using the deep learning approach. However, deep learning is greedy and requires large datasets to learn, so it takes more time for data annotation. In this research, we proposed a semiautomatic approach using Naïve Bayes (NB) to annotate a new dataset in order to reduce the human effort and time spent on the annotation process. We created a dataset for the purpose of training and testing the classifier by collecting Saudi dialect tweets. The dataset produced from the semiautomatic model was then used to train and test deep learning classifiers to perform Saudi dialect SA. The accuracy achieved by the NB classifier was 83%. The trained semiautomatic model was used to annotate the new dataset before it was fed into the deep learning classifiers. The three deep learning classifiers tested in this research were convolutional neural network (CNN), long short-term memory (LSTM) and bidirectional long short-term memory (Bi-LSTM). Support vector machine (SVM) was used as the baseline for comparison. Overall, the performance of the deep learning classifiers exceeded that of SVM. The results showed that CNN reported the highest performance. On one hand, the performance of Bi-LSTM was higher than that of LSTM and SVM, and, on the other hand, the performance of LSTM was higher than that of SVM. The proposed semiautomatic annotation approach is usable and promising to increase speed and save time and effort in the annotation process.


Author(s):  
Ralph Sherwin A. Corpuz ◽  

Analyzing natural language-based Customer Satisfaction (CS) is a tedious process. This issue is practically true if one is to manually categorize large datasets. Fortunately, the advent of supervised machine learning techniques has paved the way toward the design of efficient categorization systems used for CS. This paper presents the feasibility of designing a text categorization model using two popular and robust algorithms – the Support Vector Machine (SVM) and Long Short-Term Memory (LSTM) Neural Network, in order to automatically categorize complaints, suggestions, feedbacks, and commendations. The study found that, in terms of training accuracy, SVM has best rating of 98.63% while LSTM has best rating of 99.32%. Such results mean that both SVM and LSTM algorithms are at par with each other in terms of training accuracy, but SVM is significantly faster than LSTM by approximately 35.47s. The training performance results of both algorithms are attributed on the limitations of the dataset size, high-dimensionality of both English and Tagalog languages, and applicability of the feature engineering techniques used. Interestingly, based on the results of actual implementation, both algorithms are found to be 100% effective in accurately predicting the correct CS categories. Hence, the extent of preference between the two algorithms boils down on the available dataset and the skill in optimizing these algorithms through feature engineering techniques and in implementing them toward actual text categorization applications.


Computers ◽  
2019 ◽  
Vol 8 (1) ◽  
pp. 4 ◽  
Author(s):  
Jurgita Kapočiūtė-Dzikienė ◽  
Robertas Damaševičius ◽  
Marcin Woźniak

We describe the sentiment analysis experiments that were performed on the Lithuanian Internet comment dataset using traditional machine learning (Naïve Bayes Multinomial—NBM and Support Vector Machine—SVM) and deep learning (Long Short-Term Memory—LSTM and Convolutional Neural Network—CNN) approaches. The traditional machine learning techniques were used with the features based on the lexical, morphological, and character information. The deep learning approaches were applied on the top of two types of word embeddings (Vord2Vec continuous bag-of-words with negative sampling and FastText). Both traditional and deep learning approaches had to solve the positive/negative/neutral sentiment classification task on the balanced and full dataset versions. The best deep learning results (reaching 0.706 of accuracy) were achieved on the full dataset with CNN applied on top of the FastText embeddings, replaced emoticons, and eliminated diacritics. The traditional machine learning approaches demonstrated the best performance (0.735 of accuracy) on the full dataset with the NBM method, replaced emoticons, restored diacritics, and lemma unigrams as features. Although traditional machine learning approaches were superior when compared to the deep learning methods; deep learning demonstrated good results when applied on the small datasets.


2018 ◽  
Vol 7 (2.27) ◽  
pp. 88 ◽  
Author(s):  
Merin Thomas ◽  
Latha C.A

Sentiment analysis has been an important topic of discussion from two decades since Lee published his first paper on the sentimental analysis in 2002. Apart from the sentimental analysis in English, it has spread its wing to other natural languages whose significance is very important in a multi linguistic country like India. The traditional approaches in machine learning have paved better accuracy for the Analysis. Deep Learning approaches have gained its momentum in recent years in sentimental analysis. Deep learning mimics the human learning so expectations are to meet higher levels of accuracy. In this paper we have implemented sentimental analysis of tweets in South Indian language Malayalam. The model used is Recurrent Neural Networks Long Short-Term Memory, a deep learning technique to predict the sentiments analysis. Achieved accuracy was found increasing with quality and depth of the datasets. 


2021 ◽  
Vol 10 (11) ◽  
pp. e33101119347
Author(s):  
Ewethon Dyego de Araujo Batista ◽  
Wellington Candeia de Araújo ◽  
Romeryto Vieira Lira ◽  
Laryssa Izabel de Araujo Batista

Introdução: a dengue é uma arbovirose causada pelo vírus DENV e transmitida para o homem através do mosquito Aedes aegypti. Atualmente, não existe uma vacina eficaz para combater todas as sorologias do vírus. Diante disso, o combate à doença se volta para medidas preventivas contra a proliferação do mosquito. Os pesquisadores estão utilizando Machine Learning (ML) e Deep Learning (DL) como ferramentas para prever casos de dengue e ajudar os governantes nesse combate. Objetivo: identificar quais técnicas e abordagens de ML e de DL estão sendo utilizadas na previsão de dengue. Métodos: revisão sistemática realizada nas bases das áreas de Medicina e de Computação com intuito de responder as perguntas de pesquisa: é possível realizar previsões de casos de dengue através de técnicas de ML e de DL, quais técnicas são utilizadas, onde os estudos estão sendo realizados, como e quais dados estão sendo utilizados? Resultados: após realizar as buscas, aplicar os critérios de inclusão, exclusão e leitura aprofundada, 14 artigos foram aprovados. As técnicas Random Forest (RF), Support Vector Regression (SVR), e Long Short-Term Memory (LSTM) estão presentes em 85% dos trabalhos. Em relação aos dados, na maioria, foram utilizados 10 anos de dados históricos da doença e informações climáticas. Por fim, a técnica Root Mean Absolute Error (RMSE) foi a preferida para mensurar o erro. Conclusão: a revisão evidenciou a viabilidade da utilização de técnicas de ML e de DL para a previsão de casos de dengue, com baixa taxa de erro e validada através de técnicas estatísticas.


2021 ◽  
Vol 8 (1) ◽  
pp. 64
Author(s):  
Dedi Tri Hermanto ◽  
Arief Setyanto ◽  
Emha Taufiq Luthfi

Media online banyak menghasilkan berbagai macam berita, baik ekonomi, politik, kesehatan, olahraga atau ilmu pengetahuan. Di antara itu semua, ekonomi adalah salah satu topik menarik untuk dibahas. Ekonomi memiliki dampak langsung kepada warga negara, perusahaan, bahkan pasar tradisional tergantung pada kondisi ekonomi di suatu negara. Sentimen yang terkandung dalam berita dapat mempengaruhi pandangan masyarakat terhadap suatu hal atau kebijakan pemerintah. Topik ekonomi adalah bahasan yang menarik untuk dilakukan penelitian karena memiliki dampak langsung kepada masyarakat Indonesia. Namun, masih sedikit penelitian yang menerapkan metode deep learning yaitu Long Short-Term Memory dan CNN untuk analisis sentimen pada artikel finance di Indonesia. Penelitian ini bertujuan untuk melakukan pengklasifikasian judul berita berbahasa Indonesia berdasarkan sentimen positif, negatif dengan menggunakan metode LSTM, LSTM-CNN, CNN-LSTM. Dataset yang digunakan adalah data judul artikel berbahasa Indonesia yang diambil dari situs Detik Finance. Berdasarkan hasil pengujian memperlihatkan bahwa metode LSTM, LSTM-CNN, CNN-LSTM memiliki hasil akurasi sebesar, 62%, 65% dan 74%.Kata Kunci — LSTM, sentiment analysis, CNNOnline media produce a lot of various kinds of news, be it economics, politics, health, sports or science. Among them, economics is one interesting topic to discuss. The economy has a direct impact on citizens, companies, and even traditional markets depending on the economic conditions in a country. The sentiment contained in the news can influence people's views on a matter or government policy. The topic of economics is an interesting topic for research because it has a direct impact on Indonesian society. However, there are still few studies that apply deep learning methods, namely Long Short-Term Memory and CNN for sentiment analysis on finance articles in Indonesia. This study aims to classify Indonesian news headlines based on positive and negative sentiments using the LSTM, LSTM-CNN, CNN-LSTM methods. The dataset used is data on Indonesian language article titles taken from the Detik Finance website. Based on the test results, it shows that the LSTM, LSTM-CNN, CNN-LSTM methods have an accuracy of, 62%, 65% and 74%.Keywords — LSTM, sentiment analysis, CNN


2021 ◽  
Vol 2021 ◽  
pp. 1-10
Author(s):  
Najla M. Alharbi ◽  
Norah S. Alghamdi ◽  
Eman H. Alkhammash ◽  
Jehad F. Al Amri

Consumer feedback is highly valuable in business to assess their performance and is also beneficial to customers as it gives them an idea of what to expect from new products. In this research, the aim is to evaluate different deep learning approaches to accurately predict the opinion of customers based on mobile phone reviews obtained from Amazon.com. The prediction is based on analysing these reviews and categorizing them as positive, negative, or neutral. Different deep learning algorithms have been implemented and evaluated such as simple RNN with its four variants, namely, Long Short-Term Memory Networks (LRNN), Group Long Short-Term Memory Networks (GLRNN), gated recurrent unit (GRNN), and update recurrent unit (UGRNN). All evaluated algorithms are combined with word embedding as feature extraction approach for sentiment analysis including Glove, word2vec, and FastText by Skip-grams. The five different algorithms with the three feature extraction methods are evaluated based on accuracy, recall, precision, and F1-score for both balanced and unbalanced datasets. For the unbalanced dataset, it was found that the GLRNN algorithms with FastText feature extraction scored the highest accuracy of 93.75%. This result achieved the highest accuracy on this dataset when compared with other methods mentioned in the literature. For the balanced dataset, the highest achieved accuracy was 88.39% by the LRNN algorithm.


2021 ◽  
Vol 2 (2) ◽  
Author(s):  
Imane Guellil ◽  
Ahsan Adeel ◽  
Faical Azouaou ◽  
Fodil Benali ◽  
Ala-Eddine Hachani ◽  
...  

AbstractIn this paper, we propose a semi-supervised approach for sentiment analysis of Arabic and its dialects. This approach is based on a sentiment corpus, constructed automatically and reviewed manually by Algerian dialect native speakers. This approach consists of constructing and applying a set of deep learning algorithms to classify the sentiment of Arabic messages as positive or negative. It was applied on Facebook messages written in Modern Standard Arabic (MSA) as well as in Algerian dialect (DALG, which is a low resourced-dialect, spoken by more than 40 million people) with both scripts Arabic and Arabizi. To handle Arabizi, we consider both options: transliteration (largely used in the research literature for handling Arabizi) and translation (never used in the research literature for handling Arabizi). For highlighting the effectiveness of a semi-supervised approach, we carried out different experiments using both corpora for the training (i.e. the corpus constructed automatically and the one that was reviewed manually). The experiments were done on many test corpora dedicated to MSA/DALG, which were proposed and evaluated in the research literature. Both classifiers are used, shallow and deep learning classifiers such as Random Forest (RF), Logistic Regression(LR) Convolutional Neural Network (CNN) and Long short-term memory (LSTM). These classifiers are combined with word embedding models such as Word2vec and fastText that were used for sentiment classification. Experimental results (F1 score up to 95% for intrinsic experiments and up to 89% for extrinsic experiments) showed that the proposed system outperforms the existing state-of-the-art methodologies (the best improvement is up to 25%).


Sign in / Sign up

Export Citation Format

Share Document