A Multilabel Classifier for Text Classification and Enhanced BERT System

Bhavana R. Bhamare; Jeyanthi Prabhu

doi:10.18280/ria.350209

A Multilabel Classifier for Text Classification and Enhanced BERT System

Revue d intelligence artificielle ◽

10.18280/ria.350209 ◽

2021 ◽

Vol 35 (2) ◽

pp. 167-176

Author(s):

Bhavana R. Bhamare ◽

Jeyanthi Prabhu

Keyword(s):

Sentiment Analysis ◽

Language Processing ◽

Language Model ◽

Classification Problem ◽

Automated System ◽

Two Phase ◽

Hybrid Features ◽

Textual Data ◽

Dependency Rule ◽

Weighted Correlation

Now-a-day, a vast variety of reviews are published on the web. As a result, an automated system to analyze and extract knowledge from such textual data is needed. Sentiment analysis is a well-known sub-area in Natural Language Processing (NLP). In earlier research, sentiments were determined without considering the aspects specified in a review instance. Aspect-based sentiment analysis (ABSA) has caught the attention of researchers. Many existing systems consider ABSA as a single label classification problem. This drawback is handled in this study by proposing three approaches that use multilabel classifiers for classification. In the first approach, the performance of a model with hybrid features is analyzed using the multilabel classifier. The hybrid feature set includes word dependency rule-based features and unigram features selected using the proposed two-phase weighted correlation feature selection (WCFS) approach. In the second and third approaches Bidirectional Encoder Representation from Transformers (BERT) language model is used. In the second approach, a BERT system is enhanced by applying max pooling on target terms which specify an aspect of a review instance and a multibit label is given as input to the BERT system. In the third approach, the basic BERT system is used for word embedding only and classification is done using multilabel classifiers. In all approaches, the label used for all training instances specifies aspects with its sentiments. The experimentation shows that the results gained using the system proposed in the first approach are comparable to the results gained using the BERT system. The experimental results depict that the Enhanced BERT system gives better results compared to the existing systems.

Download Full-text

EMOSIS Sentiment Analysis on Tweets with Emotion and Intensity Level Recognition Considering Ending Punctuation Marks

International Journal of Recent Technology and Engineering - 2 ◽

10.35940/ijrte.d4518.118419 ◽

2019 ◽

Vol 8 (4) ◽

pp. 10289-10293

Keyword(s):

Natural Language Processing ◽

Natural Language ◽

Emotion Recognition ◽

Sentiment Analysis ◽

Language Processing ◽

Significant Role ◽

Language Model ◽

Intensity Level ◽

Processing Stage ◽

Overall Performance

Sentiment Analysis is a tool used for determining the Polarity or Emotion of a Sentence. It is a field of Natural Language Processing which focuses on the study of opinions. In this study, the researchers solved one key challenge in Sentiment Analysis, which is to consider the Ending Punctuation Marks present in a sentence. Ending punctuation marks plays a significant role in Emotion Recognition and Intensity Level Recognition. The research made used of tweets expressing opinions about Philippine President Rodrigo Duterte. These downloaded tweets served as the inputs. It was initially subjected to pre-processing stage to be able to prepare the sentences for processing. A Language Model was created to serve as the classifier for determining the scores of the tweets. The scores give the polarity of the sentence. Accuracy is very important in sentiment analysis. To increase the chance of correctly identifying the polarity of the tweets, the input undergone Intensity Level Recognition which determines the intensifiers and negations within the sentences. The system was evaluated with overall performance of 80.27%.

Download Full-text

Data Mining and Machine Learning: Design a Generalized Real Time Sentiment Analysis System on Tweeter Data Using Natural Language Processing

International Journal of Engineering and Advanced Technology - Regular Issue ◽

10.35940/ijeat.f8492.088619 ◽

2019 ◽

Vol 8 (6) ◽

pp. 2139-2142

Keyword(s):

Sentiment Analysis ◽

Real Time ◽

Language Processing ◽

Automated System ◽

Product Reviews ◽

Analysis System ◽

Result Analysis ◽

Python Programming ◽

The Impact ◽

Analyze Data

Sentiment analysis is a task, that is becoming recently important for numerous companies. Because the consigner subscriptions on social media like Facebook, twitter and other side get their product reviews. If the company wants to track tweets about their brand to command over the impact on time or many website analyze the comments on their articles. This will help them to track comments and impact. So the sentiment analysis is an automated system that collects and analyzes the content and generates the desired results. This paper proposes a sentiment analysis system for twitter posts. Proposed system will work on real time tweets. System is also designed in such a way that this can analyze data related to any topic. Python programming language is used to extract tweets form twitter feeds. Proposed system also calculates the level of sentiments. That how much negative or positive tweets are. This paper also presents some real time result analysis.

Download Full-text

A Novel Framework Using Neutrosophy for Integrated Speech and Text Sentiment Analysis

Symmetry ◽

10.3390/sym12101715 ◽

2020 ◽

Vol 12 (10) ◽

pp. 1715

Author(s):

Kritika Mishra ◽

Ilanthenral Kandasamy ◽

Vasantha Kandasamy W. B. ◽

Florentin Smarandache

Keyword(s):

Sentiment Analysis ◽

Language Processing ◽

Wide Spectrum ◽

Neutrosophic Set ◽

Text File ◽

Neutrosophic Sets ◽

Plain Text ◽

Textual Data ◽

Audio Files ◽

Text Sentiment Analysis

With increasing data on the Internet, it is becoming difficult to analyze every bit and make sure it can be used efficiently for all the businesses. One useful technique using Natural Language Processing (NLP) is sentiment analysis. Various algorithms can be used to classify textual data based on various scales ranging from just positive-negative, positive-neutral-negative to a wide spectrum of emotions. While a lot of work has been done on text, only a lesser amount of research has been done on audio datasets. An audio file contains more features that can be extracted from its amplitude and frequency than a plain text file. The neutrosophic set is symmetric in nature, and similarly refined neutrosophic set that has the refined indeterminacies I1 and I2 in the middle between the extremes Truth T and False F. Neutrosophy which deals with the concept of indeterminacy is another not so explored topic in NLP. Though neutrosophy has been used in sentiment analysis of textual data, it has not been used in speech sentiment analysis. We have proposed a novel framework that performs sentiment analysis on audio files by calculating their Single-Valued Neutrosophic Sets (SVNS) and clustering them into positive-neutral-negative and combines these results with those obtained by performing sentiment analysis on the text files of those audio.

Download Full-text

SENTIMENT ANALYSIS OF CUSTOMER REVIEWS

Azerbaijan Journal of High Performance Computing ◽

10.32010/26166127.2021.4.1.113.125 ◽

2021 ◽

Vol 4 (1) ◽

pp. 113-125

Author(s):

Syed Rashiq Nazar ◽

◽

Tapalina Bhattasali

Keyword(s):

Machine Learning ◽

Sentiment Analysis ◽

Language Processing ◽

Binary Classification ◽

Classification Problem ◽

Supervised Machine Learning ◽

Frequency Model ◽

Customer Reviews ◽

Machine Learning Model ◽

Logistic Regression Algorithm

Sentiment analysis is a process in which we classify text data as positive, negative, or neutral or into some other category, which helps understand the sentiment behind the data. Mainly machine learning and natural language processing methods are combined in this process. One can find customer sentiment in reviews, tweets, comments, etc. A company needs to evaluate the sentiment behind the reviews of its product. Customer sentiment can be a valuable asset to the company. This ultimately helps the company make better decisions regarding its product marketing and improving product quality. This paper focuses on the sentiment analysis of customer reviews from Amazon. The reviews contain textual feedback along with a rating system. The aim is to build a supervised machine learning model to classify the review as positive or negative. As reviews are in the text format, there is a need to vectorize the text to numerical format for the computer to process the data. To do this, we use the Bag-of-words model and the TF-IDF (Term Frequency-Inverse Document Frequency) model. These two models are related to each other, and the aim is to find which model performs better in our case. The problem in our case is a binary classification problem; the logistic regression algorithm is used. Finally, the performance of the model is calculated using a metric called the F1 score.

Download Full-text

Topic features for machine learning-based sentiment analysis in Indonesian tweets

International Journal of Intelligent Computing and Cybernetics ◽

10.1108/ijicc-04-2018-0057 ◽

2019 ◽

Vol 12 (1) ◽

pp. 70-81 ◽

Cited By ~ 1

Author(s):

Hendri Murfi ◽

Furida Lusi Siagian ◽

Yudi Satria

Keyword(s):

Machine Learning ◽

Sentiment Analysis ◽

Language Processing ◽

Analysis Data ◽

Data Representation ◽

Support Vector ◽

Data Sets ◽

Content Type ◽

Textual Data ◽

Standard Word

Purpose The purpose of this paper is to analyze topics as alternative features for sentiment analysis in Indonesian tweets. Design/methodology/approach Given Indonesian tweets, the processes of sentiment analysis start by extracting features from the tweets. The features are words or topics. The authors use non-negative matrix factorization to extract the topics and apply a support vector machine to classify the tweets into its sentiment class. Findings The authors analyze the accuracy using the two-class and three-class sentiment analysis data sets. Both data sets are about sentiments of candidates for Indonesian presidential election. The experiments show that the standard word features give better accuracies than the topics features for the two-class sentiment analysis. Moreover, the topic features can slightly improve the accuracy of the standard word features. The topic features can also improve the accuracy of the standard word features for the three-class sentiment analysis. Originality/value The standard textual data representation for sentiment analysis using machine learning is bag of word and its extensions mainly created by natural language processing. This paper applies topics as novel features for the machine learning-based sentiment analysis in Indonesian tweets.

Download Full-text

A Survey

International Journal of Applied Evolutionary Computation ◽

10.4018/ijaec.2020010102 ◽

2020 ◽

Vol 11 (1) ◽

pp. 28-33

Author(s):

Reshma Radheshamjee Baheti ◽

Supriya Kinariwala

Keyword(s):

Sentiment Analysis ◽

Life Events ◽

Language Processing ◽

Social Networking Sites ◽

Human Life ◽

Discussion Forums ◽

Huge Number ◽

Textual Data ◽

Human Stress ◽

Social Media Network

Recently, human stress is rapidly increasing. The school-college students, job professionals, and many people those work under pressure. In last few decades, research is going on how to predict people under pressure or feeling relax with his/her duty. In survey it is evaluated, sentiment analysis will work to find emotions or feelings about their daily life. By analyzing social media network like Facebook, Twitter, and other networking sites where user can share personal feelings like happy, angry, stressed, relaxed, or any other emotion to express human life events or views regarding any topic. On social networking sites, a huge number of informal messages are posted every day, also blogs or discussion forums are also available. Emotions appear to be frequently vital in these texts for expressing friendship, and the presentation of social support as a part of opinions or view. In this article, a survey is done on existing techniques which are working to find sentiment analysis of textual data. In the textual data, the positive and negative sentences have to be found to check the emotions of the user. The survey also finds the natural language processing, the lexical parser, sentiment analysis, the classifier algorithm and some different kinds of Twitter datasets. It is found that 85% work completed on sentiment analysis and categorized the sentences as positive or negative.

Download Full-text

The Evolution of Language Models Applied to Emotion Analysis of Arabic Tweets

Information ◽

10.3390/info12020084 ◽

2021 ◽

Vol 12 (2) ◽

pp. 84

Author(s):

Nora Al-Twairesh

Keyword(s):

Machine Learning ◽

Language Processing ◽

Language Model ◽

Classification Problem ◽

Language Models ◽

Machine Learning Techniques ◽

Emotional States ◽

Evolution Of Language ◽

Emotion Analysis ◽

Language Representation

The field of natural language processing (NLP) has witnessed a boom in language representation models with the introduction of pretrained language models that are trained on massive textual data then used to fine-tune downstream NLP tasks. In this paper, we aim to study the evolution of language representation models by analyzing their effect on an under-researched NLP task: emotion analysis; for a low-resource language: Arabic. Most of the studies in the field of affect analysis focused on sentiment analysis, i.e., classifying text into valence (positive, negative, neutral) while few studies go further to analyze the finer grained emotional states (happiness, sadness, anger, etc.). Emotion analysis is a text classification problem that is tackled using machine learning techniques. Different language representation models have been used as features for these machine learning models to learn from. In this paper, we perform an empirical study on the evolution of language models, from the traditional term frequency–inverse document frequency (TF–IDF) to the more sophisticated word embedding word2vec, and finally the recent state-of-the-art pretrained language model, bidirectional encoder representations from transformers (BERT). We observe and analyze how the performance increases as we change the language model. We also investigate different BERT models for Arabic. We find that the best performance is achieved with the ArabicBERT large model, which is a BERT model trained on a large dataset of Arabic text. The increase in F1-score was significant +7–21%.

Download Full-text

Sentiment Analysis Based on Movie Reviews using Various Classification Techniques : A Review

International Journal of Scientific Research in Computer Science Engineering and Information Technology ◽

10.32628/cseit217329 ◽

2021 ◽

pp. 197-208

Author(s):

Karishma Kaushik ◽

Mahesh Parmar

Keyword(s):

Machine Learning ◽

Sentiment Analysis ◽

Language Processing ◽

Social Networking Sites ◽

Opinion Mining ◽

Classification Problem ◽

Analysis Model ◽

Linguistic Processing ◽

General Feeling ◽

Language Characteristics

Sentimental analysis is also called "opinion mining" analyses attitudes and classifies text views. It relates to the use of natural language processing, text, and linguistic processing. A huge amount of data is created with the rapid growth of web technologies. Social networking sites are now popular and normal places where feelings can be shared by short messages. These sentiments involve happiness, sadness, anxiety, fear, etc. The analysis of short texts tends to recognize the crowd's sentiment. Sentiment Analysis on IMDb moviereviews describes a reviewer's general feeling or impression of a movie. Since the perceptions of humans improve the effectiveness of products & since a movie'ssuccess or failure depending on its review, costs are rising, and a good sentiment analysis model needs to be developed, that classifies moviereviews. Machine learning methods use ML algorithms to carry out sentiment analysis as a standard classification problem using syntactic and language characteristics. There are some methods of machine learning used for sentiment analysis in this paper. Most of the sentiment analysis is performed using SVM, RF, ANN, and NB, Algorithms of DT, BN, & KNN.

Download Full-text

Machine Learning Techniques for Sentiment Analysis of Indian Languages

International Journal of Recent Technology and Engineering - 2 ◽

10.35940/ijrte.b1456.0982s1119 ◽

2019 ◽

Vol 8 (2S11) ◽

pp. 3630-3636

Keyword(s):

Machine Learning ◽

Natural Language Processing ◽

Sentiment Analysis ◽

Language Processing ◽

Machine Learning Techniques ◽

Indian Languages ◽

Indian Language ◽

Learning Techniques ◽

Textual Data ◽

Language Text

Sentiment Analysis is the domain of automatically understanding the emotions, feelings, opinions in a textual data. It is a way of understating how a product, brand, service, idea or an event is viewed by common people, customers and stakeholders. Sentiment Analysis Systems are used by politicians, business leaders, developers and researchers to infer useful information as per their specific needs. It is used in business decision making process to value the views of the customers. Sentiment analysis has become a hot topic of scientific and market research in the field of natural Language Processing. India is a large populated country and the number of Internet users is also huge. Most people share their experience in English. However, during the last decade, due to the accessibility of Internet and evolution in language modelling people express their views in their own native Indian language. With the increase in Indian language text, researchers find it quite fascinating to infer valuable information from this unstructured text data. A number of machine learning techniques have been applied on this textual data set. Basic concepts of Sentiment analysis shall be discussed with focus on Indian language text in this paper. Due to on availability of rich lexicon resources for unsupervised learning techniques and better evaluation measures for the Supervised learning techniques, the later become the first choice for researchers in the field of Natural Language Processing. A comparative analysis shall be made for various supervised machine learning techniques in the context of Indian languages.

Download Full-text

Sentiment Analysis of Movie Review using Machine Learning Approach

IJOSTHE ◽

10.24113/ojssports.v5i1.83 ◽

2017 ◽

Vol 5 (1) ◽

pp. 10

Author(s):

Rajul Rai ◽

Pradeep Mewada

Keyword(s):

Machine Learning ◽

Natural Language Processing ◽

Natural Language ◽

Sentiment Analysis ◽

Language Processing ◽

Opinion Mining ◽

The People ◽

Textual Data ◽

Machine Learning Approach ◽

Regional Languages

With development of Internet and Natural Language processing, use of regional languages is also grown for communication. Sentiment analysis is natural language processing task that extracts useful information from various data forms such as reviews and categorize them on basis of polarity. One of the sub-domain of opinion mining is sentiment analysis which is basically focused on the extraction of emotions and opinions of the people towards a particular topic from textual data. In this paper, sentiment analysis is performed on IMDB movie review database. We examine the sentiment expression to classify the polarity of the movie review on a scale of negative to positive and perform feature extraction and ranking and use these features to train our multilevel classifier to classify the movie review into its correct label. In this paper classification of movie reviews into positive and negative classes with the help of machine learning. Proposed approach using classification techniques has the best accuracy of about 99%.

Download Full-text