scholarly journals Enhanced Twofold-LDA Model for Aspect Discovery and Sentiment Classification

2019 ◽  
Vol 9 (4) ◽  
pp. 1-20 ◽  
Author(s):  
Nicola Burns ◽  
Yaxin Bi ◽  
Hui Wang ◽  
Terry Anderson

There is a need to automatically classify information from online reviews. Customers want to know useful information about different aspects of a product or service and also the sentiment expressed towards each aspect. This article proposes an Enhanced Twofold-LDA model (Latent Dirichlet Allocation), in which one LDA is used for aspect assignment and another is used for sentiment classification, aiming to automatically determine aspect and sentiment. The enhanced model incorporates domain knowledge (i.e., seed words) to produce more focused topics and has the ability to handle two aspects in at the sentence level simultaneously. The experiment results show that the Enhanced Twofold-LDA model is able to produce topics more related to aspects in comparison to the state of arts method ASUM (Aspect and Sentiment Unification Model), whereas comparable with ASUM on sentiment classification performance.

2021 ◽  
pp. 016555152110077
Author(s):  
Sulong Zhou ◽  
Pengyu Kan ◽  
Qunying Huang ◽  
Janet Silbernagel

Natural disasters cause significant damage, casualties and economical losses. Twitter has been used to support prompt disaster response and management because people tend to communicate and spread information on public social media platforms during disaster events. To retrieve real-time situational awareness (SA) information from tweets, the most effective way to mine text is using natural language processing (NLP). Among the advanced NLP models, the supervised approach can classify tweets into different categories to gain insight and leverage useful SA information from social media data. However, high-performing supervised models require domain knowledge to specify categories and involve costly labelling tasks. This research proposes a guided latent Dirichlet allocation (LDA) workflow to investigate temporal latent topics from tweets during a recent disaster event, the 2020 Hurricane Laura. With integration of prior knowledge, a coherence model, LDA topics visualisation and validation from official reports, our guided approach reveals that most tweets contain several latent topics during the 10-day period of Hurricane Laura. This result indicates that state-of-the-art supervised models have not fully utilised tweet information because they only assign each tweet a single label. In contrast, our model can not only identify emerging topics during different disaster events but also provides multilabel references to the classification schema. In addition, our results can help to quickly identify and extract SA information to responders, stakeholders and the general public so that they can adopt timely responsive strategies and wisely allocate resource during Hurricane events.


2021 ◽  
Vol ahead-of-print (ahead-of-print) ◽  
Author(s):  
Ziang Wang ◽  
Feng Yang

Purpose It has always been a hot topic for online retailers to obtain consumers’ product evaluations from massive online reviews. In the process of online shopping, there is no face-to-face interaction between online retailers and customers. After collecting online reviews left by customers, online retailers are eager to acquire answers to some questions. For example, which product attributes will attract consumers? Or which step brings a better experience to consumers during the process of shopping? This paper aims to associate the latent Dirichlet allocation (LDA) model with the consumers’ attitude and provides a method to calculate the numerical measure of consumers’ product evaluation expressed in each word. Design/methodology/approach First, all possible pairs of reviews are organized as a document to build the corpus. After that, latent topics of the traditional LDA model noted as the standard LDA model, are separated into shared and differential topics. Then, the authors associate the model with consumers’ attitudes toward each review which is distinguished as positive review and non-positive review. The product evaluation reflected in consumers’ binary attitude is expanded to each word that appeared in the corpus. Finally, a variational optimization is introduced to calculate parameters mentioned in the expanded LDA model. Findings The experiment’s result illustrates that the LDA model in the research noted as an expanded LDA model, can successfully assign sufficient probability with words related to products attributes or consumers’ product evaluation. Compared with the standard LDA model, the expanded model intended to assign higher probability with words, which have a higher ranking within each topic. Besides, the expanded model also has higher precision on the prediction set, which shows that breaking down the topics into two categories fits better on the data set than the standard LDA model. The product evaluation of each word is calculated by the expanded model and depicted at the end of the experiment. Originality/value This research provides a new method to calculate consumers’ product evaluation from reviews in the level of words. Words may be used to describe product attributes or consumers’ experiences in reviews. Assigning words with numerical measures can analyze consumers’ products evaluation quantitatively. Besides, words are labeled themselves, they can also be ranked if a numerical measure is given. Online retailers can benefit from the result for label choosing, advertising or product recommendation.


2021 ◽  
Vol 297 ◽  
pp. 01071
Author(s):  
Sifi Fatima-Zahrae ◽  
Sabbar Wafae ◽  
El Mzabi Amal

Sentiment classification is one of the hottest research areas among the Natural Language Processing (NLP) topics. While it aims to detect sentiment polarity and classification of the given opinion, requires a large number of aspect extractions. However, extracting aspect takes human effort and long time. To reduce this, Latent Dirichlet Allocation (LDA) method have come out recently to deal with this issue.In this paper, an efficient preprocessing method for sentiment classification is presented and will be used for analyzing user’s comments on Twitter social network. For this purpose, different text preprocessing techniques have been used on the dataset to achieve an acceptable standard text. Latent Dirichlet Allocation has been applied on the obtained data after this fast and accurate preprocessing phase. The implementation of different sentiment analysis methods and the results of these implementations have been compared and evaluated. The experimental results show that the combined uses of the preprocessing method of this paper and Latent Dirichlet Allocation have an acceptable results compared to other basic methods.


2020 ◽  
pp. 1-10
Author(s):  
Junegak Joung ◽  
Harrison M. Kim

Abstract Identifying product attributes from the perspective of a customer is essential to measure the satisfaction, importance, and Kano category of each product attribute for product design. This paper proposes automated keyword filtering to identify product attributes from online customer reviews based on latent Dirichlet allocation. The preprocessing for latent Dirichlet allocation is important because it affects the results of topic modeling; however, previous research performed latent Dirichlet allocation either without removing noise keywords or by manually eliminating them. The proposed method improves the preprocessing for latent Dirichlet allocation by conducting automated filtering to remove the noise keywords that are not related to the product. A case study of Android smartphones is performed to validate the proposed method. The performance of the latent Dirichlet allocation by the proposed method is compared to that of a previous method, and according to the latent Dirichlet allocation results, the former exhibits a higher performance than the latter.


2020 ◽  
Vol 39 (5) ◽  
pp. 7909-7919
Author(s):  
Chuantao Wang ◽  
Xuexin Yang ◽  
Linkai Ding

The purpose of sentiment classification is to solve the problem of automatic judgment of sentiment tendency. In the sentiment classification task of text data (such as online reviews), the traditional deep learning model focuses on algorithm optimization, but ignores the characteristics of the imbalanced distribution of the number of samples in each classification, which will cause the classification performance of the model to decrease in practical applications. In this paper, the experiment is divided into two stages. In the first stage, samples of minority class in the sample distribution are used to train a sequence generative adversarial nets, so that the sequence generative adversarial nets can learn the features of the samples of minority class in depth. In the second stage, the trained generator of sequence generative adversarial nets is used to generate false samples of minority class and mix them with the original samples to balance the sample distribution. After that, the mixed samples are input into the sentiment classification deep model to complete the model training. Experimental results show that the model has excellent classification performance in comparing a variety of deep learning models based on classic imbalanced learning methods in the sentiment classification task of hotel reviews.


2021 ◽  
Author(s):  
Zongxi Li ◽  
Xinhong Chen ◽  
Haoran Xie ◽  
Qing Li ◽  
Xiaohui Tao ◽  
...  

AbstractExploiting hand-crafted lexicon knowledge to enhance emotional or sentimental features at word-level has become a widely adopted method in emotion-relevant classification studies. However, few attempts have been made to explore the emotion construction in the classification task, which provides insights to how a sentence’s emotion is constructed. The major challenge of exploring emotion construction is that the current studies assume the dataset labels as relatively independent emotions, which overlooks the connections among different emotions. This work aims to understand the coarse-grained emotion construction and their dependency by incorporating fine-grained emotions from domain knowledge. Incorporating domain knowledge and dimensional sentiment lexicons, our previous work proposes a novel method named EmoChannel to capture the intensity variation of a particular emotion in time series. We utilize the resultant knowledge of 151 available fine-grained emotions to comprise the representation of sentence-level emotion construction. Furthermore, this work explicitly employs a self-attention module to extract the dependency relationship within all emotions and propose EmoChannel-SA Network to enhance emotion classification performance. We conducted experiments to demonstrate that the proposed method produces competitive performances against the state-of-the-art baselines on both multi-class datasets and sentiment analysis datasets.


Sign in / Sign up

Export Citation Format

Share Document