scholarly journals A Fuzzy Computing Model for Identifying Polarity of Chinese Sentiment Words

2015 ◽  
Vol 2015 ◽  
pp. 1-13 ◽  
Author(s):  
Bingkun Wang ◽  
Yongfeng Huang ◽  
Xian Wu ◽  
Xing Li

With the spurt of online user-generated contents on web, sentiment analysis has become a very active research issue in data mining and natural language processing. As the most important indicator of sentiment, sentiment words which convey positive and negative polarity are quite instrumental for sentiment analysis. However, most of the existing methods for identifying polarity of sentiment words only consider the positive and negative polarity by the Cantor set, and no attention is paid to the fuzziness of the polarity intensity of sentiment words. In order to improve the performance, we propose a fuzzy computing model to identify the polarity of Chinese sentiment words in this paper. There are three major contributions in this paper. Firstly, we propose a method to compute polarity intensity of sentiment morphemes and sentiment words. Secondly, we construct a fuzzy sentiment classifier and propose two different methods to compute the parameter of the fuzzy classifier. Thirdly, we conduct extensive experiments on four sentiment words datasets and three review datasets, and the experimental results indicate that our model performs better than the state-of-the-art methods.

2020 ◽  
Vol 10 (12) ◽  
pp. 4386 ◽  
Author(s):  
Sandra Rizkallah ◽  
Amir F. Atiya ◽  
Samir Shaheen

Embedding words from a dictionary as vectors in a space has become an active research field, due to its many uses in several natural language processing applications. Distances between the vectors should reflect the relatedness between the corresponding words. The problem with existing word embedding methods is that they often fail to distinguish between synonymous, antonymous, and unrelated word pairs. Meanwhile, polarity detection is crucial for applications such as sentiment analysis. In this work we propose an embedding approach that is designed to capture the polarity issue. The approach is based on embedding the word vectors into a sphere, whereby the dot product between any vectors represents the similarity. Vectors corresponding to synonymous words would be close to each other on the sphere, while a word and its antonym would lie at opposite poles of the sphere. The approach used to design the vectors is a simple relaxation algorithm. The proposed word embedding is successful in distinguishing between synonyms, antonyms, and unrelated word pairs. It achieves results that are better than those of some of the state-of-the-art techniques and competes well with the others.


Sentiment analysis is a field which deals with assessing the sentiments or emotions of the users on products and services. It takes user comments as input and applies natural language processing techniques to identify the mood of the user. Usually a sentiment is deemed to be positive, negative or neutral depending upon the mood that he expresses in the comments or feedbacks. It is largely used by businesses to improve products and services and also to present its customers with a set of products and services based on their likes and dislikes. State-of-the-art indicates many techniques have been applied in past such as, linear regression and SVM models. Recurrent Neural Networks (RNNs) have improved the way in which sentiment analysis could be done with greater accuracy, but they suffer from major drawback when applied to longer sentences. This paper proposes a sentiment analysis model using Long ShortTerm Memory (LSTM) based approach , which is a variant of RNNs. LSTMs are good in handling long sentence data. The model is applied to reviews collected from IMDB dataset. It is large dataset that contains 50K reviews. Out of the available reviews 50 % are used for training purpose and 50% are used for testing purpose. The model gives a training accuracy of 92% and validation accuracy of 85% which is neither an over fit nor an under fit. The overall accuracy here is 85%, which seems to be better than some of the existing techniques such as SVM with linear kernel.


2021 ◽  
pp. 1-13
Author(s):  
Qingtian Zeng ◽  
Xishi Zhao ◽  
Xiaohui Hu ◽  
Hua Duan ◽  
Zhongying Zhao ◽  
...  

Word embeddings have been successfully applied in many natural language processing tasks due to its their effectiveness. However, the state-of-the-art algorithms for learning word representations from large amounts of text documents ignore emotional information, which is a significant research problem that must be addressed. To solve the above problem, we propose an emotional word embedding (EWE) model for sentiment analysis in this paper. This method first applies pre-trained word vectors to represent document features using two different linear weighting methods. Then, the resulting document vectors are input to a classification model and used to train a text sentiment classifier, which is based on a neural network. In this way, the emotional polarity of the text is propagated into the word vectors. The experimental results on three kinds of real-world data sets demonstrate that the proposed EWE model achieves superior performances on text sentiment prediction, text similarity calculation, and word emotional expression tasks compared to other state-of-the-art models.


2019 ◽  
Vol 34 (4) ◽  
pp. 295-310 ◽  
Author(s):  
Huyen T M Nguyen ◽  
Hung V Nguyen ◽  
Quyen T Ngo ◽  
Luong X Vu ◽  
Vu Mai Tran ◽  
...  

Sentiment analysis is a natural language processing (NLP) task of identifying orextracting the sentiment content of a text unit. This task has become an active research topic since the early 2000s. During the two last editions of the VLSP workshop series, the shared task on Sentiment Analysis (SA) for Vietnamese has been organized in order to provide an objective evaluation measurement about the performance (quality) of sentiment analysis tools, and encouragethe development of Vietnamese sentiment analysis systems, as well as to provide benchmark datasets for this task. The rst campaign in 2016 only focused on the sentiment polarity classication, with a dataset containing reviews of electronic products. The second campaign in 2018 addressed the problem of Aspect Based Sentiment Analysis (ABSA) for Vietnamese, by providing two datasets containing reviews in restaurant and hotel domains. These data are accessible for research purpose via the VLSP website vlsp.org.vn/resources. This paper describes the built datasets as well as the evaluation results of the systems participating to these campaigns.


Author(s):  
Yong Li ◽  
Qingyu Jin ◽  
Min Zuo ◽  
Haisheng Li ◽  
Xiaojun Yang ◽  
...  

Sentiment analysis becomes one of the most active research hotspots in the field of natural language processing tasks in recent years. However, the inability to fully and effectively use emotional information is a problem in present deep learning models. A single Chinese character has different meanings in different words, and the character embeddings are combined with the word embeddings to extract more precise meaning information. In this paper, a single Chinese character and word are used as input units to train. Based on BLSTM, the attention mechanism based on vocabulary semantics in food field is introduced to realize distance-related sequence semantic feature extraction. CNN is used to realize semantic sentiment classification of sequence semantic features. Therefore, a model based on multi-neural network for sentiment information extraction and analysis is proposed. Experiments show that the model has excellent characteristics in sentiment analysis and obtains high accuracy and F value.


Author(s):  
Bhushan R. Chincholkar

Sentiment analysis is one of the fastest growing fields with its demand and potential benefits that are increasing every day. Sentiment analysis aims to classify the polarity of a document through natural language processing, text analysis. With the help of internet and modern technology, there has bee n a tremendous growth in the amount of data. Each individual is in position to precise his/her own ideas freely on social media. All of this data can be analyzed and used in order to draw benefits and quality information. In this paper, the focus is on cyber-hate classification based on for public opinion or views, since the spread of hate speech using social media can have disruptive impacts on social sentiment analysis. In particular, here proposing a modified approach with two stage training for dealing with text ambiguity and classifying three type approach positive, negative and neutral sentiment, and compare its performance with those popular methods also as well as some existing fuzzy approaches. Afterword comparing the performance of proposed approach with commonly used sentiment classifiers which are known to perform well in this task. The experimental results indicate that our modified approach performs marginally better than the other algorithms.


Information ◽  
2021 ◽  
Vol 12 (9) ◽  
pp. 374
Author(s):  
Babacar Gaye ◽  
Dezheng Zhang ◽  
Aziguli Wulamu

With the extensive availability of social media platforms, Twitter has become a significant tool for the acquisition of peoples’ views, opinions, attitudes, and emotions towards certain entities. Within this frame of reference, sentiment analysis of tweets has become one of the most fascinating research areas in the field of natural language processing. A variety of techniques have been devised for sentiment analysis, but there is still room for improvement where the accuracy and efficacy of the system are concerned. This study proposes a novel approach that exploits the advantages of the lexical dictionary, machine learning, and deep learning classifiers. We classified the tweets based on the sentiments extracted by TextBlob using a stacked ensemble of three long short-term memory (LSTM) as base classifiers and logistic regression (LR) as a meta classifier. The proposed model proved to be effective and time-saving since it does not require feature extraction, as LSTM extracts features without any human intervention. We also compared our proposed approach with conventional machine learning models such as logistic regression, AdaBoost, and random forest. We also included state-of-the-art deep learning models in comparison with the proposed model. Experiments were conducted on the sentiment140 dataset and were evaluated in terms of accuracy, precision, recall, and F1 Score. Empirical results showed that our proposed approach manifested state-of-the-art results by achieving an accuracy score of 99%.


Author(s):  
Nurul Husna Mahadzir Et.al

In recent times, sentiment analysis has become one of the most active research and progressively popular areas in information retrieval and text mining. To date, sentiment analysis has been applied in various domains such as product, movie, sport and political reviews. Most of the previous work in this field has focused on analyzing only a single language, especially English. However, with the need of globalization and the increasing number of the Internet used worldwide; it is common to see the post written in multiple languages. Moreover, in an unstructured content like Twitter posts, people tend to mix languages in one sentence, which make sentiment analysis process even harder and more challenging. This paper reviews the state-of-the-art of sentiment analysis for code-mixed, which includes the detail discussions of each focus area, qualitative comparison and limitations of current approaches. This paper also highlights challenges along this line of research and suggests several recommendations for future works that should be explored.


Author(s):  
Lucia Specia ◽  
Yorick Wilks

Machine Translation (MT) is and always has been a core application in the field of natural-language processing. It is a very active research area and it has been attracting significant commercial interest, most of which has been driven by the deployment of corpus-based, statistical approaches, which can be built in a much shorter time and at a fraction of the cost of traditional, rule-based approaches, and yet produce translations of comparable or superior quality. This chapter aims at introducing MT and its main approaches. It provides a historical overview of the field, an introduction to different translation methods, both rationalist (rule-based) and empirical, and a more in depth description of state-of-the-art statistical methods. Finally, it covers popular metrics to evaluate the output of machine translation systems.


2021 ◽  
Vol 12 (2) ◽  
pp. 1-24
Author(s):  
Md Abul Bashar ◽  
Richi Nayak

Language model (LM) has become a common method of transfer learning in Natural Language Processing (NLP) tasks when working with small labeled datasets. An LM is pretrained using an easily available large unlabelled text corpus and is fine-tuned with the labelled data to apply to the target (i.e., downstream) task. As an LM is designed to capture the linguistic aspects of semantics, it can be biased to linguistic features. We argue that exposing an LM model during fine-tuning to instances that capture diverse semantic aspects (e.g., topical, linguistic, semantic relations) present in the dataset will improve its performance on the underlying task. We propose a Mixed Aspect Sampling (MAS) framework to sample instances that capture different semantic aspects of the dataset and use the ensemble classifier to improve the classification performance. Experimental results show that MAS performs better than random sampling as well as the state-of-the-art active learning models to abuse detection tasks where it is hard to collect the labelled data for building an accurate classifier.


Sign in / Sign up

Export Citation Format

Share Document