Targeted Sentiment Classification Based on Attentional Encoding and Graph Convolutional Networks

Luwei Xiao; Xiaohui Hu; Yinong Chen; Yun Xue; Donghong Gu; Bingliang Chen; Tao Zhang

doi:10.3390/app10030957

Targeted Sentiment Classification Based on Attentional Encoding and Graph Convolutional Networks

Applied Sciences ◽

10.3390/app10030957 ◽

2020 ◽

Vol 10 (3) ◽

pp. 957 ◽

Cited By ~ 3

Author(s):

Luwei Xiao ◽

Xiaohui Hu ◽

Yinong Chen ◽

Yun Xue ◽

Donghong Gu ◽

...

Keyword(s):

Neural Networks ◽

Semantic Information ◽

Sentiment Classification ◽

Convolutional Network ◽

Convolutional Networks ◽

Dependency Tree ◽

Proposed Model ◽

Syntactic Information ◽

Specific Goal ◽

State Of Art

Targeted sentiment classification aims to predict the emotional trend of a specific goal. Currently, most methods (e.g., recurrent neural networks and convolutional neural networks combined with an attention mechanism) are not able to fully capture the semantic information of the context and they also lack a mechanism to explain the relevant syntactical constraints and long-range word dependencies. Therefore, syntactically irrelevant context words may mistakenly be recognized as clues to predict the target sentiment. To tackle these problems, this paper considers that the semantic information, syntactic information, and their interaction information are very crucial to targeted sentiment analysis, and propose an attentional-encoding-based graph convolutional network (AEGCN) model. Our proposed model is mainly composed of multi-head attention and an improved graph convolutional network built over the dependency tree of a sentence. Pre-trained BERT is applied to this task, and new state-of-art performance is achieved. Experiments on five datasets show the effectiveness of the model proposed in this paper compared with a series of the latest models.

Download Full-text

Sentence Compression Using BERT and Graph Convolutional Networks

Applied Sciences ◽

10.3390/app11219910 ◽

2021 ◽

Vol 11 (21) ◽

pp. 9910

Author(s):

Yo-Han Park ◽

Gyong-Ho Lee ◽

Yong-Seok Choi ◽

Kong-Joo Lee

Keyword(s):

Language Processing ◽

Large Scale ◽

Convolutional Network ◽

Convolutional Networks ◽

Compression Model ◽

Sentence Compression ◽

Dependency Tree ◽

Proposed Model ◽

Syntactic Information ◽

Input Sentence

Sentence compression is a natural language-processing task that produces a short paraphrase of an input sentence by deleting words from the input sentence while ensuring grammatical correctness and preserving meaningful core information. This study introduces a graph convolutional network (GCN) into a sentence compression task to encode syntactic information, such as dependency trees. As we upgrade the GCN to activate a directed edge, the compression model with the GCN layers can distinguish between parent and child nodes in a dependency tree when aggregating adjacent nodes. Furthermore, by increasing the number of GCN layers, the model can gradually collect high-order information of a dependency tree when propagating node information through the layers. We implement a sentence compression model for Korean and English, respectively. This model consists of three components: pre-trained BERT model, GCN layers, and a scoring layer. The scoring layer can determine whether a word should remain in a compressed sentence by relying on the word vector containing contextual and syntactic information encoded by BERT and GCN layers. To train and evaluate the proposed model, we used the Google sentence compression dataset for English and a Korean sentence compression corpus containing about 140,000 sentence pairs for Korean. The experimental results demonstrate that the proposed model achieves state-of-the-art performance for English. To the best of our knowledge, this sentence compression model based on the deep learning model trained with a large-scale corpus is the first attempt for Korean.

Download Full-text

Attention-Enhanced Graph Convolutional Networks for Aspect-Based Sentiment Classification with Multi-Head Attention

Applied Sciences ◽

10.3390/app11083640 ◽

2021 ◽

Vol 11 (8) ◽

pp. 3640

Author(s):

Guangtao Xu ◽

Peiyu Liu ◽

Zhenfang Zhu ◽

Jie Liu ◽

Fuyong Xu

Keyword(s):

Sentiment Classification ◽

Experimental Results ◽

Sentence Structure ◽

Convolutional Network ◽

Structure Information ◽

Convolutional Networks ◽

Dependency Tree ◽

Syntactic Information ◽

Benchmark Datasets ◽

Reasonable Use

The purpose of aspect-based sentiment classification is to identify the sentiment polarity of each aspect in a sentence. Recently, due to the introduction of Graph Convolutional Networks (GCN), more and more studies have used sentence structure information to establish the connection between aspects and opinion words. However, the accuracy of these methods is limited by noise information and dependency tree parsing performance. To solve this problem, we proposed an attention-enhanced graph convolutional network (AEGCN) for aspect-based sentiment classification with multi-head attention (MHA). Our proposed method can better combine semantic and syntactic information by introducing MHA and GCN. We also added an attention mechanism to GCN to enhance its performance. In order to verify the effectiveness of our proposed method, we conducted a lot of experiments on five benchmark datasets. The experimental results show that our proposed method can make more reasonable use of semantic and syntactic information, and further improve the performance of GCN.

Download Full-text

Graph Convolutional Networks with Bidirectional Attention for Aspect-Based Sentiment Classification

Applied Sciences ◽

10.3390/app11041528 ◽

2021 ◽

Vol 11 (4) ◽

pp. 1528

Author(s):

Jie Liu ◽

Peiyu Liu ◽

Zhenfang Zhu ◽

Xiaowen Li ◽

Guangtao Xu

Keyword(s):

Attention Mechanism ◽

Sentiment Classification ◽

Long Distance ◽

Convolutional Network ◽

Convolutional Networks ◽

Syntactic Information ◽

Position Coding ◽

Novel Method

Aspect-based sentiment classification aims at determining the corresponding sentiment of a particular aspect. Many sophisticated approaches, such as attention mechanisms and Graph Convolutional Networks, have been widely used to address this challenge. However, most of the previous methods have not well analyzed the role of words and long-distance dependencies, and the interaction between context and aspect terms is not well realized, which greatly limits the effectiveness of the model. In this paper, we propose an effective and novel method using attention mechanism and graph convolutional network (ATGCN). Firstly, we make full use of multi-head attention and point-wise convolution transformation to obtain the hidden state. Secondly, we introduce position coding in the model, and use Graph Convolutional Networks to obtain syntactic information and long-distance dependencies. Finally, the interaction between context and aspect terms is further realized by bidirectional attention. Experiments on three benchmarking collections indicate the effectiveness of ATGCN.

Download Full-text

Revisiting Graph Based Collaborative Filtering: A Linear Residual Graph Convolutional Network Approach

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i01.5330 ◽

2020 ◽

Vol 34 (01) ◽

pp. 27-34 ◽

Cited By ~ 5

Author(s):

Lei Chen ◽

Le Wu ◽

Richang Hong ◽

Kun Zhang ◽

Meng Wang

Keyword(s):

Collaborative Filtering ◽

Representation Learning ◽

Superior Performance ◽

Convolutional Network ◽

Convolutional Networks ◽

Proposed Model ◽

Non Linear ◽

Efficiency And Effectiveness ◽

Residual Graph ◽

Interaction Modeling

Graph Convolutional Networks~(GCNs) are state-of-the-art graph based representation learning models by iteratively stacking multiple layers of convolution aggregation operations and non-linear activation operations. Recently, in Collaborative Filtering~(CF) based Recommender Systems~(RS), by treating the user-item interaction behavior as a bipartite graph, some researchers model higher-layer collaborative signals with GCNs. These GCN based recommender models show superior performance compared to traditional works. However, these models suffer from training difficulty with non-linear activations for large user-item graphs. Besides, most GCN based models could not model deeper layers due to the over smoothing effect with the graph convolution operation. In this paper, we revisit GCN based CF models from two aspects. First, we empirically show that removing non-linearities would enhance recommendation performance, which is consistent with the theories in simple graph convolutional networks. Second, we propose a residual network structure that is specifically designed for CF with user-item interaction modeling, which alleviates the over smoothing problem in graph convolution aggregation operation with sparse user-item interaction data. The proposed model is a linear model and it is easy to train, scale to large datasets, and yield better efficiency and effectiveness on two real datasets. We publish the source code at https://github.com/newlei/LR-GCCF.

Download Full-text

A CNN Model for Human Parsing Based on Capacity Optimization

Applied Sciences ◽

10.3390/app9071330 ◽

2019 ◽

Vol 9 (7) ◽

pp. 1330 ◽

Cited By ~ 1

Author(s):

Yalong Jiang ◽

Zheru Chi

Keyword(s):

Neural Networks ◽

Computational Efficiency ◽

Semantic Information ◽

State Of The Art ◽

Depth Estimation ◽

Baseline Model ◽

Computational Burden ◽

Proposed Model ◽

Saliency Prediction ◽

Benchmark Solutions

Although a state-of-the-art performance has been achieved in pixel-specific tasks, such as saliency prediction and depth estimation, convolutional neural networks (CNNs) still perform unsatisfactorily in human parsing where semantic information of detailed regions needs to be perceived under the influences of variations in viewpoints, poses, and occlusions. In this paper, we propose to improve the robustness of human parsing modules by introducing a depth-estimation module. A novel scheme is proposed for the integration of a depth-estimation module and a human-parsing module. The robustness of the overall model is improved with the automatically obtained depth labels. As another major concern, the computational efficiency is also discussed. Our proposed human parsing module with 24 layers can achieve a similar performance as the baseline CNN model with over 100 layers. The number of parameters in the overall model is less than that in the baseline model. Furthermore, we propose to reduce the computational burden by replacing a conventional CNN layer with a stack of simplified sub-layers to further reduce the overall number of trainable parameters. Experimental results show that the integration of two modules contributes to the improvement of human parsing without additional human labeling. The proposed model outperforms the benchmark solutions and the capacity of our model is better matched to the complexity of the task.

Download Full-text

Relation classification via BERT with piecewise convolution and focal loss

PLoS ONE ◽

10.1371/journal.pone.0257092 ◽

2021 ◽

Vol 16 (9) ◽

pp. e0257092

Author(s):

Jianyi Liu ◽

Xi Duan ◽

Ru Zhang ◽

Youqiang Sun ◽

Lei Guan ◽

...

Keyword(s):

Neural Networks ◽

Semantic Information ◽

Semantic Representation ◽

Language Model ◽

Relation Extraction ◽

Fine Tuning ◽

Superior Result ◽

Useful Knowledge ◽

Proposed Model ◽

Relation Classification

Recent relation extraction models’ architecture are evolved from the shallow neural networks to natural language model, such as convolutional neural networks or recurrent neural networks to Bert. However, these methods did not consider the semantic information in the sequence or the distance dependence problem, the internal semantic information may contain the useful knowledge which can help relation classification. Focus on these problems, this paper proposed a BERT-based relation classification method. Compare with the existing Bert-based architecture, the proposed model can obtain the internal semantic information between entity pair and solve the distance semantic dependence better. The pre-trained BERT model after fine tuning is used in this paper to abstract the semantic representation of sequence, then adopt the piecewise convolution to obtain semantic information which influence the extraction results. Compare with the existing methods, the proposed method can achieve a better accuracy on relational extraction task because of the internal semantic information extracted in the sequence. While, the generalization ability is still a problem that cannot be ignored, and the numbers of the relationships are difference between different categories. In this paper, the focal loss function is adopted to solve this problem by assigning a heavy weight to less number or hard classify categories. Finally, comparing with the existing methods, the F1 metric of the proposed method can reach a superior result 89.95% on the SemEval-2010 Task 8 dataset.

Download Full-text

Hybrid Low-Order and Higher-Order Graph Convolutional Networks

Computational Intelligence and Neuroscience ◽

10.1155/2020/3283890 ◽

2020 ◽

Vol 2020 ◽

pp. 1-9

Author(s):

Fangyuan Lei ◽

Xun Liu ◽

Qingyun Dai ◽

Bingo Wing-Kuen Ling ◽

Huimin Zhao ◽

...

Keyword(s):

Computational Complexity ◽

Large Scale ◽

Representation Learning ◽

Higher Order ◽

Semisupervised Learning ◽

Graph Representation ◽

Convolutional Network ◽

Convolutional Networks ◽

Proposed Model ◽

Low Order

With the higher-order neighborhood information of a graph network, the accuracy of graph representation learning classification can be significantly improved. However, the current higher-order graph convolutional networks have a large number of parameters and high computational complexity. Therefore, we propose a hybrid lower-order and higher-order graph convolutional network (HLHG) learning model, which uses a weight sharing mechanism to reduce the number of network parameters. To reduce the computational complexity, we propose a novel information fusion pooling layer to combine the high-order and low-order neighborhood matrix information. We theoretically compare the computational complexity and the number of parameters of the proposed model with those of the other state-of-the-art models. Experimentally, we verify the proposed model on large-scale text network datasets using supervised learning and on citation network datasets using semisupervised learning. The experimental results show that the proposed model achieves higher classification accuracy with a small set of trainable weight parameters.

Download Full-text

Augmenting Paraphrase Generation with Syntax Information Using Graph Convolutional Networks

Entropy ◽

10.3390/e23050566 ◽

2021 ◽

Vol 23 (5) ◽

pp. 566

Author(s):

Xiaoqiang Chi ◽

Yang Xiang

Keyword(s):

Language Processing ◽

Neural Nets ◽

Linguistic Knowledge ◽

Convolutional Network ◽

Tree Structures ◽

Convolutional Networks ◽

Sequence Modeling ◽

Syntactic Information ◽

Dependency Trees ◽

Paraphrase Generation

Paraphrase generation is an important yet challenging task in natural language processing. Neural network-based approaches have achieved remarkable success in sequence-to-sequence learning. Previous paraphrase generation work generally ignores syntactic information regardless of its availability, with the assumption that neural nets could learn such linguistic knowledge implicitly. In this work, we make an endeavor to probe into the efficacy of explicit syntactic information for the task of paraphrase generation. Syntactic information can appear in the form of dependency trees, which could be easily acquired from off-the-shelf syntactic parsers. Such tree structures could be conveniently encoded via graph convolutional networks to obtain more meaningful sentence representations, which could improve generated paraphrases. Through extensive experiments on four paraphrase datasets with different sizes and genres, we demonstrate the utility of syntactic information in neural paraphrase generation under the framework of sequence-to-sequence modeling. Specifically, our graph convolutional network-enhanced models consistently outperform their syntax-agnostic counterparts using multiple evaluation metrics.

Download Full-text

Relation Extraction with Convolutional Network over Learnable Syntax-Transport Graph

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i05.6423 ◽

2020 ◽

Vol 34 (05) ◽

pp. 8928-8935

Author(s):

Kai Sun ◽

Richong Zhang ◽

Yongyi Mao ◽

Samuel Mensah ◽

Xudong Liu

Keyword(s):

Relation Extraction ◽

Weighted Graph ◽

Relevant Information ◽

Irrelevant Information ◽

Classification Task ◽

Convolutional Network ◽

Structure Information ◽

Convolutional Networks ◽

Dependency Tree ◽

Relation Classification

A large majority of approaches have been proposed to leverage the dependency tree in the relation classification task. Recent works have focused on pruning irrelevant information from the dependency tree. The state-of-the-art Attention Guided Graph Convolutional Networks (AGGCNs) transforms the dependency tree into a weighted-graph to distinguish the relevance of nodes and edges for relation classification. However, in their approach, the graph is fully connected, which destroys the structure information of the original dependency tree. How to effectively make use of relevant information while ignoring irrelevant information from the dependency trees remains a challenge in the relation classification task. In this work, we learn to transform the dependency tree into a weighted graph by considering the syntax dependencies of the connected nodes and persisting the structure of the original dependency tree. We refer to this graph as a syntax-transport graph. We further propose a learnable syntax-transport attention graph convolutional network (LST-AGCN) which operates on the syntax-transport graph directly to distill the final representation which is sufficient for classification. Experiments on Semeval-2010 Task 8 and Tacred show our approach outperforms previous methods.

Download Full-text

Comparative Analysis of the Application of Multilayer and Convolutional Neural Networks for Recognition of Handwritten Letters of the Azerbaijani Alphabet

Cybernetics and Computer Technologies ◽

10.34229/2707-451x.21.3.6 ◽

2021 ◽

pp. 65-73

Author(s):

Elshan Mustafayev ◽

Rustam Azimov

Keyword(s):

Neural Networks ◽

Feature Extraction ◽

Comparative Analysis ◽

Convolutional Neural Networks ◽

Information Technologies ◽

Test Sample ◽

Convolutional Network ◽

Convolutional Networks ◽

Feature Extraction Algorithm ◽

Extraction Algorithm

Introduction. The implementation of information technologies in various spheres of public life dictates the creation of efficient and productive systems for entering information into computer systems. In such systems it is important to build an effective recognition module. At the moment, the most effective method for solving this problem is the use of artificial multilayer neural and convolutional networks. The purpose of the paper. This paper is devoted to a comparative analysis of the recognition results of handwritten characters of the Azerbaijani alphabet using neural and convolutional neural networks. Results. The analysis of the dependence of the recognition results on the following parameters is carried out: the architecture of neural networks, the size of the training base, the choice of the subsampling algorithm, the use of the feature extraction algorithm. To increase the training sample, the image augmentation technique was used. Based on the real base of 14000 characters, the bases of 28000, 42000 and 72000 characters were formed. The description of the feature extraction algorithm is given. Conclusions. Analysis of recognition results on the test sample showed: as expected, convolutional neural networks showed higher results than multilayer neural networks; the classical convolutional network LeNet-5 showed the highest results among all types of neural networks. However, the multi-layer 3-layer network, which was input by the feature extraction results; showed rather high results comparable with convolutional networks; there is no definite advantage in the choice of the method in the subsampling layer. The choice of the subsampling method (max-pooling or average-pooling) for a particular model can be selected experimentally; increasing the training database for this task did not give a tangible improvement in recognition results for convolutional networks and networks with preliminary feature extraction. However, for networks learning without feature extraction, an increase in the size of the database led to a noticeable improvement in performance. Keywords: neural networks, feature extraction, OCR.

Download Full-text