Clustered genetic semantic graph approach for multi-document abstractive summarization

Author(s):  
Atif Khan ◽  
Naomie Salim ◽  
Haleem Farman
Author(s):  
J. Balaji ◽  
T.V. Geetha ◽  
Ranjani Parthasarathi

Customization of information from web documents is an immense job that involves mainly the shortening of original texts. This task is carried out using summarization techniques. In general, an automatically generated summary is of two types – extractive and abstractive. Extractive methods use surface level and statistical features for the selection of important sentences, without considering the meaning conveyed by those sentences. In contrast, abstractive methods need a formal semantic representation, where the selection of important components and the rephrasing of the selected components are carried out using the semantic features associated with the words as well as the context. Furthermore, a deep linguistic analysis is needed for generating summaries. However, the bottleneck behind abstractive summarization is that it requires semantic representation, inference rules and natural language generation. In this paper, The authors propose a semi-supervised bootstrapping approach for the identification of important components for abstractive summarization. The input to the proposed approach is a fully connected semantic graph of a document, where the semantic graphs are constructed for sentences, which are then connected by synonym concepts and co-referring entities to form a complete semantic graph. The direction of the traversal of nodes is determined by a modified spreading activation algorithm, where the importance of the nodes and edges are decided, based on the node and its connected edges under consideration. Summary obtained using the proposed approach is compared with extractive and template based summaries, and also evaluated using ROUGE scores.


2021 ◽  
Author(s):  
Wenhao Wu ◽  
Wei Li ◽  
Xinyan Xiao ◽  
Jiachen Liu ◽  
Ziqiang Cao ◽  
...  

Author(s):  
Balaji Jagan ◽  
Ranjani Parthasarathi ◽  
Geetha T. V.

Customization of information from web documents is an immense job that involves mainly the shortening of original texts. Extractive methods use surface level and statistical features for the selection of important sentences. In contrast, abstractive methods need a formal semantic representation, where the selection of important components and the rephrasing of the selected components are carried out using the semantic features associated with the words as well as the context. In this paper, we propose a semi-supervised bootstrapping approach for the identification of important components for abstractive summarization. The input to the proposed approach is a fully connected semantic graph of a document, where the semantic graphs are constructed for sentences, which are then connected by synonym concepts and co-referring entities to form a complete semantic graph. The direction of the traversal of nodes is determined by a modified spreading activation algorithm, where the importance of the nodes and edges are decided, based on the node and its connected edges under consideration.


2015 ◽  
Vol 77 (18) ◽  
Author(s):  
Atif Khan ◽  
Naomie Salim ◽  
Waleed Reafee ◽  
Anupong Sukprasert ◽  
Yogan Jaya Kumar

Multi-document abstractive summarization aims is to create a compact version of the source text and preserves the important information. The existing graph based methods rely on Bag of Words approach, which treats sentence as bag of words and relies on content similarity measure. The obvious limitation of Bag of Words approach is that it ignores semantic relationships among words and thus the summary produced from the source text would not be adequate. This paper proposes a clustered semantic graph based approach for multi-document abstractive summarization. The approach operates by employing semantic role labeling (SRL) to extract the semantic structure (predicate argument structures) from the document text. The predicate argument structures (PASs) are compared pair wise based on Lin semantic similarity measure to build semantic similarity matrix, which is thus represented as semantic graph whereas the vertices of graph represent the PASs and the edges correspond to the semantic similarity weight between the vertices. Content selection for summary is made by ranking the important graph vertices (PASs) based on modified graph based ranking algorithm. Agglomerative hierarchical clustering is performed to eliminate redundancy in such a way that representative PAS with the highest salience score from each cluster is chosen, and fed to language generation to generate summary sentences. Experiment of this study is performed using DUC-2002, a standard corpus for text summarization. Experimental results reveal that the proposed approach outperforms other summarization systems.


2021 ◽  
pp. 1-15
Author(s):  
Qiwei Bi ◽  
Haoyuan Li ◽  
Kun Lu ◽  
Hanfang Yang

2021 ◽  
pp. 106996
Author(s):  
Xiaoyan Cai ◽  
Kaile Shi ◽  
Yuehan Jiang ◽  
Libin Yang ◽  
Sen Liu

Sign in / Sign up

Export Citation Format

Share Document