scholarly journals Leveraging Pre-Trained Language Model for Summary Generation on Short Text

IEEE Access ◽  
2020 ◽  
Vol 8 ◽  
pp. 228798-228803
Author(s):  
Shuai Zhao ◽  
Fucheng You ◽  
Zeng Yuan Liu
Keyword(s):  
2021 ◽  
Vol 15 (6) ◽  
pp. 1-22
Author(s):  
Yashen Wang ◽  
Huanhuan Zhang ◽  
Zhirun Liu ◽  
Qiang Zhou

For guiding natural language generation, many semantic-driven methods have been proposed. While clearly improving the performance of the end-to-end training task, these existing semantic-driven methods still have clear limitations: for example, (i) they only utilize shallow semantic signals (e.g., from topic models) with only a single stochastic hidden layer in their data generation process, which suffer easily from noise (especially adapted for short-text etc.) and lack of interpretation; (ii) they ignore the sentence order and document context, as they treat each document as a bag of sentences, and fail to capture the long-distance dependencies and global semantic meaning of a document. To overcome these problems, we propose a novel semantic-driven language modeling framework, which is a method to learn a Hierarchical Language Model and a Recurrent Conceptualization-enhanced Gamma Belief Network, simultaneously. For scalable inference, we develop the auto-encoding Variational Recurrent Inference, allowing efficient end-to-end training and simultaneously capturing global semantics from a text corpus. Especially, this article introduces concept information derived from high-quality lexical knowledge graph Probase, which leverages strong interpretability and anti-nose capability for the proposed model. Moreover, the proposed model captures not only intra-sentence word dependencies, but also temporal transitions between sentences and inter-sentence concept dependence. Experiments conducted on several NLP tasks validate the superiority of the proposed approach, which could effectively infer meaningful hierarchical concept structure of document and hierarchical multi-scale structures of sequences, even compared with latest state-of-the-art Transformer-based models.


2020 ◽  
Vol 34 (05) ◽  
pp. 8253-8260 ◽  
Author(s):  
Xin Li ◽  
Piji Li ◽  
Wei Bi ◽  
Xiaojiang Liu ◽  
Wai Lam

Despite the effectiveness of sequence-to-sequence framework on the task of Short-Text Conversation (STC), the issue of under-exploitation of training data (i.e., the supervision signals from query text is ignored) still remains unresolved. Also, the adopted maximization-based decoding strategies, inclined to generating the generic responses or responses with repetition, are unsuited to the STC task. In this paper, we propose to formulate the STC task as a language modeling problem and tailor-make a training strategy to adapt a language model for response generation. To enhance generation performance, we design a relevance-promoting transformer language model, which performs additional supervised source attention after the self-attention to increase the importance of informative query tokens in calculating the token-level representation. The model further refines the query representation with relevance clues inferred from its multiple references during training. In testing, we adopt a randomization-over-maximization strategy to reduce the generation of generic responses. Experimental results on a large Chinese STC dataset demonstrate the superiority of the proposed model on relevance metrics and diversity metrics.1


2021 ◽  
Vol 18 (2) ◽  
pp. 54-75
Author(s):  
Mingxin Gan ◽  
Xiongtao Zhang

As a typical characteristic of microblog information, short text length makes a microblog recommendation hard for new users. Moreover, user cold start makes it difficult to explore accurately the interests of microblog users. Therefore, the authors proposed a microblog recommendation model that integrates both of the users' interest from their communities and the semantic from their neighbors' microblogs. Based on the Kullback-Leibler (KL) language model, the proposed model estimated an interest-based language model and a microblog-based language model. Specifically, the interest-based language model was estimated based on both of the user's word set of interest and that of their community interest. Meanwhile, the microblog-based language model was estimated by combining the word set of a microblog, the neighbor semantic, and the microblog set. Real data from Sina Weibo was crawled to evaluate recommendation performance. Results showed that the proposed model outperforms state-of-art models significantly.


Author(s):  
T. Sashchuk

<div><em>The article presents the results of the study of the communicative competence of the politicians on the basis of the analysis of their messages on their official pages of the Facebook social network. The research used the following general scientific methods: descriptive and comparative, as well as analysis, synthesis and generalization. The quantitative content analysis method with qualitative elements was used to distinguish the peculiarities of information messages that provide communication of the deputies of Verkhovna Rada (Ukrainian Parliament) on their official Facebook pages. Information messages have been analyzed by the following three criteria: subject matter, structure and language.</em></div><p> </p><p><em>For the first time the article draws a parallel between communicative competence and the ability to communicate with voters on the official pages of Facebook which is the most popular social network in Ukraine. As it is established, communicative competence in the analyzed cases is caused not by education, but by previous professional activity of a politician. The most successful and high-quality communication was from the current parliamentarian who worked as a journalist in the past. More than half of the messages that provided successful communication consisted of sufficiently structured short text and a video. The topic covers the activity of the parliamentarian in the Verkhovna Rada and in his district. More than half of the messages are spoken in the first person.</em></p><p><em>The findings of the study can be used in teaching such subjects as Political PR and Electronic PR, and may be of interest to politicians and their assistants.</em><em></em></p><p><strong><em>Key words:</em></strong><em> competence and competency, communicative competence, political discourse, official page of the deputy of Verkhovna Rada of Ukraine on the Facebook social network, subject matter and structure of the information message, first-person narrative, correspondence of communication to the level of communicative competence.</em></p>


Vestnik MEI ◽  
2020 ◽  
Vol 5 (5) ◽  
pp. 132-139
Author(s):  
Ivan E. Kurilenko ◽  
◽  
Igor E. Nikonov ◽  

A method for solving the problem of classifying short-text messages in the form of sentences of customers uttered in talking via the telephone line of organizations is considered. To solve this problem, a classifier was developed, which is based on using a combination of two methods: a description of the subject area in the form of a hierarchy of entities and plausible reasoning based on the case-based reasoning approach, which is actively used in artificial intelligence systems. In solving various problems of artificial intelligence-based analysis of data, these methods have shown a high degree of efficiency, scalability, and independence from data structure. As part of using the case-based reasoning approach in the classifier, it is proposed to modify the TF-IDF (Term Frequency - Inverse Document Frequency) measure of assessing the text content taking into account known information about the distribution of documents by topics. The proposed modification makes it possible to improve the classification quality in comparison with classical measures, since it takes into account the information about the distribution of words not only in a separate document or topic, but in the entire database of cases. Experimental results are presented that confirm the effectiveness of the proposed metric and the developed classifier as applied to classification of customer sentences and providing them with the necessary information depending on the classification result. The developed text classification service prototype is used as part of the voice interaction module with the user in the objective of robotizing the telephone call routing system and making a shift from interaction between the user and system by means of buttons to their interaction through voice.


2018 ◽  
Vol 15 ◽  
pp. 101-112
Author(s):  
So-Hyun Park ◽  
Ae-Rin Song ◽  
Young-Ho Park ◽  
Sun-Young Ihm
Keyword(s):  

Author(s):  
Larisa V. Kalashnikova

The article enlightens the probem of nonsense and its role in the development of creative thinking and fantasy, and the way how the interpretation of nonsense affects children imagination. The function of imagination inherent to a person, and especially to a child, has a powerful potential – to create artificially new metaphorical models, absurd and most incredible situations based on self-amazement. Children are able to measure the properties of unfamiliar objects with the properties of known things. It is not difficult for small researchers to replace incomprehensible meanings with familiar ones; to think over situations, to make analogies, to transfer signs and properties of one object to another. The problem of nonsense research is interesting and relevant. The element of the game is an integral component of nonsense. In the process of playing, children cognize the world, learn to interact with the world, imitating the adults behavior. Imagination and fantasy help the child to invent his own rules of the game, to choose language elements that best suit his ideas. The child uses the learned productive models of the language system to create their own models and their own language, attracting language signs: words, morphs, sentences. Children’s dictionary stimulates word formation and language nomination processes. Nonsense-words are the result of children’s dictionary, speech errors and occazional formations, presented in the form of contamination, phonetic transformations, lexical substitution, implemented on certain models. The first two models are phonetic imitation and hybrid speech, based on the natural language model. The third model of designing nonsense is represented by words that have no meaning at all and can be attributed to words-portmonaie. Due to the flexibility of interframe relationships and the lack of algorithmic thinking, children can not only capture the implicit similarity of objects and phenomena, but also create it through their imagination. Interpretation of nonsense is an effective method of developing imagination in children, because metaphors, nonsense as a means of creating new meanings, modeling new content from fragments of one’s own experience, are a powerful incentive for creative thinking.


Sign in / Sign up

Export Citation Format

Share Document