Semi-supervised Relational Topic Model for Weakly Annotated Image Recognition in Social Media

Author(s):  
Zhenxing Niu ◽  
Gang Hua ◽  
Xinbo Gao ◽  
Qi Tian
2021 ◽  
Vol 10 (7) ◽  
pp. 474
Author(s):  
Bingqing Wang ◽  
Bin Meng ◽  
Juan Wang ◽  
Siyu Chen ◽  
Jian Liu

Social media data contains real-time expressed information, including text and geographical location. As a new data source for crowd behavior research in the era of big data, it can reflect some aspects of the behavior of residents. In this study, a text classification model based on the BERT and Transformers framework was constructed, which was used to classify and extract more than 210,000 residents’ festival activities based on the 1.13 million Sina Weibo (Chinese “Twitter”) data collected from Beijing in 2019 data. On this basis, word frequency statistics, part-of-speech analysis, topic model, sentiment analysis and other methods were used to perceive different types of festival activities and quantitatively analyze the spatial differences of different types of festivals. The results show that traditional culture significantly influences residents’ festivals, reflecting residents’ motivation to participate in festivals and how residents participate in festivals and express their emotions. There are apparent spatial differences among residents in participating in festival activities. The main festival activities are distributed in the central area within the Fifth Ring Road in Beijing. In contrast, expressing feelings during the festival is mainly distributed outside the Fifth Ring Road in Beijing. The research integrates natural language processing technology, topic model analysis, spatial statistical analysis, and other technologies. It can also broaden the application field of social media data, especially text data, which provides a new research paradigm for studying residents’ festival activities and adds residents’ perception of the festival. The research results provide a basis for the design and management of the Chinese festival system.


2021 ◽  
Vol 19 (7) ◽  
pp. 59-82
Author(s):  
Md Ashraf Ahmed, PhD Candidate ◽  
Arif Mohaimin Sadri, PhD ◽  
M. Hadi Amini, PhD, DEng

Risk perception and risk averting behaviors of public agencies in the emergence and spread of COVID-19 can be retrieved through online social media (Twitter), and such interactions can be echoed in other information outlets. This study collected time-sensitive online social media data and analyzed patterns of health risk communication of public health and emergency agencies in the emergence and spread of novel coronavirus using data-driven methods. The major focus is toward understanding how policy-making agencies communicate risk and response information through social media during a pandemic and influence community response—ie, timing of lockdown, timing of reopening, etc.—and disease outbreak indicators—ie, number of confirmed cases and number of deaths. Twitter data of six major public organizations (1,000-4,500 tweets per organization) are collected from February 21, 2020 to June 6, 2020. Several machine learning algorithms, including dynamic topic model and sentiment analysis, are applied over time to identify the topic dynamics over the specific timeline of the pandemic. Organizations emphasized on various topics—eg, importance of wearing face mask, home quarantine, understanding the symptoms, social distancing and contact tracing, emerging community transmission, lack of personal protective equipment, COVID-19 testing and medical supplies, effect of tobacco, pandemic stress management, increasing hospitalization rate, upcoming hurricane season, use of convalescent plasma for COVID-19 treatment, maintaining hygiene, and the role of healthcare podcast in different timeline. The findings can benefit emergency management, policymakers, and public health agencies to identify targeted information dissemination policies for public with diverse needs based on how local, federal, and international agencies reacted to COVID-19.


2021 ◽  
pp. 1-10
Author(s):  
Wang Gao ◽  
Hongtao Deng ◽  
Xun Zhu ◽  
Yuan Fang

Harmful information identification is a critical research topic in natural language processing. Existing approaches have been focused either on rule-based methods or harmful text identification of normal documents. In this paper, we propose a BERT-based model to identify harmful information from social media, called Topic-BERT. Firstly, Topic-BERT utilizes BERT to take additional information as input to alleviate the sparseness of short texts. The GPU-DMM topic model is used to capture hidden topics of short texts for attention weight calculation. Secondly, the proposed model divides harmful short text identification into two stages, and different granularity labels are identified by two similar sub-models. Finally, we conduct extensive experiments on a real-world social media dataset to evaluate our model. Experimental results demonstrate that our model can significantly improve the classification performance compared with baseline methods.


2019 ◽  
Vol 1 (1) ◽  
pp. 45-78
Author(s):  
Chankyung Pak

Abstract To disseminate their stories efficiently via social media, news organizations make decisions that resemble traditional editorial decisions. However, the decisions for social media may deviate from traditional ones because they are often made outside the newsroom and guided by audience metrics. This study focuses on selective link sharing as quasi-gatekeeping on Twitter ‐ conditioning a link sharing decision about news content. It illustrates how selective link sharing resembles and deviates from gatekeeping for the publication of news stories. Using a computational data collection method and a machine learning technique called Structural Topic Model (STM), this study shows that selective link sharing generates a different topic distribution between news websites and Twitter and thus significantly revokes the specialty of news organizations. This finding implies that emergent logic, which governs news organizations’ decisions for social media, can undermine the provision of diverse news.


2018 ◽  
Vol 24 (2) ◽  
pp. 221-264 ◽  
Author(s):  
SABINE GRÜNDER-FAHRER ◽  
ANTJE SCHLAF ◽  
GREGOR WIEDEMANN ◽  
GERHARD HEYER

AbstractSocial media are an emerging new paradigm in interdisciplinary research in crisis informatics. They bring many opportunities as well as challenges to all fields of application and research involved in the project of using social media content for an improved disaster management. Using the Central European flooding 2013 as our case study, we optimize and apply methods from the field ofnatural language processingand unsupervised machine learning to investigate the thematic and temporal structure of German social media communication. By means of topic model analysis, we will investigate which kind of content was shared on social media during the event. On this basis, we will, furthermore, investigate the development of topics over time and apply temporal clustering techniques to automatically identify different characteristic phases of communication. From the results, we, first, want to reveal properties of social media content and show what potential social media have for improving disaster management in Germany. Second, we will be concerned with the methodological issue of finding and adapting natural language processing methods that are suitable for analysing social media data in order to obtain information relevant for disaster management. With respect to the first, application-oriented focal point, our study reveals high potential of social media content in the factual, organizational and psychological dimension of the disaster and during all stages of the disaster management life cycle. Interestingly, there appear to be systematic differences in thematic profile between the different platforms Facebook and Twitter and between different stages of the event. In context of our methodological investigation, we claim that if topic model analysis is combined with appropriate optimization techniques, it shows high applicability for thematic and temporal social media analysis in disaster management.


2020 ◽  
Vol 14 (02) ◽  
pp. 273-293
Author(s):  
Yingcheng Sun ◽  
Richard Kolacinski ◽  
Kenneth Loparo

With the explosive growth of online discussions published everyday on social media platforms, comprehension and discovery of the most popular topics have become a challenging problem. Conventional topic models have had limited success in online discussions because the corpus is extremely sparse and noisy. To overcome their limitations, we use the discussion thread tree structure and propose a “popularity” metric to quantify the number of replies to a comment to extend the frequency of word occurrences, and the “transitivity” concept to characterize topic dependency among nodes in a nested discussion thread. We build a Conversational Structure Aware Topic Model (CSATM) based on popularity and transitivity to infer topics and their assignments to comments. Experiments on real forum datasets are used to demonstrate improved performance for topic extraction with six different measurements of coherence and impressive accuracy for topic assignments.


2020 ◽  
Vol 39 (4) ◽  
pp. 827-846 ◽  
Author(s):  
Ning Zhong ◽  
David A. Schweidel

We develop a topic model with multiple latent changepoints and demonstrate an approach to detect changes in the topics mentioned in brand-related social media posts.


2020 ◽  
Vol 25 (3) ◽  
pp. 295-313
Author(s):  
Sander van Haperen ◽  
Justus Uitermark ◽  
Alex van der Zeeuw

The Movement for Black Lives has connected millions of people online. How are their outrage and hope mediated through social media? To address this question, this article extends Randall Collins’s Interaction Ritual Theory to social media. Employing semisupervised image recognition methods on a million Instagram posts with the hashtag #blacklivesmatter, we identify four different interaction ritual types, each with distinct geographies. Instagram posts featuring interactions with physical copresence are concentrated in urban areas. We identify two different types of such areas: arenas where contention plays out and milieus where movement identities are affirmed. Instagram posts that do not feature physical copresence are more geographically dispersed. These posts, including memes and selfies, allow people to engage with the movement even when they are not embedded in activist environments. Our analysis helps to understand how different forms of engagement are embedded in particular places and connected through the circulation of social media posts.


2019 ◽  
Vol 119 (1) ◽  
pp. 111-128 ◽  
Author(s):  
Jianhong Luo ◽  
Xuwei Pan ◽  
Shixiong Wang ◽  
Yujing Huang

Purpose Delivering messages and information to potentially interested users is one of the distinguishing applications of online enterprise social network (ESN). The purpose of this paper is to provide insights to better understand the repost preferences of users and provide personalized information service in enterprise social media marketing. Design/methodology/approach It is accomplished by constructing a target audience identification framework. Repost preference latent Dirichlet allocation (RPLDA) topic model topic model is proposed to understand the mass user online repost preferences toward different contents. A topic-oriented preference metric is proposed to measure the preference degree of individual users. And the function of reposting forecasting is formulated to identify target audience. Findings The empirical research shows the following: a total of 20 percent of the repost users in ESN represent the key active users who are particularly interested in the latent topic of messages in ESN and fits Pareto distribution; and the target audience identification framework can successfully identify different target key users for messages with different latent topics. Practical implications The findings should motivate marketing managers to improve enterprise brand by identifying key target audience in ESN and marketing in a way that truthfully reflects personalized preferences. Originality/value This study runs counter to most current business practices, which tend to use simple popularity to seek important users. Adaptively and dynamically identifying target audience appears to have considerable potential, especially in the rapidly growing area of enterprise social media information service.


Sign in / Sign up

Export Citation Format

Share Document