What Counts? Reflections on the Multivalence of Social Media Data

Carolin Gerlitz

doi:10.14361/dcs-2016-0203

What Counts? Reflections on the Multivalence of Social Media Data

Digital Culture & Society ◽

10.14361/dcs-2016-0203 ◽

2016 ◽

Vol 2 (2) ◽

pp. 19-38 ◽

Cited By ~ 14

Author(s):

Carolin Gerlitz

Keyword(s):

Social Media ◽

Structured Data ◽

Social Media Data ◽

Orders Of Worth ◽

Social Media Platforms ◽

Set Up ◽

Empirical Experiment ◽

Media Data

Abstract Social media platforms have been characterised by their programmability, affordances, constraints and stakeholders - the question of value and valuation of platforms, their data and features has, however, received less attention in platform studies. This paper explores the specific socio-technical conditions for valuating platform data and suggests that platforms set up their data to become multivalent, that is to be valuable alongside multiple, possibly conflicting value regimes. Drawing on both platform and valuation studies, it asks how the production, storing and circulation of data, its connection to user action and the various stakeholders of platforms contribute to its valuation. Platform data, the paper suggests, is the outcome of capture systems which allow to collapse action and its capture into pre-structured data forms which remain open to divergent interpretations. Platforms offer such grammars of action both to users and other stakeholders in frontand back-ends, inviting them to produce and engage with its data following heterogeneous orders of worth. Platform data can participate in different valuation regimes at the same time - however, the paper concludes, not all actors can participate in all modes of valuation, as in the end, it is the platform that sets the conditions for participation. The paper offers a conceptual perspective to interrogate what data counts by attending to questions of quantification, its entanglement with valuation and the various technologies and stakeholders involved. It finishes with an empirical experiment to map the various ways in which Instagram data is made to count.

Download Full-text

Information extraction from digital social trace data with applications to social media and scholarly communication data

ACM SIGIR Forum ◽

10.1145/3451964.3451981 ◽

2020 ◽

Vol 54 (1) ◽

pp. 1-2

Author(s):

Shubhanshu Mishra

Keyword(s):

Social Media ◽

Information Extraction ◽

Scholarly Communication ◽

Structured Data ◽

Graph Structure ◽

Learning Models ◽

Social Media Data ◽

Scholarly Data ◽

Media Data ◽

Machine Learning Models

Information extraction (IE) aims at extracting structured data from unstructured or semi-structured data. The thesis starts by identifying social media data and scholarly communication data as a special case of digital social trace data (DSTD). This identification allows us to utilize the graph structure of the data (e.g., user connected to a tweet, author connected to a paper, author connected to authors, etc.) for developing new information extraction tasks. The thesis focuses on information extraction from DSTD, first, using only the text data from tweets and scholarly paper abstracts, and then using the full graph structure of Twitter and scholarly communications datasets. This thesis makes three major contributions. First, new IE tasks based on DSTD representation of the data are introduced. For scholarly communication data, methods are developed to identify article and author level novelty [Mishra and Torvik, 2016] and expertise. Furthermore, interfaces for examining the extracted information are introduced. A social communication temporal graph (SCTG) is introduced for comparing different communication data like tweets tagged with sentiment, tweets about a search query, and Facebook group posts. For social media, new text classification categories are introduced, with the aim of identifying enthusiastic and supportive users, via their tweets. Additionally, the correlation between sentiment classes and Twitter meta-data in public corpora is analyzed, leading to the development of a better model for sentiment classification [Mishra and Diesner, 2018]. Second, methods are introduced for extracting information from social media and scholarly data. For scholarly data, a semi-automatic method is introduced for the construction of a large-scale taxonomy of computer science concepts. The method relies on the Wikipedia category tree. The constructed taxonomy is used for identifying key computer science phrases in scholarly papers, and tracking their evolution over time. Similarly, for social media data, machine learning models based on human-in-the-loop learning [Mishra et al., 2015], semi-supervised learning [Mishra and Diesner, 2016], and multi-task learning [Mishra, 2019] are introduced for identifying sentiment, named entities, part of speech tags, phrase chunks, and super-sense tags. The machine learning models are developed with a focus on leveraging all available data. The multi-task models presented here result in competitive performance against other methods, for most of the tasks, while reducing inference time computational costs. Finally, this thesis has resulted in the creation of multiple open source tools and public data sets (see URL below), which can be utilized by the research community. The thesis aims to act as a bridge between research questions and techniques used in DSTD from different domains. The methods and tools presented here can help advance work in the areas of social media and scholarly data analysis.

Download Full-text

Embed2Detect: temporally clustered embedded words for event detection in social media

Machine Learning ◽

10.1007/s10994-021-05988-7 ◽

2021 ◽

Author(s):

Hansi Hettiarachchi ◽

Mariam Adedoyin-Olowe ◽

Jagdev Bhogal ◽

Mohamed Medhat Gaber

Keyword(s):

Social Media ◽

Event Detection ◽

High Volume ◽

Detection Methods ◽

Word Embeddings ◽

Agglomerative Clustering ◽

Data Set ◽

Social Media Data ◽

Social Media Platforms ◽

Media Data

AbstractSocial media is becoming a primary medium to discuss what is happening around the world. Therefore, the data generated by social media platforms contain rich information which describes the ongoing events. Further, the timeliness associated with these data is capable of facilitating immediate insights. However, considering the dynamic nature and high volume of data production in social media data streams, it is impractical to filter the events manually and therefore, automated event detection mechanisms are invaluable to the community. Apart from a few notable exceptions, most previous research on automated event detection have focused only on statistical and syntactical features in data and lacked the involvement of underlying semantics which are important for effective information retrieval from text since they represent the connections between words and their meanings. In this paper, we propose a novel method termed Embed2Detect for event detection in social media by combining the characteristics in word embeddings and hierarchical agglomerative clustering. The adoption of word embeddings gives Embed2Detect the capability to incorporate powerful semantical features into event detection and overcome a major limitation inherent in previous approaches. We experimented our method on two recent real social media data sets which represent the sports and political domain and also compared the results to several state-of-the-art methods. The obtained results show that Embed2Detect is capable of effective and efficient event detection and it outperforms the recent event detection methods. For the sports data set, Embed2Detect achieved 27% higher F-measure than the best-performed baseline and for the political data set, it was an increase of 29%.

Download Full-text

Google Plus as a Contentious Field of Revolutionary Identity

Comparative Sociology ◽

10.1163/15691330-bja10036 ◽

2021 ◽

Vol 20 (3) ◽

pp. 402-416

Author(s):

Amirhossein Teimouri

Keyword(s):

Social Media ◽

Iranian Revolution ◽

Social Media Data ◽

Social Media Platforms ◽

New Generation ◽

Media Data

Abstract Social media platforms have been increasingly reinvigorating extreme movements, especially rightist movements. Utilizing unique Google Plus data, the author shows the rise and fall of the 2015 rightist anti-Nuclear Deal movement in Iran. He argues that the Google Plus platform in 2015 provided the new generation of revolutionary Islamist rightist activists with a contentious space of mobilization, enabling them to develop a new revolutionary rightist identity. This revolutionary identity and its corresponding language and discourse did not fully unfold in Iranian mainstream rightist media, even though rightist groups, compared to liberal groups, are not censored and repressed. The new generation of rightist activists perceived the Nuclear Deal as an existential threat to revolutionary principles of the country, and thus played out their outrage and identity anxieties on Google Plus. The author contends that this online outrage, due to the activists’ identity bond with the regime and the 1979 Iranian Revolution, however, did not translate into any massive offline mobilization against the Nuclear Deal. He also discusses the methodological implications of using social media data, especially the discontinuation of Google Plus.

Download Full-text

Review of Data Visualization for Social Media Postings

International Journal of Engineering & Technology ◽

10.14419/ijet.v7i4.38.27613 ◽

2018 ◽

Vol 7 (4.38) ◽

pp. 939

Author(s):

Nur Atiqah Sia Abdullah ◽

Hamizah Binti Anuar

Keyword(s):

Social Media ◽

Data Visualization ◽

Line Graph ◽

Data Types ◽

Social Media Data ◽

Data Analyst ◽

The Social ◽

Social Media Platforms ◽

Types Of Information ◽

Media Data

Facebook and Twitter are the most popular social media platforms among netizen. People are now more aggressive to express their opinions, perceptions, and emotions through social media platforms. These massive data provide great value for the data analyst to understand patterns and emotions related to a certain issue. Mining the data needs techniques and time, therefore data visualization becomes trending in representing these types of information. This paper aims to review data visualization studies that involved data from social media postings. Past literature used node-link diagram, node-link tree, directed graph, line graph, heatmap, and stream graph to represent the data collected from the social media platforms. An analysis by comparing the social media data types, representation, and data visualization techniques is carried out based on the previous studies. This paper critically discussed the comparison and provides a suggestion for the suitability of data visualization based on the type of social media data in hand.

Download Full-text

Social Media to Social Media Analytics

International Journal of Technoethics ◽

10.4018/ijt.2019070104 ◽

2019 ◽

Vol 10 (2) ◽

pp. 57-70 ◽

Cited By ~ 4

Author(s):

Vikas Kumar ◽

Pooja Nanda

Keyword(s):

Social Media ◽

Social Media Analytics ◽

Social Media Data ◽

Ethical Concerns ◽

Ethical Implications ◽

The Social ◽

Social Media Platforms ◽

The Individual ◽

The Impact ◽

Media Data

With the amplification of social media platforms, the importance of social media analytics has exponentially increased for many brands and organizations across the world. Tracking and analyzing the social media data has been contributing as a success parameter for such organizations, however, the data is being poorly harnessed. Therefore, the ethical implications of social media analytics need to be identified and explored for both the organizations and targeted users of social media data. The present work is an exploratory study to identify the various techno-ethical concerns of social media engagement, as well as social media analytics. The impact of these concerns on the individuals, organizations, and society as a whole are discussed. Ethical engagement for the most common social media platforms has been outlined with a number of specific examples to understand the prominent techno-ethical concerns. Both the individual and organizational perspectives have been taken into account to identify the implications of social media analytics.

Download Full-text

“It’s Not Like It’s Life or Death or Whatever”: Young People’s Understandings of Social Media Data

Social Media + Society ◽

10.1177/2056305118787808 ◽

2018 ◽

Vol 4 (3) ◽

pp. 205630511878780 ◽

Cited By ~ 7

Author(s):

Luci Pangrazio ◽

Neil Selwyn

Keyword(s):

Social Media ◽

Lived Experiences ◽

Personal Data ◽

Third Parties ◽

Third Party ◽

Cultural Issues ◽

Social Media Data ◽

Critical Data ◽

Social Media Platforms ◽

Media Data

Young people’s engagements with social media now generate large quantities of personal data, with “big social data” becoming an increasingly important “currency” in the digital economy. While using social media platforms is ostensibly “free,” users nevertheless “pay” for these services through their personal data—enabling advertisers, content developers, and other third parties to profile, predict, and position individuals. Such developments have prompted calls for social media users to adopt more informed and critical stances toward how and why their data are being used—that is, to build “critical data literacies.” This article reports on research that explores young social media users’ understandings of their personal data and its attendant issues. Drawing on research with groups of young people (aged 13–17 years), the article investigates the consequences of making third party (re)uses of personal data openly available for social media users to interpret and make critical sense of. The findings provide valuable insights into young people’s understandings of the technical, social, and cultural issues that underpin their ability to engage with, and make sense of, social media data. The article concludes by considering how research into critical data literacies might connect in more meaningful and effective ways with everyday lived experiences of social media use.

Download Full-text

Analyzing Social Media Research: A Data Quality and Research Reproducibility Perspective

IIM Kozhikode Society & Management Review ◽

10.1177/22779752211011810 ◽

2021 ◽

pp. 227797522110118

Author(s):

Amit K. Srivastava ◽

Rajhans Mishra

Keyword(s):

Social Media ◽

Data Quality ◽

Quality Of Data ◽

Social Media Data ◽

National Crisis ◽

Social Media Platforms ◽

Quality Issues ◽

The One ◽

Media Data

Social media platforms have become very popular these days among individuals and organizations. On the one hand, organizations use social media as a potential tool to create awareness of their products among consumers, and on the other hand, social media data is useful to predict the national crisis, election polls, stock prediction, etc. However, nowadays, a debate is going on about the quality of data generated on social media platforms, whether it is relevant for prediction and generalization. The article discusses the relevance and quality of data obtained from social media in the context of research and development. Social media data quality issues may impact the generalizability and reproducibility of the results of the study. The paper explores possible reasons for quality issues in the data generated over social media platforms along with the suggestive measures to minimize them using the proposed social media data quality framework.

Download Full-text

Journalists’ Use of Social Media to Infer Public Opinion: The citizens’ perspective

10.32920/14637978.v1 ◽

2021 ◽

Author(s):

Elizabeth Dubois ◽

Anatoliy Gruzd ◽

Jenna Jacobson

Keyword(s):

Social Media ◽

Public Opinion ◽

Online Survey ◽

Social Media Analytics ◽

Social Media Data ◽

Social Media Platforms ◽

Use Of Social Media ◽

Traditional Approaches ◽

Journalistic Practice ◽

Media Data

Journalists increasingly use social media data to infer and report public opinion by quoting social media posts, identifying trending topics, and reporting general sentiment. In contrast to traditional approaches of inferring public opinion, citizens are often unaware of how their publicly available social media data is being used and how public opinion is constructed using social media analytics. In this exploratory study based on a census-weighted online survey of Canadian adults (N=1,500), we examine citizens’ perceptions of journalistic use of social media data. We demonstrate that: (1) people find it more appropriate for journalists to use aggregate social media data rather than personally identifiable data; (2) people who use more social media are more likely to positively perceive journalistic use of social media data to infer public opinion; and (3) the frequency of political posting is positively related to acceptance of this emerging journalistic practice, which suggests some citizens want to be heard publicly on social media while others do not. We provide recommendations for journalists on the ethical use of social media data and social media platforms on opt-in functionality.

Download Full-text

A Novel Machine Learning Framework for Comparison of Viral COVID-19–Related Sina Weibo and Twitter Posts: Workflow Development and Content Analysis

Journal of Medical Internet Research ◽

10.2196/24889 ◽

2021 ◽

Vol 23 (1) ◽

pp. e24889

Author(s):

Shi Chen ◽

Lina Zhou ◽

Yunya Song ◽

Qian Xu ◽

Ping Wang ◽

...

Keyword(s):

Machine Learning ◽

Social Media ◽

Public Discourse ◽

Analytical Framework ◽

Health Issues ◽

Social Media Data ◽

Sina Weibo ◽

Social Media Platforms ◽

Media Data ◽

Content Feature

Background Social media plays a critical role in health communications, especially during global health emergencies such as the current COVID-19 pandemic. However, there is a lack of a universal analytical framework to extract, quantify, and compare content features in public discourse of emerging health issues on different social media platforms across a broad sociocultural spectrum. Objective We aimed to develop a novel and universal content feature extraction and analytical framework and contrast how content features differ with sociocultural background in discussions of the emerging COVID-19 global health crisis on major social media platforms. Methods We sampled the 1000 most shared viral Twitter and Sina Weibo posts regarding COVID-19, developed a comprehensive coding scheme to identify 77 potential features across six major categories (eg, clinical and epidemiological, countermeasures, politics and policy, responses), quantified feature values (0 or 1, indicating whether or not the content feature is mentioned in the post) in each viral post across social media platforms, and performed subsequent comparative analyses. Machine learning dimension reduction and clustering analysis were then applied to harness the power of social media data and provide more unbiased characterization of web-based health communications. Results There were substantially different distributions, prevalence, and associations of content features in public discourse about the COVID-19 pandemic on the two social media platforms. Weibo users were more likely to focus on the disease itself and health aspects, while Twitter users engaged more about policy, politics, and other societal issues. Conclusions We extracted a rich set of content features from social media data to accurately characterize public discourse related to COVID-19 in different sociocultural backgrounds. In addition, this universal framework can be adopted to analyze social media discussions of other emerging health issues beyond the COVID-19 pandemic.

Download Full-text

Success in an Online Giving Day: The Role of Social Media in Fundraising

Nonprofit and Voluntary Sector Quarterly ◽

10.1177/0899764019868849 ◽

2019 ◽

Vol 49 (1) ◽

pp. 74-92 ◽

Cited By ~ 6

Author(s):

Abhishek Bhati ◽

Diarmuid McDonnell

Keyword(s):

Social Media ◽

Organizational Factors ◽

Network Size ◽

Audience Engagement ◽

Social Media Data ◽

Size Number ◽

Social Media Platforms ◽

The Relationship ◽

Media Data

Social media platforms offer nonprofits considerable potential for crafting, supporting, and executing successful fundraising campaigns. How impactful are attempts by these organizations to utilize social media to support fundraising activities associated with online Giving Days? We address this question by testing a number of hypotheses of the effectiveness of using Facebook for fundraising purposes by all 704 nonprofits participating in Omaha Gives 2015. Using linked administrative and social media data, we find that fundraising success—as measured by the number of donors and value of donations—is positively associated with a nonprofit’s Facebook network size (number of likes), activity (number of posts), and audience engagement (number of shares), as well as net effects of organizational factors including budget size, age, and program service area. These results provide important new empirical insights into the relationship between social media utilization and fundraising success of nonprofits.

Download Full-text