Using Social Media Data to Understand Consumers' Information Needs and Emotions Regarding Cancer: Ontology-Based Data Analysis Study

Jooyun Lee; Hyeoun-Ae Park; Seul Ki Park; Tae-Min Song

doi:10.2196/18767

Using Social Media Data to Understand Consumers' Information Needs and Emotions Regarding Cancer: Ontology-Based Data Analysis Study

Journal of Medical Internet Research ◽

10.2196/18767 ◽

2020 ◽

Vol 22 (12) ◽

pp. e18767

Author(s):

Jooyun Lee ◽

Hyeoun-Ae Park ◽

Seul Ki Park ◽

Tae-Min Song

Keyword(s):

Social Media ◽

Health Information ◽

Language Processing ◽

Information Needs ◽

Semantic Analysis ◽

Care Process ◽

Cancer Type ◽

Social Media Data ◽

Related Information ◽

Media Data

Background Analysis of posts on social media is effective in investigating health information needs for disease management and identifying people’s emotional status related to disease. An ontology is needed for semantic analysis of social media data. Objective This study was performed to develop a cancer ontology with terminology containing consumer terms and to analyze social media data to identify health information needs and emotions related to cancer. Methods A cancer ontology was developed using social media data, collected with a crawler, from online communities and blogs between January 1, 2014 and June 30, 2017 in South Korea. The relative frequencies of posts containing ontology concepts were counted and compared by cancer type. Results The ontology had 9 superclasses, 213 class concepts, and 4061 synonyms. Ontology-driven natural language processing was performed on the text from 754,744 cancer-related posts. Colon, breast, stomach, cervical, lung, liver, pancreatic, and prostate cancer; brain tumors; and leukemia appeared most in these posts. At the superclass level, risk factor was the most frequent, followed by emotions, symptoms, treatments, and dealing with cancer. Conclusions Information needs and emotions differed according to cancer type. The observations of this study could be used to provide tailored information to consumers according to cancer type and care process. Attention should be paid to provision of cancer-related information to not only patients but also their families and the general public seeking information on cancer.

Download Full-text

Using Social Media Data to Understand Consumers' Information Needs and Emotions Regarding Cancer: Ontology-Based Data Analysis Study (Preprint)

10.2196/preprints.18767 ◽

2020 ◽

Author(s):

Jooyun Lee ◽

Hyeoun-Ae Park ◽

Seul Ki Park ◽

Tae-Min Song

Keyword(s):

Social Media ◽

Health Information ◽

Language Processing ◽

Information Needs ◽

Semantic Analysis ◽

Care Process ◽

Cancer Type ◽

Social Media Data ◽

Related Information ◽

Media Data

BACKGROUND Analysis of posts on social media is effective in investigating health information needs for disease management and identifying people’s emotional status related to disease. An ontology is needed for semantic analysis of social media data. OBJECTIVE This study was performed to develop a cancer ontology with terminology containing consumer terms and to analyze social media data to identify health information needs and emotions related to cancer. METHODS A cancer ontology was developed using social media data, collected with a crawler, from online communities and blogs between January 1, 2014 and June 30, 2017 in South Korea. The relative frequencies of posts containing ontology concepts were counted and compared by cancer type. RESULTS The ontology had 9 superclasses, 213 class concepts, and 4061 synonyms. Ontology-driven natural language processing was performed on the text from 754,744 cancer-related posts. Colon, breast, stomach, cervical, lung, liver, pancreatic, and prostate cancer; brain tumors; and leukemia appeared most in these posts. At the superclass level, risk factor was the most frequent, followed by emotions, symptoms, treatments, and dealing with cancer. CONCLUSIONS Information needs and emotions differed according to cancer type. The observations of this study could be used to provide tailored information to consumers according to cancer type and care process. Attention should be paid to provision of cancer-related information to not only patients but also their families and the general public seeking information on cancer.

Download Full-text

Ontology-Based Natural Language Processing of Social Media Data in the Assessment of Health Information Sought During Pregnancy

10.3233/shti210668 ◽

2021 ◽

Author(s):

Joo Yun Lee

Keyword(s):

Social Media ◽

Natural Language Processing ◽

South Korea ◽

Natural Language ◽

Family Support ◽

Health Information ◽

Language Processing ◽

Social Media Data ◽

Media Data

This study analyzed collected social media data from South Korea containing keywords related to “pregnancy” using ontology-based natural language processing. Of the 504,725 documents, those containing concepts related to “maternal emotion” were the most frequent, followed by “family support”. Social media were used as a means of exchanging information and expressing emotions.

Download Full-text

Design and implementation of natural language processing with syntax and semantic analysis for extract traffic conditions from social media data

2015 5th IEEE International Conference on System Engineering and Technology (ICSET) ◽

10.1109/icsengt.2015.7412443 ◽

2015 ◽

Cited By ~ 2

Author(s):

Mochamad Vicky Ghani Aziz ◽

Ary Setijadi Prihatmanto ◽

Diotra Henriyan ◽

Rifki Wijaya

Keyword(s):

Social Media ◽

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Semantic Analysis ◽

Social Media Data ◽

Traffic Conditions ◽

Design And Implementation ◽

Media Data

Download Full-text

Exploring the Spatiotemporal Patterns of Residents’ Daily Activities Using Text-Based Social Media Data: A Case Study of Beijing, China

ISPRS International Journal of Geo-Information ◽

10.3390/ijgi10060389 ◽

2021 ◽

Vol 10 (6) ◽

pp. 389

Author(s):

Jian Liu ◽

Bin Meng ◽

Juan Wang ◽

Siyu Chen ◽

Bin Tian ◽

...

Keyword(s):

Social Media ◽

Language Processing ◽

Semantic Analysis ◽

Activity Rhythm ◽

New Paradigm ◽

Social Media Data ◽

Spatiotemporal Information ◽

Spatiotemporal Behavior ◽

Semantic Resources ◽

Media Data

The use of social media data provided powerful data support to reveal the spatiotemporal characteristics and mechanisms of human activity, as it integrated rich spatiotemporal and textual semantic information. However, previous research has not fully utilized its semantic and spatiotemporal information, due to its technical and algorithmic limitations. The efficiency of the deep mining of textual semantic resources was also low. In this research, a multi-classification of text model, based on natural language processing technology and the Bidirectional Encoder Representations from Transformers (BERT) framework is constructed. The residents’ activities in Beijing were then classified using the Sina Weibo data in 2019. The results showed that the accuracy of the classifications was more than 90%. The types and distribution of residents’ activities were closely related to the characteristics of the activities and holiday arrangements. From the perspective of a short timescale, the activity rhythm on weekends was delayed by one hour as compared to that on weekdays. There was a significant agglomeration of residents’ activities that presented a spatial co-location cluster pattern, but the proportion of balanced co-location cluster areas was small. The research demonstrated that location conditions, especially the microlocation condition (the distance to the nearest subway station), were the driving factors that affected the resident activity cluster patterns. In this research, the proposed framework integrates textual semantic analysis, statistical method, and spatial techniques, broadens the application areas of social media data, especially text data, and provides a new paradigm for the research of residents’ activities and spatiotemporal behavior.

Download Full-text

Perceiving Residents’ Festival Activities Based on Social Media Data: A Case Study in Beijing, China

ISPRS International Journal of Geo-Information ◽

10.3390/ijgi10070474 ◽

2021 ◽

Vol 10 (7) ◽

pp. 474

Author(s):

Bingqing Wang ◽

Bin Meng ◽

Juan Wang ◽

Siyu Chen ◽

Jian Liu

Keyword(s):

Social Media ◽

Language Processing ◽

Topic Model ◽

Central Area ◽

Classification Model ◽

Social Media Data ◽

Ring Road ◽

Different Types ◽

Spatial Differences ◽

Media Data

Social media data contains real-time expressed information, including text and geographical location. As a new data source for crowd behavior research in the era of big data, it can reflect some aspects of the behavior of residents. In this study, a text classification model based on the BERT and Transformers framework was constructed, which was used to classify and extract more than 210,000 residents’ festival activities based on the 1.13 million Sina Weibo (Chinese “Twitter”) data collected from Beijing in 2019 data. On this basis, word frequency statistics, part-of-speech analysis, topic model, sentiment analysis and other methods were used to perceive different types of festival activities and quantitatively analyze the spatial differences of different types of festivals. The results show that traditional culture significantly influences residents’ festivals, reflecting residents’ motivation to participate in festivals and how residents participate in festivals and express their emotions. There are apparent spatial differences among residents in participating in festival activities. The main festival activities are distributed in the central area within the Fifth Ring Road in Beijing. In contrast, expressing feelings during the festival is mainly distributed outside the Fifth Ring Road in Beijing. The research integrates natural language processing technology, topic model analysis, spatial statistical analysis, and other technologies. It can also broaden the application field of social media data, especially text data, which provides a new research paradigm for studying residents’ festival activities and adds residents’ perception of the festival. The research results provide a basis for the design and management of the Chinese festival system.

Download Full-text

A Multi-platform Approach to Monitoring Negative Dominance for COVID-19 Vaccine-Related Information Online

Disaster Medicine and Public Health Preparedness ◽

10.1017/dmp.2021.136 ◽

2021 ◽

pp. 1-24

Author(s):

Paola Pascual-Ferrá ◽

Neil Alperstein ◽

Daniel J. Barnett

Keyword(s):

Social Media ◽

Adverse Events ◽

News Media ◽

Search Behavior ◽

Social Media Data ◽

Future Studies ◽

Related Information ◽

Youtube Videos ◽

Negative Dominance ◽

Media Data

Abstract Objective The aim of this study was to test the appearance of negative dominance in COVID-19 vaccine-related information and activity online. We hypothesized that if negative dominance appeared, it would be a reflection of peaks in adverse events related to the vaccine, that negative content would attract more engagement on social media than other vaccine-related posts, and posts referencing adverse events related to COVID-19 vaccination would have a higher average toxicity score. Methods We collected data using Google Trends for search behavior, CrowdTangle for social media data, and Media Cloud for media stories, and compared them against the dates of key adverse events related to COVID-19. We used Communalytic to analyze the toxicity of social media posts by platform and topic. Results While our first hypothesis was partially supported, with peaks in search behavior for image and YouTube videos driven by adverse events, we did not find negative dominance in other types of searches or patterns of attention by news media or on social media. Conclusion We did not find evidence in our data to prove the negative dominance of adverse events related to COVID-19 vaccination on social media. Future studies should corroborate these findings and, if consistent, focus on explaining why this may be the case.

Download Full-text

A Pipeline to Understand Emerging Illness Via Social Media Data Analysis: Case Study on Breast Implant Illness (Preprint)

10.2196/preprints.29768 ◽

2021 ◽

Author(s):

Vishal Dey ◽

Peter Krasniak ◽

Minh Nguyen ◽

Clara Lee ◽

Xia Ning

Keyword(s):

Mental Health ◽

Social Media ◽

Natural Language Processing ◽

Data Analysis ◽

Natural Language ◽

Language Processing ◽

Breast Implant ◽

Public Attention ◽

Social Media Data ◽

Media Data

BACKGROUND A new illness can come to public attention through social media before it is medically defined, formally documented, or systematically studied. One example is a condition known as breast implant illness (BII), which has been extensively discussed on social media, although it is vaguely defined in the medical literature. OBJECTIVE The objective of this study is to construct a data analysis pipeline to understand emerging illnesses using social media data and to apply the pipeline to understand the key attributes of BII. METHODS We constructed a pipeline of social media data analysis using natural language processing and topic modeling. Mentions related to signs, symptoms, diseases, disorders, and medical procedures were extracted from social media data using the clinical Text Analysis and Knowledge Extraction System. We mapped the mentions to standard medical concepts and then summarized these mapped concepts as topics using latent Dirichlet allocation. Finally, we applied this pipeline to understand BII from several BII-dedicated social media sites. RESULTS Our pipeline identified topics related to toxicity, cancer, and mental health issues that were highly associated with BII. Our pipeline also showed that cancers, autoimmune disorders, and mental health problems were emerging concerns associated with breast implants, based on social media discussions. Furthermore, the pipeline identified mentions such as rupture, infection, pain, and fatigue as common self-reported issues among the public, as well as concerns about toxicity from silicone implants. CONCLUSIONS Our study could inspire future studies on the suggested symptoms and factors of BII. Our study provides the first analysis and derived knowledge of BII from social media using natural language processing techniques and demonstrates the potential of using social media information to better understand similar emerging illnesses. CLINICALTRIAL

Download Full-text

Using social media data to map the areas most affected by ISIS in Syria

Proceedings of the International conference “InterCarto/InterGIS” ◽

10.35595/2414-9179-2020-1-26-464-470 ◽

2020 ◽

Vol 26 (1) ◽

pp. 464-470

Author(s):

Mohamad Hasan

Keyword(s):

Social Media ◽

Language Processing ◽

Geographical Information ◽

Data Mapping ◽

Islamic State ◽

Social Media Data ◽

The Social ◽

Mapping Process ◽

Data Source ◽

Media Data

This paper presents a model to collect, save, geocode, and analyze social media data. The model is used to collect and process the social media data concerned with the ISIS terrorist group (the Islamic State in Iraq and Syria), and to map the areas in Syria most affected by ISIS accordingly to the social media data. Mapping process is assumed automated compilation of a density map for the geocoded tweets. Data mined from social media (e.g., Twitter and Facebook) is recognized as dynamic and easily accessible resources that can be used as a data source in spatial analysis and geographical information system. Social media data can be represented as a topic data and geocoding data basing on the text of the mined from social media and processed using Natural Language Processing (NLP) methods. NLP is a subdomain of artificial intelligence concerned with the programming computers to analyze natural human language and texts. NLP allows identifying words used as an initial data by developed geocoding algorithm. In this study, identifying the needed words using NLP was done using two corpora. First corpus contained the names of populated places in Syria. The second corpus was composed in result of statistical analysis of the number of tweets and picking the words that have a location meaning (i.e., schools, temples, etc.). After identifying the words, the algorithm used Google Maps geocoding API in order to obtain the coordinates for posts.

Download Full-text

A Sentiment Analysis and Role of Twitter for Health Communications

10.4018/978-1-7998-8421-7.ch011 ◽

2022 ◽

pp. 188-205

Author(s):

Erkan Çiçek ◽

Uğur Gündüz

Keyword(s):

Social Media ◽

Crisis Management ◽

Semantic Analysis ◽

Social Media Data ◽

Global Pandemic ◽

The Difference ◽

Textual Content ◽

The Impact ◽

Media Data

Social media has been in our lives so much lately that it is an undeniable fact that global pandemics, which constitute an important part of our lives, are also affected by these networks and that they exist in these networks and share the users. The purpose of making this hashtag analysis is to reveal the difference in discourse and language while analyzing Twitter data and to evaluate the effects of a global pandemic crisis on language, message, and crisis management with social media data. This form of analysis is typically completed through amassing textual content data then investigating the “sentiment” conveyed. Within the scope of the study, 11,300 Twitter messages posted with the #stayhome hashtag between 30 May 2020 and 6 June 2020 were examined. The impact and reliability of social media in disaster management could be questioned by carrying out a content analysis based totally on the semantic analysis of the messages given on the Twitter posts with the phrases and frequencies used.

Download Full-text

Anatomy of a Protest: Spatial Information, Social Media, and Urban Space

Social Media + Society ◽

10.1177/2056305119897320 ◽

2020 ◽

Vol 6 (1) ◽

pp. 205630511989732

Author(s):

Alireza Karduni ◽

Eric Sauda

Keyword(s):

Social Media ◽

Public Space ◽

Urban Space ◽

Language Processing ◽

Local Community ◽

Spatial Information ◽

Social Media Data ◽

Public Events ◽

Use Of Social Media ◽

Media Data

Black Lives Matter, like many modern movements in the age of information, makes significant use of social media as well as public space to demand justice. In this article, we study the protests in response to the shooting of Keith Lamont Scott by police in Charlotte, North Carolina, on September 2016. Our goal is to measure the significance of urban space within the virtual and physical network of protesters. Using a mixed-methods approach, we identify and study urban space and social media generated by these protests. We conducted interviews with protesters who were among the first to join the Keith Lamont Scott shooting demonstrations. From the interviews, we identify places that were significant in our interviewees’ narratives. Using a combination of natural language processing and social network analysis, we analyze social media data related to the Charlotte protests retrieved from Twitter. We found that social media, local community, and public space work together to organize and motivate protests and that public events such as protests cause a discernible increase in social media activity. Finally, we find that there are two distinct communities who engage social media in different ways; one group involved with social media, local community and urban space, and a second group connected almost exclusively through social media.

Download Full-text