Identifying health-related discussions of cannabis use on Twitter: a proof-of-concept study (Preprint)

2021 ◽  
Author(s):  
Jon-Patrick Allem ◽  
Anuja Majmundar ◽  
Allison Dormanesh ◽  
Scott Donaldson

BACKGROUND The cannabis product and regulatory landscape is changing in the United States. Against the backdrop of these changes, there have been increasing reports on health-related motives for cannabis use and of adverse events from its use. The use of social media data in monitoring cannabis-related health conversations may be useful to state and federal-level regulatory agencies as they grapple with identifying cannabis safety signals in a comprehensive and scalable fashion. OBJECTIVE This study attempted to determine the extent to which a medical dictionary, the Unified Medical Language System (UMLS) Consumer Health Vocabulary (CHV), could identify cannabis-related motivations of use and health consequences of its use as discussed on Twitter in 2020. METHODS Twitter posts containing cannabis-related terms were obtained from January 1 to August 31, 2020. Each post from the sample (n = 353,353) was classified into at least one of 17 a priori categories of commonly health-related topics, using a rule-based classifier with each category defined by the terms in the medical dictionary. A subsample of posts (n=1094) was then manually annotated to help validate the rule-based classifier and determine if each post pertained to health-related motivations for cannabis use or perceived adverse health effects from its use or neither. RESULTS The validation process suggested that the medical dictionary could identify health-related conversations in 31.2% of posts. Specifically, 20.4% of posts were accurately identified as relating to a health-related motivation for cannabis use, while 10.8% of posts were accurately identified as relating to a health-related consequence from cannabis use. Potential health-related conversations around cannabis use ranged from issues with the respiratory system and stress to the immune system and gastrointestinal problems, among other health topics. CONCLUSIONS The mining of social media data may prove helpful in improving surveillance of cannabis products and their adverse health effects. However, future research needs to develop and validate a dictionary and codebook that captures cannabis use-specific health conversations on Twitter.

2020 ◽  
Vol 3 (1) ◽  
pp. 433-458 ◽  
Author(s):  
Rion Brattig Correia ◽  
Ian B. Wood ◽  
Johan Bollen ◽  
Luis M. Rocha

Social media data have been increasingly used to study biomedical and health-related phenomena. From cohort-level discussions of a condition to population-level analyses of sentiment, social media have provided scientists with unprecedented amounts of data to study human behavior associated with a variety of health conditions and medical treatments. Here we review recent work in mining social media for biomedical, epidemiological, and social phenomena information relevant to the multilevel complexity of human health. We pay particular attention to topics where social media data analysis has shown the most progress, including pharmacovigilance and sentiment analysis, especially for mental health. We also discuss a variety of innovative uses of social media data for health-related applications as well as important limitations of social media data access and use.


2020 ◽  
Vol 8 (1) ◽  
pp. e001190
Author(s):  
Adrian Ahne ◽  
Francisco Orchard ◽  
Xavier Tannier ◽  
Camille Perchoux ◽  
Beverley Balkau ◽  
...  

IntroductionLittle research has been done to systematically evaluate concerns of people living with diabetes through social media, which has been a powerful tool for social change and to better understand perceptions around health-related issues. This study aims to identify key diabetes-related concerns in the USA and primary emotions associated with those concerns using information shared on Twitter.Research design and methodsA total of 11.7 million diabetes-related tweets in English were collected between April 2017 and July 2019. Machine learning methods were used to filter tweets with personal content, to geolocate (to the USA) and to identify clusters of tweets with emotional elements. A sentiment analysis was then applied to each cluster.ResultsWe identified 46 407 tweets with emotional elements in the USA from which 30 clusters were identified; 5 clusters (18% of tweets) were related to insulin pricing with both positive emotions (joy, love) referring to advocacy for affordable insulin and sadness emotions related to the frustration of insulin prices, 5 clusters (12% of tweets) to solidarity and support with a majority of joy and love emotions expressed. The most negative topics (10% of tweets) were related to diabetes distress (24% sadness, 27% anger, 21% fear elements), to diabetic and insulin shock (45% anger, 46% fear) and comorbidities (40% sadness).ConclusionsUsing social media data, we have been able to describe key diabetes-related concerns and their associated emotions. More specifically, we were able to highlight the real-world concerns of insulin pricing and its negative impact on mood. Using such data can be a useful addition to current measures that inform public decision making around topics of concern and burden among people with diabetes.


BMJ Open ◽  
2018 ◽  
Vol 8 (12) ◽  
pp. e022931 ◽  
Author(s):  
Joanna Taylor ◽  
Claudia Pagliari

IntroductionThe rising popularity of social media, since their inception around 20 years ago, has been echoed in the growth of health-related research using data derived from them. This has created a demand for literature reviews to synthesise this emerging evidence base and inform future activities. Existing reviews tend to be narrow in scope, with limited consideration of the different types of data, analytical methods and ethical issues involved. There has also been a tendency for research to be siloed within different academic communities (eg, computer science, public health), hindering knowledge translation. To address these limitations, we will undertake a comprehensive scoping review, to systematically capture the broad corpus of published, health-related research based on social media data. Here, we present the review protocol and the pilot analyses used to inform it.MethodsA version of Arksey and O’Malley’s five-stage scoping review framework will be followed: (1) identifying the research question; (2) identifying the relevant literature; (3) selecting the studies; (4) charting the data and (5) collating, summarising and reporting the results. To inform the search strategy, we developed an inclusive list of keyword combinations related to social media, health and relevant methodologies. The frequency and variability of terms were charted over time and cross referenced with significant events, such as the advent of Twitter. Five leading health, informatics, business and cross-disciplinary databases will be searched: PubMed, Scopus, Association of Computer Machinery, Institute of Electrical and Electronics Engineers and Applied Social Sciences Index and Abstracts, alongside the Google search engine. There will be no restriction by date.Ethics and disseminationThe review focuses on published research in the public domain therefore no ethics approval is required. The completed review will be submitted for publication to a peer-reviewed, interdisciplinary open access journal, and conferences on public health and digital research.


2021 ◽  
Vol 8 (1) ◽  
pp. 205395172110103
Author(s):  
Sabina Leonelli ◽  
Rebecca Lovell ◽  
Benedict W Wheeler ◽  
Lora Fleming ◽  
Hywel Williams

The paper problematises the reliability and ethics of using social media data, such as sourced from Twitter or Instagram, to carry out health-related research. As in many other domains, the opportunity to mine social media for information has been hailed as transformative for research on well-being and disease. Considerations around the fairness, responsibilities and accountabilities relating to using such data have often been set aside, on the understanding that as long as data were anonymised, no real ethical or scientific issue would arise. We first counter this perception by emphasising that the use of social media data in health research can yield problematic and unethical results. We then provide a conceptualisation of methodological data fairness that can complement data management principles such as FAIR by enhancing the actionability of social media data for future research. We highlight the forms that methodological data fairness can take at different stages of the research process and identify practical steps through which researchers can ensure that their practices and outcomes are scientifically sound as well as fair to society at large. We conclude that making research data fair as well as FAIR is inextricably linked to concerns around the adequacy of data practices. The failure to act on those concerns raises serious ethical, methodological and epistemic issues with the knowledge and evidence that are being produced.


2020 ◽  
Author(s):  
Oladapo Oyebode ◽  
Chinenye Ndulue ◽  
Ashfaq Adib ◽  
Dinesh Mulchandani ◽  
Banuchitra Suruliraj ◽  
...  

BACKGROUND The COVID-19 pandemic has caused a global health crisis that affects many aspects of human lives. In the absence of vaccines and antivirals, several behavioural change and policy initiatives, such as physical distancing, have been implemented to control the spread of the coronavirus. Social media data can reveal public perceptions toward how governments and health agencies across the globe are handling the pandemic, as well as the impact of the disease on people regardless of their geographic locations in line with various factors that hinder or facilitate the efforts to control the spread of the pandemic globally. OBJECTIVE This paper aims to investigate the impact of the COVID-19 pandemic on people globally using social media data. METHODS We apply natural language processing (NLP) and thematic analysis to understand public opinions, experiences, and issues with respect to the COVID-19 pandemic using social media data. First, we collect over 47 million COVID-19-related comments from Twitter, Facebook, YouTube, and three online discussion forums. Second, we perform data preprocessing which involves applying NLP techniques to clean and prepare the data for automated theme extraction. Third, we apply context-aware NLP approach to extract meaningful keyphrases or themes from over 1 million randomly-selected comments, as well as compute sentiment scores for each theme and assign sentiment polarity (i.e., positive, negative, or neutral) based on the scores using lexicon-based technique. Fourth, we categorize related themes into broader themes. RESULTS A total of 34 negative themes emerged, out of which 15 are health-related issues, psychosocial issues, and social issues related to the COVID-19 pandemic from the public perspective. Some of the health-related issues are increased mortality, health concerns, struggling health systems, and fitness issues; while some of the psychosocial issues include frustrations due to life disruptions, panic shopping, and expression of fear. Social issues include harassment, domestic violence, and wrong societal attitude. In addition, 20 positive themes emerged from our results. Some of the positive themes include public awareness, encouragement, gratitude, cleaner environment, online learning, charity, spiritual support, and innovative research. CONCLUSIONS We uncover various negative and positive themes representing public perceptions toward the COVID-19 pandemic and recommend interventions that can help address the health, psychosocial, and social issues based on the positive themes and other remedial ideas rooted in research. These interventions will help governments, health professionals and agencies, institutions, and individuals in their efforts to curb the spread of COVID-19 and minimize its impact, as well as in reacting to any future pandemics.


Author(s):  
Jon Parker ◽  
Andrew Yates ◽  
Nazli Goharian ◽  
Ophir Frieder

Author(s):  
Lu He ◽  
Tingjue Yin ◽  
Zhaoxian Hu ◽  
Yunan Chen ◽  
David A Hanauer ◽  
...  

Abstract Objective Sentiment analysis is a popular tool for analyzing health-related social media content. However, existing studies exhibit numerous methodological issues and inconsistencies with respect to research design and results reporting, which could lead to biased data, imprecise or incorrect conclusions, or incomparable results across studies. This article reports a systematic analysis of the literature with respect to such issues. The objective was to develop a standardized protocol for improving the research validity and comparability of results in future relevant studies. Materials and Methods We developed the Protocol of Analysis of senTiment in Health (PATH) based on a systematic review that analyzed common research design choices and how such choices were made, or reported, among eligible studies published 2010-2019. Results Of 409 articles screened, 89 met the inclusion criteria. A total of 16 distinctive research design choices were identified, 9 of which have significant methodological or reporting inconsistencies among the articles reviewed, ranging from how relevance of study data was determined to how the sentiment analysis tool selected was validated. Based on this result, we developed the PATH protocol that encompasses all these distinctive design choices and highlights the ones for which careful consideration and detailed reporting are particularly warranted. Conclusions A substantial degree of methodological and reporting inconsistencies exist in the extant literature that applied sentiment analysis to analyzing health-related social media data. The PATH protocol developed through this research may contribute to mitigating such issues in future relevant studies.


2019 ◽  
Vol 15 (3) ◽  
pp. 187-201
Author(s):  
Chris Norval ◽  
Tristan Henderson

Social media have become a rich source of data, particularly in health research. Yet, the use of such data raises significant ethical questions about the need for the informed consent of those being studied. Consent mechanisms, if even obtained, are typically broad and inflexible, or place a significant burden on the participant. Machine learning algorithms show much promise for facilitating a “middle-ground” approach: using trained models to predict and automate granular consent decisions. Such techniques, however, raise a myriad of follow-on ethical and technical considerations. In this article, we present an exploratory user study ( n = 67) in which we find that we can predict the appropriate flow of health-related social media data with reasonable accuracy, while minimizing undesired data leaks. We then attempt to deconstruct the findings of this study, identifying and discussing a number of real-world implications if such a technique were put into practice.


2016 ◽  
Vol 150 (4) ◽  
pp. S89
Author(s):  
Mark W. Reid ◽  
Michelle S. Keller ◽  
Cynthia B. Whitman ◽  
Corey Arnold ◽  
Francis Dailey ◽  
...  

Sign in / Sign up

Export Citation Format

Share Document