Using Natural Language Processing to Examine the Uptake, Content, and Readability of Media Coverage of a Pan-Canadian Drug Safety Research Project: Cross-Sectional Observational Study (Preprint)

Mapping Intimacies ◽

10.2196/preprints.13296 ◽

2019 ◽

Author(s):

Hossein Mohammadhassanzadeh ◽

Ingrid Sketris ◽

Robyn Traynor ◽

Susan Alexander ◽

Brandace Winquist ◽

...

Keyword(s):

Natural Language Processing ◽

Natural Language ◽

Health Information ◽

Language Processing ◽

Drug Safety ◽

Media Coverage ◽

Reading Level ◽

Safety Communication ◽

Safety Research ◽

The Media

BACKGROUND Isotretinoin, for treating cystic acne, increases the risk of miscarriage and fetal abnormalities when taken during pregnancy. The Health Canada–approved product monograph for isotretinoin includes pregnancy prevention guidelines. A recent study by the Canadian Network for Observational Drug Effect Studies (CNODES) on the occurrence of pregnancy and pregnancy outcomes during isotretinoin therapy estimated poor adherence to these guidelines. Media uptake of this study was unknown; awareness of this uptake could help improve drug safety communication. OBJECTIVE The aim of this study was to understand how the media present pharmacoepidemiological research using the CNODES isotretinoin study as a case study. METHODS Google News was searched (April 25-May 6, 2016), using a predefined set of terms, for mention of the CNODES study. In total, 26 articles and 3 CNODES publications (original article, press release, and podcast) were identified. The article texts were cleaned (eg, advertisements and links removed), and the podcast was transcribed. A dictionary of 1295 unique words was created using natural language processing (NLP) techniques (term frequency-inverse document frequency, Porter stemming, and stop-word filtering) to identify common words and phrases. Similarity between the articles and reference publications was calculated using Euclidian distance; articles were grouped using hierarchical agglomerative clustering. Nine readability scales were applied to measure text readability based on factors such as number of words, difficult words, syllables, sentence counts, and other textual metrics. RESULTS The top 5 dictionary words were <italic>pregnancy</italic> (250 appearances), <italic>isotretinoin</italic> (220), <italic>study</italic> (209), <italic>drug</italic> (201), and <italic>women</italic> (185). Three distinct clusters were identified: Clusters 2 (5 articles) and 3 (4 articles) were from health-related websites and media, respectively; Cluster 1 (18 articles) contained largely media sources; 2 articles fell outside these clusters. Use of the term <italic>isotretinoin</italic> versus <italic>Accutane</italic> (a brand name of isotretinoin), discussion of pregnancy complications, and assignment of responsibility for guideline adherence varied between clusters. For example, the term <italic>pregnanc</italic> appeared most often in Clusters 1 (14.6 average times per article) and 2 (11.4) and relatively infrequently in Cluster 3 (1.8). Average readability for all articles was high (eg, Flesch-Kincaid, 13; Gunning Fog, 15; SMOG Index, 10; Coleman Liau Index, 15; Linsear Write Index, 13; and Text Standard, 13). Readability increased from Cluster 2 (Gunning Fog of 16.9) to 3 (12.2). It varied between clusters (average 13th-15th grade) but exceeded the recommended health information reading level (grade 6th to 8th), overall. CONCLUSIONS Media interpretation of the CNODES study varied, with differences in synonym usage and areas of focus. All articles were written above the recommended health information reading level. Analyzing media using NLP techniques can help determine drug safety communication effectiveness. This project is important for understanding how drug safety studies are taken up and redistributed in the media.

Download Full-text

Using Natural Language Processing to Examine the Uptake, Content, and Readability of Media Coverage of a Pan-Canadian Drug Safety Research Project: Cross-Sectional Observational Study

JMIR Formative Research ◽

10.2196/13296 ◽

2020 ◽

Vol 4 (1) ◽

pp. e13296

Author(s):

Hossein Mohammadhassanzadeh ◽

Ingrid Sketris ◽

Robyn Traynor ◽

Susan Alexander ◽

Brandace Winquist ◽

...

Keyword(s):

Natural Language Processing ◽

Natural Language ◽

Health Information ◽

Language Processing ◽

Drug Safety ◽

Media Coverage ◽

Reading Level ◽

Safety Communication ◽

Safety Research ◽

The Media

Background Isotretinoin, for treating cystic acne, increases the risk of miscarriage and fetal abnormalities when taken during pregnancy. The Health Canada–approved product monograph for isotretinoin includes pregnancy prevention guidelines. A recent study by the Canadian Network for Observational Drug Effect Studies (CNODES) on the occurrence of pregnancy and pregnancy outcomes during isotretinoin therapy estimated poor adherence to these guidelines. Media uptake of this study was unknown; awareness of this uptake could help improve drug safety communication. Objective The aim of this study was to understand how the media present pharmacoepidemiological research using the CNODES isotretinoin study as a case study. Methods Google News was searched (April 25-May 6, 2016), using a predefined set of terms, for mention of the CNODES study. In total, 26 articles and 3 CNODES publications (original article, press release, and podcast) were identified. The article texts were cleaned (eg, advertisements and links removed), and the podcast was transcribed. A dictionary of 1295 unique words was created using natural language processing (NLP) techniques (term frequency-inverse document frequency, Porter stemming, and stop-word filtering) to identify common words and phrases. Similarity between the articles and reference publications was calculated using Euclidian distance; articles were grouped using hierarchical agglomerative clustering. Nine readability scales were applied to measure text readability based on factors such as number of words, difficult words, syllables, sentence counts, and other textual metrics. Results The top 5 dictionary words were pregnancy (250 appearances), isotretinoin (220), study (209), drug (201), and women (185). Three distinct clusters were identified: Clusters 2 (5 articles) and 3 (4 articles) were from health-related websites and media, respectively; Cluster 1 (18 articles) contained largely media sources; 2 articles fell outside these clusters. Use of the term isotretinoin versus Accutane (a brand name of isotretinoin), discussion of pregnancy complications, and assignment of responsibility for guideline adherence varied between clusters. For example, the term pregnanc appeared most often in Clusters 1 (14.6 average times per article) and 2 (11.4) and relatively infrequently in Cluster 3 (1.8). Average readability for all articles was high (eg, Flesch-Kincaid, 13; Gunning Fog, 15; SMOG Index, 10; Coleman Liau Index, 15; Linsear Write Index, 13; and Text Standard, 13). Readability increased from Cluster 2 (Gunning Fog of 16.9) to 3 (12.2). It varied between clusters (average 13th-15th grade) but exceeded the recommended health information reading level (grade 6th to 8th), overall. Conclusions Media interpretation of the CNODES study varied, with differences in synonym usage and areas of focus. All articles were written above the recommended health information reading level. Analyzing media using NLP techniques can help determine drug safety communication effectiveness. This project is important for understanding how drug safety studies are taken up and redistributed in the media.

Download Full-text

Correction: Using Natural Language Processing to Examine the Uptake, Content, and Readability of Media Coverage of a Pan-Canadian Drug Safety Research Project: Cross-Sectional Observational Study

JMIR Formative Research ◽

10.2196/20211 ◽

2020 ◽

Vol 4 (6) ◽

pp. e20211

Author(s):

Hossein Mohammadhassanzadeh ◽

Ingrid Sketris ◽

Robyn Traynor ◽

Susan Alexander ◽

Brandace Winquist ◽

...

Keyword(s):

Natural Language Processing ◽

Observational Study ◽

Natural Language ◽

Language Processing ◽

Drug Safety ◽

Media Coverage ◽

Research Project ◽

Cross Sectional ◽

Safety Research

Download Full-text

Correction: Using Natural Language Processing to Examine the Uptake, Content, and Readability of Media Coverage of a Pan-Canadian Drug Safety Research Project: Cross-Sectional Observational Study (Preprint)

10.2196/preprints.20211 ◽

2020 ◽

Author(s):

Hossein Mohammadhassanzadeh ◽

Ingrid Sketris ◽

Robyn Traynor ◽

Susan Alexander ◽

Brandace Winquist ◽

...

Keyword(s):

Natural Language Processing ◽

Observational Study ◽

Natural Language ◽

Language Processing ◽

Drug Safety ◽

Media Coverage ◽

Research Project ◽

Cross Sectional ◽

Safety Research

Download Full-text

Social Media and Political Campaigning

The International Journal of Press/Politics ◽

10.1177/1940161216673196 ◽

2016 ◽

Vol 22 (1) ◽

pp. 23-42 ◽

Cited By ~ 16

Author(s):

Michael J. Jensen

Keyword(s):

Social Media ◽

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Public Figure ◽

Political Campaigning ◽

The Media ◽

Formal Properties ◽

Political Figure ◽

Communication Operations

This paper develops a way for analyzing the structure of campaign communications within Twitter. The structure of communication affordances creates opportunities for a horizontal organization power within Twitter interactions. However, one cannot infer the structure of interactions as they materialize from the formal properties of the technical environment in which the communications occur. Consequently, the paper identifies three categories of empowering communication operations that can occur on Twitter: Campaigns can respond to others, campaigns can retweet others, and campaigns can call for others to become involved in the campaign on their own terms. The paper operationalizes these categories in the context of the 2015 U.K. general election. To determine whether Twitter is used to empower laypersons, the profiles of each account retweeted and replied to were retrieved and analyzed using natural language processing to identify whether an account is from a political figure, member of the media, or some other public figure. In addition, tweets and retweets are compared with respect to the manner key election issues are discussed. The findings indicate that empowering uses of Twitter are fairly marginal, and retweets use almost identical policy language as the original campaign tweets.

Download Full-text

Exploring the Non-Medical impacts of Covid-19 using Natural Language Processing

10.20944/preprints202011.0056.v1 ◽

2020 ◽

Author(s):

Amol Agade ◽

Samta Balpande

Keyword(s):

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Global Economy ◽

Latent Dirichlet Allocation ◽

Topic Modelling ◽

Financial Industry ◽

Distance Map ◽

The Media ◽

Non Negative Matrix Factorization

Ongoing COVID-19 Pandemic has resulted into massive damage to various platforms of global economy which has caused disruption to human livelihood. Natural Language Processing has been extensively used in different organizations to categorize sentiments, perform recommendation, summarizing information and topic modelling. This research aims to understand the non-medical impact of COVID-19 on global economy by leveraging the natural language processing methodology. This methodology comprises of text classification which includes topic modelling on unstructured COVID-19 media articles dataset provided by Anacode. Like other Natural Language Processing algorithms, Latent Dirichlet allocation (LDA) and Non-negative matrix factorization (NMF) has been proposed to classify the media articles dataset in order to analyze COVID-19 pandemic impacts in the different sectors of global economy. Model Accuracy was examined based on the coherence and perplexity score which came out to be 0.51 and -10.90 using LDA algorithm. Both the LDA and NMF algorithm identified similar prevalent topics that was impacted by COVID-19 pandemic in multiple sectors of economy. Through intertopic distance map visualization produced by LDA algorithm, it can be reciprocated that general industries which includes children schooling, parental care, and family gatherings had the major impact followed by business sector and the financial industry.

Download Full-text

4349 Survey of Regulatory Reforms to Address Comprehension of Clinical Trial Results

Journal of Clinical and Translational Science ◽

10.1017/cts.2020.350 ◽

2020 ◽

Vol 4 (s1) ◽

pp. 115-115

Author(s):

Matthieu Kirkland ◽

Christian Reyes ◽

Nancy Pire-Smerkanich ◽

Eunjoo Pacifici

Keyword(s):

Clinical Trial ◽

Natural Language Processing ◽

Natural Language ◽

Clinical Research ◽

Language Processing ◽

Reading Level ◽

Medical Community ◽

European Medicines Agency ◽

Plain Language ◽

Clinical Trial Results

OBJECTIVES/GOALS: Clinical research is the backbone of the medical community. However, there are few regulations to ensure clinical trial participants can understand their results, leading to volunteers feeling unvalued and unlikely to enroll in trials1. This study examines the need of lay summaries METHODS/STUDY POPULATION: To understand the current landscape of clinical trial summaries, literature searches were conducted using the University of Southern California Library database with keywords Title contains “lay language” OR “lay summary” AND any field contains “Trial” OR “clinical”, and Title contains “natural language processing” AND “clinical trial” OR “Summary”. Studies were deemed relevant if they discussed lay language summaries for health care realms or using Natural Language Processing (NLP) to increase comprehension. Papers published by the Center for Information and Study on Clinical Research Participation (CISCRP) were reviewed and their Associate Director was interviewed. RESULTS/ANTICIPATED RESULTS: Of 67 total results, 14 were determined to be relevant. Ten of the relevant results examined lay language summaries and their regulation and 4 were NLP studies. The European Medicines Agency set regulations mandating clinical trial summaries. However, researchers have difficulty validating to an appropriate reading level2. Difficulty and potential bias halted a U.S. mandate of lay summaries3. The nonprofit CISCRP has partnered with industry to develop unbiased clinical trial summaries resulting in all volunteers feeling appreciated and 91% understanding clinical trial results post summary1. Similarly, NLP software for annotating Electronic Health Records increased comprehension for 77% of patients4. DISCUSSION/SIGNIFICANCE OF IMPACT: In the U.S., a lack of regulations mandating lay summaries may be related to concerns by regulatory agencies that summaries in plain language may introduce bias3. Future looks into integration of NLP systems to clinical trials may create unbiased summaries and allow for FDA regulation.

Download Full-text

Representing the True and False Text Information About Human Papillomavirus Vaccines

Proceedings of the International Symposium on Human Factors and Ergonomics in Health Care ◽

10.1177/2327857920091070 ◽

2020 ◽

Vol 9 (1) ◽

pp. 317-321

Author(s):

Chieh-Li Chin ◽

Wen-Yuh Su ◽

Jessie Chin

Keyword(s):

Natural Language Processing ◽

Human Papillomavirus ◽

Natural Language ◽

Health Information ◽

Language Processing ◽

False Information ◽

Computational Approaches ◽

Global Issues ◽

Health Domains ◽

Text Information

While the virality of misinformation has been recognized as one of the significant global issues in the modern societies, few studies had examined the computational approaches to represent and identify false information in health domains. The current study aimed at using both psycholinguistic and natural language processing models to represent verified true and false texts about human papillomavirus (HPV) vaccines. Compared to the conventional word-embedding models representing texts in the levels of words, sentences or documents, results showed that introducing the embedding in the levels of propositions best differentiated the semantic representations in true and false texts. The study would advance our understandings in representing health texts and have implications on detecting false health information.

Download Full-text

Identification of Adverse Drug Event–Related Japanese Articles: Natural Language Processing Analysis (Preprint)

10.2196/preprints.22661 ◽

2020 ◽

Author(s):

Shogo Ujiie ◽

Shuntaro Yada ◽

Shoko Wakamiya ◽

Eiji Aramaki

Keyword(s):

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Drug Safety ◽

Automated System ◽

Pharmaceutical Companies ◽

Manual Labor ◽

Sentence Level ◽

Medical Articles ◽

Document Level

BACKGROUND Medical articles covering adverse drug events (ADEs) are systematically reported by pharmaceutical companies for drug safety information purposes. Although policies governing reporting to regulatory bodies vary among countries and regions, all medical article reporting may be categorized as precision or recall based. Recall-based reporting, which is implemented in Japan, requires the reporting of any possible ADE. Therefore, recall-based reporting can introduce numerous false negatives or substantial amounts of noise, a problem that is difficult to address using limited manual labor. OBJECTIVE Our aim was to develop an automated system that could identify ADE-related medical articles, support recall-based reporting, and alleviate manual labor in Japanese pharmaceutical companies. METHODS Using medical articles as input, our system based on natural language processing applies document-level classification to extract articles containing ADEs (replacing manual labor in the first screening) and sentence-level classification to extract sentences within those articles that imply ADEs (thus supporting experts in the second screening). We used 509 Japanese medical articles annotated by a medical engineer to evaluate the performance of the proposed system. RESULTS Document-level classification yielded an F1 of 0.903. Sentence-level classification yielded an F1 of 0.413. These were averages of fivefold cross-validations. CONCLUSIONS A simple automated system may alleviate the manual labor involved in screening drug safety–related medical articles in pharmaceutical companies. After improving the accuracy of the sentence-level classification by considering a wider context, we intend to apply this system toward real-world postmarketing surveillance.

Download Full-text

Ontology-Based Natural Language Processing of Social Media Data in the Assessment of Health Information Sought During Pregnancy

10.3233/shti210668 ◽

2021 ◽

Author(s):

Joo Yun Lee

Keyword(s):

Social Media ◽

Natural Language Processing ◽

South Korea ◽

Natural Language ◽

Family Support ◽

Health Information ◽

Language Processing ◽

Social Media Data ◽

Media Data

This study analyzed collected social media data from South Korea containing keywords related to “pregnancy” using ontology-based natural language processing. Of the 504,725 documents, those containing concepts related to “maternal emotion” were the most frequent, followed by “family support”. Social media were used as a means of exchanging information and expressing emotions.

Download Full-text

Applying Natural Language Processing to Evaluate News Media Coverage of Bullying and Cyberbullying

Prevention Science ◽

10.1007/s11121-019-01029-x ◽

2019 ◽

Vol 20 (8) ◽

pp. 1274-1283 ◽

Cited By ~ 3

Author(s):

Megan A. Moreno ◽

Aubrey D. Gower ◽

Heather Brittain ◽

Tracy Vaillancourt

Keyword(s):

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

News Media ◽

Media Coverage

Download Full-text