4349 Survey of Regulatory Reforms to Address Comprehension of Clinical Trial Results

Matthieu Kirkland; Christian Reyes; Nancy Pire-Smerkanich; Eunjoo Pacifici

doi:10.1017/cts.2020.350

4349 Survey of Regulatory Reforms to Address Comprehension of Clinical Trial Results

Journal of Clinical and Translational Science ◽

10.1017/cts.2020.350 ◽

2020 ◽

Vol 4 (s1) ◽

pp. 115-115

Author(s):

Matthieu Kirkland ◽

Christian Reyes ◽

Nancy Pire-Smerkanich ◽

Eunjoo Pacifici

Keyword(s):

Clinical Trial ◽

Natural Language Processing ◽

Natural Language ◽

Clinical Research ◽

Language Processing ◽

Reading Level ◽

Medical Community ◽

European Medicines Agency ◽

Plain Language ◽

Clinical Trial Results

OBJECTIVES/GOALS: Clinical research is the backbone of the medical community. However, there are few regulations to ensure clinical trial participants can understand their results, leading to volunteers feeling unvalued and unlikely to enroll in trials1. This study examines the need of lay summaries METHODS/STUDY POPULATION: To understand the current landscape of clinical trial summaries, literature searches were conducted using the University of Southern California Library database with keywords Title contains “lay language” OR “lay summary” AND any field contains “Trial” OR “clinical”, and Title contains “natural language processing” AND “clinical trial” OR “Summary”. Studies were deemed relevant if they discussed lay language summaries for health care realms or using Natural Language Processing (NLP) to increase comprehension. Papers published by the Center for Information and Study on Clinical Research Participation (CISCRP) were reviewed and their Associate Director was interviewed. RESULTS/ANTICIPATED RESULTS: Of 67 total results, 14 were determined to be relevant. Ten of the relevant results examined lay language summaries and their regulation and 4 were NLP studies. The European Medicines Agency set regulations mandating clinical trial summaries. However, researchers have difficulty validating to an appropriate reading level2. Difficulty and potential bias halted a U.S. mandate of lay summaries3. The nonprofit CISCRP has partnered with industry to develop unbiased clinical trial summaries resulting in all volunteers feeling appreciated and 91% understanding clinical trial results post summary1. Similarly, NLP software for annotating Electronic Health Records increased comprehension for 77% of patients4. DISCUSSION/SIGNIFICANCE OF IMPACT: In the U.S., a lack of regulations mandating lay summaries may be related to concerns by regulatory agencies that summaries in plain language may introduce bias3. Future looks into integration of NLP systems to clinical trials may create unbiased summaries and allow for FDA regulation.

Download Full-text

Best Paper Selection

Yearbook of Medical Informatics ◽

10.1055/s-0041-1726529 ◽

2021 ◽

Vol 30 (01) ◽

pp. 263-263

Keyword(s):

Mental Health ◽

Clinical Trial ◽

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Domain Adaptation ◽

Language Models ◽

Health Records ◽

Clinical Trial Results ◽

Evidence Integration

Jin Q, Tan C, Chen M, Liu X, Huang S. Predicting clinical trial results by implicit evidence integration. https://arxiv.org/abs/2010.05639 Poerner N, Waltinger U, Schütze H. Inexpensive Domain Adaptation of Pre-trained Language Models: Case Studies on Biomedical NER and Covid-19 QA. https://arxiv.org/abs/2004.03354 Ive J, Viani N, Kam J, Yin L, Verma S, Puntis S, Cardinal R, Roberts A, Stewart R, Velupillai S. Generation and evaluation of artificial mental health records for Natural Language Processing. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7224173/

Download Full-text

Automatic Extraction of Adverse Drug Reactions from Summary of Product Characteristics

Applied Sciences ◽

10.3390/app11062663 ◽

2021 ◽

Vol 11 (6) ◽

pp. 2663

Author(s):

Zhengru Shen ◽

Marco Spruit

Keyword(s):

Natural Language Processing ◽

Natural Language ◽

Adverse Drug Reactions ◽

Language Processing ◽

European Medicines Agency ◽

Drug Reactions ◽

Clinical Practices ◽

Product Characteristics ◽

Textual Information ◽

Summary Of Product Characteristics

The summary of product characteristics from the European Medicines Agency is a reference document on medicines in the EU. It contains textual information for clinical experts on how to safely use medicines, including adverse drug reactions. Using natural language processing (NLP) techniques to automatically extract adverse drug reactions from such unstructured textual information helps clinical experts to effectively and efficiently use them in daily practices. Such techniques have been developed for Structured Product Labels from the Food and Drug Administration (FDA), but there is no research focusing on extracting from the Summary of Product Characteristics. In this work, we built a natural language processing pipeline that automatically scrapes the summary of product characteristics online and then extracts adverse drug reactions from them. Besides, we have made the method and its output publicly available so that it can be reused and further evaluated in clinical practices. In total, we extracted 32,797 common adverse drug reactions for 647 common medicines scraped from the Electronic Medicines Compendium. A manual review of 37 commonly used medicines has indicated a good performance, with a recall and precision of 0.99 and 0.934, respectively.

Download Full-text

Text Classification for Clinical Trial Operations: Evaluation and Comparison of Natural Language Processing Techniques

Therapeutic Innovation & Regulatory Science ◽

10.1007/s43441-020-00236-x ◽

2020 ◽

Author(s):

Emma Richard ◽

Bhargava Reddy

Keyword(s):

Clinical Trial ◽

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Text Classification ◽

Processing Techniques

Download Full-text

Artificial intelligence approaches using natural language processing to advance EHR-based clinical research

Journal of Allergy and Clinical Immunology ◽

10.1016/j.jaci.2019.12.897 ◽

2020 ◽

Vol 145 (2) ◽

pp. 463-469 ◽

Cited By ~ 7

Author(s):

Young Juhn ◽

Hongfang Liu

Keyword(s):

Artificial Intelligence ◽

Natural Language Processing ◽

Natural Language ◽

Clinical Research ◽

Language Processing

Download Full-text

Improving the Efficacy of the Data Entry Process for Clinical Research With a Natural Language Processing–Driven Medical Information Extraction System: Quantitative Field Research

JMIR Medical Informatics ◽

10.2196/13331 ◽

2019 ◽

Vol 7 (3) ◽

pp. e13331 ◽

Cited By ~ 3

Author(s):

Jiang Han ◽

Ken Chen ◽

Lei Fang ◽

Shaodian Zhang ◽

Fei Wang ◽

...

Keyword(s):

Natural Language Processing ◽

Natural Language ◽

Clinical Research ◽

Information Extraction ◽

Language Processing ◽

Medical Information ◽

Data Entry ◽

Field Research ◽

Extraction System ◽

Information Extraction System

Download Full-text

Latent Dirichlet Allocation in predicting clinical trial terminations

BMC Medical Informatics and Decision Making ◽

10.1186/s12911-019-0973-y ◽

2019 ◽

Vol 19 (1) ◽

Author(s):

Simon Geletta ◽

Lendie Follett ◽

Marcia Laugerman

Keyword(s):

Clinical Trial ◽

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Latent Dirichlet Allocation ◽

Structured Data ◽

Unstructured Data ◽

Future Research ◽

Funding Agencies ◽

Dirichlet Allocation

Abstract Background This study used natural language processing (NLP) and machine learning (ML) techniques to identify reliable patterns from within research narrative documents to distinguish studies that complete successfully, from the ones that terminate. Recent research findings have reported that at least 10 % of all studies that are funded by major research funding agencies terminate without yielding useful results. Since it is well-known that scientific studies that receive funding from major funding agencies are carefully planned, and rigorously vetted through the peer-review process, it was somewhat daunting to us that study-terminations are this prevalent. Moreover, our review of the literature about study terminations suggested that the reasons for study terminations are not well understood. We therefore aimed to address that knowledge gap, by seeking to identify the factors that contribute to study failures. Method We used data from the clinicialTrials.gov repository, from which we extracted both structured data (study characteristics), and unstructured data (the narrative description of the studies). We applied natural language processing techniques to the unstructured data to quantify the risk of termination by identifying distinctive topics that are more frequently associated with trials that are terminated and trials that are completed. We used the Latent Dirichlet Allocation (LDA) technique to derive 25 “topics” with corresponding sets of probabilities, which we then used to predict study-termination by utilizing random forest modeling. We fit two distinct models – one using only structured data as predictors and another model with both structured data and the 25 text topics derived from the unstructured data. Results In this paper, we demonstrate the interpretive and predictive value of LDA as it relates to predicting clinical trial failure. The results also demonstrate that the combined modeling approach yields robust predictive probabilities in terms of both sensitivity and specificity, relative to a model that utilizes the structured data alone. Conclusions Our study demonstrated that the use of topic modeling using LDA significantly raises the utility of unstructured data in better predicating the completion vs. termination of studies. This study sets the direction for future research to evaluate the viability of the designs of health studies.

Download Full-text

A natural language processing tool for automatic identification of new disease and disease progression: Parsing text in multi-institutional radiology reports to facilitate clinical trial eligibility screening.

Journal of Clinical Oncology ◽

10.1200/jco.2021.39.15_suppl.1555 ◽

2021 ◽

Vol 39 (15_suppl) ◽

pp. 1555-1555

Author(s):

Eric J. Clayton ◽

Imon Banerjee ◽

Patrick J. Ward ◽

Maggie D Howell ◽

Beth Lohmueller ◽

...

Keyword(s):

Clinical Trial ◽

Clinical Trials ◽

Natural Language Processing ◽

Natural Language ◽

Disease Progression ◽

Language Processing ◽

Free Text ◽

New Disease ◽

Radiology Reports ◽

Precision And Accuracy

1555 Background: Screening every patient for clinical trials is time-consuming, costly and inefficient. Developing an automated method for identifying patients who have potential disease progression, at the point where the practice first receives their radiology reports, but prior to the patient’s office visit, would greatly increase the efficiency of clinical trial operations and likely result in more patients being offered trial opportunities. Methods: Using Natural Language Processing (NLP) methodology, we developed a text parsing algorithm to automatically extract information about potential new disease or disease progression from multi-institutional, free-text radiology reports (CT, PET, bone scan, MRI or x-ray). We combined semantic dictionary mapping and machine learning techniques to normalize the linguistic and formatting variations in the text, training the XGBoost model particularly to achieve a high precision and accuracy to satisfy clinical trial screening requirements. In order to be comprehensive, we enhanced the model vocabulary using a multi-institutional dataset which includes reports from two academic institutions. Results: A dataset of 732 de-identified radiology reports were curated (two MDs agreed on potential new disease/dz progression vs stable) and the model was repeatedly re-trained for each fold where the folds were randomly selected. The final model achieved consistent precision (>0.87 precision) and accuracy (>0.87 accuracy). See the table for a summary of the results, by radiology report type. We are continuing work on the model to validate accuracy and precision using a new and unique set of reports. Conclusions: NLP systems can be used to identify patients who potentially have suffered new disease or disease progression and reduce the human effort in screening or clinical trials. Efforts are ongoing to integrate the NLP process into existing EHR reporting. New imaging reports sent via interface to the EHR will be extracted daily using a database query and will be provided via secure electronic transport to the NLP system. Patients with higher likelihood of disease progression will be automatically identified, and their reports routed to the clinical trials office for clinical trial screening parallel to physician EHR mailbox reporting. The over-arching goal of the project is to increase clinical trial enrollment. 5-fold cross-validation performance of the NLP model in terms of accuracy, precision and recall averaged across all the folds.[Table: see text]

Download Full-text

Latent Dirichlet Allocation in Predicting Clinical Trial Terminations

10.21203/rs.2.12904/v4 ◽

2019 ◽

Author(s):

Simon Geletta ◽

Lendie Follett ◽

Marcia R Laugerman

Keyword(s):

Clinical Trial ◽

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Latent Dirichlet Allocation ◽

Structured Data ◽

Unstructured Data ◽

Future Research ◽

Funding Agencies ◽

Dirichlet Allocation

Abstract This study used natural language processing (NLP) and machine learning (ML) techniques to identify reliable patterns from within research narrative documents to distinguish studies that complete successfully, from the ones that terminate. Recent research findings have reported that at least ten percent of all studies that are funded by major research funding agencies terminate without yielding useful results. Since it is well-known that scientific studies that receive funding from major funding agencies are carefully planned, and rigorously vetted through the peer-review process, it was somewhat daunting to us that study-terminations are this prevalent. Moreover, our review of the literature about study terminations suggested that the reasons for study terminations are not well understood. We therefore aimed to address that knowledge gap, by seeking to identify the factors that contribute to study failures.Method: We used data from the clinicialTrials.gov repository, from which we extracted both structured data (study characteristics), and unstructured data (the narrative description of the studies). We applied natural language processing techniques to the unstructured data to quantify the risk of termination by identifying distinctive topics that are more frequently associated with trials that are terminated and trials that are completed. We used the Latent Dirichlet Allocation (LDA) technique to derive 25 “topics” with corresponding sets of probabilities, which we then used to predict study-termination by utilizing random forest modeling. We fit two distinct models – one using only structured data as predictors and another model with both structured data and the 25 text topics derived from the unstructured data.Results: In this paper, we demonstrate the interpretive and predictive value of LDA as it relates to predicting clinical trial failure. The results also demonstrate that the combined modeling approach yields robust predictive probabilities in terms of both sensitivity and specificity, relative to a model that utilizes the structured data alone.Conclusions: Our study demonstrated that the use of topic modeling using LDA significantly raises the utility of unstructured data in better predicating the completion vs. termination of studies. This study sets the direction for future research to evaluate the viability of the designs of health studies.

Download Full-text

Latent Dirichlet Allocation in Predicting Clinical Trial Terminations

10.21203/rs.2.12904/v3 ◽

2019 ◽

Author(s):

Simon Geletta ◽

Lendie Follett ◽

Marcia R Laugerman

Keyword(s):

Clinical Trial ◽

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Latent Dirichlet Allocation ◽

Structured Data ◽

Unstructured Data ◽

Future Research ◽

Funding Agencies ◽

Dirichlet Allocation

Abstract Background: This study used natural language processing (NLP) and machine learning (ML) techniques to identify reliable patterns from within research narrative documents to distinguish studies that complete successfully, from the ones that terminate. Recent research findings have reported that at least ten percent of all studies that are funded by major research funding agencies terminate without yielding useful results. Since it is well-known that scientific studies that receive funding from major funding agencies are carefully planned, and rigorously vetted through the peer-review process, it was somewhat daunting to us that study-terminations are this prevalent. Moreover, our review of the literature about study terminations suggested that the reasons for study terminations are not well understood. We therefore aimed to address that knowledge gap, by seeking to identify the factors that contribute to study failures.Method: We used data from the clinicialTrials.gov repository, from which we extracted both structured data (study characteristics), and unstructured data (the narrative description of the studies). We applied natural language processing techniques to the unstructured data to quantify the risk of termination by identifying distinctive topics that are more frequently associated with trials that are terminated and trials that are completed. We used the Latent Dirichlet Allocation (LDA) technique to derive 25 “topics” with corresponding sets of probabilities, which we then used to predict study-termination by utilizing random forest modeling. We fit two distinct models – one using only structured data as predictors and another model with both structured data and the 25 text topics derived from the unstructured data.Results: In this paper, we demonstrate the interpretive and predictive value of LDA as it relates to predicting clinical trial failure. The results also demonstrate that the combined modeling approach yields robust predictive probabilities in terms of both sensitivity and specificity, relative to a model that utilizes the structured data alone.Conclusions: Our study demonstrated that the use of topic modeling using LDA significantly raises the utility of unstructured data in better predicating the completion vs. termination of studies. This study sets the direction for future research to evaluate the viability of the designs of health studies.

Download Full-text

Identifying Suicide Ideation and Suicidal Attempts in a Psychiatric Clinical Research Database using Natural Language Processing

Scientific Reports ◽

10.1038/s41598-018-25773-2 ◽

2018 ◽

Vol 8 (1) ◽

Cited By ~ 27

Author(s):

Andrea C. Fernandes ◽

Rina Dutta ◽

Sumithra Velupillai ◽

Jyoti Sanyal ◽

Robert Stewart ◽

...

Keyword(s):

Natural Language Processing ◽

Natural Language ◽

Clinical Research ◽

Language Processing ◽

Suicide Ideation ◽

Research Database ◽

Suicidal Attempts

Download Full-text