Developing an FHIR-Based Computational Pipeline for Automatic Population of Case Report Forms for Colorectal Cancer Clinical Trials Using Electronic Health Records

JCO Clinical Cancer Informatics ◽

10.1200/cci.19.00116 ◽

2020 ◽

pp. 201-209 ◽

Cited By ~ 4

Author(s):

Nansu Zong ◽

Andrew Wen ◽

Daniel J. Stone ◽

Deepak K. Sharma ◽

Chen Wang ◽

...

Keyword(s):

Colorectal Cancer ◽

Clinical Trials ◽

Case Report ◽

Electronic Health Records ◽

Real World ◽

Cancer Clinical Trials ◽

Computational Pipeline ◽

Cancer Trial ◽

Health Records ◽

Electronic Health

PURPOSE The Fast Healthcare Interoperability Resources (FHIR) is emerging as a next-generation standards framework developed by HL7 for exchanging electronic health care data. The modeling capability of FHIR in standardizing cancer data has been gaining increasing attention by the cancer research informatics community. However, few studies have been conducted to examine the capability of FHIR in electronic data capture (EDC) applications for effective cancer clinical trials. The objective of this study was to design, develop, and evaluate an FHIR-based method that enables the automation of the case report forms (CRFs) population for cancer clinical trials using real-world electronic health records (EHRs). MATERIALS AND METHODS We developed an FHIR-based computational pipeline of EDC with a case study for modeling colorectal cancer trials. We first leveraged an existing FHIR-based cancer profile to represent EHR data of patients with colorectal cancer, and then we used the FHIR Questionnaire and QuestionnaireResponse resources to represent the CRFs and their data population. To test the accuracy of and overall quality of the computational pipeline, we used synoptic reports of 287 Mayo Clinic patients with colorectal cancer from 2013 to 2019 with standard measures of precision, recall, and F1 score. RESULTS Using the computational pipeline, a total of 1,037 synoptic reports were successfully converted as the instances of the FHIR-based cancer profile. The average accuracy for converting all data elements (excluding tumor perforation) of the cancer profile was 0.99, using 200 randomly selected records. The average F1 score for populating nine questions of the CRFs in a real-world colorectal cancer trial was 0.95, using 100 randomly selected records. CONCLUSION We demonstrated that it is feasible to populate CRFs with EHR data in an automated manner with satisfactory performance. The outcome of the study provides helpful insight into future directions in implementing FHIR-based EDC applications for modern cancer clinical trials.

Download Full-text

A Distribution-based Method for Assessing The Differences between Clinical Trial Target Populations and Patient Populations in Electronic Health Records

Applied Clinical Informatics ◽

10.4338/aci-2013-12-ra-0105 ◽

2014 ◽

Vol 05 (02) ◽

pp. 463-479 ◽

Cited By ~ 30

Author(s):

P. Ryan ◽

Y. Zhang ◽

F. Liu ◽

J. Gao ◽

J.T. Bigger ◽

...

Keyword(s):

Type 2 Diabetes ◽

Clinical Trial ◽

Clinical Trials ◽

Electronic Health Records ◽

Real World ◽

World Population ◽

Health Records ◽

Target Populations ◽

Electronic Health

SummaryObjective: To improve the transparency of clinical trial generalizability and to illustrate the method using Type 2 diabetes as an example.Methods: Our data included 1,761 diabetes clinical trials and the electronic health records (EHR) of 26,120 patients with Type 2 diabetes who visited Columbia University Medical Center of New-York Presbyterian Hospital. The two populations were compared using the Generalizability Index for Study Traits (GIST) on the earliest diagnosis age and the mean hemoglobin A1c (HbA1c) values.Results: Greater than 70% of Type 2 diabetes studies allow patients with HbA1c measures between 7 and 10.5, but less than 40% of studies allow HbA1c<7 and fewer than 45% of studies allow HbA1c>10.5. In the real-world population, only 38% of patients had HbA1c between 7 and 10.5, with 12% having values above the range and 52% having HbA1c<7. The GIST for HbA1c was 0.51. Most studies adopted broad age value ranges, with the most common restrictions excluding patients >80 or <18 years. Most of the real-world population fell within this range, but 2% of patients were <18 at time of first diagnosis and 8% were >80. The GIST for age was 0.75. Conclusions: We contribute a scalable method to profile and compare aggregated clinical trial target populations with EHR patient populations. We demonstrate that Type 2 diabetes studies are more generalizable with regard to age than they are with regard to HbA1c. We found that the generalizability of age increased from Phase 1 to Phase 3 while the generalizability of HbA1c decreased during those same phases. This method can generalize to other medical conditions and other continuous or binary variables. We envision the potential use of EHR data for examining the generaliz-ability of clinical trials and for defining population-representative clinical trial eligibility criteria.Citation: Weng C, Li Y, Ryan P, Zhang Y, Liu F, Gao J, Bigger JT, Hripcsak G. A distribution-based method for assessing the differences between clinical trial target populations and patient populations in electronic health records. Appl Clin Inf 2014; 5: 463–479 http://dx.doi.org/10.4338/ACI-2013-12-RA-0105

Download Full-text

Development of CancerLinQ, a Health Information Learning Platform From Multiple Electronic Health Record Systems to Support Improved Quality of Care

JCO Clinical Cancer Informatics ◽

10.1200/cci.20.00064 ◽

2020 ◽

pp. 929-937

Author(s):

Danielle Potter ◽

Raven Brothers ◽

Andrej Kolacevski ◽

Jacob E. Koskimaki ◽

Amy McNutt ◽

...

Keyword(s):

United States ◽

Clinical Trials ◽

Electronic Health Records ◽

Health System ◽

Real World ◽

The United States ◽

Health Records ◽

Patients With Cancer ◽

Learning Health System ◽

Electronic Health

PURPOSE ASCO, through its wholly owned subsidiary, CancerLinQ LLC, developed CancerLinQ, a learning health system for oncology. A learning health system is important for oncology patients because less than 5% of patients with cancer enroll in clinical trials, leaving evidence gaps for patient populations not enrolled in trials. In addition, clinical trial populations often differ from the overall cancer population with respect to age, race, performance status, and other clinical parameters. MATERIALS AND METHODS Working with subscribing practices, CancerLinQ accepts data from electronic health records and transforms the local representation of a patient’s care into a standardized representation on the basis of the Quality Data Model from the National Quality Forum. CancerLinQ provides this information back to the subscribing practice through a series of tools that support quality improvement. CancerLinQ also creates de-identified data sets for secondary research use. RESULTS As of March 2020, CancerLinQ includes data from 63 organizations across the United States that use nine different electronic health records. The database includes 1,426,015 patients with a primary cancer diagnosis, of which 238,680 have had additional information abstracted from unstructured content. CONCLUSION As CancerLinQ continues to onboard subscribing practices, the breadth of potential applications for a learning health care system widen. Future practice-facing tools could include real-world data visualization, recommendations for treatment of patients with actionable genetic variations, and identification of patients who may be eligible for clinical trials. Feeding these insights back into oncology practice ensures that we learn how to treat patients with cancer not just on the basis of the selective experience of the 5% that enroll in clinical trials, but from the real-world experience of the entire spectrum of patients with cancer in the United States.

Download Full-text

A Framework for Systematic Assessment of Clinical Trial Population Representativeness Using Electronic Health Records Data

Applied Clinical Informatics ◽

10.1055/s-0041-1733846 ◽

2021 ◽

Vol 12 (04) ◽

pp. 816-825

Author(s):

Yingcheng Sun ◽

Alex Butler ◽

Ibrahim Diallo ◽

Jae Hyun Kim ◽

Casey Ta ◽

...

Keyword(s):

Clinical Trial ◽

Clinical Trials ◽

Electronic Health Records ◽

The United States ◽

Design Stage ◽

Common Data Model ◽

Free Text ◽

Eligibility Criteria ◽

Health Records ◽

Electronic Health

Abstract Background Clinical trials are the gold standard for generating robust medical evidence, but clinical trial results often raise generalizability concerns, which can be attributed to the lack of population representativeness. The electronic health records (EHRs) data are useful for estimating the population representativeness of clinical trial study population. Objectives This research aims to estimate the population representativeness of clinical trials systematically using EHR data during the early design stage. Methods We present an end-to-end analytical framework for transforming free-text clinical trial eligibility criteria into executable database queries conformant with the Observational Medical Outcomes Partnership Common Data Model and for systematically quantifying the population representativeness for each clinical trial. Results We calculated the population representativeness of 782 novel coronavirus disease 2019 (COVID-19) trials and 3,827 type 2 diabetes mellitus (T2DM) trials in the United States respectively using this framework. With the use of overly restrictive eligibility criteria, 85.7% of the COVID-19 trials and 30.1% of T2DM trials had poor population representativeness. Conclusion This research demonstrates the potential of using the EHR data to assess the clinical trials population representativeness, providing data-driven metrics to inform the selection and optimization of eligibility criteria.

Download Full-text

Population Pharmacokinetic Analysis of Dexmedetomidine in Children using Real World Data from Electronic Health Records and Remnant Specimens

British Journal of Clinical Pharmacology ◽

10.1111/bcp.15194 ◽

2021 ◽

Author(s):

Nathan T. James ◽

Joseph H. Breeyear ◽

Richard Caprioli ◽

Todd Edwards ◽

Brian Hachey ◽

...

Keyword(s):

Electronic Health Records ◽

Real World ◽

Population Pharmacokinetic Analysis ◽

Pharmacokinetic Analysis ◽

Population Pharmacokinetic ◽

Real World Data ◽

Health Records ◽

World Data ◽

Electronic Health

Download Full-text

‘Precision Health’: Balancing Reactive Care and Proactive Care Through the Evidence Based Knowledge Graph Constructed from Real-World Electronic Health Records, Disease Trajectories, Diseasome, and Patholome

Big Data Analytics - Lecture Notes in Computer Science ◽

10.1007/978-3-030-66665-1_9 ◽

2020 ◽

pp. 113-133

Author(s):

Asoke K Talukder ◽

Julio Bonis Sanz ◽

Jahnavi Samajpati

Keyword(s):

Electronic Health Records ◽

Real World ◽

Knowledge Graph ◽

Evidence Based ◽

Health Records ◽

Electronic Health ◽

Proactive Care ◽

Precision Health

Download Full-text

Assessing function of electronic health records for real-world data generation

BMJ evidence-based medicine ◽

10.1136/bmjebm-2018-111111 ◽

2018 ◽

Vol 24 (3) ◽

pp. 95-98 ◽

Cited By ~ 2

Author(s):

Daphne Guinn ◽

Erin E Wilhelm ◽

Grazyna Lieberman ◽

Sean Khozin

Keyword(s):

Electronic Health Records ◽

Real World ◽

Data Generation ◽

Real World Data ◽

Health Records ◽

World Data ◽

Electronic Health

Download Full-text

Secondary Use of Electronic Health Records for Building Large, Real-World ILD Cohorts

10.1164/ajrccm-conference.2021.203.1_meetingabstracts.a1873 ◽

2021 ◽

Author(s):

E.D. Farrand ◽

O. Gologorskaya ◽

H. Mills ◽

L. Radhakrishnan ◽

H.R. Collard ◽

...

Keyword(s):

Electronic Health Records ◽

Real World ◽

Health Records ◽

Secondary Use ◽

Electronic Health

Download Full-text

Phenotyping issues for exploring electronic health records to design clinical trials

Clinical Trials ◽

10.1177/1740774520931039 ◽

2020 ◽

Vol 17 (4) ◽

pp. 402-404

Author(s):

Jill Schnall ◽

LingJiao Zhang ◽

Jinbo Chen

Keyword(s):

Clinical Trials ◽

Electronic Health Records ◽

Electronic Health Record ◽

Statistical Methods ◽

Gold Standard ◽

The Other ◽

Health Record ◽

Health Records ◽

Electronic Health ◽

Downstream Analysis

For utilizing electronic health records to help design and conduct clinical trials, an essential first step is to select eligible patients from electronic health records, that is, electronic health record phenotyping. We present two novel statistical methods that can be used in the context of electronic health record phenotyping. One mitigates the requirement for gold-standard control patients in developing phenotyping algorithms, and the other effectively corrects for bias in downstream analysis introduced by study samples contaminated by ineligible subjects.

Download Full-text

Publication bias in clinical trials of electronic health records

Journal of Biomedical Informatics ◽

10.1016/j.jbi.2012.08.007 ◽

2013 ◽

Vol 46 (1) ◽

pp. 139-141 ◽

Cited By ~ 24

Author(s):

David K. Vawdrey ◽

George Hripcsak

Keyword(s):

Clinical Trials ◽

Electronic Health Records ◽

Publication Bias ◽

Health Records ◽

Electronic Health

Download Full-text

Five analytic challenges in working with electronic health records data to support clinical trials with some solutions

Clinical Trials ◽

10.1177/1740774520931211 ◽

2020 ◽

Vol 17 (4) ◽

pp. 370-376

Author(s):

Benjamin A Goldstein

Keyword(s):

Clinical Trials ◽

Electronic Health Records ◽

Large Scale ◽

Point Of Care ◽

Cross Sectional ◽

Health Records ◽

Data Resource ◽

Electronic Health ◽

Data Elements ◽

Data Efficiency

Electronic health records data are becoming a key data resource in clinical research. Owing to issues of data efficiency, electronic health records data are being used for clinical trials. This includes both large-scale pragmatic trails and smaller—more focused—point-of-care trials. While electronic health records data open up a number of scientific opportunities, they also present a number of analytic challenges. This article discusses five particular challenges related to organizing electronic health records data for analytic purposes. These are as follows: (1) data are not organized for research purposes, (2) data are both densely and irregularly observed, (3) we don’t have all data elements we may want or need, (4) data are both cross-sectional and longitudinal, and (5) data may be informatively observed. While laying out these challenges, the article notes how many of these challenges can be addressed by careful and thoughtful study design as well as by integration of clinicians and informaticians into the analytic team.

Download Full-text