scholarly journals A Pragmatic Framework for Single-site and Multisite Data Quality Assessment in Electronic Health Record-based Clinical Research

Medical Care ◽  
2012 ◽  
Vol 50 ◽  
pp. S21-S29 ◽  
Author(s):  
Michael G. Kahn ◽  
Marsha A. Raebel ◽  
Jason M. Glanz ◽  
Karen Riedlinger ◽  
John F. Steiner
2016 ◽  
Vol 07 (01) ◽  
pp. 69-88 ◽  
Author(s):  
Stuart Speedie ◽  
Gyorgy Simon ◽  
Vipin Kumar ◽  
Bonnie Westra ◽  
Steven Johnson

SummaryThe goal of this study is to apply an ontology based assessment process to electronic health record (EHR) data and determine its usefulness in characterizing data quality for calculating an example eMeasure (CMS178).The process uses a data quality ontology that references separate data quality, domain and task ontologies to compute measures based on proportions of constraints that are satisfied. These quantities indicate how well the data conforms to the domain and how well it fits the task.The process was performed on a de-identified 200,000 encounter sample from a hospital EHR. CodingConsistency was poor (44%) but DomainConsistency (97%) and TaskRelevance (95%) were very good. Improvements in the data quality Measures correlated with improvements in the eMeasure.This approach can encourage the development of new detailed Domain ontologies that can be reused for data quality purposes across different organizations’ EHR data. Automating the data quality assessment process using this method can enable sharing of data quality metrics that may aid in making research results that use EHR data more transparent and reproducible.


2021 ◽  
Vol 21 (1) ◽  
Author(s):  
Hui Wang ◽  
Ilana Belitskaya-Levy ◽  
Fan Wu ◽  
Jennifer S. Lee ◽  
Mei-Chiung Shih ◽  
...  

Abstract Background To describe an automated method for assessment of the plausibility of continuous variables collected in the electronic health record (EHR) data for real world evidence research use. Methods The most widely used approach in quality assessment (QA) for continuous variables is to detect the implausible numbers using prespecified thresholds. In augmentation to the thresholding method, we developed a score-based method that leverages the longitudinal characteristics of EHR data for detection of the observations inconsistent with the history of a patient. The method was applied to the height and weight data in the EHR from the Million Veteran Program Data from the Veteran’s Healthcare Administration (VHA). A validation study was also conducted. Results The receiver operating characteristic (ROC) metrics of the developed method outperforms the widely used thresholding method. It is also demonstrated that different quality assessment methods have a non-ignorable impact on the body mass index (BMI) classification calculated from height and weight data in the VHA’s database. Conclusions The score-based method enables automated and scaled detection of the problematic data points in health care big data while allowing the investigators to select the high-quality data based on their need. Leveraging the longitudinal characteristics in EHR will significantly improve the QA performance.


Sign in / Sign up

Export Citation Format

Share Document