A longitudinal analysis of data quality in a large pediatric data research network

Ritu Khare; Levon Utidjian; Byron J Ruth; Michael G Kahn; Evanette Burrows; Keith Marsolo; Nandan Patibandla; Hanieh Razzaghi; Ryan Colvin; Daksha Ranade; Melody Kitzmiller; Daniel Eckrich; L Charles Bailey

doi:10.1093/jamia/ocx033

A longitudinal analysis of data quality in a large pediatric data research network

Journal of the American Medical Informatics Association ◽

10.1093/jamia/ocx033 ◽

2017 ◽

Vol 24 (6) ◽

pp. 1072-1079 ◽

Cited By ~ 16

Author(s):

Ritu Khare ◽

Levon Utidjian ◽

Byron J Ruth ◽

Michael G Kahn ◽

Evanette Burrows ◽

...

Keyword(s):

Data Quality ◽

Large Scale ◽

Research Capacity ◽

Lessons Learned ◽

Research Network ◽

Common Data Model ◽

Electronic Health Record Data ◽

Quality Issues ◽

Conducting Research ◽

Quality Assessments

Abstract Objective PEDSnet is a clinical data research network (CDRN) that aggregates electronic health record data from multiple children’s hospitals to enable large-scale research. Assessing data quality to ensure suitability for conducting research is a key requirement in PEDSnet. This study presents a range of data quality issues identified over a period of 18 months and interprets them to evaluate the research capacity of PEDSnet. Materials and Methods Results were generated by a semiautomated data quality assessment workflow. Two investigators reviewed programmatic data quality issues and conducted discussions with the data partners’ extract-transform-load analysts to determine the cause for each issue. Results The results include a longitudinal summary of 2182 data quality issues identified across 9 data submission cycles. The metadata from the most recent cycle includes annotations for 850 issues: most frequent types, including missing data (>300) and outliers (>100); most complex domains, including medications (>160) and lab measurements (>140); and primary causes, including source data characteristics (83%) and extract-transform-load errors (9%). Discussion The longitudinal findings demonstrate the network’s evolution from identifying difficulties with aligning the data to a common data model to learning norms in clinical pediatrics and determining research capability. Conclusion While data quality is recognized as a critical aspect in establishing and utilizing a CDRN, the findings from data quality assessments are largely unpublished. This paper presents a real-world account of studying and interpreting data quality findings in a pediatric CDRN, and the lessons learned could be used by other CDRNs.

Download Full-text

Electronic Health Record Data Quality Issues Are Not Remedied by Increasing Granularity of Diagnosis Codes

JAMA Cardiology ◽

10.1001/jamacardio.2019.0830 ◽

2019 ◽

Vol 4 (5) ◽

pp. 465 ◽

Cited By ~ 2

Author(s):

Ann Marie Navar

Keyword(s):

Electronic Health Record ◽

Data Quality ◽

Health Record ◽

Electronic Health Record Data ◽

Diagnosis Codes ◽

Record Data ◽

Quality Issues ◽

Electronic Health

Download Full-text

Prediction of incident myocardial infarction using machine learning applied to harmonized electronic health record data

BMC Medical Informatics and Decision Making ◽

10.1186/s12911-020-01268-x ◽

2020 ◽

Vol 20 (1) ◽

Author(s):

Divneet Mandair ◽

Premanand Tiwari ◽

Steven Simon ◽

Kathryn L. Colborn ◽

Michael A. Rosenberg

Keyword(s):

Neural Network ◽

Machine Learning ◽

Risk Factors ◽

Myocardial Infarction ◽

Logistic Regression ◽

Large Scale ◽

Deep Neural Network ◽

Low Frequency ◽

Common Data Model ◽

Electronic Health Record Data

Abstract Background With cardiovascular disease increasing, substantial research has focused on the development of prediction tools. We compare deep learning and machine learning models to a baseline logistic regression using only ‘known’ risk factors in predicting incident myocardial infarction (MI) from harmonized EHR data. Methods Large-scale case-control study with outcome of 6-month incident MI, conducted using the top 800, from an initial 52 k procedures, diagnoses, and medications within the UCHealth system, harmonized to the Observational Medical Outcomes Partnership common data model, performed on 2.27 million patients. We compared several over- and under- sampling techniques to address the imbalance in the dataset. We compared regularized logistics regression, random forest, boosted gradient machines, and shallow and deep neural networks. A baseline model for comparison was a logistic regression using a limited set of ‘known’ risk factors for MI. Hyper-parameters were identified using 10-fold cross-validation. Results Twenty thousand Five hundred and ninety-one patients were diagnosed with MI compared with 2.25 million who did not. A deep neural network with random undersampling provided superior classification compared with other methods. However, the benefit of the deep neural network was only moderate, showing an F1 Score of 0.092 and AUC of 0.835, compared to a logistic regression model using only ‘known’ risk factors. Calibration for all models was poor despite adequate discrimination, due to overfitting from low frequency of the event of interest. Conclusions Our study suggests that DNN may not offer substantial benefit when trained on harmonized data, compared to traditional methods using established risk factors for MI.

Download Full-text

Large-Scale Research via App Stores

Advances in Wireless Technologies and Telecommunication - Emerging Perspectives on the Design, Use, and Evaluation of Mobile and Handheld Devices ◽

10.4018/978-1-4666-8583-3.ch012 ◽

2015 ◽

pp. 269-292

Author(s):

Matthias Kranz ◽

Andreas Möller ◽

Florian Michahelles

Keyword(s):

Data Collection ◽

Human Computer Interaction ◽

Real World ◽

Mobile Applications ◽

Large Scale ◽

Mobile App ◽

Lessons Learned ◽

Conducting Research ◽

Collection Methods

Large-scale research has gained momentum in the context of Mobile Human-Computer Interaction (Mobile HCI), as many aspects of mobile app usage can only be evaluated in the real world. In this chapter, we present findings on the challenges of research in the large via app stores, in conjunction with selected data collection methods (logging, self-reporting) we identified and have proven as useful in our research. As a case study, we investigated the adoption of NFC technology, based on a gamification approach. We therefore describe the development of the game NFC Heroes involving two release cycles. We conclude with lessons learned and provide recommendations for conducting research in the large for mobile applications.

Download Full-text

The Effects of Survey Enhancements on the Quality of Reporting in the Medical Expenditure Panel Survey, 2008–2015

Journal of Survey Statistics and Methodology ◽

10.1093/jssam/smz014 ◽

2019 ◽

Vol 8 (3) ◽

pp. 589-616 ◽

Cited By ~ 1

Author(s):

Samuel H Zuvekas ◽

Adam I Biener ◽

Wendy D Hicks

Keyword(s):

Health Care ◽

Data Quality ◽

Large Scale ◽

Medical Expenditure Panel Survey ◽

Lessons Learned ◽

Health Care Use ◽

Medical Expenditure ◽

Panel Survey ◽

Quality Of Reporting

Abstract It is well established that survey respondents imperfectly recall health care use in surveys. However, careful attention to both survey design and fielding procedures can enhance recall. We examine the effects of a comprehensive, multi-pronged approach to changing field procedures in the Medical Expenditure Panel Survey (MEPS) to improve quality of health care use reporting. Conducted annually since 1996, the MEPS is the leading large-scale nationally representative health survey with detailed individual and household information on health care use and expenditures. These survey enhancements were undertaken in 2013–2014 because of concerns over a drop in the quality of reporting in 2010 that persisted into 2011–2012. The approach combined focused retraining of field supervisors and interviewers, developing quality metrics and reports for ongoing monitoring of interviewers, and revising advanced letters and materials sent to respondents. We seek to determine the extent to which changes in field procedures and trainings improved interviewer and respondent behaviors associated with better reporting, and more importantly, improved reporting accuracy. We use longitudinal MEPS data from 2008 through 2015, combining household reported use with sociodemographic and health status characteristics, and paradata on the characteristics of the interviews and interviewers. We exploit the longitudinal data and timings of major trainings and changes in field procedures in regression models, separating out the effects of the trainings and other fielding changes to the extent possible. We find that the 2013–2014 data quality improvement activities substantially improved reporting quality. Positive interviewer behaviors increased substantially to above pre-2010 levels, and utilization reporting has recovered to above pre-2010 levels, returning MEPS to trend. Importantly, these substantial gains occurred in 2013, prior to extensive in-person training for most of the field force. We examine the lessons learned from this data quality initiative both for the MEPS program and for other large household surveys.

Download Full-text

Extracting Electronic Health Record Data in a Practice-Based Research Network: Lessons Learned from Collaborations with Translational Researchers

eGEMs (Generating Evidence & Methods to improve patient outcomes) ◽

10.13063/2327-9214.1206 ◽

2016 ◽

Vol 4 (2) ◽

pp. 4 ◽

Cited By ~ 4

Author(s):

Allison M. Cole ◽

Kari A. Stephens ◽

Gina A. Keppel ◽

Hossein Estiri ◽

Laura-Mae Baldwin

Keyword(s):

Electronic Health Record ◽

Lessons Learned ◽

Research Network ◽

Health Record ◽

Electronic Health Record Data ◽

Record Data ◽

Practice Based Research ◽

Electronic Health

Download Full-text

Lessons Learned through Research Partnership and Capacity Enhancement in Inuit Nunangat

ARCTIC ◽

10.14430/arctic69507 ◽

2019 ◽

Vol 72 (4) ◽

pp. 381-403

Author(s):

Natalie Ann Carter ◽

Jackie Dawson ◽

Natasha Simonee ◽

Shirley Tagalik ◽

Gita Ljubicic

Keyword(s):

Learning Experience ◽

Research Capacity ◽

Lessons Learned ◽

The Arctic ◽

Community Research ◽

Management Options ◽

Local Organizations ◽

Working Together ◽

Conducting Research ◽

Partnership Approach

Facilitating research and enhancing community research capacity through a partnered approach in Inuit Nunangat (the Inuit homeland of Canada, located in Arctic Canada) presents learning opportunities and challenges for southern-based, non-Inuit researchers and community members alike. This article outlines lessons learned through the Arctic Corridors and Northern Voices (AC-NV) project, which involved 14 communities across Inuit Nunangat. The AC-NV focused on understanding community-identified impacts and potential management options of increased shipping in Inuit Nunangat due to sea ice reductions and a changing climate. The approach used to conduct the research involved visiting researchers and community partners working together with local organizations, and training and hiring northern youth as cultural liaisons and workshop co-facilitators. We strove to develop a model of collaborative partnership and strong north-south research relationships. In this paper, we draw on our broad learning experiences from four community case studies conducted as part of the AC-NV project: Arviat, Cambridge Bay, Gjoa Haven, and Pond Inlet, Nunavut. Close partnerships were formed in each of these communities, and 32 youth were trained in participatory mapping and workshop facilitation. For our diverse team of Inuit, northern- (i.e., non-Inuit, living in Inuit Nunangat), and southern-based non-Inuit researchers, our efforts to engage in partnered research were a critical component of the research and learning experience. In this article we share methodological reflections and lessons learned from what collaborative-partnered research means in practice. In so doing, we aim to contribute to the increasing dialogue and efforts around knowledge co-production and Inuit self-determination in research. Key conclusions of this reflective exercise include the importance of 1) conducting research that is relevant to local needs and interests, 2) visiting researchers and local organizations partnering together, 3) co-creating and refining knowledge documentation tools, 4) including youth cultural liaisons as co-facilitators, 5) conducting results validation and sharing exercises, and 6) being open to forming personal friendships. For the AC-NV, this community-based partnership approach resulted in more robust research results, strengthened north-south relations, and enhanced local capacity for community-led projects.

Download Full-text

Experiences With and Lessons Learned From Developing, Implementing, and Evaluating a Support Program for Older Hearing Aid Users and Their Communication Partners in the Hearing Aid Dispensing Setting

American Journal of Audiology ◽

10.1044/2020_aja-19-00072 ◽

2020 ◽

Vol 29 (3S) ◽

pp. 638-647 ◽

Cited By ~ 2

Author(s):

Janine F. J. Meijerink ◽

Marieke Pronk ◽

Sophia E. Kramer

Keyword(s):

Large Scale ◽

Hearing Aid ◽

Critical Discussion ◽

Lessons Learned ◽

Research Note ◽

Support Program ◽

Large Sample Size ◽

Long Term Effects ◽

Communication Program ◽

Communication Partners

Purpose The SUpport PRogram (SUPR) study was carried out in the context of a private academic partnership and is the first study to evaluate the long-term effects of a communication program (SUPR) for older hearing aid users and their communication partners on a large scale in a hearing aid dispensing setting. The purpose of this research note is to reflect on the lessons that we learned during the different development, implementation, and evaluation phases of the SUPR project. Procedure This research note describes the procedures that were followed during the different phases of the SUPR project and provides a critical discussion to describe the strengths and weaknesses of the approach taken. Conclusion This research note might provide researchers and intervention developers with useful insights as to how aural rehabilitation interventions, such as the SUPR, can be developed by incorporating the needs of the different stakeholders, evaluated by using a robust research design (including a large sample size and a longer term follow-up assessment), and implemented widely by collaborating with a private partner (hearing aid dispensing practice chain).

Download Full-text

Engaging Supply-Chain Manufacturers to Optimize Delivery of Automation: Case Study and Lessons Learned from Optimizing BNR and Energy Efficiency within Large-Scale Aeration Automation at Tallman Island WPCP, New York City

Proceedings of the Water Environment Federation ◽

10.2175/193864715819542278 ◽

2015 ◽

Vol 2015 (10) ◽

pp. 3006-3014

Author(s):

R. J Kowalski ◽

J Finnigan ◽

A Kreel ◽

M Zaman

Keyword(s):

New York ◽

Energy Efficiency ◽

New York City ◽

Supply Chain ◽

York City ◽

Large Scale ◽

Lessons Learned

Download Full-text

Political Science Research in the Middle East and North Africa

10.1093/oso/9780190882969.001.0001 ◽

2018 ◽

Cited By ~ 5

Keyword(s):

Middle East ◽

Research Methods ◽

North Africa ◽

Quantitative Research ◽

Real Life ◽

Field Research ◽

Science Research ◽

Lessons Learned ◽

Conducting Research ◽

Personal Accounts

Based on personal accounts of their experiences conducting qualitative and quantitative research in the countries of the Middle East and North Africa, the contributors to this volume share the real-life obstacles they have encountered in applying research methods in practice and the possible solutions to overcome them. The volume is an important companion book to more standard methods books, which focus on the “how to” of methods but are often devoid of any real discussion of the practicalities, challenges, and common mistakes of fieldwork. The volume is divided into three parts, highlighting the challenges of (1) specific contexts, including conducting research in areas of violence; (2) a range of research methods, including interviewing, process-tracing, ethnography, experimental research, and the use of online media; and (3) the ethics of field research. In sharing their lessons learned, the contributors raise issues of concern to both junior and experienced researchers, particularly those of the Global South but also to those researching the Global North.

Download Full-text

The graph neural networking challenge

ACM SIGCOMM Computer Communication Review ◽

10.1145/3477482.3477485 ◽

2021 ◽

Vol 51 (3) ◽

pp. 9-16

Author(s):

José Suárez-Varela ◽

Miquel Ferriol-Galmés ◽

Albert López ◽

Paul Almasan ◽

Guillermo Bernárdez ◽

...

Keyword(s):

Machine Learning ◽

Computer Networks ◽

Real World ◽

Large Scale ◽

Lessons Learned ◽

Educational Resources ◽

Global Competition ◽

International Telecommunication Union ◽

International Telecommunication ◽

Broad Audience

During the last decade, Machine Learning (ML) has increasingly become a hot topic in the field of Computer Networks and is expected to be gradually adopted for a plethora of control, monitoring and management tasks in real-world deployments. This poses the need to count on new generations of students, researchers and practitioners with a solid background in ML applied to networks. During 2020, the International Telecommunication Union (ITU) has organized the "ITU AI/ML in 5G challenge", an open global competition that has introduced to a broad audience some of the current main challenges in ML for networks. This large-scale initiative has gathered 23 different challenges proposed by network operators, equipment manufacturers and academia, and has attracted a total of 1300+ participants from 60+ countries. This paper narrates our experience organizing one of the proposed challenges: the "Graph Neural Networking Challenge 2020". We describe the problem presented to participants, the tools and resources provided, some organization aspects and participation statistics, an outline of the top-3 awarded solutions, and a summary with some lessons learned during all this journey. As a result, this challenge leaves a curated set of educational resources openly available to anyone interested in the topic.

Download Full-text