Missing data in longitudinal studies: Comparison of multiple imputation methods in a real clinical setting

A comparison of multiple imputation methods for missing data in longitudinal studies

BMC Medical Research Methodology ◽

10.1186/s12874-018-0615-6 ◽

2018 ◽

Vol 18 (1) ◽

Cited By ~ 22

Author(s):

Md Hamidul Huque ◽

John B. Carlin ◽

Julie A. Simpson ◽

Katherine J. Lee

Keyword(s):

Missing Data ◽

Multiple Imputation ◽

Longitudinal Studies ◽

Imputation Methods

Download Full-text

A practical comparison of single and multiple imputation methods to handle complex missing data in air quality datasets

Chemometrics and Intelligent Laboratory Systems ◽

10.1016/j.chemolab.2014.02.007 ◽

2014 ◽

Vol 134 ◽

pp. 23-33 ◽

Cited By ~ 34

Author(s):

M.P. Gómez-Carracedo ◽

J.M. Andrade ◽

P. López-Mahía ◽

S. Muniategui ◽

D. Prada

Keyword(s):

Air Quality ◽

Missing Data ◽

Multiple Imputation ◽

Imputation Methods

Download Full-text

Best Practices for Addressing Missing Data through Multiple Imputation

10.31234/osf.io/uaezh ◽

2021 ◽

Author(s):

Adrienne D. Woods ◽

Pamela Davis-Kean ◽

Max Andrew Halvorson ◽

Kevin Michael King ◽

Jessica A. R. Logan ◽

...

Keyword(s):

Missing Data ◽

Best Practices ◽

Multiple Imputation ◽

Statistical Techniques ◽

Parameter Estimates ◽

Developmental Research ◽

Imputation Methods ◽

Listwise Deletion ◽

Highly Effective ◽

Practical Guidelines

A common challenge in developmental research is the amount of incomplete and missing data that occurs from respondents failing to complete tasks or questionnaires, as well as from disengaging from the study (i.e., attrition). This missingness can lead to biases in parameter estimates and, hence, in the interpretation of findings. These biases can be addressed through statistical techniques that adjust for missing data, such as multiple imputation. Although this technique is highly effective, it has not been widely adopted by developmental scientists given barriers such as lack of training or misconceptions about imputation methods and instead utilizing default methods within software like listwise deletion. This manuscript is intended to provide practical guidelines for developmental researchers to follow when examining their data for missingness, making decisions about how to handle that missingness, and reporting the extent of missing data biases and specific multiple imputation procedures in publications.

Download Full-text

Comparison of Selected Multiple Imputation Methods for Continuous Variables – Preliminary Simulation Study Results

Acta Universitatis Lodziensis Folia oeconomica ◽

10.18778/0208-6018.339.05 ◽

2019 ◽

Vol 6 (339) ◽

pp. 73-98

Author(s):

Małgorzata Aleksandra Misztal

Keyword(s):

Missing Data ◽

Multiple Imputation ◽

Missing Values ◽

Imputation Accuracy ◽

Imputation Method ◽

Data Sets ◽

Continuous Variables ◽

Imputation Methods ◽

Study Results ◽

Almost All

The problem of incomplete data and its implications for drawing valid conclusions from statistical analyses is not related to any particular scientific domain, it arises in economics, sociology, education, behavioural sciences or medicine. Almost all standard statistical methods presume that every object has information on every variable to be included in the analysis and the typical approach to missing data is simply to delete them. However, this leads to ineffective and biased analysis results and is not recommended in the literature. The state of the art technique for handling missing data is multiple imputation. In the paper, some selected multiple imputation methods were taken into account. Special attention was paid to using principal components analysis (PCA) as an imputation method. The goal of the study was to assess the quality of PCA‑based imputations as compared to two other multiple imputation techniques: multivariate imputation by chained equations (MICE) and missForest. The comparison was made by artificially simulating different proportions (10–50%) and mechanisms of missing data using 10 complete data sets from the UCI repository of machine learning databases. Then, missing values were imputed with the use of MICE, missForest and the PCA‑based method (MIPCA). The normalised root mean square error (NRMSE) was calculated as a measure of imputation accuracy. On the basis of the conducted analyses, missForest can be recommended as a multiple imputation method providing the lowest rates of imputation errors for all types of missingness. PCA‑based imputation does not perform well in terms of accuracy.

Download Full-text

Influence of outliers on some multiple imputation methods

Advances in Methodology and Statistics ◽

10.51936/tuki4538 ◽

2010 ◽

Vol 7 (1) ◽

Author(s):

Claudio Quintano ◽

Rosalia Castellano ◽

Antonella Rocca

Keyword(s):

Missing Data ◽

Multiple Imputation ◽

Strong Influence ◽

Survey Design ◽

Small And Medium Enterprises ◽

Imputation Methods ◽

Annual Survey ◽

Before And After ◽

Design Characteristics ◽

Medium Enterprises

In the field of data quality, imputation is the most used method for handling missing data. The performance of imputation techniques is influenced by various factors, especially when data represent only a sample of population, for example the survey design characteristics. In this paper, we compare the results of different multiple imputation methods in terms of final estimates when outliers occur in a dataset. Consequently, in order to evaluate the influence of outliers on the performance of these methods, the procedure is applied before and after that we have identified and removed them. For this purpose, missing data were simulated on data coming from sample ISTAT annual survey on Small and Medium Enterprises. MAR mechanism is assumed for missing data. The methods are based on the multiple imputation through the Markov Chain Monte Carlo (MCMC), the propensity score and the mixture models. The results highlight the strong influence of data characteristics on final estimates.

Download Full-text

Characterizing and Managing Missing Structured Data in Electronic Health Records

10.1101/167858 ◽

2017 ◽

Author(s):

Brett K. Beaulieu-Jones ◽

Daniel R. Lavage ◽

John W. Snyder ◽

Jason H. Moore ◽

Sarah A Pendergrass ◽

...

Keyword(s):

Missing Data ◽

Multiple Imputation ◽

Missing Values ◽

Structured Data ◽

Health Record ◽

Data Types ◽

Health Records ◽

Imputation Methods ◽

Electronic Health ◽

Evaluation Of Methods

ABSTRACTMissing data is a challenge for all studies; however, this is especially true for electronic health record (EHR) based analyses. Failure to appropriately consider missing data can lead to biased results. Here, we provide detailed procedures for when and how to conduct imputation of EHR data. We demonstrate how the mechanism of missingness can be assessed, evaluate the performance of a variety of imputation methods, and describe some of the most frequent problems that can be encountered. We analyzed clinical lab measures from 602,366 patients in the Geisinger Health System EHR. Using these data, we constructed a representative set of complete cases and assessed the performance of 12 different imputation methods for missing data that was simulated based on 4 mechanisms of missingness. Our results show that several methods including variations of Multivariate Imputation by Chained Equations (MICE) and softImpute consistently imputed missing values with low error; however, only a subset of the MICE methods were suitable for multiple imputation. The analyses described provide an outline of considerations for dealing with missing EHR data, steps that researchers can perform to characterize missingness within their own data, and an evaluation of methods that can be applied to impute clinical data. While the performance of methods may vary between datasets, the process we describe can be generalized to the majority of structured data types that exist in EHRs and all of our methods and code are publicly available.

Download Full-text

Recovering Missing or Partial Data from Studies: a Survey of Conversions and Imputations for Meta-analysis

Handbook of Meta-analysis in Ecology and Evolution ◽

10.23943/princeton/9780691137285.003.0013 ◽

2013 ◽

Cited By ~ 3

Author(s):

Marc J. Lajeunesse

Keyword(s):

Missing Data ◽

Multiple Imputation ◽

Partial Information ◽

Meta Analysis ◽

Effect Sizes ◽

Missing Information ◽

Imputation Methods ◽

Partial Data ◽

Available Information ◽

The Way

This chapter discusses possible solutions for dealing with partial information and missing data from published studies. These solutions can improve the amount of information extracted from individual studies, and increase the representation of data for meta-analysis. It begins with a description of the mechanisms that generate missing information within studies, followed by a discussion of how gaps of information can influence meta-analysis and the way studies are quantitatively reviewed. It then suggests some practical solutions to recovering missing statistics from published studies. These include statistical acrobatics to convert available information (e.g., t-test) into those that are more useful to compute effect sizes, as well as a heuristic approaches that impute (fill gaps) missing information when pooling effect sizes. Finally, the chapter discusses multiple-imputation methods that account for the uncertainty associated with filling gaps of information when performing meta-analysis.

Download Full-text

A comparison of multiple-imputation methods for handling missing data in repeated measurements observational studies

Journal of the Royal Statistical Society Series A (Statistics in Society) ◽

10.1111/rssa.12140 ◽

2015 ◽

Vol 179 (3) ◽

pp. 683-706 ◽

Cited By ~ 19

Author(s):

Oya Kalaycioglu ◽

Andrew Copas ◽

Michael King ◽

Rumana Z. Omar

Keyword(s):

Missing Data ◽

Multiple Imputation ◽

Observational Studies ◽

Repeated Measurements ◽

Imputation Methods

Download Full-text

PCN109 DEALING WITH QUALITY OF LIFE MISSING DATA IN A SINGLE ARM STUDY. COMPARISON OF MULTIPLE IMPUTATION METHODS

Value in Health ◽

10.1016/s1098-3015(10)66646-6 ◽

2008 ◽

Vol 11 (6) ◽

pp. A494-A495

Author(s):

A Arnault ◽

C Ivanescu ◽

A van Engen ◽

P Peeters

Keyword(s):

Quality Of Life ◽

Missing Data ◽

Multiple Imputation ◽

Imputation Methods

Download Full-text

Using multiple imputation to deal with missing data and attrition in longitudinal studies with repeated measures of patient-reported outcomes

Clinical Epidemiology ◽

10.2147/clep.s72247 ◽

2015 ◽

pp. 91 ◽

Cited By ~ 66

Author(s):

Karin Biering ◽

Niels Henrik Hjollund ◽

Morten Frydenberg

Keyword(s):

Missing Data ◽

Multiple Imputation ◽

Longitudinal Studies ◽

Repeated Measures ◽

Patient Reported Outcomes ◽

Patient Reported

Download Full-text