Imputation-based strategies for clinical trial longitudinal data with nonignorable missing values

AbstractLongitudinal datasets of human ageing studies usually have a high volume of missing data, and one way to handle missing values in a dataset is to replace them with estimations. However, there are many methods to estimate missing values, and no single method is the best for all datasets. In this article, we propose a data-driven missing value imputation approach that performs a feature-wise selection of the best imputation method, using known information in the dataset to rank the five methods we selected, based on their estimation error rates. We evaluated the proposed approach in two sets of experiments: a classifier-independent scenario, where we compared the applicabilities and error rates of each imputation method; and a classifier-dependent scenario, where we compared the predictive accuracy of Random Forest classifiers generated with datasets prepared using each imputation method and a baseline approach of doing no imputation (letting the classification algorithm handle the missing values internally). Based on our results from both sets of experiments, we concluded that the proposed data-driven missing value imputation approach generally resulted in models with more accurate estimations for missing data and better performing classifiers, in longitudinal datasets of human ageing. We also observed that imputation methods devised specifically for longitudinal data had very accurate estimations. This reinforces the idea that using the temporal information intrinsic to longitudinal data is a worthwhile endeavour for machine learning applications, and that can be achieved through the proposed data-driven approach.

Download Full-text

Semiparametric inverse propensity weighting for nonignorable missing data

Biometrika ◽

10.1093/biomet/asv071 ◽

2016 ◽

Vol 103 (1) ◽

pp. 175-187 ◽

Cited By ~ 31

Author(s):

Jun Shao ◽

Lei Wang

Keyword(s):

Missing Data ◽

Missing Values ◽

Generalized Method Of Moments ◽

Estimating Equations ◽

Real Data ◽

Population Parameters ◽

Finite Sample ◽

External Data ◽

Nonignorable Missing ◽

Inverse Propensity Weighting

Abstract To estimate unknown population parameters based on data having nonignorable missing values with a semiparametric exponential tilting propensity, Kim & Yu (2011) assumed that the tilting parameter is known or can be estimated from external data, in order to avoid the identifiability issue. To remove this serious limitation on the methodology, we use an instrument, i.e., a covariate related to the study variable but unrelated to the missing data propensity, to construct some estimating equations. Because these estimating equations are semiparametric, we profile the nonparametric component using a kernel-type estimator and then estimate the tilting parameter based on the profiled estimating equations and the generalized method of moments. Once the tilting parameter is estimated, so is the propensity, and then other population parameters can be estimated using the inverse propensity weighting approach. Consistency and asymptotic normality of the proposed estimators are established. The finite-sample performance of the estimators is studied through simulation, and a real-data example is also presented.

Download Full-text

Mixture of multivariate t nonlinear mixed models for multiple longitudinal data with heterogeneity and missing values

Test ◽

10.1007/s11749-018-0612-4 ◽

2018 ◽

Vol 28 (1) ◽

pp. 196-222 ◽

Cited By ~ 3

Author(s):

Wan-Lun Wang

Keyword(s):

Longitudinal Data ◽

Mixed Models ◽

Missing Values ◽

Nonlinear Mixed Models ◽

Multivariate T

Download Full-text

Classification of Multivariate Linear-Circular Data with Nonignorable Missing Values

Contributions to Statistics - Complex Models and Computational Methods in Statistics ◽

10.1007/978-88-470-2871-5_13 ◽

2012 ◽

pp. 161-173

Author(s):

Francesco Lagona ◽

Marco Picone

Keyword(s):

Missing Values ◽

Circular Data ◽

Nonignorable Missing

Download Full-text

Analysing intensive longitudinal data after summarization at landmarks: an application to daily pain evaluation in a clinical trial

Journal of the Royal Statistical Society Series A (Statistics in Society) ◽

10.1111/j.1467-985x.2011.01014.x ◽

2011 ◽

Vol 175 (2) ◽

pp. 513-534 ◽

Cited By ~ 1

Author(s):

P. Bunouf ◽

J.-M. Grouin ◽

G. Molenberghs ◽

G. Koch

Keyword(s):

Clinical Trial ◽

Longitudinal Data ◽

Intensive Longitudinal Data ◽

Pain Evaluation ◽

Daily Pain

Download Full-text

EM algorithm for longitudinal data with non-ignorable missing values: An application to health data

Bangladesh Journal of Scientific Research ◽

10.3329/bjsr.v27i2.26231 ◽

2016 ◽

Vol 27 (2) ◽

pp. 133-142

Author(s):

Radia Taisir ◽

M Ataharul Islam

Keyword(s):

Longitudinal Data ◽

Em Algorithm ◽

Expectation Maximization ◽

Missing Values ◽

Likelihood Method ◽

Estimation Methods ◽

Binary Response ◽

Complex Method ◽

Missing Responses ◽

Regression Parameters

Longitudinal studies involves repeated observations over time on the same experimental units and missingness may occur in non-ignorable fashion. For such longitudinal missing data, a Markov model may be used to model the binary response along with a suitable non-response model for the missing portion of the data. It is of the primary interest to estimate the effects of covariates on the binary response. Similar model for such incomplete longitudinal data exists where estimation of the regression parameters are obtained using likelihood method by summing over all possible values of the missing responses. In this paper, we propose an expectation-maximization (EM) algorithm technique for the estimation of the regression parameters which is computationally simple and produces similar efficient estimates as compared to the existing complex method of estimation. A comparison of the existing and the proposed estimation methods has been made by analyzing the Health and Retirement Survey (HRS) data of United States.Bangladesh J. Sci. Res. 27(2): 133-142, December-2014

Download Full-text

Some issues on longitudinal data with nonignorable dropout, a discussion of “Statistical Inference for Nonignorable Missing-Data Problems: A Selective Review” by Niansheng Tang and Yuanyuan Ju

Statistical Theory and Related Fields ◽

10.1080/24754269.2018.1522575 ◽

2018 ◽

Vol 2 (2) ◽

pp. 137-139

Author(s):

Lei Wang

Keyword(s):

Missing Data ◽

Longitudinal Data ◽

Statistical Inference ◽

Nonignorable Missing Data ◽

Selective Review ◽

Nonignorable Dropout ◽

Nonignorable Missing

Download Full-text

Empirical likelihood and Wilks phenomenon for data with nonignorable missing values

Scandinavian Journal of Statistics ◽

10.1111/sjos.12379 ◽

2019 ◽

Vol 46 (4) ◽

pp. 1003-1024 ◽

Cited By ~ 1

Author(s):

Puying Zhao ◽

Lei Wang ◽

Jun Shao

Keyword(s):

Empirical Likelihood ◽

Missing Values ◽

Wilks Phenomenon ◽

Nonignorable Missing

Download Full-text

A mixed-effects estimating equation approach to nonignorable missing longitudinal data with refreshment samples

Statistica Sinica ◽

10.5705/ss.202015.0317 ◽

2018 ◽

Author(s):

Xuan Bi ◽

Annie Qu

Keyword(s):

Longitudinal Data ◽

Estimating Equation ◽

Mixed Effects ◽

Equation Approach ◽

Nonignorable Missing

Download Full-text