scholarly journals Imputation-based strategies for clinical trial longitudinal data with nonignorable missing values

2008 ◽  
Vol 27 (15) ◽  
pp. 2826-2849 ◽  
Author(s):  
Xiaowei Yang ◽  
Jinhui Li ◽  
Steven Shoptaw
Author(s):  
Caio Ribeiro ◽  
Alex A. Freitas

AbstractLongitudinal datasets of human ageing studies usually have a high volume of missing data, and one way to handle missing values in a dataset is to replace them with estimations. However, there are many methods to estimate missing values, and no single method is the best for all datasets. In this article, we propose a data-driven missing value imputation approach that performs a feature-wise selection of the best imputation method, using known information in the dataset to rank the five methods we selected, based on their estimation error rates. We evaluated the proposed approach in two sets of experiments: a classifier-independent scenario, where we compared the applicabilities and error rates of each imputation method; and a classifier-dependent scenario, where we compared the predictive accuracy of Random Forest classifiers generated with datasets prepared using each imputation method and a baseline approach of doing no imputation (letting the classification algorithm handle the missing values internally). Based on our results from both sets of experiments, we concluded that the proposed data-driven missing value imputation approach generally resulted in models with more accurate estimations for missing data and better performing classifiers, in longitudinal datasets of human ageing. We also observed that imputation methods devised specifically for longitudinal data had very accurate estimations. This reinforces the idea that using the temporal information intrinsic to longitudinal data is a worthwhile endeavour for machine learning applications, and that can be achieved through the proposed data-driven approach.


Biometrika ◽  
2016 ◽  
Vol 103 (1) ◽  
pp. 175-187 ◽  
Author(s):  
Jun Shao ◽  
Lei Wang

Abstract To estimate unknown population parameters based on data having nonignorable missing values with a semiparametric exponential tilting propensity, Kim & Yu (2011) assumed that the tilting parameter is known or can be estimated from external data, in order to avoid the identifiability issue. To remove this serious limitation on the methodology, we use an instrument, i.e., a covariate related to the study variable but unrelated to the missing data propensity, to construct some estimating equations. Because these estimating equations are semiparametric, we profile the nonparametric component using a kernel-type estimator and then estimate the tilting parameter based on the profiled estimating equations and the generalized method of moments. Once the tilting parameter is estimated, so is the propensity, and then other population parameters can be estimated using the inverse propensity weighting approach. Consistency and asymptotic normality of the proposed estimators are established. The finite-sample performance of the estimators is studied through simulation, and a real-data example is also presented.


2016 ◽  
Vol 27 (2) ◽  
pp. 133-142
Author(s):  
Radia Taisir ◽  
M Ataharul Islam

Longitudinal studies involves repeated observations over time on the same experimental units and missingness may occur in non-ignorable fashion. For such longitudinal missing data, a Markov model may be used to model the binary response along with a suitable non-response model for the missing portion of the data. It is of the primary interest to estimate the effects of covariates on the binary response. Similar model for such incomplete longitudinal data exists where estimation of the regression parameters are obtained using likelihood method by summing over all possible values of the missing responses. In this paper, we propose an expectation-maximization (EM) algorithm technique for the estimation of the regression parameters which is computationally simple and produces similar efficient estimates as compared to the existing complex method of estimation. A comparison of the existing and the proposed estimation methods has been made by analyzing the Health and Retirement Survey (HRS) data of United States.Bangladesh J. Sci. Res. 27(2): 133-142, December-2014


Sign in / Sign up

Export Citation Format

Share Document