Normal Theory GLS Estimator for Missing Data: An Application to Item-Level Missing Data and a Comparison to Two-Stage ML

Two-stage maximum likelihood approach for item-level missing data in regression

Behavior Research Methods ◽

10.3758/s13428-020-01355-x ◽

2020 ◽

Vol 52 (6) ◽

pp. 2306-2323 ◽

Cited By ~ 1

Author(s):

Lihan Chen ◽

Victoria Savalei ◽

Mijke Rhemtulla

Keyword(s):

Missing Data ◽

Maximum Likelihood ◽

Structural Equation ◽

High Efficiency ◽

Structural Equation Models ◽

Small Sample ◽

Parameter Estimates ◽

Two Stage ◽

Item Level ◽

Scale Scores

AbstractPsychologists use scales comprised of multiple items to measure underlying constructs. Missing data on such scales often occur at the item level, whereas the model of interest to the researcher is at the composite (scale score) level. Existing analytic approaches cannot easily accommodate item-level missing data when models involve composites. A very common practice in psychology is to average all available items to produce scale scores. This approach, referred to as available-case maximum likelihood (ACML), may produce biased parameter estimates. Another approach researchers use to deal with item-level missing data is scale-level full information maximum likelihood (SL-FIML), which treats the whole scale as missing if any item is missing. SL-FIML is inefficient and it may also exhibit bias. Multiple imputation (MI) produces the correct results using a simulation-based approach. We study a new analytic alternative for item-level missingness, called two-stage maximum likelihood (TSML; Savalei & Rhemtulla, Journal of Educational and Behavioral Statistics, 42(4), 405–431. 2017). The original work showed the method outperforming ACML and SL-FIML in structural equation models with parcels. The current simulation study examined the performance of ACML, SL-FIML, MI, and TSML in the context of univariate regression. We demonstrated performance issues encountered by ACML and SL-FIML when estimating regression coefficients, under both MCAR and MAR conditions. Aside from convergence issues with small sample sizes and high missingness, TSML performed similarly to MI in all conditions, showing negligible bias, high efficiency, and good coverage. This fast analytic approach is therefore recommended whenever it achieves convergence. R code and a Shiny app to perform TSML are provided.

Download Full-text

Normal Theory Two-Stage ML Estimator When Data Are Missing at the Item Level

Journal of Educational and Behavioral Statistics ◽

10.3102/1076998617694880 ◽

2017 ◽

Vol 42 (4) ◽

pp. 405-431 ◽

Cited By ~ 1

Author(s):

Victoria Savalei ◽

Mijke Rhemtulla

Keyword(s):

Missing Data ◽

Maximum Likelihood ◽

Structural Equation ◽

Structural Equation Models ◽

Estimation Method ◽

Analytic Approach ◽

Parameter Estimates ◽

Two Stage ◽

Data Set ◽

Item Level

In many modeling contexts, the variables in the model are linear composites of the raw items measured for each participant; for instance, regression and path analysis models rely on scale scores, and structural equation models often use parcels as indicators of latent constructs. Currently, no analytic estimation method exists to appropriately handle missing data at the item level. Item-level multiple imputation (MI), however, can handle such missing data straightforwardly. In this article, we develop an analytic approach for dealing with item-level missing data—that is, one that obtains a unique set of parameter estimates directly from the incomplete data set and does not require imputations. The proposed approach is a variant of the two-stage maximum likelihood (TSML) methodology, and it is the analytic equivalent of item-level MI. We compare the new TSML approach to three existing alternatives for handling item-level missing data: scale-level full information maximum likelihood, available-case maximum likelihood, and item-level MI. We find that the TSML approach is the best analytic approach, and its performance is similar to item-level MI. We recommend its implementation in popular software and its further study.

Download Full-text

A Two-stage Deep Autoencoder-based Missing Data Imputation Method for Wind Farm SCADA Data

IEEE Sensors Journal ◽

10.1109/jsen.2021.3061109 ◽

2021 ◽

pp. 1-1

Author(s):

Xin Liu ◽

Zijun Zhang

Keyword(s):

Missing Data ◽

Wind Farm ◽

Imputation Method ◽

Data Imputation ◽

Two Stage ◽

Missing Data Imputation

Download Full-text

Examining the effect of missing data on RMSEA and CFI under normal theory full-information maximum likelihood

Structural Equation Modeling A Multidisciplinary Journal ◽

10.1080/10705511.2019.1642111 ◽

2019 ◽

Vol 27 (2) ◽

pp. 219-239 ◽

Cited By ~ 3

Author(s):

Xijuan Zhang ◽

Victoria Savalei

Keyword(s):

Missing Data ◽

Maximum Likelihood ◽

Full Information ◽

Normal Theory ◽

Full Information Maximum Likelihood

Download Full-text

Using Information Criteria Under Missing Data: Full Information Maximum Likelihood Versus Two-Stage Estimation

Structural Equation Modeling A Multidisciplinary Journal ◽

10.1080/10705511.2020.1780925 ◽

2020 ◽

pp. 1-14

Author(s):

Keke Lai

Keyword(s):

Missing Data ◽

Maximum Likelihood ◽

Information Criteria ◽

Full Information ◽

Two Stage ◽

Full Information Maximum Likelihood ◽

Stage Estimation

Download Full-text

Variance Estimation in Two-Stage Cluster Sampling under Imputation for Missing Data

Journal of Statistical Theory and Practice ◽

10.1080/15598608.2010.10412021 ◽

2010 ◽

Vol 4 (4) ◽

pp. 827-844 ◽

Cited By ~ 6

Author(s):

David Haziza ◽

J. N. K. Rao

Keyword(s):

Missing Data ◽

Variance Estimation ◽

Cluster Sampling ◽

Two Stage

Download Full-text

Hierarchical Bayes Approach for Analysis of Item-Level Missing Data

International Journal of Clinical Biostatistics and Biometrics ◽

10.23937/2469-5831/1510009 ◽

2016 ◽

Vol 2 (1) ◽

Author(s):

Junshan Qiu

Keyword(s):

Missing Data ◽

Hierarchical Bayes ◽

Item Level ◽

Bayes Approach

Download Full-text

Addressing Item-Level Missing Data: A Comparison of Proration and Full Information Maximum Likelihood Estimation

Multivariate Behavioral Research ◽

10.1080/00273171.2015.1068157 ◽

2015 ◽

Vol 50 (5) ◽

pp. 504-519 ◽

Cited By ~ 40

Author(s):

Gina L. Mazza ◽

Craig K. Enders ◽

Linda S. Ruehlman

Keyword(s):

Missing Data ◽

Maximum Likelihood ◽

Maximum Likelihood Estimation ◽

Likelihood Estimation ◽

Full Information ◽

Full Information Maximum Likelihood ◽

Item Level

Download Full-text

PC program extending the two-stage polynomial growth curve model to allow missing data

International Journal of Bio-Medical Computing ◽

10.1016/0020-7101(93)90042-5 ◽

1993 ◽

Vol 33 (3-4) ◽

pp. 287-296 ◽

Cited By ~ 2

Author(s):

Amy M. Furey ◽

Thomas R. Ten Have ◽

Charles J. Kowalski ◽

Emet D. Schneiderman ◽

Stephen M. Willis

Keyword(s):

Missing Data ◽

Growth Curve ◽

Polynomial Growth ◽

Growth Curve Model ◽

Two Stage ◽

Curve Model

Download Full-text

Multistep autoregressive reconstruction of seismic records

Geophysics ◽

10.1190/1.2771685 ◽

2007 ◽

Vol 72 (6) ◽

pp. V111-V118 ◽

Cited By ~ 95

Author(s):

Mostafa Naghizadeh ◽

Mauricio D. Sacchi

Keyword(s):

Missing Data ◽

Field Data ◽

Linear Prediction ◽

Fourier Method ◽

Weighted Norm ◽

Sampled Data ◽

Regular Grid ◽

Two Stage ◽

Low Frequencies ◽

Seismic Records

Linear prediction filters in the [Formula: see text] domain are widely used to interpolate regularly sampled data. We study the problem of reconstructing irregularly missing data on a regular grid using linear prediction filters. We propose a two-stage algorithm. First, we reconstruct the unaliased part of the data spectrum using a Fourier method (minimum-weighted norm interpolation). Then, prediction filters for all the frequencies are extracted from the reconstructed low frequencies. The latter is implemented via a multistep autoregressive (MSAR) algorithm. Finally, these prediction filters are used to reconstruct the complete data in the [Formula: see text] domain. The applicability of the proposed method is examined using synthetic and field data examples.

Download Full-text