Improved Small Sample Inference Methods for a Mixed-Effects Model for Repeated Measures Approach in Incomplete Longitudinal Data Analysis

Yoshifumi Ukyo; Hisashi Noma; Kazushi Maruo; Masahiko Gosho

doi:10.3390/stats2020013

Improved Small Sample Inference Methods for a Mixed-Effects Model for Repeated Measures Approach in Incomplete Longitudinal Data Analysis

Stats ◽

10.3390/stats2020013 ◽

2019 ◽

Vol 2 (2) ◽

pp. 174-188

Author(s):

Yoshifumi Ukyo ◽

Hisashi Noma ◽

Kazushi Maruo ◽

Masahiko Gosho

Keyword(s):

Clinical Trial ◽

Repeated Measures ◽

Type I Error ◽

Mixed Effects ◽

Error Rates ◽

Small Sample ◽

Mixed Effects Model ◽

Model Complexity ◽

Type I ◽

Inference Methods

The mixed-effects model for repeated measures (MMRM) approach has been widely applied for longitudinal clinical trials. Many of the standard inference methods of MMRM could possibly lead to the inflation of type I error rates for the tests of treatment effect, when the longitudinal dataset is small and involves missing measurements. We propose two improved inference methods for the MMRM analyses, (1) the Bartlett correction with the adjustment term approximated by bootstrap, and (2) the Monte Carlo test using an estimated null distribution by bootstrap. These methods can be implemented regardless of model complexity and missing patterns via a unified computational framework. Through simulation studies, the proposed methods maintain the type I error rate properly, even for small and incomplete longitudinal clinical trial settings. Applications to a postnatal depression clinical trial are also presented.

Download Full-text

Type I Error Rates from Mixed Effects Model Repeated Measures Versus Fixed Effects Anova with Missing Values Imputed Via Last Observation Carried Forward

Drug Information Journal ◽

10.1177/009286150103500418 ◽

2001 ◽

Vol 35 (4) ◽

pp. 1215-1225 ◽

Cited By ~ 95

Author(s):

Craig H. Mallinckrodt ◽

W. Scott Clark ◽

Stacy R. David

Keyword(s):

Repeated Measures ◽

Fixed Effects ◽

Missing Values ◽

Type I Error ◽

Mixed Effects ◽

Error Rates ◽

Mixed Effects Model ◽

Type I ◽

Type I Error Rates

Download Full-text

Efficient baseline utilization for incomplete block crossover clinical trials

Statistical Methods in Medical Research ◽

10.1177/0962280217736790 ◽

2017 ◽

Vol 28 (3) ◽

pp. 801-821

Author(s):

Thomas O Jemielita ◽

Mary E Putt ◽

Devan V Mehrotra

Keyword(s):

Fixed Effects ◽

Type I Error ◽

Real Data ◽

Mixed Effects ◽

Error Rates ◽

Small Sample ◽

Analysis Of Covariance ◽

Type I ◽

Incomplete Block ◽

Clinical Drug Development

Incomplete block crossover trials with period-specific baseline and post-baseline (outcome) measures for each subject are often used in clinical drug development; without loss of generality, we focus on the three-treatment two-period ([Formula: see text]) crossover. Data from such trials are commonly analyzed using a mixed effects model with indicator terms for treatment and period, and an unstructured covariance matrix for the vector of intra-subject measurements. It is well-known that treatment effect estimates from this analysis are complex functions of both within-subject and between-subject treatment contrasts. We caution that the associated type I error rate and power for hypothesis testing can be non-trivially influenced by how the baselines are utilized. Specifically, the mixed effects analysis which uses change from baseline as the dependent variable is shown to consistently underperform corresponding analyses in which the outcome is the dependent variable and linear combinations of the baselines are used as period-specific and/or period-invariant covariates. A simpler fixed effects analysis of covariance involving only within-subject contrasts is also described for small sample situations in which the mixed effects analyses can suffer from increased type I error rates. Theoretical insights, simulation results and an illustrative example with real data are used to develop the main points.

Download Full-text

Differences of Type I error rates for ANOVA and Multilevel-Linear-Models using SAS and SPSS for repeated measures designs

Meta-Psychology ◽

10.15626/mp.2018.898 ◽

2019 ◽

Vol 3 ◽

Author(s):

Nicolas Haverkamp ◽

André Beauducel

Keyword(s):

Repeated Measures ◽

Linear Models ◽

Type I Error ◽

Error Rates ◽

Small Sample ◽

Small Samples ◽

Type I ◽

Sample Sizes ◽

Type I Error Rates ◽

Multilevel Linear Models

To derive recommendations on how to analyze longitudinal data, we examined Type I error rates of Multilevel Linear Models (MLM) and repeated measures Analysis of Variance (rANOVA) using SAS and SPSS. We performed a simulation with the following specifications: To explore the effects of high numbers of measurement occasions and small sample sizes on Type I error, measurement occasions of m = 9 and 12 were investigated as well as sample sizes of n = 15, 20, 25 and 30. Effects of non-sphericity in the population on Type I error were also inspected: 5,000 random samples were drawn from two populations containing neither a within-subject nor a between-group effect. They were analyzed including the most common options to correct rANOVA and MLM-results: The Huynh-Feldt-correction for rANOVA (rANOVA-HF) and the Kenward-Roger-correction for MLM (MLM-KR), which could help to correct progressive bias of MLM with an unstructured covariance matrix (MLM-UN). Moreover, uncorrected rANOVA and MLM assuming a compound symmetry covariance structure (MLM-CS) were also taken into account. The results showed a progressive bias for MLM-UN for small samples which was stronger in SPSS than in SAS. Moreover, an appropriate bias correction for Type I error via rANOVA-HF and an insufficient correction by MLM-UN-KR for n < 30 were found. These findings suggest MLM-CS or rANOVA if sphericity holds and a correction of a violation via rANOVA-HF. If an analysis requires MLM, SPSS yields more accurate Type I error rates for MLM-CS and SAS yields more accurate Type I error rates for MLM-UN.

Download Full-text

A Monte Carlo Comparison of Seven ε-Adjustment Procedures in Repeated Measures Designs With Small Sample Sizes

Journal of Educational Statistics ◽

10.3102/10769986019001057 ◽

1994 ◽

Vol 19 (1) ◽

pp. 57-71 ◽

Cited By ~ 18

Author(s):

Stephen M. Quintana ◽

Scott E. Maxwell

Keyword(s):

Repeated Measures ◽

Type I Error ◽

Error Rates ◽

Small Sample ◽

Small Samples ◽

Type I ◽

Sample Sizes ◽

Type I Error Rates ◽

Repeated Measures Designs ◽

Small Sample Sizes

The purpose of this study was to evaluate seven univariate procedures for testing omnibus null hypotheses for data gathered from repeated measures designs. Five alternate approaches are compared to the two more traditional adjustment procedures (Geisser and Greenhouse’s ε̂ and Huynh and Feldt’s ε̃), neither of which may be entirely adequate when sample sizes are small and the number of levels of the repeated factors is large. Empirical Type I error rates and power levels were obtained by simulation for conditions where small samples occur in combination with many levels of the repeated factor. Results suggested that alternate univariate approaches were improvements to the traditional approaches. One alternate approach in particular was found to be most effective in controlling Type I error rates without unduly sacrificing power.

Download Full-text

Heterogeneous Heterogeneity by Default: Testing Categorical Moderators in Random-effects Meta-Analysis

10.31234/osf.io/tqcka ◽

2021 ◽

Author(s):

Josue E. Rodriguez ◽

Donald Ray Williams ◽

Paul - Christian Bürkner

Keyword(s):

Statistical Power ◽

Type I Error ◽

Meta Analysis ◽

Mixed Effects ◽

Error Rates ◽

Mixed Effects Model ◽

Scale Model ◽

Type I ◽

Sample Sizes ◽

Unequal Variances

Categorical moderators are often included in mixed-effects meta-analysis to explain heterogeneity in effect sizes. An assumption in tests of moderator effects is that of a constant between-study variance across all levels of the moderator. Although it rarely receives serious thought, there can be drastic ramifications to upholding this assumption. We propose that researchers should instead assume unequal between-study variances by default. To achieve this, we suggest using a mixed-effects location-scale model (MELSM) to allow group-specific estimates for the between-study variances. In two extensive simulation studies, we show that in terms of Type I error and statistical power, nearly nothing is lost by using the MELSM for moderator tests, but there can be serious costs when a mixed-effects model with equal variances is used. Most notably, in scenarios with balanced sample sizes or equal between-study variance, the Type I error and power rates are nearly identical between the mixed-effects model and the MELSM. On the other hand, with imbalanced sample sizes and unequal variances, the Type I error rate under the mixed-effects model can be grossly inflated or overly conservative, whereas the MELSM excellently controlled the Type I error across all scenarios. With respect to power, the MELSM had comparable or higher power than the mixed-effects model in all conditions where the latter produced valid (i.e., not inflated) Type 1 error rates. Altogether, our results strongly support that assuming unequal between-study variances is preferred as a default strategy when testing categorical moderators

Download Full-text

Supplemental Material for Type I Error Inflation in the Traditional By-Participant Analysis to Metamemory Accuracy: A Generalized Mixed-Effects Model Perspective

Journal of Experimental Psychology Learning Memory and Cognition ◽

10.1037/a0036914.supp ◽

2014 ◽

Keyword(s):

Type I Error ◽

Mixed Effects ◽

Mixed Effects Model ◽

Type I

Download Full-text

The Use of Theory of Linear Mixed-Effects Models to Detect Fraudulent Erasures at an Aggregate Level

Educational and Psychological Measurement ◽

10.1177/0013164421994893 ◽

2021 ◽

pp. 001316442199489

Author(s):

Luyao Peng ◽

Sandip Sinharay

Keyword(s):

Type I Error ◽

Real Data ◽

Mixed Effects ◽

Error Rates ◽

Mixed Effects Models ◽

Type I ◽

Aggregate Level ◽

Linear Mixed Effects Models ◽

Linear Mixed Effects ◽

Best Linear Unbiased

Wollack et al. (2015) suggested the erasure detection index (EDI) for detecting fraudulent erasures for individual examinees. Wollack and Eckerly (2017) and Sinharay (2018) extended the index of Wollack et al. (2015) to suggest three EDIs for detecting fraudulent erasures at the aggregate or group level. This article follows up on the research of Wollack and Eckerly (2017) and Sinharay (2018) and suggests a new aggregate-level EDI by incorporating the empirical best linear unbiased predictor from the literature of linear mixed-effects models (e.g., McCulloch et al., 2008). A simulation study shows that the new EDI has larger power than the indices of Wollack and Eckerly (2017) and Sinharay (2018). In addition, the new index has satisfactory Type I error rates. A real data example is also included.

Download Full-text

Cluster Wild Bootstrapping to Handle Dependent Effect Sizes in Meta-Analysis with a Small Number of Studies

10.31222/osf.io/x6uhk ◽

2021 ◽

Author(s):

Megha Joshi ◽

James E Pustejovsky ◽

S. Natasha Beretvas

Keyword(s):

Effect Size ◽

Type I Error ◽

Meta Analysis ◽

Error Rates ◽

Small Sample ◽

Type I ◽

Hypothesis Tests ◽

Type I Error Rates ◽

Meta Analyses ◽

Small Sample Correction

The most common and well-known meta-regression models work under the assumption that there is only one effect size estimate per study and that the estimates are independent. However, meta-analytic reviews of social science research often include multiple effect size estimates per primary study, leading to dependence in the estimates. Some meta-analyses also include multiple studies conducted by the same lab or investigator, creating another potential source of dependence. An increasingly popular method to handle dependence is robust variance estimation (RVE), but this method can result in inflated Type I error rates when the number of studies is small. Small-sample correction methods for RVE have been shown to control Type I error rates adequately but may be overly conservative, especially for tests of multiple-contrast hypotheses. We evaluated an alternative method for handling dependence, cluster wild bootstrapping, which has been examined in the econometrics literature but not in the context of meta-analysis. Results from two simulation studies indicate that cluster wild bootstrapping maintains adequate Type I error rates and provides more power than extant small sample correction methods, particularly for multiple-contrast hypothesis tests. We recommend using cluster wild bootstrapping to conduct hypothesis tests for meta-analyses with a small number of studies. We have also created an R package that implements such tests.

Download Full-text

Performance of Monte Carlo Permutation and Approximate Tests for Multivariate Means Comparisons With Small Sample Sizes When Parametric Assumptions are Violated

Methodology ◽

10.1027/1614-2241.5.2.60 ◽

2009 ◽

Vol 5 (2) ◽

pp. 60-70 ◽

Cited By ~ 6

Author(s):

W. Holmes Finch ◽

Teresa Davenport

Keyword(s):

Monte Carlo ◽

Type I Error ◽

Permutation Tests ◽

Error Rates ◽

Covariance Matrices ◽

Small Sample ◽

Type I ◽

Permutation Testing ◽

Sample Sizes ◽

Type I Error Rates

Permutation testing has been suggested as an alternative to the standard F approximate tests used in multivariate analysis of variance (MANOVA). These approximate tests, such as Wilks’ Lambda and Pillai’s Trace, have been shown to perform poorly when assumptions of normally distributed dependent variables and homogeneity of group covariance matrices were violated. Because Monte Carlo permutation tests do not rely on distributional assumptions, they may be expected to work better than their approximate cousins when the data do not conform to the assumptions described above. The current simulation study compared the performance of four standard MANOVA test statistics with their Monte Carlo permutation-based counterparts under a variety of conditions with small samples, including conditions when the assumptions were met and when they were not. Results suggest that for sample sizes of 50 subjects, power is very low for all the statistics. In addition, Type I error rates for both the approximate F and Monte Carlo tests were inflated under the condition of nonnormal data and unequal covariance matrices. In general, the performance of the Monte Carlo permutation tests was slightly better in terms of Type I error rates and power when both assumptions of normality and homogeneous covariance matrices were not met. It should be noted that these simulations were based upon the case with three groups only, and as such results presented in this study can only be generalized to similar situations.

Download Full-text

A Monte Carlo Study on the Performance of a Corrected Formula for ɛ̃ Suggested by Lecoutre

Journal of Educational Statistics ◽

10.3102/10769986019002119 ◽

1994 ◽

Vol 19 (2) ◽

pp. 119-126 ◽

Cited By ~ 3

Author(s):

Ru San Chen ◽

William P. Dunlap

Keyword(s):

Group Treatment ◽

Repeated Measures ◽

Type I Error ◽

Monte Carlo Study ◽

Error Rates ◽

Type I ◽

Type I Errors ◽

Repeated Measures Designs ◽

Number Of Treatment ◽

Present Simulation

Lecoutre (1991) has pointed out an error in the Huynh and Feldt (1976) formula for ɛ̃ used to adjust the degree of freedom for an approximate test in repeated measures designs with two or more independent groups. The present simulation study confirms that Lecoutre’s corrected ɛ̃ yields less biased estimation of population ɛ and reduces Type I error rates when compared to Huynh and Feldt’s (1976) ɛ̃. The increased accuracy in Type I errors for group-treatment interactions may become substantial when sample sizes are close to the number of treatment levels.

Download Full-text